You're building a new application. So you need somewhere to store your data. It's time to create a table in your database!
In this post you'll find out how to:
The basic create table statement takes the form:
create table <table_name> ( <column1> <data type>, <column2> <data type>, <column3> <data type>, ... );
So to create a table called toys, with the columns toy_name, weight, and colour, run:
create table toys ( toy_name varchar2(10), weight number, colour varchar2(10) );
Oracle Database has many data types to choose from. Common data types are:
NOTE: Create table is a form of data-definition language (DDL) statement. These change the objects in your database. Oracle Database runs a commit before and after DDL. So if the create works, it's saved to your database.
You can also create a table based on a select statement. This makes a table with the same columns and rows as the source query. This operation is known as create-table-as-select (CTAS).
This is a handy way to copy one table to another. For example, the following creates toys_clone from toys:
create table toys_clone as select * from toys;
Yes. But, as always, there's more to it than this. You'll want to add some constraints to your table. And there are many types of table available in Oracle Database, including:
These affect how the database physically stores the data. Which can have a big impact on performance.
Database tables tend to last a long time. And it's tricky to change the type of a table storing millions of rows. So it's worth spending a few minutes deciding which you need.
For an overview of these types, watch this video, taken from the first module of the beginner's SQL course: Databases for Developers: Foundations:
This is the default for tables in Oracle Database. But if you want to be explicit, add the "organization heap" clause at the end:
create table toys ( toy_name varchar2(10), weight number, colour varchar2(10) ) organization heap;
Heaps are good general purpose tables. They are the most common type you'll see in Oracle Database installations.
With these, the database is free to store new rows wherever there is space. So if you read ten rows, they could be anywhere on disk.
If you're lucky, they're all in the same place. So the query can get them all in one trip to disk.
But it's not guaranteed. Each row could be in a different location. Meaning you need ten I/O operations to read them all.
This is bad news if you want the query to be as fast as possible. The more disk reads your SQL does, the slower it will be. Even if the rows are cached in memory, accessing ten memory addresses is slower than hitting one.
Luckily you can force the database to store rows with similar values in the same place. This can reduce work your query does to get them. Making your SQL faster!
To force this physical order, you need to change your table's properties. The first we'll look at is an index-organized table.
Indexes are ordered data structures. So an IOT stores rows physically sorted according to its primary key.
NOTE: A primary key (PK) is a constraint. Each set of values in its columns can only appear once. So you can't have duplicates. It also has a not null constraint. And creates a unique index in the background.
To create one, add the organization index clause to the table definition:
create table toys ( toy_name varchar2(10) primary key, weight number, colour varchar2(10) ) organization index;
So why would you use this instead of a default heap table?
A couple of reasons.
First up, when using heap organization, the table and its primary key index are separate data structures. An IOT combines these into one.
This can reduce some overheads.
You no longer need an extra index for the primary key. Which can save you some space.
And SQL accessing a row using the primary key only has to access one structure. Instead of two. So these queries are that tiny bit faster.
But the biggest advantage (IMO) comes for tables with a multi-column primary key. For example many-to-many join tables. Such as customers to their addresses:
create table customer_addresses ( customer_id integer, address_id integer, primary key ( customer_id, address_id ) ) organization index;
This stores the values sorted by customer_id, then address_id. So all the addresses for customer 1 are next to each other on disk.
So if you search for all the addresses for this customer, like so:
select * from customers where customer_id = 1;
You know all the rows will be in the same few locations. Making your SQL that tiny bit faster.
Whereas with a heap table, the database could store them anywhere.
You can extend this principle to any table with a composite PK. And you (almost always) search for rows where the first column of the PK equals some value.
This is common for tables in a master-detail relationship. For example:
Here you usually get the rows from the detail table matching a row in the parent. Such as all products in an order.
Remember, to use an IOT the table must have a primary key. So the values you want to sort by must be unique. If they're not, you can get around this by creating a fake primary key. But this is a niche technique. Only use if you're sure what you're doing!
You can also use partitioning or table clusters to impose order on your data. But let's cover off the organization clause first.
The final option for the organization clause is external. You use this to read text files stored on the database's file system. This enables you to read CSV or other formatted files into your database using SQL.
To create one you must have a directory object in place. This points to the folder where the file is:
create or replace directory ext_files as '/path/to/files';
To read the file toys.csv in /path/to/files, use this directory and define the file like so:
create table toys_csv ( toy_name varchar2(10), weight number, colour varchar2(10) ) organization external ( default directory ext_files location ( 'toys.csv' ) );
Now, when you query toys_csv, you're reading the records in the file toys.csv.
There are many options for creating external tables. If the file you're reading isn't a straightforward CSV, set the external clause as needed.
External tables are great for loading files using SQL. But sometimes you may need to massage the data before saving it in your real tables. Or need to read the same record many times during the load.
When doing this, it's handy to stage the data in a private working area. Somewhere you can store rows that only you can see.
Rows in temporary tables are private to your session. Only you can view them. And, once you disconnect, the database removes any rows you added.
Oracle Database has two types of temporary table: global (GTT) and private (PTT).
The syntax to create a global temporary table is:
create global temporary table toys_gtt ( toy_name varchar2(10), weight number, colour varchar2(10) );
By default the database will auto-delete all the rows at the end of each transaction. So as soon as you commit, it's empty!
If you need to keep the rows across transactions, set the on commit clause to preserve rows:
create global temporary table toys_gtt_keep ( toy_name varchar2(10), weight number, colour varchar2(10) ) on commit preserve rows;
Global temporary tables have a standard definition across the database. You create it once and leave it in your application. This is in sharp contrast to:
PTTs are new in Oracle Database 18c. With these, not only are the rows hidden from other sessions, so is the table itself!
The following creates a PTT called ora$ptt_toys:
create private temporary table ora$ptt_toys ( toy_name varchar2(10), weight number, colour varchar2(10) );
NOTE: the cryptic ora$ptt_ prefix for the table name. This must match whatever your database's private_temp_table_prefix parameter is set to. Otherwise it won't work! Also, unlike all other forms of DDL, create private temporary table does NOT commit!
By default, these only last for the duration of a transaction. As soon as you commit (or rollback) the table is gone.
If you need it to persist for the length of your connection, set the on commit clause to preserve definition:
create private temporary table ora$ptt_toys_keep ( toy_name varchar2(10), weight number, colour varchar2(10) ) on commit preserve definition;
Unlike every other type of table, two people can create a PTT at the same time with the same name. But different columns! Handy if you need to stage values from dynamic data sources.
As the number of rows in a table grows, it gets harder to manage them. Queries can become sluggish, adding indexes takes longer, and archiving off old data is slow.
Partitioning addresses these issues by allowing you to logically split a table into several smaller tables. But still access these as one table.
You can even subdivide each of these partitions further with subpartitions.
NOTE: to partition a table, you must purchase the Partitioning option. Check you have this before diving in!
To partition a table, choose your partitioning column(s) and method. Supported ways include:
When you add rows to the table, they will go in the relevant partition.
For example, you may want to split up your toys by colour. To do so, list partition it by this column. And define which values will go in each partition:
create table toys_partitioned ( toy_name varchar2(10), weight number, colour varchar2(10) ) partition by list ( colour ) partitions ( partition p_green values ( 'green' ), partition p_red values ( 'red' ), partition p_blue values ( 'blue' ) );
Each partition only stores rows where the row's colour value matches the partition value. And queries searching for a given colour will now only access that partition.
So if you search for all the red rows, the database knows they're all located in the same place. Which can make your SQL that bit faster.
So Oracle Database 12.2 introduced a new create table option. Create-table-for-exchange:
create table toys_stage for exchange with table toys_partitioned;
This makes a non-partitioned table with the same structure as the source. Ensuring you can exchange it with the partitioned table.
Partitioning can bring huge benefits. But get it wrong and you can make your database an unmaintainable mess! Read up on the pros and cons before jumping in.
To learn more, read my colleague Connor's guide to partitioning for developers.
So far we've dealt with making queries against one table faster. But what if you want to get rows from two tables at the same time?
Joining tables is one of the most common operations in a database. To do this, first the database reads rows from one table. Then finds rows matching the join criteria in the other.
So you need at least two lots of reads get the data. One for each table. In practice joins can do many more.
Table clusters avoid this "join penalty" by storing rows from different tables in the same place. Rows with the same value for the cluster key from each table go in the same location. This means you get "two-for-the-price-of-one" access. Instead of an I/O operation per table you query, you can get all the rows in one shot.
Say you have a lookup table for colours. And you often join the toy table to this on the colour. Clustering these tables by colour ensures the database stores rows with the same colour in the same place.
To use table clusters, first you need to create the cluster. This can be a hash or index cluster. The following creates a hash cluster:
create cluster colour_clus ( colour varchar2(10) ) hashkeys 1024;
The cluster's columns are its key. Place the tables in this by adding the cluster clause to your create table. Here you state the cluster columns. These must have the same data type as in the cluster.
The following adds colours_clustered and toys_clustered to the colour_clus:
create table colours_clustered ( colour varchar2(10), rgb_hex_value varchar2(6) ) cluster colour_clus ( colour ); create table toys_clustered ( toy_name varchar2(10), weight number, colour varchar2(10) ) cluster colour_clus ( colour );
Now, if you insert a row with the colour red in both colours_clustered and toys_clustered, the database will stick them in the same place.
Table clusters are an advanced feature. While they can make joins faster, they come with several caveats. Read up on them before diving in!
Added in Oracle Database 21c (and now backported to 19.10), blockchain tables are insert-only. These solve a specific problem: where you need to guarantee no one has changed the data. For example, audit records.
Use caution when trying these out; you can easily end up with tables you can't remove!
The syntax to create one is:
create blockchain table blockchain_tab ( audit_id integer not null primary key, audit_timestamp timestamp not null, audit_data clob not null ) no drop until 0 days idle -- !!CAREFUL!! no delete until 16 days after insert hashing using sha2_512 version v1;
Pay close attention to the
no drop clause. The
until N days idle clause is optional. If you exclude this you're unable to remove the table! When playing with these tables I urge you to set this to:
no drop until 0 days idle
The number of days you specify here is how long you have to wait after the last row was added to the table. Using zero days means you can drop the table immediately. You can only increase this after you create the table, so think carefully before setting this to non-zero values.
no delete clause controls how many days you have to wait to be able to remove rows. Again, by default you will never be able to remove rows from the table. To allow deletions, set:
no delete until 16 days after insert
This means you can only remove rows 16 days after you inserted them. As with the drop clause, you can only up this value with
alter table – unless you set it as
locked. In which case the property is fixed for the lifetime of the table.
Whatever you set the
no delete clause to, you can never
update rows in a blockchain table:
insert into blockchain_tab values ( 1, systimestamp, 'Audit' ); update blockchain_tab set audit_data = 'Overwrite attempt'; ORA-05715: operation not allowed on the blockchain table
That was a lot of options! And there are few other, more specialized types available too. For a full list of options, read up on tables and table clusters in the Concepts Guide. Or view the full create table syntax.
So now you may be wondering: which type should I use?
In most cases, a heap table is the way to go. These are the most versatile.
But think about how you'll access the rows in the table. Will you have one or two queries that need to be fast as possible? If so, an index-organized or partitioned table may be the way to go.
Or will it need to support many queries on different columns? In which case a default heap table is the better option.
And bear in mind you can combine some options. For example, you can have a partitioned IOT. Knowing which to use will come from experience. But more importantly testing!
So get familiar with the table types available. Play around with them to see how they work. When building new functionality, try different table settings. And test to see how they perform.
But making new tables is a small part of database development. Often you'll need to change existing tables. It's time to find out how to:
So you've created a shiny new table. But the chances are most of your development will be against existing tables. Tables you'll need to extend to store new information. So you need to add extra columns.
To add columns to your table, you need an alter table statement. This lists the table you're changing, with the names and data types of the columns you want to add. So to add a price column to toys, run:
alter table toys add ( price number );
If you want to add more than one column, you can list them out in the add clause:
alter table toys add ( cuddliness_factor integer, quantity_of_stuffing integer );
So that's easy enough. But what about columns you no longer need? How do you get rid of them?
Every now and then you may want to remove a column from a table. Maybe the business has realised each toy has many colours. So there's no point storing this on the toys table. Thus you want remove this column.
You can do so with an alter table drop command:
alter table toys drop ( weight );
This is an expensive operation. On tables storing millions of rows it will take a long time to run. And it blocks other changes to the table. So your application may be unusable while this runs!
A better option is to set the column unused:
alter table toys set unused column weight;
This is an instant operation. No matter how many rows are in the table, it will take the same amount of time.
This is because it doesn't physically remove the columns from the database. It marks them as unavailable. The data still exists. You just can't get to it!
If you're hoping to reclaim the space these columns used, you need to wipe them from the table. Do this with the following command:
alter table toys drop ununsed columns;
As you're now doing the work of deleting the data, this can take a long time. The key difference here is dropping unused columns is non-blocking. Your application can continue to run as normal.
Whichever method you use, take care when removing columns. Dropping columns or setting them unused are both one-way operations. There is no "undrop column" command. If you made a mistake, you'll have to recover the table from a backup!
So that helps if you're removing some of the columns. But what if you want to scrap the whole table?
It's time to drop it!
Sometimes you may want to delete a whole table. To do this, drop the table like so:
drop table toys;
This removes the whole table and its data. Like dropping columns, this is a one-way operation.
Starting your SQL journey and want to know more? The video above is from the first module of Databases for Developers: Foundations. Learn SQL by taking this free online tutorial.
The best part?
It's 100% FREE!
Head to the registration page to join.
UPDATE 2 Feb 2021: Added blockchain tables