Pat Shuff's Blog

database option - partitioning

Database partitioning has been around since 8i version of the database over ten years ago. The initial features of partitioning were manual processes that allowed you to split data by range like dates or sequences like zip codes. Tablespaces were able to be split into multiple files and indexes applied to each file. If a select statement were executed with a where clause that met the partition boundary, a full table scan was not necessary. Splitting the data into different tablespaces allows us not only to read only the relevant data into memory but we can split our database into storage tiers. We can keep the most used data in high speed disk and historic data in slower lower cost storage. Not only can we use lower cost storage but we can compress the data that is not changing and take up less space. We keep our frequently used data in a high speed disk (or memory if we are lucky) and our older data in lower cost storage. This is only available with partitioning and the reason why many customer purchase this as an option. The return on the software investment significantly reduces the cost of our database storage. We can use flash drives for our current quarter/month data, 10K rpm drives for last quarter/month data, 7.5K rpm drives the rest of the years data, and nfs mounts for data greater than a year old. The cost savings on storage more than pays for the cost of partitioning. Unfortunately, this does not correlate into cloud services since you really don't get into tiered storage behind a database when you consume DBaaS or PaaS. We need to focus on improving performance by helping subpartitions into the available memory to speed up select statements.

Some places to learn more about partitioning include

Before we go down the rabbit hole and dive deep into partitioning, let's review how a select statement works and how data is stored. Say for example we have a database that contains addresses for customers. The table contains an id number, a first name, last name, address, phone number, city, state, zip code, credit card number, credit card expiration, and email address. We have a second table for our on-line catalog that contains part numbers, a title, a description, and a file link for photos. We have a third table for our orders and it contains a customer id number, a part number, an order quantity, and order date. We would create our tables with the following commands

create table customers (
customer_id number(8),,
first_name varchar2(32),
last_name varchar2(32),
address varchar2(64),
phone_number varchar2(10),
city varchar2(32),
state varchar2(16),
zip_code varchar2(16),
credit_card_number varchar2(16),
credit_card_expiration varchar2(8)
email_address varchar2(64)
create table catalog (
part_number number(8),
title varchar2(32),
description varchar2(128),
part_image blob
create order_entry(
order_number number(8),
customer_id number(8),
part_number number(8),
part_quantity number(8),
order_date date

If we have ten million items in our catalog we potentially consume 128 + 32 + 8 + 16 bytes times 10,000,000. This makes our table roughly 2 TB in size. If we have two million orders we have about 0.5 TB for the order_entry table. When we create a database we have the option of defining not only the storage type that we want our table to reside in but we can define how and where to store the data associated with this table. By default all tables that we create as a user are stored in the SYSTEM tablespace. All three of these tables will be stored in the DATA area under the SYSTEM tablespace since we did not specify a storage area or tablespace to hold the tables. For the database that we created in previous blog entries using Oracle DBaaS, these files are stored in /u02. We can dive down the /u02/app/oracle/oradata/ORCL/PDB1 and see that there is a system01.dbf file. This correlates to the SYSTEM tablespace in the PDB1 pluggable database. As tables are added, they are added to the system01.dbf file. If we are in the container database ORCL the files are stored in /u02/app/oracle/oradata/ORCL/system01.dbf file.

To help with database performance, index are created into tables so that a reference to a table knows where in the system01.dbf file the table customers and catalog are located. We can also create an index on the table. This index is also stored in the system01.dbf file so that we can look up common queries as they are executed. For example, if we are looking for all orders that happened in February we can select this data quicker with an index by presorting all of the data related to order_date. The index allows us to directly access the table entries in the system01.dbf table by creating an index link to the entries. This index is also stored in the system01.dbf file and re-created when we enter new data into the order_entry table. Hopefully our indexes are small enough to stay resident in memory and we don't need to go to storage to reload and reindex our indexes. Partitioning helps keep indexes smaller as well and unused indexes can be aged out to disk to free up memory. If we never look at data that is two years old, we don't need to keep an index on our two year old data in memory but pull it in from disk when needed.

To reduce the access time and select time we can pre-sort the data in a different way. We can partition the data and store the table information in different files. Rather than storing everything in system01.dbf, we can store February order data in february.dbf. When an update to a table is done the insert is done into the system01.dbf file or the january.dbf, february.dbf, or march.dbf file. When we transition into April an april.dbf file is created and the january.dbf data is moved into q1_2016.dbf file. The key advantage to this is when we perform a select statement and look for data in March and April, we only look in the march.dbf and april.dbf files. The rest of the data is not loaded because we know that the data is not in the other table extents. This reduces the amount of data that is loaded into memory and reduces the amount of disk operations that are performed for every select statement. If everything was stored in the system01.dbf file, we would need to load all two million orders just to find the one or two hundred that happened in April. We basically read then throw away 97% of the data read because it does not match our request. True, the index would help but this requires multiple writes to the disk when an insert happens. With partitioning enabled for the order_date column, all order entries are stored pre-sorted by date in the different table extents. Since 11g interval partitioning automatically creates new partition tables. As we cross from February to March, the march.dbf is created and all writes corresponding to march orders are written to the new partition.

There are a variety of partition types that you can use to divide data

  • Range partitioning - typically based on date columns, months, quarters, a range of numberic or character values. You can specify a value less than or value greater than when defining the partition. The value that you compare can be relative or specific to a current date or number.
  • List partitioning - this describes a discrete value and assigns them to their own tablespace. We might split our catalog into plumbing products, lawn and garden products, or appliances. This helps searches into our catalog for a specific item. Note when you do a search at HomeDepot or Amazon you have the option of search in department. They are using list partitions on the back end.
  • Hash partitioning - this is good if you don't have a specific range that will split the data. If you want to sort by address for example, it is difficult to list all of the addresses or sort them into a range. The hash partition allows you to split your data into 16 different partitions and the database will split the data with a best effort to spread all of the data between the number of partitions you define.
  • Composite partitioning - this is a combination of two of the types described above. Composite partitioning is accomplished with the subpartition command where we first sort by one method then sub-sort by another. We could use a list-list or a list-range. We can use two of any of the above to help manage a large tablespace into smaller chunks.
  • Reference partitioning - this allows you to partition data based on referential constraints. If, for example, you want to create a constraint in a table creation and sort on that constraint, you can do this with partition by reference. If we create a table and add a constraint that an order_id must be tied to a customer_id in our customers table we can partition by this constraint which effectively splits the orders table into orders by customer_id which is not defined in this table.
  • Virtual column-based partitioning - virtual column partitioning allows us to split a table based on part of a column value. If, for example, we index our parts in our catalog by sequence numbers with 1-3 representing plumbing and 4-6 representing appliances, we can partition based on the first number in our part_id and effectively split the catalog based on departments without having to define the department as a column in our table. We just need to make sure that all part numbers that are inserted into our catalog follow our numbering convention and not put a gas range into the catalog staring with a 2 as the part number.

If we change the customers table described and append a partition by range statement with the command

create table customers ( .... ) partition by range (state);

we divide the table into potentially fifty different tables. As a new customer is added, they are added to the new state table. Inserts happen quicker, selects happen quicker, backups happen quicker unless all of our customers are located in one state.

If we group our customers into regions and want to store data not in fifty states but in three regions we could do this wist a list range. Note that we can define the table name when we define the partition.

create table customers (....) partition by list (state)
(partition part1 values ('Texas', 'Louisiana', 'Oklahoma', 'Arkansas') tablespace tola_ts ,
partition part2 values ('California', 'Oregon', 'Washington', 'Hawaii') tablespace pac_ts,
partition category_other values (default));

In this example we create the tola_ts, pac_ts, and default tablespace. We split seven of the states into three buckets and store customers into the three areas. This make reporting simpler and optimizes for select statements looking for customers in or around Texas or along the Pacific Ocean. Note that we could also subpartition this data to separate the big cities from rural areas
create table customers (.....) partition by list (state)
(partition part1 values ('Texas") tablespace texas_ts
subpartition big_cities
(partition texas_cities values('Houston', 'Dallas', 'San Antonio', 'Austin', 'Fort Worth', 'El Paso') tablespace big_texas_ts,
partition category_other values(default) tablespace small_texas_ts),
partition part2 values ('California', 'Oregon', 'Washington', 'Hawaii') tablespace pac_ts,
partition category_other values (default));

This will create four tablespaces. One for Texas big cities, one for Texas small cities, one for Pacific rim states, and one for all other states.

Database 12c added a few new commands to help manage and maintain partitions. We can now alter partitions and add, truncate, drop, split, and merge. The add and merge are very valuable functions that allow us to update ranges. If, for example, we paid a consultant two years ago to define a partition by range and they went out four years with the following

create table sales (.....) partition by range (salesdate)
partition part_2015 values less than (TO_DATE('01-Jan-2016', 'DD-MON-YYYY')),
partition part_2016 values less than (TO_DATE('01-Jan-2017', 'DD-MON-YYYY'))

But we want to start saving data by quarter rather than by year we could execute the following
alter table sales add
partition p_q1_2016 values less than (TO_DATE('01-Apr-2016', 'DD-MON-YYYY')),
partition p_q2_2016 values less than (TO_DATE('01-Jul-2016', 'DD-MON-YYYY'));

This would slide in quarterly reporting and allow us to handle a larger volume than was created before. If at the end of the year we want to aggregate everything back into a year rather than a quarter basis we can do this with a merge command
alter table sales
merge partitions p_q1_2016, p_q2_2016, p_q3_2016, p_q45_2016
into partition part_2016;

Fortunately, Enterprise Manager has a partition advisor that looks at the history of your select statements and suggests how you should divide your tables into partitions. It notices that you do a lot of select by state or select by zip code and recommends partitioning by list or by hash based on your usage patterns. This was a new feature added with Enterprise Manager 11 and has gotten more robust and reliable with 13c. We should see a significant speed up if we get the right combination of partitions and indexes and could potentially take a select statement from 45 seconds to sub seconds as shown in the Enterprise Manager screen shots below.

In summary, partitioning is very powerful. It helps you split up your larger tables so that they fit into the memory that you have allocated. The return on investment is difficult to do because the cost for partitioning vs the cost of memory and resulting speed up for queries is hard to measure. Enterprise Manager has tools to help you with this analysis but it is difficult to put into future dollars and what if analysis. It would be nice if you could say that splitting your table into partitions would reduce your buffer cache and allow you to shrink your SGA size by 25%. The tools are not quite there. They do tell you that you can reduce your select times by partitioning the data and predict relatively accurately how much faster a select statement will be with partitioning based on your current hardware configuration. All of these functions should work on Amazon RDS with the exception of manipulating a tablespace. This requires a different command syntax since manipulation of a tablespace requires system access. Typically the command would be alter database default tablespace users2 but with Amazon RDS you have to execute exec rdsadmin.rdsadmin_util.alter_default_tablespace('users2') instead. Given that this is not done very often, it is up to you to decide how and where you deploy your large table database.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha