X

Database Storage Optimization best practices, tips and tricks and guidance from Database Compression Product Management

  • May 24, 2021

Implementing an Automated Compression Tiering and Storage Tiering Solution using Automatic Data Optimization

Gregg Christman
Product Manager

Previously, when implementing a data lifecycle management strategy with Oracle Database, organizations used Advanced Row Compression and Data Partitioning to manually create and deploy a compression tiering and storage tiering solution – a solution that required organizations to have sharp insight into data access, and usage patterns, across applications and tables.

Based upon their combined insights, DBAs, along with their storage counterparts, would manually compress and/or move data, using their best estimations regarding actual data usage, ideally trying to ensure that the most frequently accessed data remained on the highest performance storage.

Oracle Database can automate this previously manual operation using the heat map and Automatic Data Optimization (ADO) features of Advanced Compression.

Heat Map

Heat Map automatically tracks usage information at the row and segment levels. Data modification times are tracked at the row level and aggregated to the block level, and modification times, full table scan times and index lookup times are tracked at the segment level. Heat Map enables a detailed view of data access, and how access patterns change over time.

Automatic Data Optimization

Automatic Data Optimization is PL/SQL scripting capability that enables organizations to create data compression tiering, index optimization and/or storage tiering ADO policies. ADO policies specify what conditions (of data/index access) will initiate an ADO data compression or index optimization operation – such as no access, or no modification, or creation time – and when the policy will take effect – for example, after “n” days or months or years. Custom conditions can also be created by the DBA, this enables other factors be used for ADO policy conditions.

All ADO operations execute automatically and in the background, with no user intervention required. In addition to being evaluated and executed automatically, in the background during the maintenance window, policies can also be evaluated and executed anytime by a DBA, manually or via a script.

Implementing an automated data compression, index optimization and storage tiering solution using Automatic Data Optimization and Heat Map is straightforward, as the example below will show.

Data Compression

In this example, we have a table named “orders” that was initially created without any compression. We have turned on Heat Map and are tracking the usage of this table over time. It is the intention, of the organization, to wait until the majority of the post-load activities, performed initially on the table complete, and then compress the table using an ADO policy, with Advanced Row Compression, without moving the table (meaning the table is compressed in place, no rebuild needed).

Once the table cools down (with no or few DML inserts/updates), and begins to be primarily used for reports/queries, another ADO policy will then compress the table with HCC QUERY HIGH. When the table has become colder and is only occasionally queried (historic data), the final ADO policy will then compress the table even further with HCC ARCHIVE HIGH.

Please note, compression can be specified at the “row” level or the “segment” level. Row level allows the table to be compressed in place, block-by-block, as all the rows on a block meet the ADO policy condition. Tables/partitions can also be compressed at the segment level; this means the entire segment is compressed at the same time.

First Policy – Enable Advanced Row Compression (the examples use the ADO condition “no modification”)

The ADO policy below enables Advanced Row Compression, and since the policy specified “row” versus “segment” level compression, the tables’ blocks will be individually compressed when all the rows on a block meet the ADO compression policy that is specified (that being AFTER 2 DAYS OF NO MODIFICATION)

ALTER TABLE orders DATA LIFECYCLE MANAGEMENT ADD POLICY
ROW STORE COMPRESS ADVANCED ROW
AFTER 2 DAYS OF NO MODIFICATION;

This policy allows the post-load activity to subside on the table before compression is enabled. For organizations with SLA’s around the load times, this allows the table to be created and populated as quickly as possible, before implementing compression.

Second Policy - Enable HCC Query High Compression

The next policy, that was specified by the DBA, will be automatically enforced by the database (at the segment level) when Heat Map determines there has been no data modifications for 90 days. The policy changes the compression level of the table to a higher level of compression (HCC QUERY HIGH) when the data is being used primarily for queries/reporting.

ALTER TABLE orders DATA LIFECYCLE MANAGEMENT ADD POLICY
COLUMN STORE COMPRESS FOR QUERY HIGH SEGMENT
AFTER 90 DAYS OF NO MODIFICATION;

Changing the compression from Advanced Row Compression, to Hybrid Columnar Compression (HCC QUERY HIGH), occurs during a maintenance window after the specified ADO policy criteria has been met.

Third Policy – Enable HCC Archive High Compression

When this table further “cools down” additional storage gains can be realized when ADO automatically compresses the data to the highest level possible (HCC ARCHIVE HIGH) with Oracle Hybrid Columnar Compression. In this example, this data is still needed for query purposes, but is no longer being actively modified (no or few DML inserts/updates) and only occasionally queried or used for reporting. This cold/historic data is an ideal candidate for HCC ARCHIVE HIGH compression.

After 180 days of no modification to the data, this ADO policy will be applied.

ALTER TABLE orders DATA LIFECYCLE MANAGEMENT ADD POLICY
COLUMN STORE COMPRESS FOR ARCHIVE HIGH SEGMENT
AFTER 180 DAYS OF NO MODIFICATION;

With the final ADO compression tiering policy criteria being satisfied, the data is now compressed to the HCC ARCHIVE HIGH level and could be moved to lower cost storage (Tier 2). This allows active data to remain on higher performance tiers and allows the historic data, which remains online, to be accessed by applications as needed and ensures a smaller footprint for the historic data.

This example used the “best practice” approach of compressing using both Advanced Row Compression and Hybrid Columnar Compression. Advanced Row Compression (as well as Heat Map and ADO) are features of Advanced Compression. While HCC does not require Advanced Compression, it does have other requirements, please see the Oracle HCC Tech Brief for more details. While data compression tiering best practice does include the use of HCC, if an organization does not have access to HCC, then they would use only Advanced Row Compression in their ADO policies.

Index Optimization

The ADO policy “OPTIMIZE” clause provides an opportunity for ADO to optimize an index whenever the ADO policy condition is met, for example:

ALTER INDEX orders_idx DATA LIFECYCLE MANAGEMENT ADD POLICY
OPTIMIZE AFTER 3 DAYS OF NO MODIFICATION;

The optimize action invoked by ADO is determined automatically by Oracle Database, and includes:

Compress:           Compresses portions of the key values in an index segment

Shrink:                  Merges the contents of index blocks where possible to free blocks for reuse

Rebuild:               Rebuild index to improve space usage and access speed

Using the example above, as with ADO for data segments, the Heat Map framework collects activity statistics on the index over time. When the index ADO policy qualifies for execution, the database automatically determines which index optimization to implement (compress, shrink or rebuild index).

In the example, the ADO policy automatically optimizes the index based on the recommendation of the database when the policy conditions (no modification for 3 days) is met and the policy is executed.

Storage Tiering

ADO-based storage tiering (ADO “Tier To” policy) is not based upon the ADO condition clause (i.e. after “x” days of NO MODIFICATION) as is data compression tiering and index optimization and instead, is based upon tablespace space pressure.

The justification for making storage tiering dependent on "space pressure" is based on the belief that organizations will want to keep as much data as possible on their high performance (and most expensive) storage tier, and not move data to a lower performance storage tier until it is absolutely required. The exception to the storage pressure requirement are storage tiering policies with the 'READ ONLY' option, these are triggered by a heat-map based condition clause.

The value for the ADO parameter TBS_PERCENT_USED specifies the percentage of the tablespace quota when a tablespace is considered full. The value for TBS_PERCENT_FREE specifies the targeted free percentage for the tablespace. When the percentage of the tablespace quota reaches the value of TBS_PERCENT_USED, ADO begins to move segments (that have TIER TO policies) so that percent free of the tablespace quota approaches the value of TBS_PERCENT_FREE. This action by ADO is a best effort and not a guarantee.

Organizations can set data lifecycle management ADO parameters with the CUSTOMIZE_DATA LIFECYCLE MANAGEMENT procedure in the DBMS_DATA LIFECYCLE MANAGEMENT_ADMIN PL/SQL package, for example:

BEGIN
DBMS_DATA LIFECYCLE MANAGEMENT_ADMIN.CUSTOMIZE_DATA LIFECYCLE MANAGEMENT(DBMS_DATA LIFECYCLE MANAGEMENT_ADMIN.TBS_PERCENT_USED,85):
DBMS_DATA LIFECYCLE MANAGEMENT_ADMIN.CUSTOMIZE_DATA LIFECYCLE MANAGEMENT(DBMS_DATA LIFECYCLE MANAGEMENT_ADMIN.TBS_PERCENT_FREE,25):
END;

In this example, when a tablespace reaches the USED threshold of 85%, the database will automatically move the coldest table/partition(s) in the tablespace to the target tablespace until the tablespace quota has at least 25 percent free. Of course, this only applies to tables and partitions that have a "TIER TO" ADO policy defined (see examples below). This frees up space on tier 1 storage for the segments that would truly benefit from the performance while moving colder segments that do not need Tier 1 performance, to lower cost Tier 2 storage.

For example:

ALTER TABLE orders DATA LIFECYCLE MANAGEMENT ADD POLICY TIER TO lessactivetbs;
ALTER INDEX orders_idx DATA LIFECYCLE MANAGEMENT ADD POLICY TIER TO lessactivetbs;

In this simple TIER TO example, Oracle Database automatically evaluated the ADO policies (and tablespace fullness) to determine when data and/or index segments are eligible to move to a different tablespace. This ensures data accessibility and performance, while reducing the storage footprint even further – with no additional burden placed on database administrators or storage management staff.

Similar to ADO data segment storage tiering ADO index segment storage tiering also operates at the segment level, so when an ADO policy implements storage tiering the entire segment is moved and this movement is one direction, meaning that ADO storage tiering is meant to move colder segments from high performance storage to slower, lower cost storage. If an ADO index optimization policy, and a storage tiering policy both qualify for execution, the database will execute both operations in a single segment reorganization step.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.