Subscribe

Share

Database, SQL and PL/SQL

Automatic Data Optimization

Learn how to enable information lifecycle management to automatically move data to lower-cost storage tiers and compress it.

By Arup Nanda Oracle ACE Director

November/December 2013

John, the principal database architect at Acme Bank, sips his coffee while listening to his visitors’ woes. The storage demand for databases at Acme is rising rapidly, due to the seemingly endless stream of data points. Because data—however old—can be good for marketing analytics, no one wants to purge old data, though. Although Acme does have an information lifecycle management (ILM) policy to put older and infrequently accessed data on lower-cost storage, the implementation has not been very effective, and unclear identification of “infrequently accessed” data is the primary reason. Everyone agrees on the time period during which the lack of access defines infrequently accessed data but not the exact mechanics of identifying it. The standard approach for identifying the data is to put a trigger on the tables to flag data that hasn’t changed, but Acme’s DBAs eschew such an approach, because it negatively affects performance. Additionally, everyone agrees that an automated system to move old, unchanged data to lower-cost storage would be ideal. Is there, all want to know, a better approach to achieving Acme’s ILM objectives?

There is, responds John: using the Automatic Data Optimization feature of Oracle Database 12c.

Setup

Acme deals with several currencies, including the British pound, the euro, and the Canadian dollar, in addition to its primary currency, the US dollar. It’s vital to record the exact exchange rate of the currencies. The rate can change every day, so the bank stores the rates along with specific dates. Because financial transactions can be made effective as of a different date, past records of exchange rates must be stored and available. The following is what the EXCHANGE_RATE table looks like:

SQL> desc exchange_rate
NAME TYPE
—————————— ————————————
FROM_CURR VARCHAR2(3)
TO_CURR VARCHAR2(3)
START_DATE DATE
END_DATE DATE
RATE NUMBER

The FROM_CURR and TO_CURR columns show the codes of the source and target currencies, with the RATE column showing the exchange rate. Because the rate changes, the START_DATE and END_DATE columns show the dates during which that rate is effective. The table is partitioned by range on END_DATE, with one partition per month. The partitions are named in the format YyyMmm, where yy is the two-digit year and mm is the two-digit month. For example, the partition y13m12 holds all data for December 2013. The table and its data can be created by the SQL script in the Online Setup Listing. All the partitions of the table exist in the LATEST_TS tablespace. There is another tablespace—MIDTERM_TS—built on lower-cost storage.

Online Setup Listing

connect sys/oracle@localhost:1522/ANL2 as sysdba
create tablespace latest_ts datafile
‘C:\ORACLE\ORADATA\ANL2\LATEST_TS_01.DBF’ size 111M autoextend on next 1m;
create tablespace midterm_ts datafile
‘C:\ORACLE\ORADATA\ANL2\MIDTERM_TS_01.DBF’ size 111M autoextend on next 1m;
grant create session, create table to forex identified by forex
/
alter user forex quota unlimited on latest_ts
/
alter user forex quota unlimited on midterm_ts
/
connect forex/forex@localhost:1522/ANL2
create table exchange_rate (
from_curr varchar2(3),
to_curr varchar2(3),
start_date date,
end_date date,
rate number
)
tablespace latest_ts
/
declare
l_rand number;
begin
for d in 1..365 loop
l_rand := dbms_random.value(1,2);
insert into exchange_rate values
(‘USD’,’GBP’,sysdate-d-1,sysdate-d,l_rand);
insert into exchange_rate values
(‘USD’,’CAD’,sysdate-d-1,sysdate-d,l_rand);
insert into exchange_rate values
(‘USD’,’EUR’,sysdate-d-1,sysdate-d,l_rand);
end loop;
commit;
end;
/

Jill, the lead developer, has two objectives:

  1. When the free space in the LATEST_TS tablespace drops to a certain threshold, 15 percent of the total space allocated, the less-used partitions of the EXCHANGE_RATE table should be relocated to the MIDTERM_TS tablespace. It’s important, Jill cautions, that this relocation of partitions should be based on access, not on time. If an older partition is being accessed quite frequently, it should be left in LATEST_TS. Similarly, if a newer partition is accessed less frequently, it should be moved to MIDTERM_TS.

  2. The partitions that do not see their data modified should be compressed to save space. The choice of segments for compression, Jill further cautions, should be independent of the relocation to a different storage tier. For example, a partition can exist in the LATEST_TS tablespace but be compressed because the data there doesn’t get modified. Similarly, a partition may be moved to MIDTERM_TS because it is accessed less frequently, but because the access is mostly write activity, it may not be compressed.

Acme can satisfy both requirements by using Automatic Data Optimization, assures John. This feature, he cautions, is not available in a pluggable database in a multitenant environment, however, and it’s also not available in materialized views and tables with object types.

Heat Map

The first step in using Automatic Data Optimization, John explains, is to enable a new feature in Oracle Database 12c called Heat Map. Simply speaking, it shows the popularity or “hotness” of a segment such as a table or a partition, and it is used by Automatic Data Optimization to decide which segments are candidates for action. If the segment is accessed frequently, it’s considered hot and therefore may not be a candidate for some actions, such as relocation to lower-cost and less-efficient storage.

John enables the Heat Map feature by executing the following command as a DBA user:

alter system set heat_map = on scope=both;

It is a one-time activity. Using scope=both, John ensures that the parameter is set in SPFILE and therefore will be in effect when the database is restarted.

To check the heat map of a specific segment, he looks into the DBA_HEAT_MAP_SEGMENT view. The time stamps of the last activity on the table—UPDATE, SELECT, full table scan, and reference constraint lookup—are recorded in this view. He describes the columns of the view for his listeners:

Enabling Automatic Data Optimization

Before starting the Automatic Data Optimization demonstration, John checks a few things, including the presence of segments in the tablespaces, using the following SQL statement:

select partition_name, 
tablespace_name
from user_segments
where segment_name =
'EXCHANGE_RATE';
PARTITION TABLESPACE
————————— ——————————
Y13M07 LATEST_TS
… output truncated …
Y12M07 LATEST_TS

John directs everyone’s attention to the output to confirm that all the partitions are in the LATEST_TS tablespace as expected. Then he checks the allocated and free space in the tablespace by running the script shown in Listing 1.

Code Listing 1: Checking for free space

select t.tablespace_name, 100*sum(f.bytes)/sum(t.bytes) free_pct
from (select tablespace_name, sum(bytes) bytes
from dba_data_files group by tablespace_name) t,
(select tablespace_name, sum(bytes) bytes
from dba_free_Space group by tablespace_name) f
where f.tablespace_name = t.tablespace_name
group by t.tablespace_name;
TABLESPACE FREE_PCT
——————————— —————————
LATEST_TS 30.00
MIDTERM_TS 99.33

The output in Listing 1 shows that only 30 percent of the LATEST_TS tablespace is free space, whereas the MIDTERM_TS tablespace—where no partitions are located—is mostly free, with 99.33 percent free space.

Next, John enables Automatic Data Optimization for the EXCHANGE_RATE table, by executing the following SQL command as the user FOREX (who owns the table):

alter table exchange_rate ilm add policy tier to midterm_ts;

In this command, John creates an ILM policy on the EXCHANGE_RATE table that tells Automatic Data Optimization that the second storage tier for the table is the MIDTERM_TS tablespace. Data should be relocated to this tablespace if the free space in the original tablespace drops below the default threshold level of 15 percent. Because the current tablespace is 30 percent free now, the ILM policy has no effect on the table—at least not yet—John explains.

As time passes, Acme DBAs will add more partitions to the table to hold new data, and Listing 2 shows the SQL statements they will use to add the partitions for October, November, and December 2013, named Y13M10, Y13M11, and Y13M12, respectively. To simulate the storage volume growth for that time, John inserts some rows that will go into these partitions, as shown in Listing 2.

Code Listing 2: Adding new partitions

alter table exchange_rate add partition y13m10 
values less than (to_date('11/01/2013','mm/dd/yyyy')) tablespace latest_ts;
alter table exchange_rate add partition y13m11
values less than (to_date('12/01/2013','mm/dd/yyyy')) tablespace latest_ts;
alter table exchange_rate add partition y13m12
values less than (to_date('01/01/2014','mm/dd/yyyy')) tablespace latest_ts;
insert into exchange_rate values ('USD', 'GBP', add_months(sysdate,2),
add_months(sysdate,2),1);
insert into exchange_rate values ('USD', 'GBP', add_months(sysdate,3),
add_months(sysdate,3),1);
insert into exchange_rate values ('USD', 'GBP', add_months(sysdate,4),
add_months(sysdate,4),1);
commit;

With these additional partitions, John points out, the LATEST_TS tablespace is now more populated. He checks the free space again by executing the SQL shown in Listing 1 and reviewing the result:

TABLESPACE  FREE_PCT
—————————— —————————
LATEST_TS 8.67
MIDTERM_TS 99.33

The free space in LATEST_TS has now dropped to 8.67 percent—below the threshold of 15 percent—so the ILM policy should trigger the relocation of some segments from this tablespace to free up space. John waits for the result of the ILM policy and checks the location of the partitions.

select partition_name, 
tablespace_name
from user_segments
where segment_name =
'EXCHANGE_RATE';
PARTITION TABLESPACE
————————— ——————————
Y13M11 LATEST_TS
… output truncated …
Y12M08 LATEST_TS
Y13M07 MIDTERM_TS
Y13M06 MIDTERM_TS
Y12M12 MIDTERM_TS
Y12M07 MIDTERM_TS

John directs everyone’s attention to the last four partitions—Y12M07, Y12M12, Y13M06, and Y13M07. Earlier, these partitions were located in the LATEST_TS tablespace, but now they are in MIDTERM_TS. This activity resulted in the freeing up of space in the LATEST_TS tablespace. This relocation of the partitions was done automatically without the intervention of the DBAs and only after the free space dropped below the threshold. Jill nods her approval. The DBAs appreciate that the relocation required nothing more than defining the ILM policy on the table. All in all, everyone is happy.

Checking the Heat Map

Jill observes that only a few partitions—not all—were relocated to the spillover tablespace. The segments for relocation, John explains, are based on the heat map of the segments, described earlier. The less recently the segment is accessed, the greater the chance that it will be relocated. John shows everyone the heat map of the EXCHANGE_RATE table, by using the SQL statement shown in Listing 3.

Code Listing 3: Checking the last access time for a table

select
subobject_name “Part Name”,
to_char(segment_write_time,'mm/dd/yy hh24:mi:ss') write_time,
to_char(segment_read_time,'mm/dd/yy hh24:mi:ss') read_time,
to_char(full_scan,'mm/dd/yy hh24:mi:ss') fts_time
from dba_heat_map_segment
where owner = 'FOREX'
and object_name = 'EXCHANGE_RATE'
order by full_scan desc;
Part Name WRITE_TIME READ_TIME FTS_TIME
————————— —————————————————— ————————————————— —————————————————
Y13M03 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y12M09 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y13M04 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y12M11 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y13M10 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y12M08 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y13M09 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y13M02 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y13M05 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y13M01 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y12M10 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y13M11 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 19:48:44
Y13M07 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 00:57:12
Y12M07 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 00:57:12
Y12M12 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 00:57:12
Y13M06 07/14/13 12:23:01 07/14/13 12:24:34 07/15/13 00:57:12

Jill examines the output carefully and observes that for the partitions Y13M07, Y12M07, Y12M12, and Y13M06, the full table scan occurred at 07/15/13 00:57:12, compared to 07/15/13 19:48:44 for the rest of the partitions. These are exactly the same partitions relocated to the MIDTERM_TS tablespace by Automatic Data Optimization. “Was that why these partitions were candidates for relocation?” she asks.

Exactly, John confirms. When Automatic Data Optimization had to choose the least recently accessed segment in the LATEST_TS tablespace, it looked at the heat map of the objects in that tablespace and selected these partitions, because they were the least recently accessed.

Administration

The group—now suitably impressed—wants to know more about managing the ILM policies. “How can I check the threshold at which Automatic Data Optimization kicks in?” asks Jill. To see that and other parameters ILM uses, John looks at the DBA_ILMPARAMETERS view:

select * from dba_ilmparameters;
NAME VALUE
——————————————————— —————
ENABLED 1
JOB LIMIT 10
EXECUTION MODE 3
EXECUTION INTERVAL 15
TBS PERCENT USED 85
TBS PERCENT FREE 25

The TBS PERCENT USED parameter specifies the percentage at which the tablespace is considered to be full. The value in this example is 85 percent, so the LATEST_TS tablespace was considered full when the free space earlier fell to 8.67 percent, John explains. This triggered Automatic Data Optimization’s relocation of segments to the next tier of storage. “When does it stop relocating?” asks Jill. This is where, John responds, another parameter—TBS PERCENT FREE—comes into play. The relocation of segments out of the tablespace continues until the free space percentage reaches this parameter value, 25 percent in this case. However, John cautions, this is just an estimate; Automatic Data Optimization does not guarantee that there will be that much free space.

“Suppose I want to change the 85 percent used to consider the tablespace full,” Jill muses. “Can I change it to, say, 90 percent?” Of course, John answers, and executes the following SQL to change the parameter to 90 percent:

begin
dbms_ilm_admin.customize_ilm (
dbms_ilm_admin.tbs_percent_used,
90);
end;
/

Likewise, John explains, all the properties listed in the DBA_ILMPARAMETERS view can be changed with the CUSTOMIZE_ILM procedure in the DBMS_ILM_ADMIN package.

Someone else has another question: “After a period of time, we would like to know what tables are under ILM policies and how the second tier of storage has been defined. Is there a way to get that information?” “Yes, there is,” John assures. He executes the SQL statements shown in Listing 4 to get that information. From the output, he demonstrates that the policy named P21 is defined on the table EXCHANGE_RATE, with the spillover tablespace being MIDTERM_TS. All the partitions of the table inherit the policy from the table (as shown in the INHERITED_FROM column). The last row in the result shows that the table itself is under the policy but is not inherited.

Code Listing 4: Identifying ILM data movement policies

select policy_name, action_type, scope, tier_tablespace, condition_days
from user_ILMDataMovementPolicies;
POLICY_NAME ACTION_TYPE SCOPE TIER_TABLESPACE
——————————— ——————————— —————— ———————————————
P21 STORAGE SEGMENT MIDTERM_TS
select * from user_ILMObjects where object_name = 'EXCHANGE_RATE';
POLICY_NAME SUBOBJECT_NAME OBJECT_TYPE INHERITED_FROM ENA
——————————— —————————————— ——————————————— —————————————— ————
P21 Y12M01 TABLE PARTITION TABLE YES
P21 Y12M02 TABLE PARTITION TABLE YES
P21 Y12M03 TABLE PARTITION TABLE YES
… output truncated …
P21 TABLE POLICY NOT INHERITED YES

“What actually moves the segments to a different tablespace?” Jill wants to know. Fair question, John concedes and explains that there is a job that moves the segments when the time comes. The job kicks in during the maintenance window in the database and calls the EXECUTE_ILM procedure in the DBMS_ILM package. To see details of this job, John queries DBA_ILMTASKS, as shown in Listing 5. It shows him the start and end time of the task. To get more details on the task, he executes the second query in Listing 5.

Code Listing 5: Checking ILM tasks

select task_id, task_owner,
to_char(start_time, 'mm/dd/yy hh24:mi:ss') start_time,
to_char(completion_time, 'mm/dd/yy hh24:mi:ss') completion_time
from dba_ilmtasks
where task_owner != 'SYS'
TASK_ID TASK_OWNER START_TIME COMPLETION_TIME
—————————— —————————— ————————————————— —————————————————
3 FOREX 07/15/13 10:41:37 07/15/13 10:41:43
-- Find the task details
select job_name, job_state,
to_char(start_time, 'mm/dd/yy hh24:mi:ss') start_time,
to_char(completion_time, 'mm/dd/yy hh24:mi:ss') completion_time
from dba_ilmresults
where task_id = 3;
JOB_NAME JOB_STATE START_TIME COMPLETION_TIME
———————— —————————————————————— ————————————————— —————————————————
ILMJOB18 COMPLETED SUCCESSFULLY 07/13/13 10:41:41 07/13/13 10:41:43


Compression

Jill is happy to see her first requirement—partition relocation—successfully met. She asks about her second request—to compress unused rows in tables. Although the exchange rates set in the table are usually immutable, they may occasionally be changed due to a mistake in data entry or a miscommunication. However, changes to data more than seven days old are rare. Therefore, Jill wants to automatically compress rows in the EXCHANGE_RATE table that have not changed for the last seven days.

To accomplish that objective, John adds another ILM policy to the table:

alter table exchange_rate
ilm add policy row store
compress advanced segment
after 7 days
of no modification;

With this ILM policy in place, the segment is automatically compressed after a day passes without modification of any row in a segment. To check on the new policy, John uses the SQL query shown in Listing 6. The result shows a new policy—named P41—that has the COMPRESSION action type, which indicates that the policy acts by compressing data in the table. CONDITION_TYPE shows how Automatic Data Optimization should decide on the applicability of the policy (data not modified, in this case), and CONDITION_DAYS shows the number of days after which the policy should choose the object for compression (seven, in this case).

Code Listing 6: Checking ILM policies on user objects

select policy_name, action_type, scope, compression_level, 
condition_type, condition_days
from user_ilmdatamovementpolicies
order by policy_name;
POLICY_NAME ACTION_TYPE SCOPE CO...VEL CONDITION_TYPE CONDITION_DAYS
——————————— ——————————— ——————— ———————— —————————————————————— ——————————————
P21 STORAGE SEGMENT 0
P41 COMPRESSION SEGMENT ADVANCED LAST MODIFICATION TIME 7

After the policy has been in place and seven days have passed, John could fire up the query shown in Listing 7 to check on the compression status of the partitions. The result would demonstrate how various partitions have been compressed, as shown by an ENABLED value in the COMPRESSION column. If the partition is not compressed, this column would report DISABLED.

Code Listing 7: Checking compression of partitions

select partition_name, tablespace_name, compression, compress_for
from dba_tab_partitions
where table_owner = 'FOREX'
and table_name = 'EXCHANGE_RATE'
order by partition_name;
PARTITION_NAME TABLESPACE COMPRESSION COMPRESS_FOR
—————————————— —————————— ——————————— ————————————
Y12M01 MIDTERM_TS ENABLED ADVANCED
Y12M02 MIDTERM_TS ENABLED ADVANCED
Y12M03 MIDTERM_TS ENABLED ADVANCED
Y12M04 MIDTERM_TS ENABLED ADVANCED
Y12M05 LATEST_TS DISABLED
Y12M06 LATEST_TS DISABLED
Y12M07 MIDTERM_TS ENABLED ADVANCED
Y12M08 LATEST_TS ENABLED ADVANCED
Y12M09 LATEST_TS ENABLED ADVANCED
Y12M10 LATEST_TS ENABLED ADVANCED
Y12M11 LATEST_TS ENABLED ADVANCED
Y12M12 MIDTERM_TS ENABLED ADVANCED
Y13M01 LATEST_TS ENABLED ADVANCED
Y13M02 LATEST_TS ENABLED ADVANCED
Y13M03 LATEST_TS ENABLED ADVANCED
Y13M04 LATEST_TS ENABLED ADVANCED
Y13M05 LATEST_TS ENABLED ADVANCED
Y13M06 MIDTERM_TS ENABLED ADVANCED
Y13M07 MIDTERM_TS ENABLED ADVANCED
Y13M08 MIDTERM_TS ENABLED ADVANCED
Y13M09 LATEST_TS DISABLED
Y13M10 LATEST_TS DISABLED
Y13M11 LATEST_TS DISABLED
Y13M12 LATEST_TS DISABLED


Conclusion

Revisiting the original requirements, John shows how they have been addressed by the new Advanced Data Optimization feature, introduced in Oracle Database 12c. Under the storage-tier-based ILM policy, a segment is moved to a different tablespace (on lower-cost storage) when the free space in the original tablespace falls below a defined threshold. Under the compression-based ILM policy, segments are compressed when the data has not been modified for a specified number of days. And all this is done automatically without DBA intervention. Agreeing on the high value of this feature, the DBAs as well as the developers thank John and leave happy.

Next Steps

READ more about Automatic Data Optimization Oracle Database
 Oracle Database VLDB and Partitioning Guide 12c Release 1 (12.1)

 LEARN more about Automatic Data Optimization




Photography by Ricardo Gomez, Unsplash