Thursday Jul 18, 2013

Migrated Rows and PDML

Recently I have seen a raising trend of issues about chained rows so this post is to show a not so obvious side effect. 
I see no reason to explain what a chained row is and examples of them as in the internet there are very good postings that talk about them
Migrated Rows is just a special case of chained rows.
I am going to use a similar testcase as the article to show my point.
Similar table, just bigger columns as the standard block size nowdays is 8k.

EMB11203> cat setup.sql
drop table  row_mig_chain_demo;
CREATE TABLE row_mig_chain_demo (
  x int PRIMARY KEY,
  a CHAR(2000),
  b CHAR(2000),
  c CHAR(2000),
  d CHAR(2000),
  e CHAR(2000)
);
EMB11203> cat data.sql
var pk number;
begin
 select nvl(max(x),0) into :pk from row_mig_chain_demo;
end;
/
Prompt just in case I want to add more, I just re-execute the script.
INSERT INTO row_mig_chain_demo (x)
select rownum+:pk from dual connect by rownum <= 100;
COMMIT;
Prompt migrate the rows.
UPDATE row_mig_chain_demo SET a = rownum, b = rownum, c = rownum;
commit;
analyze table ROW_MIG_CHAIN_DEMO compute statistics;

The script creates 100 rows and migrates 99 of them

SQL> select  table_name,num_rows,blocks,CHAIN_CNT
  2  from user_tables where table_name='ROW_MIG_CHAIN_DEMO';
TABLE_NAME                       NUM_ROWS     BLOCKS  CHAIN_CNT
------------------------------ ---------- ---------- ----------
ROW_MIG_CHAIN_DEMO                    100        103         99


So, first interesting thing. Why I used ANALYZE instead of DBMS_STATS ?
Because DBMS_STATS does not collect CHAIN_CNT.
In May 2008 an internal bug was opened to request that DBMS_STATS collects CHAIN_CNT and it is still waiting to be implemented, so while I wait I'll use ANALYZE to collect it.
The CBO was enhanced in 10.2.0.3 to take CHAIN_CNT in its cost calculations so if you feel that it would be good for DBMS_STATS to implement the collection of CHAIN_CNT drop Optimizer Development a line 
Second, why 99 and not all were migrated ?
because the update on the first row updated only 3 columns with combined length of 6000 which still fits in the block.

The insert placed all the rows in the same block and the update migrated the rest.
The following query shows that the head piece is still in the same block.

set pagesize 200 lines 180
select x,DBMS_ROWID.ROWID_RELATIVE_FNO (rowid) , dbms_rowid.ROWID_BLOCK_NUMBER (rowid)
from row_mig_chain_demo
order by 2,3
         X DBMS_ROWID.ROWID_RELATIVE_FNO(ROWID) DBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID)
---------- ------------------------------------ ------------------------------------
         1                                    4                               248213
         2                                    4                               248213
         3                                    4                               248213
...         
       100                                    4                               248213         

A block dump shows all but the first row to be migrated.

Start dump data blocks tsn: 4 file#:4 minblk 248213 maxblk 248215
Block dump from cache:
Dump of buffer cache at level 4 for tsn=4 rdba=17025429
BH (0x6a3884b8) file#: 4 rdba: 0x0103c995 (4/248213) class: 1 ba: 0x6bcc4000
...
===============
tsiz: 0x1f98
hsiz: 0xda
pbl: 0x2b017976ba64
     76543210
flag=--------
ntab=1
nrow=100                               <-- 100 rows.
...
block_row_dump:
tab 0, row 0, @0x5c1
tl: 6015 fb: --H-FL-- lb: 0x1  cc: 4   <-- Head, First and Last Piece
...
tab 0, row 1, @0x5b8
tl: 9 fb: --H----- lb: 0x1  cc: 0      <-- Only the head
nrid:  0x0103c996.0                    <-- Where is the next piece.
tab 0, row 2, @0x5af
tl: 9 fb: --H----- lb: 0x1  cc: 0
nrid:  0x0103c997.0
....
tab 0, row 99, @0x246
tl: 9 fb: --H----- lb: 0x1  cc: 0
nrid:  0x0103ed84.0
end_of_block_dump
Dump of buffer cache at level 4 for tsn=4 rdba=17025430
BH (0x6a30e928) file#: 4 rdba: 0x0103c996 (4/248214) class: 1 ba: 0x6b048000
...
===============
tsiz: 0x1f80
hsiz: 0x14
pbl: 0x2b017976ba7c
     76543210
flag=--------
ntab=1
nrow=1                                <-- 1 row
...
block_row_dump:
tab 0, row 0, @0x7fb
tl: 6021 fb: ----FL-- lb: 0x2  cc: 4  <-- First and Last Piece
hrid: 0x0103c995.1                    <-- Where is the head 


Everything is setup so now to experiment.
every where I looked in the internet everyone talked about queries so I thought a good opportunity to show the effect of the migrated rows over a DML.
So I will use a Serial UPDATE and a Parallel UPDATE.

EMB11203> cat sdml.sql
update row_mig_chain_demo
set a=rowid||'-'||to_char(sysdate,'yyyymmddhh24miss')
/
EMB11203> cat pdml.sql
alter session enable parallel dml;
alter session force parallel query;
update /*+ parallel */row_mig_chain_demo
set a=rowid||'-'||to_char(sysdate,'yyyymmddhh24miss')
/
EMB11203> cat d.sql
set pages 2000 lines 180
SELECT a.name, b.value value_QC
  FROM v$statname a, v$mystat b
 WHERE a.statistic# = b.statistic#
   AND lower(a.name) = 'table fetch continued row';
SELECT a.name, b.value value_system
  FROM v$statname a, v$sysstat b
 WHERE a.statistic# = b.statistic#
   AND lower(a.name) = 'table fetch continued row';

The update will modify only the first column of all 100 rows but since the column is CHAR there is not going to be any change in their allocated space.

First Test, the serial UDPATE

SQL> @d
NAME                                                               VALUE_QC
---------------------------------------------------------------- ----------
table fetch continued row                                                 0
NAME                                                             VALUE_SYSTEM
---------------------------------------------------------------- ------------
table fetch continued row                                                 482
SQL> @sdml
100 rows updated.
SQL> @d
NAME                                                               VALUE_QC
---------------------------------------------------------------- ----------
table fetch continued row                                                 0
NAME                                                             VALUE_SYSTEM
---------------------------------------------------------------- ------------
table fetch continued row                                                 482
SQL> commit;
Commit complete.
SQL> @d
NAME                                                               VALUE_QC
---------------------------------------------------------------- ----------
table fetch continued row                                                 0
NAME                                                             VALUE_SYSTEM
---------------------------------------------------------------- ------------
table fetch continued row                                                 482
Why did I query the statistics before and after the commit ? the answer will be evident in the PDML test. The execution plan shows a full scan and the expectation is that multiblock reads will be done and sure enough the 10046 shows it.

It is interesting that the first IO is a single block read but later all are multiblock.
So far , all that was promised is there, the full table scan is not affected by the migrated rows.

Now the Parallel Update

NAME                                                               VALUE_QC
---------------------------------------------------------------- ----------
table fetch continued row                                                 0

NAME                                                             VALUE_SYSTEM
---------------------------------------------------------------- ------------
table fetch continued row                                                 602
SQL> @pdml
Session altered.

Session altered.

100 rows updated.
SQL> @d
NAME                                                               VALUE_QC
---------------------------------------------------------------- ----------
table fetch continued row                                                 0

NAME                                                             VALUE_SYSTEM
---------------------------------------------------------------- ------------
table fetch continued row                                                 700
SQL> commit;
Commit complete.
SQL> @d
NAME                                                               VALUE_QC
---------------------------------------------------------------- ----------
table fetch continued row                                                98

NAME                                                             VALUE_SYSTEM
---------------------------------------------------------------- ------------
table fetch continued row                                                 700


To answer a previous question, why I get the statistics before and after the commit ?
The reason is that it appears the statistics of the slaves are not updated to the Query Coordinator (QC) until the transaction is over
but they are updated in the system-wide statistics and as it shows, 98 "table fetch continued row" (TFCR) operations were done.
98 is a strange number as there are 99 migrated rows.

The execution plan shows the Full Table Scan too so why the Parallel UPDATE did TFCR when the Serial did not ?

The explanation given by development is that some operations , in particular PDMLs, in order to avoid some deadlock scenarios the slaves need to do head piece scan while serial does a row piece scan.

Why do I care if it is a head piece scan vs row piece scan done ?

Because a Head piece scan reads the entire row, this means accessing all the blocks where the row pieces resides to get then all, and the worst part is that it has to be resolved immediately, cannot be deferred as the pieces are found by the multiblock read so the head piece scan does single block reads to look for the pieces.

The 10046 of the slaves shows those single block reads.

This is particulary bad in Exadata as customers like to have very wide tables (more than 255 columns which are automatically chained) and like to do PDML.

In Exadata system, the waitevent will be 'cell single block physical read'.

There is hope though, smart scan can cope with some cases of chained rows as long as the head and pieces are within the 1MB boundary of the IO but do not temp your luck and do your best to not produce chained or migrated rows, it is in everyones best interest.

Thursday Apr 11, 2013

how to identify "think time" in a SQL trace ?

There are multiple blogs posts, presentations and documentation about "think time" and "service time" and some tools that would translate that for you 
but if you do not have those tools available or you need to understand a pattern of behavior from the client app, how can you identify "think time" in a SQL trace ?

This case is of a customer which reported a performance problem from its JDBC connections running reports that are constantly timing out and needed to identify where in the stack the problem is. 

From database point of view we got a 10046 level 12 trace (SQL Trace with Wait events and Bind variables)


=====================
PARSING IN CURSOR #4... len=50 dep=0 uid=888 oct=3 lid=888 tim=1365524355043575 
SELECT STATUS FROM <table> WHERE ID = :1
END OF STMT
PARSE #4...:c=1000,e=1126,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=1,plh=0,tim=1365524355043574
BINDS #4...:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=873 siz=24 off=0
  kxsbbbfp=2ab1d8538888  bln=22  avl=06  flg=05
value=123456789
EXEC #4...:c=2999,e=3049,p=0,cr=2,cu=0,mis=1,r=0,dep=0,og=1,plh=3860417426,tim=1365524355046681
WAIT #4...: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524355046730
WAIT #4...: nam='SQL*Net message from client' ela= 274 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524355047023
WAIT #4...: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524355047055
WAIT #4...: nam='SQL*Net message from client' ela= 259 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524355047330
WAIT #4...: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524355047385
FETCH #4...:c=0,e=41,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=3860417426,tim=1365524355047402

*** 2013-04-09 12:19:17.047
WAIT #4...: nam='SQL*Net message from client' ela= 2000161 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524357047634
WAIT #4...: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524357047707
WAIT #4...: nam='SQL*Net message from client' ela= 258 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524357047973
BINDS #4...:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=873 siz=24 off=0
  kxsbbbfp=2ab1d8538888  bln=22  avl=06  flg=05
value=123456789
EXEC #4...:c=0,e=58,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3860417426,tim=1365524357048051
WAIT #4...: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524357048086
FETCH #4...:c=0,e=26,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=3860417426,tim=1365524357048098

*** 2013-04-09 12:19:19.054
WAIT #4...: nam='SQL*Net message from client' ela= 2006775 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524359054902
WAIT #4...: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524359055015
WAIT #4...: nam='SQL*Net message from client' ela= 294 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524359055324
BINDS #4...:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=873 siz=24 off=0
  kxsbbbfp=2ab1d8538888  bln=22  avl=06  flg=05
value=123456789
EXEC #4...:c=0,e=74,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3860417426,tim=1365524359055435
WAIT #4...: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524359055482
FETCH #4...:c=0,e=40,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=3860417426,tim=1365524359055503

*** 2013-04-09 12:19:21.055
WAIT #4...: nam='SQL*Net message from client' ela= 2000065 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524361055603
WAIT #4...: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524361055688
WAIT #4...: nam='SQL*Net message from client' ela= 267 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524361055967
BINDS #4...:
 Bind#0
  oacdty=02 mxl=22(22) mxlc=00 mal=00 scl=00 pre=00
  oacflg=01 fl2=1000000 frm=01 csi=873 siz=24 off=0
  kxsbbbfp=2ab1d8538888  bln=22  avl=06  flg=05
value=123456789
.....
.....
Pattern repeats 30 times.
.....
.....
WAIT #4...: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524415383807
FETCH #4...:c=0,e=45,p=0,cr=4,cu=0,mis=0,r=1,dep=0,og=1,plh=3860417426,tim=1365524415383827
WAIT #4...: nam='SQL*Net message from client' ela= 282 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524415384145
WAIT #4...: nam='SQL*Net message to client' ela= 1 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524415384170
WAIT #4...: nam='SQL*Net message from client' ela= 317 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1365524415384497


The trace shows a query that is looking for the STATUS of a specific ID over an over again.
In between each roundtrip there is a delay of 2 seconds (Notice the ela= 2000000+ in 'SQL*Net message from client')

I asked the customer if he considered normal for the session to execute the same query with same binds over and over again.
The answer was no. And there was neither an explanation about why there is a 2 seconds delay between calls.

From the point of view of the database, the query was parsed, bound , executed and rows fetched in microseconds.
After that, the Oracle Process sits idle waiting for the client software to issue the next statement and that came after 2 seconds.

That is the application "think time" and it is reflected by the 'SQL*Net message from client' in the SQL trace.

After some digging, the customer recognized that this was a session that comes from a JDBC program that was
in a loop keeping taps on the execution of a report from another session and it was doing it by looking 
at the STATUS of the report over and over after sleeping for 2 seconds.

The objective of my post is to show how "think time" looks like but just to complete the root cause,

Other traces show the reports were finishing in a timely manner but these "tracking" sessions were on the same connection pool as the reports.
The connection pooling was not recognizing that these "tracking" sessions were "idle" for those 2 seconds so the connection was being considered active and not yielding it for other incoming report so effectively half the slots were used by useful work and half by tracking sessions.

To make things worst, for some reason the "tracking" were not going away after the reports finished and new ones were being created as new reports came in so eventually the connection pool got exhausted and no work was being done.



About

I am Abel Macias email, an Oracle support engineer that specialized in Performance that belongs to Exadata Support.
Disclaimer This blog looks to broadcast my experiences while diagnosing performance problems on Oracle's RDBMS, mostly on Exadata.
The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today