Tuesday May 21, 2013

Diagnosis of a High Version Count


In the past, I have participated in many situations where I had to debug the mechanisms that generate large number of child cursors and learned that ,more often than not ,a cursor is not shared for many reasons and those may change from version to version so I wrote a tool to help debug it.

I see pertinent first explaining a little what child cursors are and what a high version count is.

A cursor is the instance of query/dml in the library cache.
A hash key to locate the cursor in the library cache is calculated based on the text of the statement.
But if the text is the only criteria used to identify the cursor in the library cache what happens if the cursor is not compatible with the conditions that execute them ?
For example, how can the same cursor be used when the same query is executed by 2 users with different privileges or with different language or even different objects?

It creates a version of the cursor with those unique conditions that make it possible to execute.
A version of a cursor is also known as child cursor. So if there is a child, is there a parent ?
Yes, the parent cursor is the root where all those child cursors are linked.

In order for a session to find the correct version to execute, it attempts to match the query text in the library cache using the hash key generated by the text to the parent cursor and once found it scans the list of versions (or children) one by one to match which has the correct set of conditions and if it does not find one then it creates a new version and adds it to the list.

For example, lets say I have 3 schemas, A, B ,C and a table T in schema A and a public synonym for T.
Lets say further that A grants access on T to B.

What happens if I run the query : "SELECT * FROM T" from each schema ?

if the query is ran by A, "T" points to the schema object "A"."T" and the query runs ok. A version is created that indicates that a session connected with schema "A" can access "A"."T".
if the query is ran by B, "T" points to public synonym "T" and that points to "A"."T" and because A granted access on T to B then the query runs ok. A version is created that indicates that a session connected with schema "B" can access "A"."T".
if the query us ran by C, "T" points to public synonym "T" and that points to "A"."T" and but because A has not granted access on T to C then the query fails and no version is created.

If now, "A" grants access on "T" to "C" and runs the query again, then "T" is resolved again to to public synonym "T" and that points to "A"."T" and since it now has privileges the query runs ok and a version is created that indicates that a session connected with schema "C" can access "A"."T".
If now, "A" revokes access on "T" to "B", then the version associated with schema "B" is invalidated, to let any session that has a pointer to that child cursor that it is no longer valid.

If the query is ran by "A" but on a client that has German NLS settings and all the previous ones were done in English then a new version is created for German NLS settings.

I can easily continue the examples as there are over 50 different reasons why a child needs to be created.
These are tracked in V$SQL_SHARED_CURSOR.

There are reasons related to Binds, privileges, languages, parallel execution, optimization, and more,
and reasons are added, removed or change in meaning with each version.

Even if in theory there could be 50+ combinations of reasons and each reason may have "n" number of variables that can make it non share-able to anyone but who has those same conditions , the number of versions usually remains low. The norm is that most cursors will have less than 100 versions but there are times when hundreds or thousands of versions are made and their management cause significant performance degradation in the system.

As each reason is related to different features in the database there is not one way to debug a high version count situation.
For example, in order to diagnose when the reason for not sharing is language, it might be necessary to know what languanges are involved.
If the reason is binds, it might be necessary to know how many, what datatype, and what size are the binds.
and so on.

For a more in depth explanation see : Troubleshooting: High Version Count Issues [ID 296377.1]

After working several of these high version count cases I came to identify some common diagnostics , their presentation and their interpretation so I made a diagnostic tool to collect them.

Roudtrips between support and the customer have always been annoying and riddled with mistakes and omitions so the tool has the objective to collect as much relevant information as possible to only the reasons identified.

The tool is VERSION_RPT and can be found in : High SQL Version Counts - Script to determine reason(s) [ID:438755.1]

The tool adapts to the number of reasons in each version so it works from 9.2 to 11gR2 and I expect to work too on 12c once released.
The tool can be used to diagnose one or many cursors, including pseudocursors, using SQL_ID or HASH_VALUE.

The output of the report shows a summary of the reasons and information about the factors that each reason may take into consideration to decide to share or not, like Optimizer parameters when the reason is  OPTIMIZER_MISMATCH.

These reasons are searchable in MOS. Searching for the phrase :

"Cursor not shared due to USER_BIND_PEEK_MISMATCH"

Can found BugTag Notes like Bug 8981059 - High Version Count (due to USER_BIND_PEEK_MISMATCH) with bind peeking [ID 8981059.8]

To collect data for a group of SQLs it can be done


SELECT b.*
FROM v$sqlarea a ,
     TABLE(version_rpt(a.sql_id)) b
WHERE loaded_versions >=100;

Here is an example of its use for one cursor:

===============

SQL> set pages 2000 lines 180
SQL> SELECT * FROM TABLE(version_rpt('8swypbbr0m372'));

Version Count Report Version 3.2.1 -- Today's Date 02-may-12 11:50

RDBMS Version :11.2.0.3.0 Host: XXXXXXXXXXXXXXXXXXXX Instance 1 : XXXXXXXXXX
==================================================================
Addr: 00000003D8B6A570  Hash_Value: 3993603298  SQL_ID 8swypbbr0m372
Sharable_Mem: 208913 bytes   Parses: 477
Stmt:
0 select order#,columns,types from access$ where d_obj#=:1
1

Versions Summary
----------------
OPTIMIZER_MISMATCH :10
BIND_MISMATCH :1
LANGUAGE_MISMATCH :1
OPTIMIZER_MODE_MISMATCH :3
PX_MISMATCH :2
Total Versions:10
~
Plan Hash Value Summary
-----------------------
Plan Hash Value Count
=============== =====
      872636971 11
              0 1
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Details for OPTIMIZER_MISMATCH :

1 versions with is_recur_flags = 167
2 versions with is_recur_flags = 161
9 versions with _pga_max_size = 614400 KB
7 versions with optimizer_mode = choose
3 versions with is_recur_flags = 35
7 versions with hash_area_size = 10485760
1 versions with optimizer_mode = rule
2 versions with advanced_queuing_internal_cursor = 1
2 versions with is_recur_flags = 163
1 versions with parallel_execution_enabled = false
1 versions with _pga_max_size = 204800 KB
10 versions with _optimizer_order_by_elimination_enabled = false
2 versions with is_recur_flags = 39
1 versions with hash_area_size = 2000000
1 versions with sort_area_size = 1000000
8 versions with _db_file_optimizer_read_count = 128
10 versions with _optimizer_cost_based_transformation = on
1 versions with parallel_threads_per_cpu = 4
1 versions with parallel_max_degree = 16
1 versions with _parallel_syspls_obey_force = false
~~~~~~~~~~~~~~~~~~~~~~~~~~~

Details for BIND_MISMATCH :

Consolidated details for :
BIND_MISMATCH,USER_BIND_PEEK_MISMATCH,BIND_UACS_DIFF and
BIND_EQUIV_FAILURE (Mislabled as ROW_LEVEL_SEC_MISMATCH BY bug 6964441 in 11gR1)

from v$sql_bind_capture
COUNT(*) POSITION MIN(MAX_LENGTH) MAX(MAX_LENGTH) DATATYPE (PRECISION,SCALE)
======== ======== =============== =============== ======== ================
      11        1              22              22        2 (,)

SUM(DECODE(column,Y, 1, 0) FROM V$SQL
IS_OBSOLETE IS_BIND_SENSITIVE IS_BIND_AWARE IS_SHAREABLE
=========== ================= ============= ============
          0                 0             0            9
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Details for LANGUAGE_MISMATCH :

No details available
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Details for OPTIMIZER_MODE_MISMATCH :

1 versions with RULE
9 versions with CHOOSE
2 versions with ALL_ROWS
~~~~~~~~~~~~~~~~~~~~~~~~~

Details for PX_MISMATCH :

No details available

####

To further debug Ask Oracle Support for the appropiate level LLL.

alter session set events 'immediate trace name cursortrace address 3993603298, level LLL';

To turn it off do use address 1, level 2147483648
================================================================

84 rows selected.


About

I am Abel Macias email, an Oracle support engineer that specialized in Performance that belongs to Exadata Support.
Disclaimer This blog looks to broadcast my experiences while diagnosing performance problems on Oracle's RDBMS, mostly on Exadata.
The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.

Search

Categories
Archives
« May 2013 »
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
11
12
13
14
15
16
17
18
19
20
22
23
24
25
26
27
28
29
30
31
 
       
Today