X

An Oracle blog about Oracle Text Index

  • January 26, 2017

Parallel Sync of Text Indexes

Roger Ford
Product Manager

When syncing a CONTEXT index, we would normally use a command such as:

  exec ctx_ddl.sync_index( index_name=>'MYINDEX', part_name=>'P1' )

However, we can "suggest" that this is run in parallel using the command

  exec ctx_ddl.sync_index( index_name=>'MYINDEX', part_name=>'P1', parallel_degree=>16 )

Why "suggest"? Well, sometimes the kernel thinks it knows better than you, and will decide not to run things in parallel even if you tell it to. But in most cases this will work.

Now, if you generate an AWR report, or otherwise examine your system's SQL, you may see a query something like this, which appears to be taking a great deal of resources:

select column_value from
table(cast(ctxsys.drvparx.ParallelPopuIndex(cursor(select /*+
DYNAMIC_SAMPLING(0) ordered PARALLEL(pnd.dr$pending 16)
INDEX_FFS(pnd.dr$pending) NOCACHE(base) */ base.rowid, NULL, NULL,
NULL, NULL, nvl(pnd_lock_failed, 'N'), base."TEXT" from
ctxsys.drv$pending pnd, "ROGER"."DOCS" PARTITION("P1") base where
pnd.pnd_cid = :CID and pnd.pnd_pid = :PID and pnd.pnd_timestamp <=
to_date(:TSTAMP, 'YYYYMMDDSSSSS') and base.rowid = pnd.pnd_rowid),
:idxownid, :idxid, :idxown, :idxname, :ixpname, :popstate) as
sys.odcivarchar2list))

What's going on there?  Well, the answer is that this is a "driver query" for the parallel sync.  SQL doesn't provide generalized syntax for executing PL/SQL statements in parallel, but it does provide a parallel query mechanism. That mechanism knows stuff like how many CPUs the system has, and can make 'intelligent' decisions about whether running things in parallel is good or not. 

So we call a parallel query, but put a PL/SQL function in the SELECT list. This PL/SQL function will be called in parallel by all the parallel query threads - thus giving us "free" parallel execution of our PL/SQL functions. By having a "PARRALLEL" hint in the query, we can give the kernel a recommended degree of parallelism.

One side-issue about that query: People often look at it, or similar queries and ask "why is using dynamic sampling"? Dynamic sampling is an optimizer technique which involves sampling data in the table to figure out which query path makes sense. It's not a good way to approach any query where you know for sure the best plan to use. The answer to that question is "it's not".  The key is in the zero argument - DYNAMIC_SAMPLING(0) means "don't use dynamic sampling, even if it seems like a good idea".


Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.