Oracle Support Master Note for Troubleshooting Streams Apply Errors ORA-1403, ORA-26787 or ORA-26786 (Doc ID 265201.1)

Master Note for Troubleshooting Streams Apply Errors ORA-1403, ORA-26787 or ORA-26786 (Doc ID 265201.1)

Copyright (c) 2010, Oracle Corporation. All Rights Reserved.

In this Document
  

Purpose
  Last Review Date
  Instructions for the Reader
  Troubleshooting Details
     Retrieve information about the error
     Identify root cause
     Fixing the error
     Avoid new errors
  References


Applies to:

Oracle Server - Enterprise Edition - Version: 9.2.0.1 to 11.2.0.2 - Release: 9.2 to 11.2
Information in this document applies to any platform.

 

Purpose

This article covers how to diagnose Apply Streams errors when the Apply process gets a data error when applying a transaction, like errors ORA-1403, ORA-26787 or ORA-26786

This do not cover critical errors suffered by the apply process that are not related to data errors.

 

Last Review Date

October 15, 2010

 

Instructions for the Reader

A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting.

 

Troubleshooting Details

On a streams environment it is required that data information retrieved by the capture process is in sync with the information that the apply process is going to apply. When this data is not not completely in sync any DML captured may raise an error on the apply side.

A good is example are the errors ORA-1403, ORA-26787 and ORA-26786 which typically occurs  when UPDATE or DELETE statements are attempted on an existing row and the OLD_VALUES of the LCR do not match the current values at the target database. Please see a sample video on how to troubleshoot these errors:

Video - Sample on Troubleshooting an error on Streams Apply error queue (4:35)


There are 2 potential actions to run when you get a data error on the apply error queue. Delete the transaction/s from the error queue or re-execute the transaction/s in the error queue after fixing the data in the apply side. To do this, it is necessary to get the information from the LCR getting the error and compare with the information existing in database if any.

 

Retrieve information about the error

When an apply process gets a data error applying an LCR, it rollbacks the whole transaction and insert the whole transaction in the error queue that could be checked through DBA_APPLY_ERROR view.

The most useful information, when analyzing a data error, in this view are the columns:
  • APPLY_NAME, Name of the apply process that get the error when applying the transaction.
  • LOCAL_TRANSACTION_ID, which contains the id of the transaction that has been rolled back and that we will use to solve the transaction.
  • MESSAGE_NUMBER, Number of the message that has hit the error
  • ERROR_NUMBER, Error number hit
  • ERROR_MESSAGE, Error message hit
  • MESSAGE_COUNT, Number of LCRs in the transaction
You can query this view yourself or you can see the errors through other mechanisms like db console, OEM console, RDA, Streams Health Check or print_errors procedure that is described below.

It is recommended to generate a Streams Health Check report from source site and target site when troubleshooting these errors.

 

View failed LCR

To see what information the apply process has tried to apply, you have to print that LCR. Depending on the size (MESSAGE_COUNT) of the transaction that has failed, it could be interesting to print the whole transaction or a single LCR.

To do this print you can make use of procedures print_transaction, print_errors, print_lcr and print_any documented on :

Oracle Streams Concepts and Administration
  Chapter - Monitoring Streams Apply Processes
     Section - Displaying Detailed Information About Apply Errors

These procedures are also available through Note 405541.1 - Procedure to Print LCRs

To print the whole transaction, you can use print_transaction procedure, to print the error on the error queue you can use procedure print_errors and to print a single_transaction you can do it as follows:

SET SERVEROUTPUT ON;

DECLARE
   lcr SYS.AnyData;
BEGIN
    lcr := DBMS_APPLY_ADM.GET_ERROR_MESSAGE
                (<MESSAGE_NUMBER>, <LOCAL_TRANSACTION_ID>);
    print_lcr(lcr);
END;
/

Check data on the target side

Based on the information retrieved on previous step, it will be necessary to retrieve information of the data indeed stored in database, if any. Do a select in the problematic table using the information in the previous step and referring to the values on old value columns.

NOTE: Streams will not trigger an ORA-1403 for columns not updated at the source site and not supplementally logged either. The OLD column value will
not be included in the LCR for comparison at the destination.

You may need to dump the problematic row from both the source and destination site using the Dump Function. Check for any data mismatch. Be sure to dump ALL primary key columns and date columns.

For Example:

SELECT dump(deptno), dump(dname), dump(loc)
FROM dept where deptno=40;

See Note 108377.1 - How Do You Read the Output from the Dump Function?

 

Enable Tracing

On certain situations, it could be necessary to enable tracing and re-execute failed transaction to observe exactly when and where the error is being raised.
To do this, you'll have to do as strmadmin:

alter session set max_dump_file_size=unlimited;
alter session set events '10046 trace name context forever, level 4';
exec DBMS_APPLY_ADM.EXECUTE_ERROR ('<local_transaction_id>');   -- or DBMS_APPLY_ADM.EXECUTE_ALL_ERRORS
 -- Error will be reported
alter session set events '10046 trace name context off';

These operations create a trace file on udump/diag directory. To interpret this output use Note 39817.1 - Interpreting Raw SQL_TRACE and DBMS_SUPPORT.START_TRACE output.

 

Identify root cause

 

Here is a list of potential root causes that may lead to the error; there could be more than one reason to get an error when applying an LCR. When a LCR is being applied and we hit on an error, we are stopping on the first error it may find.

Supplemental Logging

 

Ensure that all key columns that uniquely identify a row (usually primary key) at source site are included in the LCR that has failed; if they are not this could be a problem of inaccurate supplemental logging.

Verify that supplemental logging has been specified at the source either at the database level or for the individual replicated table.

Database level logging

SELECT supplemental_log_data_pk, supplemental_log_data_ui FROM V$DATABASE;

Table level logging

Check supplemental log groups. For Example:

ALTER TABLE scott.dept
ADD SUPPLEMENTAL LOG GROUP dept_log_group (deptno, dname, loc) ALWAYS;

Select owner,log_group_name, table_name, always from dba_log_groups;

Check columns in supplemental log groups:

Select log_group_name, column_name, position
from dba_log_group_columns
where table_name = 'DEPT' and owner='SCOTT';

 

Key columns identification

Ensure that LCR failing have value assigned for key columns on target site, this means to verify if the target table has primary key and LCR contains values for all the columns.

If target table do not have primary key, you can either add the primary key constraint to the table, or use the DBMS_APPLY_ADM.SET_KEY_COLUMNS procedure to specify the identification key for apply. When using SET_KEY_COLUMNS be sure to list ALL of the key columns, in order, in the column list. Do not specify each column in a separate call to SET_KEY_COLUMNS.

his has even more importance if the tables that are being managed contain LOB columns, otherwise you will get errors ORA-1403 or ORA-26572 while applying LOB LCR as described on Note 567623.1

In the case that the columns of the primary key on source site do not match the columns of the primary key  at target, it will be necessary that target site key columns are supplementally logged at the source database, so the LCR will not contain the values needed to identify the row at the target site. This applies to using SET_KEY_COLUMNS as well.

 

Triggers and Constraints

The LCR identified as cause of the error indicates in which row we are hitting the error, but note that the operation may involve additional operations fired by a trigger or by a constraint in the case of a DELETE operation.

Therefore ensure that triggers associated to target table has firing property set to fire once. This verification is done using DBMS_DDL.IS_TRIGGER_FIRE_ONCE procedure. By default, DML and DDL triggers are set to fire once. If a trigger is not set to fire once it will fire when a relevant change is made by an apply process and/or when a relevant change results from the execution of one or more apply errors using the EXECUTE_ERROR or EXECUTE_ALL_ERRORS procedure in the DBMS_APPLY_ADM package.

set serveroutput on

DECLARE
   res boolean;
BEGIN
   res := DBMS_DDL.IS_TRIGGER_FIRE_ONCE('triggerowner'.'triggername');
   IF res THEN
     DBMS_OUTPUT.PUT_LINE('Fire Once? YES');
   ELSE
     DBMS_OUTPUT.PUT_LINE('Fire Once? NO');
   END IF;
END;
/

If necessary set the property using DBMS_DDL.SET_TRIGGER_FIRING_PROPERTY.

Verify if table being affected has a foreign key with includes a ON DELETE CASCADE clause, as  the error may come because cascade operation. Please disable the constraint in this case or create an error handler to manage the ORA-1403 error for DELETE statements.

 

Transformations and Handlers

Declarative Transformations, Custom-rule based transformations and DML or DDL handlers do transform the contents and the format of an LCR, if this modification is not correct it may lead to errors by the apply process. It is important to review if the transformation that is ocurring is the desired one and if not correct it accordingly.

Transformations may occur on any phase of a Streams flow (Capture, Propagation and Apply), so it is also important to check if there has been any transformation on source site. Transformations occur at rule evaluation time, so if the evaluation of the LCR is positive then the transformation is going to occur. Transformations are registered on DBA_STREAMS_TRANSFORMATIONS view and you can easyly identify them on a Streams HealthCheck output on sections:

  • Capture Rule Transformations By Rule Set
  • Propagation Rule Transformations By Rule Set
  • Apply Rule Transformations By Rule Set
  • Rule Transformations Status
In the other side we have apply handlers that only occurs on apply site, handler are executed directly by the apply server and are going to occur after a transformation (if defined) has taken place. It is important to distinguish 3 group of handlers:

  • DML and DDL handlers, these handlers are executed by the apply slave before applying the LCR.
  • Precommit handlers, these handlers are executed before running COMMIT of the transaction; remember that DDL have an implicit commit.
  • Error handlers, these handlers is executed by apply slave after applying the LCR but only in the case there has been an error while applying the LCR
Handlers are registered on DBA_APPLY_DML_HANDLERS and DBA_APPLY views or you can find further information for them on a Streams Health Check output on sections:
  • Apply Process Handlers
  • Apply Handlers
  • Apply Dml Handlers

DDL operations

Under certain circumstances a DDL operation may fire DML operations, for example :  alter table add column default (x)).
As per the design of Streams, logminer session associated to Capture process is ignoring these DMLs and are not being captured.
In the case that Streams set up in place do not replicate DDLs, when replicating DMLs we may find a data mismatch.

 

Unrecoverable operations

Capture process trust on redo/archive log information to replicate information; but is it possible not to store information of operations in redo files, this is done through unrecoverable operations or nologging settings.

It is possible to set table and tablespace settings to nologging, that means that operations on that table/tablespace will not write redo information into redo log files. So, please review if the table or tablespace where the table has been stored has this setting enabled at any moment. You can check DBA_TABLES and DBA_TABLESPACES views.

In addition to this settings and although table and tablespace settings are correct, it is possible to force operations not to write on redo log files, this is done by specifying the operation as NOLOGGING or UNRECOVERABLE; this usually occurs on massive data loads.

Please check if you have done a massive data load in source database recently as UNRECOVERABLE, if so, maybe does worth to repeat massive data load on target site.

Other possible causes

Bug 2849670

This bug occurs when no primary key columns are being used on target site, so DBMS_APPLY_ADM.SET_KEY_COLUMNS procedure has been used and there is a DELETE LCR being processed. It will occur if schema_name has been misspelled when using DBMS_APPLY_ADM.SET_KEY_COLUMNS.

The solution is to define properly schema_name on  DBMS_APPLY_ADM.SET_KEY_COLUMNS.

Data modification on target site

If cause is not any of previous one, this probably is the cause.
This occurs when target data has been manipulated manually or automatically, causing a data mismatch.

Compare the column values for the existing row in the target table with the values of the LCR. LOB columns should not be compared.

Be sure to expand the DATE columns to show full year value and time.
ALTER SESSION SET NLS_DATE_FORMAT='HH24:MI:SS DD/MM/YYYY';
Starting on release 11g the package DBMS_COMPARISON is included. This package allows to do a massive data comparison between source data and target data, allowing to identify quickly and simple the diverge data between the tables, it is also very useful to solve massive ORA-1403, ORA-26787 and ORA-26786 errors reported. In addition to this it allows to repair the divergence data.

Please refer to Oracle Streams Replication  Administrator Guide for further information on how to use DBMS_COMPARISON with Streams.

 

Fixing the error

 

Once the root cause of the error has been identified, then there are two options:

  • Re-execute the transaction after fixing root cause.
  • Ignore failed transaction and delete the error from error queue.
Fixing root cause, could be different types of actions like disabling constraints, altering a table,...but what it is not possible is to modify the LCR that is stored in the error queue.
If the only solution is altering the LCR, then it will be necessary to define a handler that will be executed by re-execution of the error.

 

Data Correction

Data correction is a typical solution to data mismatch; in the case that data there is uni-directional replication just correct the data on the apply site
But if bi-directional replication is configured, it will be necessary that data manipulation to be done is not replicated back to source site causing new error there; to achieve this target you can set a streams tag before doing data correction on target site.

exec DBMS_STREAMS.SET_TAG('ff');
  >> fix the data on the apply site
exec DBMS_STREAMS.SET_TAG(null);


Starting on release 11g you can converge the data on  source and remote table through package DBMS_COMPARISON, if it has been used to identify the differences, on bi-directional streams environments you should be aware of the tags too.

In this way if parameter converge_options is set to DBMS_COMPARISON.CMP_CONVERGE_LOCAL_WINS, then parameter remote_converge_tag should be set and; if parameter converge_options is set to DBMS_COMPARISON.CMP_CONVERGE_REMOTE_WINS, then parameter local_converge_tag should be set.

Please refer to Oracle Streams Replication Administrator Guide for further information on how to use DBMS_COMPARISON with Streams.

 

Reexecuting the error

To re-execute transaction you can two procedures. DBMS_APPLY_ADM.EXECUTE_ERROR to reexecute only one transaction or, DBMS_APPLY_ADM.EXECUTE_ALL_ERRORS to re-execute the all failed transactions in the error queue. This should be done as Streams Administrator.

exec DBMS_APPLY_ADM.EXECUTE_ERROR ('<local_transaction_id>');
or
exec DBMS_APPLY_ADM.EXECUTE_ALL_ERRORS ();

If the root cause the error has been properly corrected, the transaction will apply properly and the error will disappear from error queue.
In the case the execution fails again the error will remain in the error queue. Remember that there could be multiple reason for a transaction to fail and although the error returned is the same, the root cause for the problem can be different and therefore identification of the root cause have to be done again.

 

Deleting error from error queue

If your conclusion is that the transaction that has failed is not relevant anymore for your system and therefore it can be removed, then you can delete the failed transaction.
To delete transaction you can use procedures DBMS_APPLY_ADM.DELETE_ERROR, to delete 1 failed transaction or DBMS_APPLY_ADM.DELETE_ALL_ERRORS to delete the whole error queue.

exec DBMS_APPLY_ADM.DELETE_ERROR ('<local_transaction_id>');
or
exec DBMS_APPLY_ADM.DELETE_ALL_ERRORS ();

Please be aware that deleting a transaction from error queue, it is not possible to recover that information unless a new Capture process is defined.

Avoid new errors

To avoid data errors in future you have to identify root cause of the errors, depending on the root cause the solution can be to define a handler or maybe implement conflict resolution methods

For further information about conflict resolution see Streams Conflict Resolution chapter on Streams Replication Administrator's Guide, or See Note 230049.1 - Streams Conflict Resolution.

Setting conflict resolution will have effect since the moment that you set it up, but you still need to re-executed failed transactions.
Also note that even though conflict resolution is in place you still may receive these errors.

References

NOTE:108377.1 - How Do You Read the Output from the Dump Function?
NOTE:215372.1 - Streams Apply Process Fails With ORA-1031: Insufficient Privileges
NOTE:230049.1 - Streams Conflict Resolution
NOTE:273674.1 - Streams Configuration Report and Health Check Script
NOTE:345119.1 - ORA-26688: missing key in lcr
NOTE:39817.1 - Interpreting Raw SQL_TRACE and DBMS_SUPPORT.START_TRACE output
NOTE:405541.1 - Procedure to Print LCRs
NOTE:463295.1 - Compare and Converge in an Oracle Streams Replication Environment
NOTE:466882.1 - Streams Apply Process Aborts On Decimal Values Using Tg4sybase - Error ORA-28500
NOTE:567623.1 - Streams Apply fails with ORA-26572 when the table has a BLOB column and no primary or substitute key defined
NOTE:760421.1 - Streams Apply Reports ORA-1422
NOTE:858554.1 - SQL to Compare Streams Replicated Table Objects between Source and Target Sites
Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

News and Troubleshooting tips for Oracle Database and Enterprise Manager

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today