X

An Oracle blog about Adapters

  • April 9, 2015

Encoding resolution in NXSD framework

Adapters accept native messages in XML or non-XML format and publish them as XML messages. Adapters can also accept XML messages and convert them back to the native EIS format. This translation from native data format to XML and back is performed using a definition file (non-XML schema definition or NXSD), which itself is defined in XML schema format.

In Char mode, the nxsd framework assumes that all incoming messages are in UTF-8
character encoding. If the messages are encoded in some other charset, then the character encoding must be passed to the NXSD framework. Similarly nxsd framework streams all outbound messages as UTF-8 encoded and the character encoding must be specified to the nxsd framework to write encoded messages.

The character encoding for a native message can be specified in multiple places and the NXSD framework resolves the character encoding according to the following order of precedence:

For incoming messages:

1. Message encoding can often be a part of the incoming message metadata and this information presumes the highest precedence. Few examples of messages with encoding are as follows:
       
a. BOM (byte order mark): In windows environment, the first four bytes of a file are used to specify the encoding information and endianness of a message. The nxsd translator peeks the first 4 bytes of a message to look for the byte order mark when the annotation nxsd:parseBom=”true”’ is set. If the byte order mark is not found, the encoding information is derived according to one of the following schemes.
     
b. XML Prolog: In case of xml messages, the encoding can be a part of the xml prolog. For xml messages this presumes the highest precedence.

<?xml version="1.0" encoding="UTF-8"?>

c. Native MQ Message (MQMD.ccsid):  In case of native class=SpellE>mq series messages, the ccsid can contain encoding infromation. To enable mq series adapter to derive encoding, “UseMessageEncodingForNativeTranslation=true” attribute must be added to the inbound adapter activation spec. If specified, the mq series adapter looks for the encoding information in the ccsid of the MQMD of the dequeued message. The same value is propagated downstream to the native translation framework.

2. jca.message.encoding: This property is used to override the encoding specified in the NXSD schema for inbound Oracle File and FTP Adapters. The adapter propagates the value of this property to the native translation framework.

3. nxsd:encoding annotation: The users can specify a custom annotation in the schema element of the nxsd grammar.

4. Default UTF8: If encoding is not specified in any of the above mentioned places, then the default value of UTF8 is assumed.

For outgoing messages:

1. Binding property: jca.message.encoding

2. Custom schema annotation: nxsd:encoding

3. Default: UTF8

By default all charsets supported by the jdk are also supported by the translation framework. There
is a provision in jdk to plugin additional Charset implementations by writing a CharsetProvider class. All such additions are also automatically supported by the translation framework.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.