Fast Infoset and WCF's binary XML encoding

When browsing Microsoft's WCF APIs i came across some interesting information on the WCF "binary XML" format.

XmlReader is a pull-based API for processing an XML infoset. The same API can be used for processing XML documents or "binary XML" documents. How?

The static XmlReader.Create method can take as input a Stream, of octets, that is an XML document or a "binary XML" document. The documentation of this method states:

"The first two bytes of the stream is checked first to determine if it is using tokenized binary XML format. If the first two bytes contain the 0xDF and 0xFF value the binary XML reader is returned."

Here is the bit i find interesting. Fast Infoset specifies that the first two octets of a fast infoset document are 0xE0 and 0x00. There is only a difference of 1 between the two!! (for an 8 bit integer or for a 16-bit integer, most significant byte first).

Given that the two binary formats use different "magic numbers" it should be possibly to integrate Fast Infoset into WCF without any conflicts :-)

The first two octets of a fast infoset document were chosen because they are different from the first two octets that can occur for an XML document encoded using a well-known character encoding scheme (see Appendix F of XML 1.0).

We have used the same type of factory mechanism based on the first couple of octets when prototyping solutions for the Java Web Services Developer Pack. For the final integration into JWSDP 1.6 we chose to rely on the MIME type instead.


Of further note is the XmlReader and XmlWriter have specific methods to read/write binary data, like octets or integers. When using the text-based implementations the data will be converted from/to characters in accordance with the lexical representations of data types specified by W3C XML Schema. But, when using a binary-based implementation such data could be encoded much more efficiently.

Having such methods on the XmlReader and XmlWriter is rather useful IMHO. Not only is it very useful aid for developers, it makes integration of special optimized binary encodings of data quite easy while hiding the implementation details from the developer.

Comments:

Post a Comment:
  • HTML Syntax: NOT allowed
About

sandoz

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today