If you think that XML is the answer, chances are you misunderstood the question

But people try to apply XML to everything they see nevertheless. Sometimes, however, the results are not just sad, but downright spooky:
Date: Wed, 23 May 2007 10:11:57 +0200
From: Lluis Batlle 
Subject: Re: [9fans] XML
To: Fans of the OS Plan 9 from Bell Labs <9fans@cse.psu.edu>

Lately I've been told at work to use a library in C. Most calls have
the signature

  ErrorType function(const char \*xml);

I have to pass to them xmls of more than two levels deep, attributes,
and around ten elements.

When I asked why such interface to a library, claiming that it was
uncomfortable to me, the lib developer told me that in fact
xml-parameter-passing was one of the techniques he liked most, and
helped him solve a lot of problems easily.

Occasionally however xml is the answer, and whats equally frustrating when it is the answer is to watch people treat it as if it was a csv file, particuarly after they were the ones that suggested using xml in the first place :(.

Posted by guest on June 18, 2007 at 08:08 AM PDT #

I'd be very curious to see the examples where XML is really the answer? I've asked a similar question on the 9fans mailing list and I don't think I've seen any convincing examples there: Thanks, Roman.

Posted by Roman Shaposhnik on June 19, 2007 at 04:16 AM PDT #

XML was originally intended for \*documents\*. Then everybody and their brother started using it for transfer notation and syntax. Then they discovered how inefficient XML is for that. So now we have ASN.1 PER encodings of XML. Funny. In any case, XML definitely has its place, including for things one might not think of as documents (e.g., SMF manifests). Note that XML might be apprpriate as an internal implementation detail of an API like you describe, but it's definitely not appropriate to present errors as XML that the application then has to parse, unless the XML snippet is meant to be integrated into a larger XML document (think AJAX).

Posted by Nico on June 19, 2007 at 04:26 AM PDT #

So basically you argument boils down to using XML as a serialization format for internal data structures. Something the human being is never supposed to see, edit or create from scratch. That's a convincing argument, except for the fact that my experience with computing has taught me one thing: never leave the human being out of the loop. Ever. See, that's why I love SMPT (regardless of how naive its security model might be) and I hate SOAP. Thanks, Roman. P.S. Consider joining 9fans. Judging by your blog you should fit right in! ;-)

Posted by Roman Shaposhnik on June 19, 2007 at 04:44 AM PDT #

Roman: Are you talking about my comment above yours?

If so, no, that's not my quite my argument. Clearly whenever you have complex objects you may need a way to represent them to users/developers, and that's yet another interface, though if you're using XML you might use an XML toolchain to build such an interface. In any case, I wouldn't recommend using XML for internal object serialization purposes unless said serialized objects are intended to fit into a larger XML document, and I imagine that that can happen in som AJAX environments.

Don't be so sure that line-oriented, all-US-ASCII Internet protocols are superior to protocols that use formal syntaxes (like ABNFs, XDR, ASN.1, XML DTDs/Schemas, etc...); I'm certain that they are not.

Posted by Nico on June 20, 2007 at 03:29 AM PDT #

Nico wrote:
||| Roman: Are you talking about my comment above yours?
Yes I am.
||| If so, no, that's not my quite my argument.
Well, that is how I read it. Otherwise please explain what do you mean by "XML was originally intended for \*documents\*". My understanding of a document is that it is either data+metadata (in which case the metadat layer is supposed to be \*really\* simple and thin) or what I said: it is a serialization format for internal data structures.
||| Don't be so sure that...
Well, define superior. To me all of the computer science is about battling complexity of the real world. With that definition of superiority (less complexity with the same amount of usefulness I still stick to my argument.

Posted by Roman Shaposhnik on June 20, 2007 at 07:40 AM PDT #

I don't see how serialized internal data == documents, though sometimes it's true (in that documents have internal representations in running programs such that the external representation is a serialized form of the internal one).

I won't define "superior," but I'll explain why I think formal syntaxes are superior to ad-hoc ones: because you can write tools that consume machine-parseable message definitions, including code generators. To be sure, there's a point at which too much formality can make documents difficult to read or can lead to redundancy (natural language descriptions and formal definitions of the same things) in documents that is difficult to ascertain is not internally in conflict, so there is a limit to how much formality is practical, but I think too that there is a minimum degree of formality that is needed to describe complex protocols and documents.

Posted by Nico on June 20, 2007 at 08:16 AM PDT #

Nico wrote:
||| I think too that there is a minimum degree ||| of formality that is needed to describe complex ||| protocols and documents.
Well, the same can be said about programming languages in general. After all, their main purpose is to let us "describe complex protocols..." ...and problems. Yet I feel much better about C than about C++. To quote Erik Quanstrom:
i think a better question is, can you think of an 
application for xml that can't be done more simply?

the debate over typed variables in programming 
languages is pretty well over. but i think asn/1 and 
xml seem to show that data type definitions can get
out of control.  and that the dtd is itself a program 
that needs debugging.

Posted by Roman Shaposhnik on June 20, 2007 at 08:26 AM PDT #

But C is a formal language! Lots of pre-ABNF/XDR/ASN.1/... Internet protocols didn't use any formal language at all. That is definitely not good, and you can't say that IP/FTP/SMTP/whatever is to, say, TLS, as C is to C++ -- assembler to C++ might be better a better analogy.

I don't know how you make the jump to your question about XML apps that couldn't be done more simply in some other way. But I'll say this: I'm not a great fan of XML -- it's a wheel re-invention as far as I'm concerned -- but in terms of tool chains, libraries, adoption and so on, it has long surpassed its predecessors (Lisp S-expressions and Lisp macros like destructuring-bind, ASN.1, etc...). So, in one very important sense XML is superior to the alternatives: that is, in terms of market penetration.

And BTW, thanks for the news that the debate over typing being over. I had no idea... :)

Posted by Nico on June 20, 2007 at 08:49 AM PDT #

Post a Comment:
  • HTML Syntax: NOT allowed



Top Tags
« August 2016