By bsankararao on Feb 10, 2009
Interpretation of extension-headers as specified in the RFC 3261 is confusing and the extension-headers may be treated as multi valued or single value based on how implementation choose to implement it.
According to RFC 3261 25.1 Basic Rules:
extension-header = header-name HCOLON header-value
header-name = token
header-value = \*(TEXT-UTF8char / UTF8-CONT / LWS)
message-body = \*OCTET
According to the above grammar, extension-headers (user defined headers), which are also not defined by any known RFCs may be treated as single valued headers but most of the applications use the extension-headers as multi valued headers and that puts the implementors in doubt. If the implementation choose either way, some applications may not behave as they intended. One such example is, should "My-Header : Foo,Bar" be treated as as My-Header having Foo and Bar as values? or what if the user wanted it as as single value "Foo,Bar", where comma is a part of value not a value separator. I have looked for guidance from RFC authors on this issue, but with out success.
Mattias from Ericsson has explained in detail, in his own words
IMHO an implementation that is unaware of the exact (BNF) definition of an extension header cannot know whether a comma in its value is intended as just another byte in the value or as a value separator. Further, it cannot know whether any double-quote character in the value is intended as just another byte in the value or as a quoting character, which could, but does not necessarily, imply that any subsequent comma should be interpreted differently. The same goes for any other character (for example single quote, angle or square brackets, etc) that may or may not be used as a quoting character by the extension. In fact, since it's impossible to know _what_ characters an extension may define for quoting, it's completely impossible to even guess at how to interpret a comma.
In short, an unaware implementation cannot be expected to interpret the values of extension headers. The only safe way to define an API is to treat the rest of the (possibly folded) line as a single value.
This does not imply that multiple values are not possible. For example, an unaware proxy that forwards the header unaltered will of course not know whether there is one or more values, but if the ultimately receiving UA is aware of the extension, it will still correctly interpret the contents as one or more values. The important thing here is for the unaware proxy to forward the header _unaltered_!
I also believe that you are correct in your interpretation of the BNF. From a strict BNF perspective, it indeed allows multiple To, From and other well-known single-value-only headers. To maintain a level of simplicity in the BNF, many such semantic rules are instead defined in the text of the RFC. Thus, the argument that the BNF allows multiple
instances of "extension-header" is worthless. For the record, it would be virtually impossible to define a BNF that allows for an arbitrary number of (different) extension headers but limits the number of each header to one, at least without enumerating all possible extension header names (which, of course, would be an infinitely long list given
that there are no length restrictions on such names). Further, it would be impossible to define it such that some extension headers would be allowed in multiple instances but others would not (which, as we've seen in _actual_ extensions, is a perfectly legitimate requirement). The only reasonable way is to keep the BNF "flexible" in this sense, and, as you point out, define for each extension, as it is defined in its own RFC, whether it's allowed one or more times, and, in the latter case, whether comma separation is allowed.
In conclusion, an unaware implementation cannot possibly know whether to allow an extension header name only once or more than one time, even on separate lines. Again, the only safe bet is to be tolerant and assume that whoever sent the extension knows what they're doing and forward any and all such extension headers, again unaltered (and to make them available on the API, accessible as one header instance per line, with a single value per such instance). It has to be left to a node (another proxy or the ultimate UA) that _is_ aware of the extension to judge whether the occurrence of one or more header instances, on a single line with comma separation or on multiple lines, or indeed on a combination of both, makes the SIP message valid.
On a final note: The above outline allows perfectly well for an application built on top of the unaware API to add awareness of the extension. It will just have to parse each "single" value given from the API according to the specific rules of the extension, to see if that "single" value needs to be "split" into several values. Further, the application may use its extension awareness to judge whether additional values on separate lines are permissible. The application, with its additional extension awareness, can do this but the API implementor/provider could never do it correctly. The API implementation would have to choose either to always split at commas or never to do it, and either to always allow multiple separate lines are never to do it. Regardless of which path is taken, some (possibly not yet defined)
extensions would be handled the wrong way.
By taking the spirit from the above explanation, SailFin treats extension-headers as multi valued headers by default and provides configurable mechanism to specify the headers which should take comma as part of value.
org.glassfish.sip.commaisnotaseperator system property can be specified with header names as a comma separated list. For example -Dorg.glassfish.sip.commaisnotaseperator=Header1,Header2, then Header1 and Header2 are interpreted as single valued headers.