Minimising S2S traffic in XMPP - elaboration , clarifications and corrections

My previous post outlined a brief idea, I will try to flesh it out a bit more in this post.
The main areas of focus would be :
  1. What are these 'named list' ? How do you use them ? What are its required properties ?
  2. When would it be a good idea to use this proposal ? When not ?
  3. What all can be optimised this way ?
  4. Some examples on how to implement it goddamnit !

Hopefully this post will address my thoughts in the same order in which I list them above.

Named lists

Named lists are a construct loosely based on XEP 33.
That is, their idea came to me through that spec - but they dont really share much with it except that I believe that XEP 33 could logically include this as an extension.
In this proposal, the main properties of named lists (in this context) are :
  1. Each named list has a 'sender jid' and a list of 'n' reciepent jids.
  2. Named lists can be removed at any point of time by the hosting server - so, remote servers cannot make assumptions about a lists existance.
  3. For an xml stanza recieved for a named list, the hosting server generates 'n' packets with 'from' replaced by the 'sender jid' and 'to' being the ith reciepent in the list : and then processes each of these stanza's as though they were recieved individually from remote server.
Now, given these we have the following cases :
  • serverA sends stanza to named list 'routing_list' in serverB, list exists : so stanza's delivered -- Ideal case, maximum savings !
  • serverA sends stanza to named list 'routing_list' in serverB, list does not exist : error returned -- either serverA recreates a new list, or fallsback on more traditional approach of stanza delivery (suppose it finds that serverB is a bit too aggressive in list cleanup and decides against using lists, etc).
In both of these cases, serverA should recieve some sort of acknowledgement that the stanza was either recieved or error was encountered.
In XMPP, request-response pattern is modelled using 'iq' stanza's, so my current thoughts are to use that for this purpose.
The general structure will be :

  • serverA sending stanza to serverB.
<iq from='domainA' to='named_list@domainB' type='set' id='someId'>
  <A single xml stanza to be delivered to the list without from and to attributes. />
</iq>

  • Failure at serverB to deliver this stanza.
If this fails for whatever reason, we will have :

<iq from='named_list@domainB' to='domainA' type='error' id='someId' />

The reason is not important, what matters is that it failed and that the sending server should retry.
It could be 'cos the list was removed, or there was some other error - bottomline, cant deliver - so retry.
Typically, this will trigger serverA to either create a new list and try against that list name or stop using lists itself and fallback on 'traditional' methods (impl detail).

Note: Here, I have removed the constrain for error responses that I placed yesterday : namely that the response MUST contain the actual stanza to be delivered within it.
This means that, until the iq response comes back from serverB, serverA will need to hold on to that stanza for retransmission purposes.
The reason why I removed this MUST requirement was 'cos we cant enforce constraints on the contained xml stanza returned in the error response : neither will we be able to validate or enforce it in case remote server is misbehaving for whatever reason.
This will also reduce the S2S traffic payload.
  • Successful delivery
<iq from='named_list@domainB' to='domainA' type='result' id='someId' />

This indicates that the delivery was successful.
Note that, this is delivery to the list that was successful - that is, names list existed when stanza was recieved and serverB could process the stanza for that list.
There might be errors while processing the actual generated stanza's later on - we are not concerned about that now.


  • Creating a named list.
Now that we have established 'how' serverA uses a named list on serverB - let us look into how it can create it.
As stated initially, the list creation is a simple enough stanza.

<iq from='domainA' to='domainB' type='set' id='someId'>
  <create_list xmlns='sun:im:namedlist' sender='userA00@domainA'>
    <j>userB00@domainB</j>
    <j>userB01@domainB</j>
    <j>userB02@domainB</j>
    <j>userB03@domainB</j>
    <j>userB04@domainB</j>
    <j>userB05@domainB</j>
    <j>userB06@domainB</j>
    <j>userB07@domainB</j>
    <j>userB08@domainB</j>
    <j>userB09@domainB</j>
  </create>
</iq>

That is, we specify the sender JID and the list of jid's who form the list.
The response will either be a success (<iq from='domainB' to=domainA' type='result' id='someId'  name='named_list@domainB'/> ) or error (<iq from='domainB' to=domainA' type='error' id='someId' /> ) - if error, dont retry but fallback on 'traditional methods' for delivery : like current XEP 33 defined methods or directed delivery.
Note:
  1. The name of the list is assigned by serverB - not serverA and no meaning MUST be associated with this at serverA other than as a jid (that is, no attempts to encode/decode info from node/resource, etc : both are opaque to serverA)..
  2. serverB CANNOT control the participants of a list - it MUST either create a list with all jids specified by serverA or return error (like invalid jid, access denied to a jid, other policy constraints, etc).

  • Removing a named list.
In case the endpoint for whom serverA was maintaining this routing info on serverB does not need it anymore, then serverA could request serverB to remove this list.

<iq from='domainA' to='domainB' type='set' id='someId'>
  <remove_list xmlns='sun:im:namedlist' sender='userA00@domainA' />
</iq>

The response to this stanza is not really relevent to serverA : it will be an error if list was already removed or result in case removal succeeds - in either case, there is nothing serverA can do or must attempt to do - both responses essentially mean that the named list is no longer present on serverB.
It is a MUST requirement that serverB periodically remove 'old' lists after some internal timeout - so even if serverA 'forgets' to remove a list, serverB MUST do its cleanup.

Note that, even if serverA does not request a explict list removal - serverB is free to kick a named list out at any point of time without notifying serverA (as part of its cleanup).
It is expected that the lists are removed only after a reasonable timeout - but it is still purely the discreation of the list hosting server.
Similarly, serverB could refuse creation of a list without any reason - serverA MUST have alternate mechanism to deliver stanza's (the current - traditional approach).

  • How do you advertise this ?

As of now, my thoughts are to advertise this as a stream feature.
The way I look at it, this is a basic enhancement to stream routing - so servers will exhibit this as a stream feature and named list MUST be enabled only if this stream feature is sucessfully enabled.
Ofcourse, it need not be enabled in both directions - so serverB might expose and allow it (so serverA can use it) - but not vice versa.

When to use ?

A few things are obvious :
  • Use this approach when number of reciepents on serverB is above some minimum (Implementation detail of serverA - but obviously more than 1 :-) ).
  • When you are expecting to use the list frequently enough - or atleast enough number of times to justify the cost of list creation .
  • Presence broadcasts at start of a session would be a good usecase : you have atleast two stanza's to be sent - one directed presence, and a probe : so the cost is 'recovered'.
  • Lists can, and MUST go out of scope - list hosting implementations MUST NOT depend on list creators to remove a list explictly : and removals MUST NOT be notified to the list creator.

What all can be optimised ?

A rough list would be :
  1. Presence information - both broadcasts and probes.
  2. Multicasting messages : in xmpp, this would typically mean MUC.
  3. All other usecases mentioned in XEP 33 which can recur.
The MUC usecase can become tricky and is implementation dependent - but the basic idea would be that number of messages sent should be higher than list (re)creation (when users in a remote server join or leave). It also requires a higher amount of coupling between the server and the MUC component.

An example.

Let us consider the same example as yesterday - but this time, we look at the packets too !

When userA00 comes online, serverA does not have a named list associated with userA00's contacts on serverB who should recieve his presence updates.
Hence, server creates that first.

serverA:>

<iq from='domainA' to='domainB' type='set' id='someId1'>
  <create_list xmlns='sun:im:namedlist' sender='userA00@domainA/resource'>
    <j>userB00@domainB</j>
    <j>userB01@domainB</j>
    <j>userB02@domainB</j>
    <j>userB03@domainB</j>
    <j>userB04@domainB</j>
    <j>userB05@domainB</j>
    <j>userB06@domainB</j>
    <j>userB07@domainB</j>
    <j>userB08@domainB</j>
    <j>userB09@domainB</j>
  </create>
</iq>

serverB:>
<iq from='domainB' to='domainA' type='result' id='someId1' name='named_list@domainB' />

Now serverA will send the directed presence and probe to serverB.

serverA:>
<iq from='domainA' to='named_list@domainB' type='set' id='someId2'>
  <presence />
</iq>

<iq from='domainA' to='named_list@domainB' type='set' id='someId3'>
  <presence type='probe'/>
</iq>


serverB:>
<iq from='named_list@domainB' to='domainA' type='result' id='someId2' />
<iq from='named_list@domainB' to='domainA' type='result' id='someId3' />

Here I am assuming that the list of users who are subscribed to userA00's presence and to whom userA00 has subscribed to are the same - which was a constraint in our scenario.
In case both are not the same (like in case there are privacy rules applied, etc) , you will end up creating two lists.

For each of the stanza's dispatched to the list, serverB ends up creating these stanza's and processes them as though serverA directly sent it across.

<presence from='userA00@domainA/resource' to='userB00@domainB'/>
<presence from='userA00@domainA/resource' to='userB01@domainB'/>
<presence from='userA00@domainA/resource' to='userB02@domainB'/>
<presence from='userA00@domainA/resource' to='userB03@domainB'/>
<presence from='userA00@domainA/resource' to='userB04@domainB'/>
<presence from='userA00@domainA/resource' to='userB05@domainB'/>
<presence from='userA00@domainA/resource' to='userB06@domainB'/>
<presence from='userA00@domainA/resource' to='userB07@domainB'/>
<presence from='userA00@domainA/resource' to='userB08@domainB'/>
<presence from='userA00@domainA/resource' to='userB09@domainB'/>

Similarly for probe.
serverB now responds back to serverA for the probe requests as though it was individually sent by the server.

Let us consider a subsequent presence push by which time serverB has already removed the list.

serverA:>
<iq from='domainA' to='named_list@domainB' type='set' id='someId4'>
  <presence xml:lang='en'>
    <show>away</show>
    <status>be right back</status>
  </presence>
</iq>

(Note again - no from or to !).

server:B>
<iq from='named_list@domainB' to='domainA' type='error' id='someId4' />

serverA can not either fallback on current approach sending out the stanza individually to the reciepents (serverA always knows who the reciepents (participants in the list) are !).
or it can recreate the list as above and retry.


Hope this clarifies the proposal a bit more ....
Updates:
  1. The careful reader will notice that the way I am encapsulating a stanza to be sent to a list can result in a schema violation. To solve it ? Have a wrapper element 'x' in a custom namespace 'ns' - this element just gets discarded and is present to be conforment with the schema. The mashup above is illustrative, not normative or formal :-)
  2. I do mention it in this post, but let me put it explictly here - if the presence-out and presence-in lists are different (privacy policy , assymetrical rosters, etc) you just end up creating different lists : and if the overhead is deemed high, just dont create a list ! There is nothing forcing server to use this approach in all cases ! It should be noted though that, these are slightly towards the corner usecases ... so the benifit to the server hosting a large number of users using it in a 'normal' way will be high enough.
Comments:

I believe we do not need to create a new protocol to do what you describe for presence. Most of it makes sense, except for some assumptions in the use cases. In effect presence notifications and probes are two very different processes, and should be treated separately. You may want to read on http://antecipate.blogspot.com/2006/11/streamlining-remote-notifications.html were a solution is presented that do not require any new protocol extension to be developped.

Posted by Jean-Louis Seguineau on November 09, 2006 at 06:59 PM IST #

I did see your blog entry (no comments enabled ? :-) ) - hence the clarification as an update to the post. As a small footnote to sending presence probes in the example section, I do mention about the possibility of both lists being different - should have made it a bit more obvious I guess ! By the way, presence was an example since it is so easy to identify with it as a server developer ... a large chunk of the traffic tends to be that - but you could very well use it for other forms of multicasting too.

Posted by Mridul on November 09, 2006 at 07:12 PM IST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

mridul

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Bookmarks
Blogroll

No bookmarks in folder