Minimising S2S traffic in XMPP

Please refer to next post for more clarification and correction to this proposal

XMPP has federation built into the protocol - not an afterthought retrofitted in.
Even though it allows to build a mesh of trusted federated network, it does have issues of scale in terms of handling traffic.
Let us look at a typical issue and how we are thinking of solving it (since there is no standard way to 'fix' this right now) ...
Ofcourse, this would be an extension as of now and so would work only on/between Sun servers if/when we end up supporting it.

A simple problem statement

The scenario :

The scenario is slightly contrived :-) But is intentionally so to illustrate the problem.
Consider the following 'configuration'.
  • serverA and serverB serving domainA and domainB respectively.
  • serverA has users userA00 to userA09 - similar user set userB\* for serverB : twenty users in all.
  • Additionally, all users in serverA are subscribed to each other and to users in serverB - similarly for serverB users : that is, every user is subscribed to every other user.
Now consider this runtime situation :
  1. Users userA08, userA09 (domainA on serverA) and users userB08, userB09 (domainB on serverB) are online - all others are offline.
  2. User userA00 connects.
Let us analyze the situation when userA00 sends its available presence intially to serverA (a single xml stanza). In the current situation, this result in the following traffic between the various entities.
Local traffic
  • serverA sends the presence of userA00 to userA08 and userA09.
  • serverA sends the presence of all local users (userA\* - userA00) to userA00 : 9 stanza's in all.
As you can see, it is as optimal as it can get .. only info to required subscribers is sent - and only info about subscribers who are available are sent. Now let us consider the traffic over the federated link (over S2S).
S2S traffic
  • For each user hosted on serverB which is in userA00's roster, serverA sends the directed presence on behalf of userA00 over the s2s link to the 'barejid' of contact on serverB : that is userB00 to userB09.
  • For each user hosted on serverB which is in userA00's roster, serverA sends a presence probe on behalf of userA00 over the s2s link to query for the presence : that is userB00 to userB09.
  • serverB responds to the presence probe with status of the userB00 to userB09 - 10 stanza's in all.
  • serverB sends the presence of userA00 to userB08 and userB09 (not really relevant to this discussion).
If you compare the traffic between serverA to serverB connection and that of userA to serverA, it is immediately evident that it is highly skewed - the traffic between both server is suboptimal as compared to the traffic within a single server.
You have 30 stanza's over S2S, while the same thing has only 10 for user case - that is 2 \* 'number of contacts' more !

Now, to be fair, the reasons for this should be evident.
serverA cannot trust serverB to 'do the right thing' - it cannot trust that the user's roster in serverA is in sync in serverB's users roster, etc.
The issues are many and boils down essentially to, who is responsible and accountable to whom : so serverA cannot offload his responsibility to serverB - that is a given.

How can we 'solve' this issue with these restriction ?
That is :
  1. serverA is accountable for its users and should make best case effort to make sure that there are no presence leaks and only contacts who are subscribed to user's presence according to serverA's roster get the update
  2. Minimize traffic as much as possible

A potential solution ?

What we are thinking of doing to solve this problem is the following :
Make following extensions to Extended Stanza Addressing, namely :
  • We will not be using the XEP for a mailing list type functionality - purely for creating adhoc routing lists.
  • We will be extending it such that, we can create a list on a server and reference it with a 'name'.
  • This named list could go out of scope at any time (depending on how long the server keeps it around)
  • In this usecase, if list goes out of scope - serverA will recrete it.
So how does this help ?
Let us consider how we will use these extensions to solve our 'problems':
How it will work now : S2S traffic
  • serverA will create a named list with owner set to userA00 on serverB with list of 'to' set to all subscribers of userA00 on serverB.
  • named lists MUST contain only local users when created by a foreign entity (prevent abuse)
  • serverA will send a single directed presence to this named list.
  • serverA will send a single presence probe to this named list.
  • For each user in this named list, serverB will handle the packet as if it was a) with 'from' set to the JID of the list owner, b) destined to the user in the list.
  • serverB will respond back to serverA with 'to' set to that list owner.
In this case, we will have - 'list creation overhead' + 2 stanzas (presence and probe) + response for each jid in the list.
So we have the traffic coming down to optimal case + constant overhead !
Note :
  1. The actual list name would be assigned by serverB - and would be totally random to prevent collusions : no meaning can be derived out of it.
  2. The list can 'expire' at any time - so if serverA sends to a list and finds that it does not exist (serverB returns error), it MUST recreate it and send the stanza again to this new list : serverB MUST include the original stanza to nonexistant list in the error response.
  3. Other than the creator of the list - serverA and host of the list serverB, no one else knows the list details : and there should be no way for any other entity to query for this information.
Comments:

This would be quite useful for MUC services with regard to both presence and message as well.

Posted by JD Conley on November 09, 2006 at 05:16 PM IST #

Hi JD, There are two problems with using this for MUC : a) The MUC component should be 'aware' of this feature in the server - if a seperate component, I am not sure how this can be achieved. b) If users joining/leaving happens at a fast enough rate, then the (new) list creation overhead will offset the advantage to some extent. But yes, all in all, if something along this lines gets standardized - it will really help in minimising S2S traffic even while talking to non Sun IM servers ! - Mridul

Posted by Mridul on November 09, 2006 at 06:24 PM IST #

Keep up the good work bro.Your article is really great and I truly enjoyed reading it.Waiting for some more great articles like this from you in the coming days.

Posted by Gucci Boots discount on December 10, 2009 at 06:08 AM IST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

mridul

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today
Bookmarks
Blogroll

No bookmarks in folder