Wednesday Apr 01, 2009

Setting DPS As Replication Hub - Part 1: a simple tut'


    There may be cases where you would like to keep two environments up to date with the same data but there is no replication or synchronization solution that fit your particular needs. One example that comes to mind is to migrate away from a legacy LDAP (RACF, OiD, Sun DS 5...) to  OpenDS. After having initialized your new data store with the former data store contents, without synchronization mechanism, you would have to switch to the new data store right away. That would not quite be acceptable in production because for one thing, importing the data might take longer than the maintenance window, and more importantly, may something unexpected happen, all real-life deployments want to preserve the option of rolling back to the legacy system (that has proved to work in the past -even if performance or functionality could use a dust-off- ).

Enters DPS "replication" distribution algorithm. The idea is quite simple: route reads to a single data store, duplicate writes across all data stores. I use the term data store here because it needs not be LDAP only but any SQL data base that has a JDBC driver can be replicated to as well. For this tutorial though, I will use two LDAP stores. We will see a MySQL example in Part 2.

Bird's Eye View

    Unlike load balancing and fail over algorithm, which work across sources in a same pool, distribution algorithms work across data views. A distribution algorithm is a way to pick the appropriate data view among eligible data views to process a given client request. In this tutorial, I will show how the "replication" distribution algorithm allows to duplicate write traffic across two distinct data sources.

In the graph below, you can see how this is structured in DPS configuration.

The Meat

We will assume here that we have two existing LDAP servers running locally and serving the same suffix dc=example,dc=com:

  1. Store A: dsA on port 1389
  2. Store B: dsB on port 2389

Let's first go about the mundane task of setting up both stores in DPS:
    For Store A:

#dpconf create-ldap-data-source dsA localhost:1389
#dpconf create-ldap-data-source-pool poolA
#dpconf attach-ldap-data-source poolA dsA
#dpconf set-attached-ldap-data-source-prop poolA dsA add-weight:1 bind-weight:1 delete-weight:1 modify-weight:1 search-weight:1
#dpconf create-ldap-data-view viewA poolA dc=example,dc=com

    For Store B:

#dpconf create-ldap-data-source dsB localhost:2389
#dpconf create-ldap-data-source-pool poolB
#dpconf attach-ldap-data-source poolB dsB
#dpconf set-attached-ldap-data-source-prop poolB dsB add-weight:1 bind-weight:1 delete-weight:1 modify-weight:1 search-weight:1
#dpconf create-ldap-data-view viewB poolB dc=example,dc=com

    Now, the distribution algorithm must be set to replication on both data views:

#dpconf set-ldap-data-view-prop viewA distribution-algorithm:replication replication-role:master
#dpconf set-ldap-data-view-prop viewB distribution-algorithm:replication replication-role:master

  And finally, the catch:

    When using dpconf to set the replication-role property to master, it effectively writes distributionDataViewType as a single valued attribute in the data view configuration entry when in reality the schema allows it to be multi-valued. To see that for yourself, simply do:

#ldapsearch -p <your DPS port> -D "cn=proxy manager" -w password "(cn=viewA)"
version: 1
dn: cn=viewA,cn=data views,cn=config
dataSourcePool: poolA
viewBase: dc=example,dc=com
objectClass: top
objectClass: configEntry
objectClass: dataView
objectClass: ldapDataView
cn: viewA
viewAlternateSearchBase: ""
viewAlternateSearchBase: "dc=com"
distributionDataViewType: write

and then try to issue the following command:

#dpconf set-ldap-data-view-prop viewA replication-role+:consumer
The property "replication-role" cannot contain multiple values.
XXX exception-syntax-prop-add-val-invalid


...or just take my word for it. 

The issue is that in order for DPS to process read traffic (bind, search, etc...), one data view needs to be consumer but for the replication to work across data views, all of them must be master as well. That is why you will need to issue the following command on one (and one only) data view:

#ldapmodify -p <your DPS port> -D "cn=proxy manager" -w password
dn: cn=viewA,cn=data views,cn=config
changetype: modify
add: distributionDataViewType
distributionDataViewType: read

That's it!
Wasn't all that hard except it took some insider's knowledge, and now you have it.
Your search traffic will always go to Store A and all write traffic will get duplicated across Store A and B.


Note that while this is very useful in a number of situations where nothing else will work, this should only be used for transitions as there are a number of caveats.
DPS does not store any historical information about traffic and therefore, in case of an outage of one of the underlying stores, contents may diverge on data stores. Especially when this mode is used where no synchronization solution can catch up after an outage.

Store A and Store B will end up out of synch if:

  • either store comes to be off-line
  • either store is unwilling to perform because the machine is  outpaced by traffic
<script type="text/freezescript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/freezescript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

Directory Services Tutorials, Utilities, Tips and Tricks


« July 2016