Tuesday Aug 18, 2009

Dissipating Performance Misconceptions About DPS

<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

Rationale

Recently engaged on a project where a Directory Server farm capable of  delivering 100,000 searches per second , I was faced with skepticism about the ability for DPS to keep up in such a demanding environment. I had to prove ten times over that our proposed architecture would fit the bill and quantify how much headroom it would provide. Here is some of the things we observed along the way.

Bird's Eye View

 In this article I will quickly talk about a theoretical performance of DPS under ideal circumstances, you'll see that the setup is quite far from a real deployment from the hardware to the way the stack is set up.

The Meat

The results

I won't make you read through the whole thing if you do not want to so here goes: 

Throughput

DPS Throughput

Response Time

DPS Response Time

Setup

The box  I'm developing on and running tests these days is somewhat unusual and the proper disclaimer is necessary: it is a corei7 (975 EE) very slightly overclocked to 3.9GHz fitted with 12GB of 1600MHz DDR3 RAM. It runs OpenSolaris 2009.06 with ZFS. There is a 10,000 rpm 320GB drive and an Intel Extreme 32GB SSD. All in all a nice rig but not a server either. The closest server if I had to give a comparable would be a beefed up Sun Fire X2270. Keep in mind that my desktop is single socket but about twice faster on the CPU clock rate.

To load this, I use SLAMD. One server, 7 clients (2 Sun Fire T5120, 4 Sun Fire X2100, 1 Sun Fire X4150). For the record, this client set has generated north of 300,000 operations per second in other tests I ran in the past. Clients in this test campaign were always loaded at less than 15% CPU, so results are not skewed by CPU pressure on the client side.

On the software side (middleware I should say) we have an Sun Ultra 27 running OpenDS with a null back-end  and my rig running running my workspace of DPS 7 plus my little sauce...

Why a null back-end on OpenDS? that is not representative of real life!

Well, quite true, but the whole point in case here is to push DPS to the limit, not the backend, not the network or any other component of the system. So a null back-end is quite helpful here as it simulates a perfect back end that responds immediately to any requests DPS sends. Quite handy if you come to think of it because what you have as a result is an overview of the overhead of introducing DPS between your clients and your servers under heavy load. The load is actually all the hardware I had could take, the CPU is almost completely used, with idle time varying frantically between 1% and 7%. Keep in mind as well that DPS runs in a JVM and at these rate, garbage collections are almost constant. 

Here is how the set up looks like:

That's all I had for you guys tonight! 

Friday Aug 07, 2009

Taking Throttling To The Next Level

Rationale

After releasing the first version of the throughput throttling, most customers seemed interested in at least kicking the tires and wanted to evaluate it but as it turned out, the fact that throttling was choking traffic across the board, it would have actually required to deploy an extra instance of DPS in the topology for the sole purpose of choking traffic to an acceptable level to the business. While some felt it was simple enough to do, some found it to be a show stopper and therefore, I wrote this new plug-in, leveraging the distribution algorithm API to allow to narrow the scope of traffic throttling per data view, bringing a whole new dimension of flexibility to this feature.

Bird's Eye View

This new wad not only provides a new, more flexible throttling facility to your LDAP Proxy Server, it also comes with a CLI that makes it trivial to configure and change your settings on the fly. The README has some instructions to get you going in no time, but I will provide a quick overview here as well.

The Meat

First things first, you need to unzip the package, which will give your the following files:

$ find Throttling
Throttling
Throttling/throttleadm
Throttling/Throttling.jar
Throttling/README

As you can see, pretty trim.

The CLI will ease mainly 3 things:

  1. Setting up data views to be able to throttle traffic (throttleadm configure)
  2. Configuring the throughput limits to enforce (per data view and operation type, e.g.: throttleadm throttle root\\ data\\ view mod 200 )
  3. Checking the current configuration (throttleadm list)

Here is the complete usage for this little tool

$ ./throttleadm --help
Checking for binaries presence in path
.
.
.
.
throttleadm usage:
  throttleadm list
  throttleadm configure <dataViewName>
  throttleadm unconfigure <dataViewName>
  throttleadm choke <dataViewName>
  throttleadm unchoke <dataViewName>
  throttleadm throttle <dataViewName> <operation> <rate>
for example:
  To list the data views configured for throttling
  throttleadm list

  To set up a data view to use throttling
  throttleadm configure root\\ data\\ view

  To set up a data view back to its original state
  throttleadm unconfigure root\\ data\\ view

  To enable throttling on the default data view (provided the data view has been properly configured)
  throttleadm choke root\\ data\\ view

  To disable throttling on the default data view
  throttleadm unchoke root\\ data\\ view

  To change or set the maximum search throughput on the default data view to 20
  throttleadm throttle root\\ data\\ view search 20

Finally, when you change the settings, you can see them be enforced right away. In the example below, I initially have set the bind throughput limit to 200 per second. The left window has an authrate running and in the right window, while the authrate is running, I lower the throughput limit to 100 for 4 seconds and then set it back to 200. See how that works below:

Finally, here is a quick snap of the output of the CLI for the throttling status.

 $ ./throttleadm -D cn=dpsadmin -w /path/to/password-file list
Checking for binaries presence in path
Will now take action list with the following environmentals:
-host=localhost -port=7777 -user=cn=dpsadmin -password file=/path/to/password-file
VIEW NAME          - THROTTLED -  ADD - BIND -  CMP -  DEL -  MOD - MDN - SRCH
ds-view            -      true -   12 -  200 -   13 -   14 -    1 -  16 -  112

 That's it !

As usual, don't hesitate to come forth with questions, remarks, requests for enhancements.


<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

Thursday Jun 25, 2009

Speed Up Your Server For Free: Using ZFS To Your Advantage For Access Logs

Rationale

Access that can cause some performance focused users some discomfort. The one main thing usually making logs a performance hog is the fact that entries must be ordered somehow. I our products, the ordering is chronological. Here is an easy way to alleviate the issue if you're on Solaris and have a spare drive.

Bird's Eye View

ZFS Intent Log (or ZIL) can be configured on a separate disk to help synchronous performance.

You will find lots of literature  on the matter out there, including Neil and Brendan's blogs for example.

The Meat

So you heard about all the great benefits you can get with SSDs but don't have one yet (Go get one!) or don't have enough that you can dedicate one to your logs?

Worry not!

All you need to do is create a ramdisk drive that will be used for ZIL when we create our access-log-dedicated ZFS pool. Here's how:

$ ramdiskadm -a zil-drive 512m
$ zpool create log-pool c8d1 log /dev/ramdisk/zil-drive

For DPS, all you need to do is:

$ dpconf set-access-log-prop log-file-name:/log-pool/access

It's just as simple for DS, do:

$ dsconf set-log-prop access path:/log-pool/access 

And OpenDS is no more complicated to configure, do:

$ dsconfig -n set-log-publisher-prop --publisher-name "File-Based Access Logger" --set log-file:/log-pool/access

OR use the interactive command, simply do: 

$ dsconfig

and follow

  • 20) log publisher
  • 3)  View and edit an existing Log Publisher
  • 1)  File-Based Access Logger
  • 3)  log-file              logs/access
  • 2)  Change the value
  • type /log-pool/access, hit return
  • type "f" to finish and apply
  • restart OpenDS bin/stop-ds;bin/start-ds
  •  I know it looks like more work but the nice thing about dsconf is that it gives you context and you will get familiar with other aspects of the server


Caveats

    In the rare event that a server configured as described here loses power, the ZIL -being on a ramdisk- will be lost. This does not however corrupt the data stored on the disk and upon restart, all you would have to do is add the ZIL on a newly created ramdisk again. This can of course be automated to be done at boot time so that you do not need to do it yourself at every power cycle.

<script type="text/freezescript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/freezescript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

What Was The Best Performance You Ever Had With OpenDS ?

Rationale

    After discussing the article I posted yesterday with someone, they asked me: "What was the best performance you ever had with OpenDS?" and though I couldn't really answer off the top of my head, I dug in my archives from the last benchmark and found what I think was my best run so far.

Bird's Eye View

    To put it bluntly, about 120,000 operations per second @ <2ms. As this was done while I was tuning OpenDS for the 10 Million entries benchmark on Intel Nehalem-based Sun Blade x6270, I therefore had the whole setup, 10M entries, searches span across the entire DB and some of the Java tunings are bleeding edge, as I will detail in the next section.

The Meat

Environment

    As I said earlier, this is the same environment as described in my previous entry except for Java.

Java

    The JVM arguments are as follows: -d64 -server -Xbootclasspath/p:/path/to/SecretSauce.jar -XX:+UseNUMA -XX:LargePageSizeInBytes=2m -XX:+UseCompressedOops -XX:+AggressiveOpts -XX:+UseBiasedLocking -Xms6g -Xmx6g -Xmn4g -XX:+UseConcMarkSweepGC -XX:+CMSIncrementalMode -XX:CMSInitiatingOccupancyFraction=85 -XX:MaxTenuringThreshold=1

    It's all pretty much business as usual but some of them call for explanation:

  • -Xbootclasspath/p:/path/to/SecretSauce.jar: One of our engineers, our lead for OpenDS core actually, has found a significant performance improvement in one of the JVM's core classes. This SecretSauce.jar contains his patched and improved version that overrides the JVM's own at run time. This makes a big difference in lowering GC pause times.
  • -XX:+UseNUMA: this is simply because the Sun Blade x6270 is a NUMA architecture and using this switch tells the JVM to be clever about memory and cache locality.
  • -XX:+UseCompressedOops: This allows to benefit of the 64-bit JVM larger Heap size, actually not quite as big but bigger than that of the 32-bit JVM while retaining 32-bit like performance. The best of both worlds. Works beautifully. And it is being improved ...

Results 

Searches Completed
Count Avg/Second Avg/Interval Std Dev Corr Coeff
35428814 116160.046 580800.230 10313.696 -0.037
Search Time (ms)
Total Duration Total Count Avg Duration Avg Count/Interval Std Dev Corr Coeff
60861940 35428857 1.718 580800.934 0.119 0.023

Caveats

    So, now that I told you all my secrets, you're wondering why I didn't use those settings for the benchmark? Because the benchmark is supposed to give me numbers on what could be achieved in a production environment, and in this case, using our patched JVM core class and a somewhat experimental or at least relatively new memory addressing mode of  the JVM isn't what I would advise to a customer about to go live.

All these bleeding edge settings only give us a 12% boost overall, I don't think it is worth the risk. But this shows that we are paving the way for an ever increasing performance on OpenDS. Tomorrow, these settings will all be well proven and safe for production. Just hang in there.

<script type="text/freezescript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/freezescript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

Wednesday Jun 24, 2009

OpenDS Nehalem Benchmark - SunBlade 6270

Rationale

Long have we heard that the new Nehalem-based Sun x86 systems would bring significant performance boost over the AMD Opterons still ruling the land to this day. The whole idea of the test was to see in the particular case of Directory Services, and even more specifically of OpenDS, how this translated into throughput, response time and all the good things we (meaning the seriously loony LDAP geeks) like to look at...

Bird's Eye View

 On this single blade, OpenDS achieves over 93,000 search operations per second and over 17,000 modification operations per second. Under lighter -but still significant throughput always above 70,000k ops/sec- OpenDS delivers sub millisecond response time.

Sounds too good to be true? Then read further...

To sum it up as Benoit did in his post, this would give you, in a fully populated 6000 chassis, the ability to process almost A MILLION REQUESTS PER SECOND in a well integrated, highly available and easily manageable package. it does NOT get any better from any vendor out there as of today.

Special thanks to Benoit Chaffanjon and his team for making this equipment available to us on short notice. Their support, reactivity and in-depth knowledge of all things benchmark is what makes them top-notch and an indispensable component of our success.

The Meat

Maybe you have already heard about Benoit's previous benchmark of DSEE (6.3.1) on Nehalem. If you haven't, read it, it'll give all you the background you need to read these results here. I tried to stick as much as I could to his bench, and I think did a pretty good job at that. The main intentional difference between our two benches is that in his, searches only span across 1 Million entries among the 10 Million data base. In mine, searches span across the whole 10 Million entries. In practice, he's right to do his benchmarks the way he does, as it better reflects the reality of how most customers end up consuming data, but mine is more stressful on the system.

Setup

Hardware

Software

Tunings

Hardware

None

Software

Solaris
  • Cap the ZFS ARC size to ( SYSTEM MEMORY \* 0.95 ) - OPENDS JVM HEAP SIZE
  • Disable ZFS cache flush since the storage takes care of that for us and has persistent cache (4GB of NVRAM)
  • Put ZFS ZIL on a dedicated SSD

Other things to consider doing:

    • use jumbo frames if returning whole entries, YMMV depending on your most frequent access patterns. I haven't tried this time around for lack of time but this should be interesting in reducing the network overhead. As we'll see later, OpenDS on this blade can max out a gigabit Ethernet connection.
Java

With very high volumes like we are seeing here, say above 80k ops/sec, you will likely want to bump request handlers and worker threads a notch to cope with the frenzy of the traffic. When you do so, the 32-bits JVM will quickly become too small no matter what tunings you try. Even though the 64-bits is not as space efficient for cache and all other aspects of memory access, it will provide an extremely stable environment for OpenDS even under heavy client traffic. I have been able to attach 10,000 hyper-clients (as in clients continuously sending traffic with no pause between requests) to OpenDS without a problem.

To cut to the chase, the settings:

OpenDS

 Worker Threads
32

 Connection Handlers

 16


As I have said previously, you may want to dial these values depending on a couple of factors:

  • How many clients you have at peak
  • How quickly your client applications open their connections (bursts or ramped up?)
  • How frantic a client is on each connection in average

If you have 5,00 clients opening 100 connections all at once, you will likely want to have more connection handlers to be able to cope with the suddenness of the pattern. This will however come at a performance cost (that we have yet to appropriately profile) under more normal circumstances.

If you have few frantic clients, these values will be right, you may want to bump up the number of worker threads a bit. This too is subobptimal  under normal circumstances.

Note: regardless of the access pattern, these settings will be adequate to serve whatever load you throw at the server, I'm only pointing out ways to improve the performance a bit. In particular, these advices will contribute to keeping the request backlog on a leash.

Import

Importing our 10M entries took 14'59", which averages at 11,120 entries per second.

Search Performance

These tests mainly aim at determining the maximum throughput that can be achieved. As such, they tend to load the servers to artificially high number of concurrent clients, inflating the response time compared to what can be expected under more normal production conditions... in the last section (Lighter Load), I will show what the response time looks like with lighter loads and lower overall throughput.

 Exact Search

 Return 1 Attribute
Heavy Load, Maximum Throughput 
Actual Duration
1839 seconds (30m 39s)
Searches Completed
Count Avg/Second Avg/Interval Std Dev Corr Coeff
169056808 93660.281 468301.407 5590.951 -0.004
Exceptions Caught
Count Avg/Second Avg/Interval Std Dev Corr Coeff
0 0.000 0.000 0.000 0.000
Entries Returned
Total Avg Value Avg/Second Avg/Interval Std Dev Corr Coeff
169056809 1.000 93660.282 468301.410 0.000 0.000
Search Time (ms)
Total Duration Total Count Avg Duration Avg Count/Interval Std Dev Corr Coeff
450590169 169056809 2.665 468301.410 0.189 -0.006
Lighter Load
Searches Completed
Count Avg/Second Avg/Interval Std Dev Corr Coeff
28143684 92274.374 461371.869 3791.935 -0.040
Exceptions Caught
Count Avg/Second Avg/Interval Std Dev Corr Coeff
0 0.000 0.000 0.000 0.000
Entries Returned
Total Avg Value Avg/Second Avg/Interval Std Dev Corr Coeff
28143684 1.000 92274.374 461371.869 0.000 0.000
Search Time (ms)
Total Duration Total Count Avg Duration Avg Count/Interval Std Dev Corr Coeff
30399915 28143685 1.080 461371.885 0.055 0.023

Return whole entry
Heavy Load, Maximum Throughput 
Actual Duration
1839 seconds (30m 39s)
Searches Completed
Count Avg/Second Avg/Interval Std Dev Corr Coeff
151991059 84205.573 421027.864 5264.386 -0.006
Exceptions Caught
Count Avg/Second Avg/Interval Std Dev Corr Coeff
0 0.000 0.000 0.000 0.000
Entries Returned
Total Avg Value Avg/Second Avg/Interval Std Dev Corr Coeff
151991061 1.000 84205.574 421027.870 0.000 0.000
Search Time (ms)
Total Duration Total Count Avg Duration Avg Count/Interval Std Dev Corr Coeff
360407639 151991065 2.371 421027.881 0.183 0.022

Lighter Load
Searches Completed
Count Avg/Second Avg/Interval Std Dev Corr Coeff
21896817 71792.843 358964.213 4125.281 -0.020
Exceptions Caught
Count Avg/Second Avg/Interval Std Dev Corr Coeff
0 0.000 0.000 0.000 0.000
Entries Returned
Total Avg Value Avg/Second Avg/Interval Std Dev Corr Coeff
21896817 1.000 71792.843 358964.213 0.000 0.000
Search Time (ms)
Total Duration Total Count Avg Duration Avg Count/Interval Std Dev Corr Coeff
15177289 21896817 0.693 358964.213 0.047 0.023

Sub Scope Search

Return 1 Attribute
Heavy load, Maximum Throughput 
Actual Duration
1838 seconds (30m 38s)
Searches Completed
Count Avg/Second Avg/Interval Std Dev Corr Coeff
169252464 93768.678 468843.391 6339.082 -0.012
Exceptions Caught
Count Avg/Second Avg/Interval Std Dev Corr Coeff
0 0.000 0.000 0.000 0.000
Entries Returned
Total Avg Value Avg/Second Avg/Interval Std Dev Corr Coeff
169252464 1.000 93768.678 468843.391 0.000 0.000
Search Time (ms)
Total Duration Total Count Avg Duration Avg Count/Interval Std Dev Corr Coeff
270122894 169252465 1.596 468843.393 0.140 0.022
Lighter Load
Searches Completed
Count Avg/Second Avg/Interval Std Dev Corr Coeff
24902860 81648.721 408243.607 4020.767 -0.011
Exceptions Caught
Count Avg/Second Avg/Interval Std Dev Corr Coeff
0 0.000 0.000 0.000 0.000
Entries Returned
Total Avg Value Avg/Second Avg/Interval Std Dev Corr Coeff
24902860 1.000 81648.721 408243.607 0.000 0.000
Search Time (ms)
Total Duration Total Count Avg Duration Avg Count/Interval Std Dev Corr Coeff
15166324 24902860 0.609 408243.607 0.039 0.023

Return Whole Entry
Heavy Load, Maximum Throughput 
Actual Duration
1839 seconds (30m 39s)
Searches Completed
Count Avg/Second Avg/Interval Std Dev Corr Coeff
152888061 84702.527 423512.634 6003.399 -0.008
Exceptions Caught
Count Avg/Second Avg/Interval Std Dev Corr Coeff
0 0.000 0.000 0.000 0.000
Entries Returned
Total Avg Value Avg/Second Avg/Interval Std Dev Corr Coeff
152888064 1.000 84702.529 423512.643 0.000 0.000
Search Time (ms)
Total Duration Total Count Avg Duration Avg Count/Interval Std Dev Corr Coeff
270188257 152888064 1.767 423512.643 0.154 0.013

Lighter Load
Searches Completed
Count Avg/Second Avg/Interval Std Dev Corr Coeff
22151207 72626.908 363134.541 3680.320 -0.007
Exceptions Caught
Count Avg/Second Avg/Interval Std Dev Corr Coeff
0 0.000 0.000 0.000 0.000
Entries Returned
Total Avg Value Avg/Second Avg/Interval Std Dev Corr Coeff
22151207 1.000 72626.908

Tuesday Jun 23, 2009

No Directory Manager - Protect Your LDAP Server

Rationale

    Many institutions, companies and organizations have security policies in place to keep security under control with an homogeneous environment. One of those guidelines mandates that no credentials be shared  between any two employees. When that is the case, cn=Directory Manager lays as what seems like a gaping hole in violation of such policies.

The other fact that bothers regulators with this user is that it is not subjected to Access Controls. It can therefore, by design, by-pass any carefully-designed access restriction policy. While this can sometimes be useful for performance reasons, this is incompatible with a quest of absolute security.

There are institutions where this is not tolerable.
Here is a painless way to stay compliant.

Bird'sEye View

    The idea behind this tip is to disable "cn=Directory Manager" knowing that a number of things are perfectible about this use, the main one being that one could run a brute force attack on it. Knowing the user name, which remains to its default more often than not, only makes things worse. So the number 1 thing would be to change the user name to some other value. But that would still allow brute force attacks.

The other thing that can be done is to null the directory manager password, which, combined with a mandatory password, effectively renders "cn=Directory Manager" unusable.

The Meat

  1. create a random password and store it in a file protected on the host to be readable only by root.
        e.g. store pd80wu709@w87-3WQJX%mjx097hc&50 in /path/to/cryptic-directory-manager-password.
        Note: Do not use echo or cat or anything of that sort as this could be sniffed. Use an editor like vi, joe or whatever is most convenient.
  2. create a random user. The only constraint is that it should be a valid DN - see rfc 2253 - and even that rule can be bent a bit...
       e.g. store tr-d7=9gcxf7tu in /path/to/cryptic-directory-manager-dn
       Note: take the same precautions as in step 1
  3. Never use the same user and password between any 2 instances of  Directory Server instances
      e.g. dsadm create -D `echo  /path/to/cryptic-directory-manager-dn` -w /path/to/cryptic-directory-manager-password -p xyz -P zyx </path/to>/instance
  4. delete the cryptic password file
  5. delete the cryptic dn file
  6. edit </path/to>/instance/config/dse.ldif and remove the value of the nsslapd-rootpw so that its contents are blank
    e.g.: nsslapd-rootpw:
  7. start the instance
    e.g. dsadm start </path/to>/instance


Your directory manager is effectively unusable and has little to no chance of having been compromised at any point of creating or starting the instance.[ if you really want absolute security, use a small program that will quietly output a randomly-generated password to file with 600 rights ]

Note that for an already created instance, you can simply do step 5 & 6, which is nice and easy. The only addition in that case is to check that the
require-bind-pwd-enabled property is on.
  e.g.
    $dsconf get-server-prop require-bind-pwd-enabled
    require-bind-pwd-enabled  :  on

Since at this point your directory manager is disabled, you will need to use an account like cn=admin,cn=Administrators,cn=config as your dsconf user.
simply export LDAP_ADMIN_USER=cn=admin,cn=Administrators,cn=config or use dsconf <command> -D cn=admin,cn=Administrators,cn=config ...

Caveats

    When following this procedure, you will end up with a server that only has "regular" users. This is mostly good but has a handful of shortcomings, such as not being able to repair ACIs ... since now all your users, including the administration accounts, are subjected to ACIs evaluation, you could end up in a state where all your administration accounts are locked. Care must be taken to keep an administration account with well calibered Access Controls. There also some additional troubleshooting operations that mandate (per the code) be done by directory manager.

<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

Monday Jun 15, 2009

Operation Throttling - Protect Your LDAP Servers

Rationale

    Many times - for various maintenance and operational reasons - we need to run batches of updates to an Identity repository. Whether it is a new application that was introduced requiring new attributes or a broad sweep cleanup for a retired application, the net result is a same: an additional write load is inflicted to the LDAP farm with the ever undesirable performance impact on the "regular" traffic. As a work around, this used to be done during maintenance windows, at night or over a quiet week end... this usually leads to stressful early Monday mornings if you had overestimated the absorption capacity of the infrastructure.

Bird's Eye View

    The idea is to allow DPS to throttle traffic in order to be able to "choke" traffic coming from a particular user or host. This would allow to leave the regular traffic alone and only apply the limitation on writes coming from the user running the batch job for example.

The Meat

 The principle is pretty straightforward, traffic fills a queue until the queue is full. When it is, DPS delays the next requests until the next slot becomes available in the queue. This is effective as it does not disrupt traffic. It only makes the LDAP infrastructure appear slower to clients. Most throttling solutions I have seen out there would return "Server Busy" or something along those lines, which may cause errors on the client side and defeats the purpose of throttling altogether from a client's perspective. It works only from the server's perspective, which indeed see their traffic decreased.
With this plug-in, all the requests sent by the client will be honored, it'll just take longer.

One of the added benefits is that the throughput limit can be changed on the fly without disturbing regular "unthrottled" traffic.

So you for example could leave the batch job completely unleashed and flood your LDAP farm over the week-end and then strangle the traffic Monday at 4:00am to an acceptable trickle. Since the configuration of DPS can be altered over LDAP, all there is to it is an entry in your cron, and you have yourself a nicely controlled environment...

This plug-in for DPS is available through Directory Integration Team via Sun Professional Services (or shoot me an email, arnaud@sun.com)

<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

Performance Impact Of Toggling Access Log

Rationale

    Many a customer need to have consistent high performance over traceability but most want both. As part of our current effort to rewrite the logging subsystem, we took a couple of quick metrics to assess how much headroom we had. This may answer some of the questions you asked yourself in the past.

Bird's Eye View

    I'll make this quick:

Throughput comparison

    This was just a crude test on a Sun Fire X4150. This is in no way a benchmark of DPS for maximum throughput or optimal response time. The only intent of this article is to illustrate the performance difference the access log makes for DPS.

    How do I know if I hit the access log bottleneck ?

Mainly, your CPU utilization is less than 100% (sometimes far less) and you have reached a maximum of throughput, throwing any more load to DPS only results in longer response times... then there is a strong likelihood that turning off the access log will allow to squeeze the extra drops of performance.

The Meat

    I won't go into details about the ins and outs of logging but simply try to articulate the challenge it poses under stress. Particularly, if you think about it, as you have seen in the first graph above, we may process about 10,000 operations per second. For each operation, we may need to log a number of steps, among which:

  • the time the connection was established
  • the time the connection was processed by the default connection handler
  • the time the operation was handed off to the adequate connection handler
  • the time the operation was sent to the back end server
  • the time the operation came back from the back end server
  • the time the operation was sent back to the client

For some back-ends or if custom plug-ins are configured, we may have more.

 We may easily have 5 times the number of writes to the log compared to the number of LDAP operations we actually process. Now imagine that we take a lock anywhere in the process and you will immediately understand how this can become a bottleneck under heavy LDAP load. We are currently in the process of removing this lock and we inspired ourselves from the approach we used in OpenDS. OpenDS has a very efficient implementation, costing only  between 6 and 8% of decrease compared to having no access log. The main goal of this reimplementation is to remove the performance bottleneck (some may call it a glass ceiling...) and introduce as little jitter in response time as possible.

Here's another snapshot of the state of affairs with DPS 6.x ...

 I hope this helped shed some light.


<script type="text/freezescript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/freezescript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

Friday Apr 03, 2009

Setting DPS As Replication Hub - Part 2: Replication to SQL and LDAP

Rationale

So you need to maintain some data in both LDAP and an SQL database?

As we've seen in part 1 of this tutorial, the replication distribution algorithm allows to duplicate write traffic between data views. Thanks to DPS support of both LDAP data views and JDBC data views, we can do the same as in  part 1 but use one SQL data base in place of "store B". In this example, I will use MySQL but this works on IBM DB2, Oracle, hsql or any other data base with a jdbc driver.

Bird's Eye View

The Meat

  1. Configure the LDAP Store A. See here
  2. Configure the SQL Store B. See here
  3. Configure the Replication Distribution Algorithm Between Them
If like me you are lazy or unwilling to jump to another page, the whole procedure is also described below:

Store A Back End Setup: Directory Server

$ echo password > /tmp/pwd
$ dsadm create -p 1389 -P 1636 -D cn=dsadmin -w /tmp/pwd ds
Use 'dsadm start 'ds'' to start the instance
$ dsadm start ds
Directory Server instance '/path/to/sun/dsee/6.3/replication2/ds' started: pid=1836
$ dsconf create-suffix -D cn=dsadmin dc=example,dc=com
$ cat admin.ldif
dn: cn=admin,dc=example,dc=com
objectClass: person
cn: admin
sn: administrator
userPassword: password
$ ldapadd -p 1389 -D cn=dsadmin -w password  < admin.ldif
adding new entry cn=admin,dc=example,dc=com
arnaud@nullpointerexception:/path/to/sun/dsee/6.3/replication2$  ldapmodify -p 1389 -D cn=dsadmin -w password
dn: dc=example,dc=com
changetype: modify
add: aci
aci: (targetattr=\*) (version 3.0; acl "allow all";allow(all) userdn="ldap:///anyone";)

modifying entry dc=example,dc=com

\^C

 Store A Configuration In DPS


$ dpconf create-ldap-data-source sourceA localhost:1389
$ dpconf create-ldap-data-source-pool poolA
$ dpconf attach-ldap-data-source poolA sourceA
$ dpconf set-attached-ldap-data-source-prop poolA sourceA add-weight:1 bind-weight:1 delete-weight:1
$ dpconf set-attached-ldap-data-source-prop poolA sourceA add-weight:1 bind-weight:1 delete-weight:1 modify-weight:1 search-weight:1
$ dpconf create-ldap-data-view viewA poolA dc=example,dc=com

 Store B Configuration In DPS: MySQL

For this example I have assumed that we already had a running instance of DPS with a data base named "replication" that contains a single table "users" with a single row of data. This row is the admin user entry.

$ dpconf create-jdbc-data-source -b replication -B jdbc:mysql:/// -J file:/path/to/apps/mysql-connector-java-5.1.6/mysql-connector-java-5.1.6-bin.jar -S com.mysql.jdbc.Driver sourceB
$ dpconf set-jdbc-data-source-prop sourceB db-user:root db-pwd-file:/tmp/pwd
The proxy server will need to be restarted in order for the changes to take effect
$ dpadm restart dps
Directory Proxy Server instance '/path/to/sun/dsee/6.3/replication2/dps' stopped
Directory Proxy Server instance '/path/to/sun/dsee/6.3/replication2/dps' started: pid=2020
$ dpconf create-jdbc-data-source-pool poolB
$ dpconf attach-jdbc-data-source poolB sourceB
$ dpconf create-jdbc-data-view viewB poolB dc=example,dc=com
$ dpconf create-jdbc-table dpsUsersTable users
$ dpconf add-jdbc-attr dpsUsersTable sn id
$ dpconf add-jdbc-attr dpsUsersTable cn name
$ dpconf add-jdbc-attr dpsUsersTable userPassword password
$ dpconf create-jdbc-object-class viewB person dpsUsersTable cn
$ dpconf set-jdbc-attr-prop dpsUsersTable sn sql-syntax:INT
$ ldapmodify -p 7777 -D cn=dpsadmin -w password
dn: cn=permissive_aci,cn=virtual access controls
changetype: add
objectClass: aciSource
dpsAci: (targetAttr="\*") (version 3.0;acl "Be lenient";allow(all) userdn="ldap:///anyone";)
cn: permissive_aci

adding new entry cn=permissive_aci,cn=virtual access controls

$ dpconf set-connection-handler-prop "default connection handler" aci-source:permissive_aci

Replication Configuration Between Directory Server And MySQL 

$ dpconf set-ldap-data-view-prop viewA distribution-algorithm:replication replication-role:master
The proxy server will need to be restarted in order for the changes to take effect
$ dpconf set-jdbc-data-view-prop viewB distribution-algorithm:replication replication-role:master
The proxy server will need to be restarted in order for the changes to take effect
$ ldapmodify -p 7777 -D cn=dpsadmin -w password
dn: cn=viewA,cn=data views,cn=config
changetype: modify
add: distributionDataViewType
distributionDataViewType: read

modifying entry cn=viewA,cn=data views,cn=config

\^C

$ dpadm restart dps
Directory Proxy Server instance '/path/to/sun/dsee/6.3/replication2/dps' stopped
Directory Proxy Server instance '/path/to/sun/dsee/6.3/replication2/dps' started: pid=2258

Testing Replication To Both Data Stores 

$ cat add.ldif
dn: cn=user,dc=example,dc=com
objectClass: person
cn: user
sn: 1
userPassword: password

$ ldapadd -p 7777 -D cn=admin,dc=example,dc=com -w password < add.ldif
adding new entry cn=user,dc=example,dc=com

$ ldapsearch -p 1389 -b dc=example,dc=com "(cn=user)"
version: 1
dn: cn=user,dc=example,dc=com
objectClass: person
objectClass: top
cn: user
sn: 1
userPassword: {SSHA}6knZSKvWHj5LKwZ5jUmyYVqxQAQKFRd0rziYXA==

$ /usr/mysql/bin/mysql
Welcome to the MySQL monitor.  Commands end with ; or \\g.
Your MySQL connection id is 57
Server version: 5.0.45 Source distribution

Type 'help;' or '\\h' for help. Type '\\c' to clear the buffer.

mysql> use replication;
Database changed
mysql> select \* from users;
+------+-------+----------+
| id   | name  | password |
+------+-------+----------+
|    0 | admin | password |
|    1 | user  | password |
+------+-------+----------+
2 rows in set (0.00 sec)

mysql> 

Caveats

    You need to remember a couple of important things:

  • Authenticate as a user present in both data stores. "cn=Directory Manager" is not going to work for multiple reasons that I won't describe in detail here but heterogeneous data stores come with some constraints.
  • Make sure the user has the proper rights to manipulate data on both data stores.
<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

The Stupid Simple DPS MySQL Example

Rationale

    Even though I use DPS every day, I find myself looking for tips quite frequently.
Here is just a REALLY simple example of how to get started with MySQL as a data store.

There are very good, detailed examples in the documentation but none really dead simple. And that's precisely what this entry aims at.

Bird's Eye View

Below is a graph trying to depict how we map in DPS from SQL to LDAP. To be honest, SQL is a quite radically different model and therefore, even in this "stupid simple" example, there are a number of things that DPS cannot guess, namely:

  1. The data source is configured to point to a specific database (through the jdbc url)
  2. The data view is configured to represent an LDAP objectClass from an SQL table
  3. Each column of the SQL table need to be mapped to an LDAP attribute

 Here's how all this looks from a DPS configuration stand point:


The Meat

   In this example, we have a database engine containing a single database named "DEADSIMPLE". The DEADSIMPLE database has a single table "USERS" with two columns, "NAME" and "PASSWORD". The "USERS" table content is a single row as described in the above figure. This is all to make it as small and as easy as possible.

    We will here want to expose this data from the MySQL database as a proper "person" object, containing a "cn" (common name) attribute, "sn" (surname) attribute and a "userPassword" attribute in order for us to be able to authenticate as user cn=admin,dc=example,dc=com with password "password". Eventually, we want the entry to look as follows:

dn: cn=admin,dc=example,dc=com
objectclass: top
objectclass: person
userpassword: password
sn: 0
cn: admin

And here the log of my session. I'll upate this article later with more details.

$ echo password > /tmp/pwd
$ dpadm create -p 7777 -P 7778 -D cn=dpsadmin -w /tmp/pwd dps
Use 'dpadm start /path/to/sun/dsee/6.3/dps' to start the instance
$ dpadm start dps
Directory Proxy Server instance '/path/to/sun/dsee/6.3/dps' started: pid=966
$ dpconf create-jdbc-data-source -b replication -B jdbc:mysql:/// -J file:/path/to/mysql-connector-java-5.1.6-bin.jar -S com.mysql.jdbc.Driver sourceA
$ dpconf set-jdbc-data-source-prop sourceA db-user:root db-pwd-file:/tmp/pwd
The proxy server will need to be restarted in order for the changes to take effect
$ dpadm restart dps
Directory Proxy Server instance '/path/to/sun/dsee/6.3/dps' stopped
Directory Proxy Server instance '/path/to/sun/dsee/6.3/dps' started: pid=1065

$ dpconf create-jdbc-data-source-pool poolA
$ dpconf attach-jdbc-data-source poolA sourceA
$ dpconf create-jdbc-data-view viewA poolA dc=example,dc=com
$ dpconf create-jdbc-table dpsUsersTable users
$ dpconf add-jdbc-attr dpsUsersTable sn id
$ dpconf add-jdbc-attr dpsUsersTable cn name
$ dpconf add-jdbc-attr dpsUsersTable userPassword password
$ dpconf create-jdbc-object-class viewA person dpsUsersTable cn
$ldapsearch -p 7777 -D cn=admin,dc=example,dc=com -w password -b dc=example,dc=com "(objectClass=\*)"
version: 1
dn: dc=example,dc=com
objectclass: top
objectclass: extensibleObject
description: Glue entry automatically generated
dc: example

dn: cn=admin,dc=example,dc=com
objectclass: top
objectclass: person
userpassword: password
sn: 0
cn: admin


$ dpconf set-jdbc-attr-prop dpsUsersTable sn sql-syntax:INT

$ cat add.ldif
dn: cn=user,dc=example,dc=com
objectClass: person
cn: user
sn: 1
userPassword: password

$ ldapadd -p 7777 -D cn=admin,dc=example,dc=com -w password < add.ldif
adding new entry cn=user,dc=example,dc=com
ldap_add: Insufficient access
ldap_add: additional info: No aciSource setup in connection handler "default connection handler"


$ ldapmodify -p 7777 -D cn=dpsadmin -w password
dn: cn=mysql_aci,cn=virtual access controls
changetype: add
objectClass: aciSource
dpsAci: (targetAttr="\*") (version 3.0; acl "Allow everything for MySQL"; allow(all) userdn="ldap:///anyone";)
cn: mysql_aci

adding new entry cn=mysql_aci,cn=virtual access controls

$ dpconf set-connection-handler-prop "default connection handler" aci-source:mysql_aci

$ ldapadd -p 7777 -D cn=admin,dc=example,dc=com -w password < add.ldif
adding new entry cn=user,dc=example,dc=com

$ ldapsearch -p 7777 -D cn=admin,dc=example,dc=com -w password -b dc=example,dc=com "(objectClass=\*)"
version: 1
dn: dc=example,dc=com
objectclass: top
objectclass: extensibleObject
description: Glue entry automatically generated
dc: example

dn: cn=admin,dc=example,dc=com
objectclass: top
objectclass: person
userpassword: password
sn: 0
cn: admin

dn: cn=user,dc=example,dc=com
objectclass: top
objectclass: person
userpassword: password
sn: 1
cn: user

$ ldapmodify -p 7777 -D cn=admin,dc=example,dc=com -w password
dn: cn=user,dc=example,dc=com
changetype: modify
replace: userPassword
userPassword: newPassword

modifying entry cn=user,dc=example,dc=com

\^C
$ ldapsearch -p 7777 -D cn=admin,dc=example,dc=com -w password -b dc=example,dc=com "(cn=user)"version: 1
dn: cn=user,dc=example,dc=com
objectclass: top
objectclass: person
userpassword: newPassword
sn: 1
cn: user

<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

Wednesday Apr 01, 2009

Setting DPS As Replication Hub - Part 1: a simple tut'

Rationale

    There may be cases where you would like to keep two environments up to date with the same data but there is no replication or synchronization solution that fit your particular needs. One example that comes to mind is to migrate away from a legacy LDAP (RACF, OiD, Sun DS 5...) to  OpenDS. After having initialized your new data store with the former data store contents, without synchronization mechanism, you would have to switch to the new data store right away. That would not quite be acceptable in production because for one thing, importing the data might take longer than the maintenance window, and more importantly, may something unexpected happen, all real-life deployments want to preserve the option of rolling back to the legacy system (that has proved to work in the past -even if performance or functionality could use a dust-off- ).

Enters DPS "replication" distribution algorithm. The idea is quite simple: route reads to a single data store, duplicate writes across all data stores. I use the term data store here because it needs not be LDAP only but any SQL data base that has a JDBC driver can be replicated to as well. For this tutorial though, I will use two LDAP stores. We will see a MySQL example in Part 2.

Bird's Eye View

    Unlike load balancing and fail over algorithm, which work across sources in a same pool, distribution algorithms work across data views. A distribution algorithm is a way to pick the appropriate data view among eligible data views to process a given client request. In this tutorial, I will show how the "replication" distribution algorithm allows to duplicate write traffic across two distinct data sources.

In the graph below, you can see how this is structured in DPS configuration.

The Meat

We will assume here that we have two existing LDAP servers running locally and serving the same suffix dc=example,dc=com:

  1. Store A: dsA on port 1389
  2. Store B: dsB on port 2389

Let's first go about the mundane task of setting up both stores in DPS:
    For Store A:

#dpconf create-ldap-data-source dsA localhost:1389
#dpconf create-ldap-data-source-pool poolA
#dpconf attach-ldap-data-source poolA dsA
#dpconf set-attached-ldap-data-source-prop poolA dsA add-weight:1 bind-weight:1 delete-weight:1 modify-weight:1 search-weight:1
#dpconf create-ldap-data-view viewA poolA dc=example,dc=com

    For Store B:

#dpconf create-ldap-data-source dsB localhost:2389
#dpconf create-ldap-data-source-pool poolB
#dpconf attach-ldap-data-source poolB dsB
#dpconf set-attached-ldap-data-source-prop poolB dsB add-weight:1 bind-weight:1 delete-weight:1 modify-weight:1 search-weight:1
#dpconf create-ldap-data-view viewB poolB dc=example,dc=com

    Now, the distribution algorithm must be set to replication on both data views:

#dpconf set-ldap-data-view-prop viewA distribution-algorithm:replication replication-role:master
#dpconf set-ldap-data-view-prop viewB distribution-algorithm:replication replication-role:master

  And finally, the catch:

    When using dpconf to set the replication-role property to master, it effectively writes distributionDataViewType as a single valued attribute in the data view configuration entry when in reality the schema allows it to be multi-valued. To see that for yourself, simply do:

#ldapsearch -p <your DPS port> -D "cn=proxy manager" -w password "(cn=viewA)"
version: 1
dn: cn=viewA,cn=data views,cn=config
dataSourcePool: poolA
viewBase: dc=example,dc=com
objectClass: top
objectClass: configEntry
objectClass: dataView
objectClass: ldapDataView
cn: viewA
viewAlternateSearchBase: ""
viewAlternateSearchBase: "dc=com"
distributionDataViewType: write
distributionAlgorithm: com.sun.directory.proxy.extensions.ReplicationDistributionAlgoritm


and then try to issue the following command:

#dpconf set-ldap-data-view-prop viewA replication-role+:consumer
The property "replication-role" cannot contain multiple values.
XXX exception-syntax-prop-add-val-invalid

...

...or just take my word for it. 

The issue is that in order for DPS to process read traffic (bind, search, etc...), one data view needs to be consumer but for the replication to work across data views, all of them must be master as well. That is why you will need to issue the following command on one (and one only) data view:

#ldapmodify -p <your DPS port> -D "cn=proxy manager" -w password
dn: cn=viewA,cn=data views,cn=config
changetype: modify
add: distributionDataViewType
distributionDataViewType: read

That's it!
Wasn't all that hard except it took some insider's knowledge, and now you have it.
Your search traffic will always go to Store A and all write traffic will get duplicated across Store A and B.

Caveats

Note that while this is very useful in a number of situations where nothing else will work, this should only be used for transitions as there are a number of caveats.
DPS does not store any historical information about traffic and therefore, in case of an outage of one of the underlying stores, contents may diverge on data stores. Especially when this mode is used where no synchronization solution can catch up after an outage.

Store A and Store B will end up out of synch if:

  • either store comes to be off-line
  • either store is unwilling to perform because the machine is  outpaced by traffic
<script type="text/freezescript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/freezescript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>

Wednesday Aug 06, 2008

From An Old Suffix To A New One Live With No Down Time

You currently have your entries an "old" suffix that we will here on call dc=old and you would like to be able to move your entries to a new suffix we will refer to as dc=new for the purpose of this article. The catch is that you cannot stop your server and migrate your data off line. On top of this, during the transition period when your client applications get reconfigured to use dc=new, you need entries to appear to be in both dc=new and  dc=old.

To make this a little simpler to picture in our minds, let's look at the data life cycle:

  1. Before the migration starts, all data resides under dc=old, requests on dc=new would fail but no application is "aware" of dc=new
    • e.g. cn=user.1,dc=old works but cn=user.1,dc=new fails. At this point the DIT looks like this:
      • dc=old
        • cn=user.1
        • cn=user.3
      • dc=new
  2. When the migration starts, all data resides under dc=old, both requests on dc=old and dc=new are honored
    • e.g. cn=user.1,dc=old and cn=user.1,dc=new will work but the entry is actually only stored physically under dc=old. The DIT is unchanged compared to 1. but we have shoehorned DPS in the topology and added a data view to take care of the transformation.
  3. While the migration is ongoing, to be migrated data resides under dc=old but migrated data resides on dc=new
    • this is when it will start to get a little complicated. Here is what the DIT might look like:
      • dc=old
        • cn=user.1
      • dc=new
        • cn=user.3
    • At this point, a request on cn=user.1,dc=old will work along with a request for cn=user.1,dc=new. But both requests for cn=user.3,dc=new and cn=user.3,dc=old will work as well. Basically, to the client application there is a true virtualization of where the data actually resides. This is crucial when you have a heterogeneous environment of applications being reconfigured to use the new suffix while some older applications might take too much work to reconfigure and are simply waiting to be discontinued.
  4. When data migration is complete, all data resides under dc=new but both requests for entries under both suffixes will be served. At this point our DIT would look like this:
    • dc=old
    • dc=new
      • cn=user.1
      • cn=user.3
  5. Once we are confident no application requests entries under dc=old then we can take the virtualization data views down. Only requests to dc=new will be served.

If you want to try it out for yourself, here are the steps to follow to get such a setup with DPS 6 but first let us agree on the environment:

  • Directory Server bits are installed in/path/to/sun/dsee/6.3/bits/ds6
  • Directory Proxy Server bits installed in /path/to/sun/dsee/6.3/bits/dps6
  • I won't use "cn=Directory Manager" but uid=admin instead with a password file containing the "password" string in /path/to/pwd
  • the data for dc=new and dc=old in our example is as follows

    dn: dc=old
    dc: old
    objectClass: top
    objectClass: domain
    aci: (targetattr=\*) ( version 3.0; acl "allow all anonymous"; allow (all) userdn="ldap:///anyone";)

    dn: cn=user.1,dc=old
    objectClass: person
    objectClass: top
    cn: user.1
    sn: 1
    userPassword: {SSHA}PzAK73RDZIikdI8qRqD7MYubasZ5JyJa/BToMw==

    dn: dc=new
    dc: new

    objectClass: top
    objectClass: domain
    aci: (targetattr=\*) ( version 3.0; acl "allow all anonymous"; allow (all) userdn="ldap:///anyone";)

    dn: cn=user.3,dc=new
    objectClass: person
    objectClass: top
    cn: user.3
    sn: 3
    userPassword: {SSHA}PzAK73RDZIikdI8qRqD7MYubasZ5JyJa/BToMw==

Before we begin, it is very convenient to set the following environment variables to make subsequent calls the CLIs much easier to read:

export PATH=${PATH}:/path/to/sun/dsee/6.3/bits/ds6/bin:/path/to/sun/dsee/6.3/bits/dps6/bin:/path/to/sun/dsee/6.3/bits/dsrk6/bin
export LDAP_ADMIN_PWF=/path/to/pwd
export LDAP_ADMIN_USER=uid=admin
export DIRSERV_PORT=1389
export DIRSERV_HOST=localhost
export DIRSERV_UNSECURED=TRUE
export DIR_PROXY_HOST=localhost
export DIR_PROXY_PORT=7777
export DIR_PROXY_UNSECURED=TRUE

First off we need to create an instance of Directory Server to store our entries. This is a two step process:

  1. Create and start the instance that we will name master
    >dsadm create -D uid=admin -w /path/to/pwd -p 1389 -P 1636 master
    Use 'dsadm start 'master'' to start the instance
    >dsadm start master
    Directory Server instance '/path/to/sun/dsee/6.3/live-migration/master' started: pid=2968
  2. On to creating a suffix and populating it with data
    >dsconf create-suffix dc=old
    >dsconf import /path/to/sun/dsee/6.3/instances/dc\\=old.ldif dc=old
    New data will override existing data of the suffix "dc=old".
    Initialization will have to be performed on replicated suffixes.
    Do you want to continue [y/n] ?  y
    ## Index buffering enabled with bucket size 40
    ## Beginning import job...
    ## Processing file "/path/to/sun/dsee/6.3/instances/dc=old.ldif"
    ## Finished scanning file "/path/to/sun/dsee/6.3/instances/dc=old.ldif" (3 entries)
    ## Workers finished; cleaning up...
    ## Workers cleaned up.
    ## Cleaning up producer thread...
    ## Indexing complete.
    ## Starting numsubordinates attribute generation. This may take a while, please wait for further activity reports.
    ## Numsubordinates attribute generation complete. Flushing caches...
    ## Closing files...
    ## Import complete.  Processed 3 entries in 4 seconds. (0.75 entries/sec)
    Task completed (slapd exit code: 0).
  3. We can now check the data was successfully loaded with a quick broad sweep search:
    >ldapsearch -p 1389 -b "dc=old" "(objectClass=\*)"
    version: 1
    dn: dc=old
    dc: old
    objectClass: top
    objectClass: domain

    dn: cn=user.1,dc=old
    objectClass: person
    objectClass: top
    cn: user.1
    sn: 1
    userPassword: {SSHA}PzAK73RDZIikdI8qRqD7MYubasZ5JyJa/BToMw==
  4. Repeat these last 3 steps for dc=new

Directory Proxy Server configuration

  1. Create and start an instance of the proxy

    >dpadm create -p 7777 -P7778 -D uid=admin -w /path/to/pwd proxy
    Use 'dpadm start /path/to/sun/dsee/6.3/live-migration/proxy' to start the instance

    >dpadm start proxy
    Directory Proxy Server instance '/path/to/sun/dsee/6.3/live-migration/proxy' started: pid=3061

  2. Connect the proxy to the Directory Server instance

    >dpconf create-ldap-data-source master localhost:1389

    >dpconf create-ldap-data-source-pool master-pool

    >dpconf attach-ldap-data-source master-pool master

    >dpconf set-attached-ldap-data-source-prop master-pool master add-weight:1 bind-weight:1 compare-weight:1 delete-weight:1 modify-dn-weight:1 modify-weight:1 search-weight:1

  3. Create a straight data view to dc=old and verify we can get through to the source

    >dpconf create-ldap-data-view actual-old master-pool dc=old

    >ldapsearch -p 7777 -b dc=old "(objectClass=\*)"
    version: 1
    dn: dc=old
    dc: old
    objectClass: top
    objectClass: domain

    dn: cn=user.1,dc=old
    objectClass: person
    objectClass: top
    cn: user.1
    sn: 1
    userPassword: {SSHA}PzAK73RDZIikdI8qRqD7MYubasZ5JyJa/BToMw==

    >ldapsearch -p 7777 -b dc=old "(cn=user.1)"
    version: 1
    dn: cn=user.1,dc=old
    objectClass: person
    objectClass: top
    cn: user.1
    sn: 1
    userPassword: {SSHA}PzAK73RDZIikdI8qRqD7MYubasZ5JyJa/BToMw==

  4. Create a virtual data view representing the physical entries under dc=old as it were under dc=new

    >dpconf create-ldap-data-view virtual-new master-pool dc=new

    >dpconf set-ldap-data-view-prop virtual-new dn-mapping-source-base-dn:dc=old

    >ldapsearch -p 7777 -b dc=new "(cn=user.1)"
    version: 1
    dn: cn=user.1,dc=new
    objectClass: person
    objectClass: top
    cn: user.1
    sn: 1
    userPassword: {SSHA}PzAK73RDZIikdI8qRqD7MYubasZ5JyJa/BToMw==


  5. Repeat 3 and 4 for the physical dc=new suffix and we now have totally virtualized back end.

    >dpconf create-ldap-data-view actual-new master-pool dc=new

    >dpconf create-ldap-data-view virtual-old master-pool dc=old

    >dpconf set-ldap-data-view-prop virtual-old dn-mapping-source-base-dn:dc=new

    >ldapsearch -p 7777 -b dc=old "(cn=user.3)"
    version: 1
    dn: cn=user.3,dc=old
    objectClass: person
    objectClass: top
    cn: user.3
    sn: 3
    userPassword: {SSHA}5LFqYHLashsY7PFAvFV9pM+C2oedPTdV/AIADQ==

    >ldapsearch -p 7777 -b dc=new "(cn=user.3)"
    version: 1
    dn: cn=user.3,dc=new
    objectClass: person
    objectClass: top
    cn: user.3
    sn: 3
    userPassword: {SSHA}5LFqYHLashsY7PFAvFV9pM+C2oedPTdV/AIADQ==

<script type="text/javascript"> var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www."); document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E")); </script> <script type="text/javascript"> try { var pageTracker = _gat._getTracker("UA-12162483-1"); pageTracker._trackPageview(); } catch(err) {}</script>
About

Directory Services Tutorials, Utilities, Tips and Tricks

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today