By sin on Jul 17, 2007
This article explains you how to gain performance improvement using the ds-cfg-backend-subtree-delete-batch-size configuration attribute. The default recommended value for this is 5000. However, this may vary from machine to machine. The default value seems to be yielding a reasonable output with a Java Heap Size of 512MB.
There is no defined formulae to achieve this value for the best performance gain. Rather, it should be mostly based on the the sleepycat utilities which help determining the DB Cache size, Java Heap Size Settings , number of records stored in different databases.
As per the default Opends setting , the assigned DB Cache Size is 10% of the JVM heap size. So if the JVM heap size for Opends is 512MB then the DB Cache may grow up to 53 MB.
In my setup, I imported nearly 100,000 entries to the opends similar to the follwing one:
After the import, following databases were populated:
dc_example_dc_com_dn2id : Record Count is 100001
dc_example_dc_com_id2entry:Record Count is 100001
dc_example_dc_com_referral: Record Count is 0
dc_example_dc_com_id2children: Record Count is 1
dc_example_dc_com_id2subtree:Record Count is 1
dc_example_dc_com_givenName.equality:Record Count is 0
dc_example_dc_com_givenName.presence:Record Count is 0
dc_example_dc_com_givenName.substring:Record Count is 0
dc_example_dc_com_objectClass.equality: Record Count is 2
dc_example_dc_com_member.equality:Record Count is 0
dc_example_dc_com_uid.equality:Record Count is 0
dc_example_dc_com_cn.equality:Record Count is 100000
dc_example_dc_com_cn.presence :Record Count is 1
dc_example_dc_com_cn.substring: Record Count is 212220
dc_example_dc_com_telephoneNumber.equality:Record Count is 0
dc_example_dc_com_telephoneNumber.presence:Record Count is 0 dc_example_dc_com_telephoneNumber.substring:Record Count is 0
dc_example_dc_com_sn.equality:Record Count is 1
dc_example_dc_com_sn.presence:Record Count is 1
dc_example_dc_com_sn.substring:Record Count is 6
dc_example_dc_com_ds-sync-hist.ordering:Record Count is 0
dc_example_dc_com_mail.equality:Record Count is 0
dc_example_dc_com_mail.presence:Record Count is 0
dc_example_dc_com_mail.substring:Record Count is 0
dc_example_dc_com_entryUUID.equality: Record Count is 100001
dc_example_dc_com_aci.presence:Record Count is 0
Total number of records in the databases is 612236 and the memory needed by locks is 612236\*142( Currenlty JE uses a 142 byte lock size for a record) =86937512 bytes = 82 MB. But it doesn't require 82MB of memory in one Go because Not all database locks are acquired together.
I won't go into the details of the lifespan of a database lock. I would rather do the performance comparions between two different batch sizes selected on the basis of looking at the cache misses and various other debug outputs from the berkeley db ( Thanks to Linda for taking so much of pain). The two tests were run with a batch size of 5K and 20K respectively. Whilst we saw almost marginal time difference between the two ( 5K was better), the 5K batch size exhibited a better memory footprint as most of the memory utilized in the deletes ( like lock information) gets released after a while. This provides more headroom compared to the other one for the future operations.
Fig 1: Graph of the cacheTotal, lockBytes, and adminBytes fields from the
stats, which gives us a picture of how the cache is composed over time
Fig. 2: Graph of Cache Miss
The test run shows that both 5K and 20K runs have the same elapsed
time. But JE's overall requirements for memory are falling in the 5K
case. Probably at the beginning of the deleteLeaf operations, JE
has to pull a fair bit of the database into cache, in order to do the deletes.
As the deletes progress, the btrees are pruned, and the overall size of the
database is falling, so JE actually needs less cache for data. But the 20K batch
case uses a bigger lump of memory for the locks, and that lump keeps JE's high
water memory usage close to the limit of the 53Mb cache.
In the 5K batch case, the lock lumps are smaller, so JE's high water memory usage points are falling, which gives more margin. The 20K batch case has bigger cache miss spikes. The 5K batch case stays nice and even. (Though not really sure why the 5K batch has a bigger spike at the very beginning).
So since the elapsed time is the same, I'd go with the 5K batch.
My sincere thanks goes to Linda Lee because most of the statistics are based on the analysis of my test results by her.