Sun Directory compresses data for better performance !

Sun Directory Server Enterprise Edition 7.0 was released last November, and in the December timeframe Brad Diggs and Wajih Ahmed, both Principal Field Technologists and big experts in Directory Services, backed with engineers from the Directory engineering team and Mr Benchmark, put the product on the test bench to evaluate its performance and scalability with Sun new hardware and especially the new F-20 PCIe flash drives (see also what Mr Benchmark says about the F-20).

Brad's first article describes how much Directory Server 7 entry compression rocks, "extending search performance by more than 50% through increased caching potential". Brad provides details of his findings and gives the commands to run to get the benefits of DSEE 7 in your deployment.

The entry compression feature is also available in the technology that will power future versions of Sun Directory Enterprise Edition: the OpenDS project. In OpenDS, there are 2 options to reduce the size of entries stored in the database. The first one is called entry compaction, and it's enabled by default. The entry compaction feature removes all references to attribute names and replace them with small identifiers. The second option is actually entry compression which will use the popular ZLib algorithm. This option is not activated by default, but it's just a command away :

<OPENDS_HOME>/bin/dsconfig -X -p 4444 -h localhost -D cn=Directory\\ manager\\
 -w password -n set-backend-prop \\
 --backend-name userRoot --set entries-compressed:true

Below is the dsconfig usage for disabling entry compaction with OpenDS:

<OPENDS_HOME>/bin/dsconfig -X -p 4444 -h localhost -D cn=directory\\ manager\\
 -w password -n set-backend-prop \\
 --backend-name backend --set compact-encoding:false

Here's a table that compares the size of the databases of OpenDS 2.2.0 with no compat encoding, with it (default settings) and with compression enabled. The table compares the size of the entry record within the database as well as the overall size of the database which also includes indexes (default OpenDS settings).

Entry Count LDIF Entry Size Uncompacted Entry Size Compacted Entry Size Compressed Entry Size Uncompacted DB Size Compacted DB Size Compressed DB Size
100K 599 b 645 b 481 b 361 b 178.8 MB 163.20 MB 151.65 MB
-34% - 25% -9.6% - 7.1%
1M 603 b 649 b 485 b 364 b 1,515 MB 1,358 MB 1,243 MB
-34% - 25% -11.5% - 8.5%
10M 607 b 653 b 490 b 363 b 13,973 MB 12,416 MB 11,188 MB
-33% - 26% -12.5% - 9.9%

The percentages are computed from the reference value which is the default i.e. compacted. A negative value means an increased size, a positive one means a reduced size.

The second table compares the import times for the 3 different modes for storing entries, for the 3 sample data files.

Entry Count Uncompacted Compacted Compressed
100K 21 s 21 s 22 s
1.1% - -3.5%
1M 106 s 107 s 112 s
0.5% - -4.9%
10M 1006 s 1009 s 1101 s
0.2% - -8.9%

Note: in this table, negative numbers represent increase in time required to import compared to the default settings.

Enabling compression does result in a smaller disk use with that sample data (fully random values), but does come with a performance penalty at least at import time, less than 10% but the penalty increases with the amount of entries. If you've read Brad's article on DSEE entry compression, you understand that the smaller the entries in the database, the more can be potentially cached in the Database Cache and the better the overall performances are. So if your entries are quite large, contain values that are strings, you should consider enabling the entry compression with OpenDS.

Changing from the default mode (compacted) to uncompacted mode does not give any real advantage in performance, but does increase the disk space usage, so I do not see the value of changing these settings in OpenDS.

Anyway, the benefits of having compact entries in the database are available today with Sun Directory Server Enterprise Edition 7 and Sun OpenDS Standard Edition 2.2, and are helping customers to reduce the overall cost of ownership of the directory services.

Technorati Tags: , , , ,

Comments:

note that entry compaction in OpenDS was specifically targeted
at improving performance as the main goal of that feature,
disk space optimization is sort of collateral of that.

Posted by anton on January 25, 2010 at 04:22 AM CET #

You cannot decorelate disk space optimization from performance, since reducing disk space occupation does reduce the cost of I/O both writing and reading. But somehow I join you with the idea that OpenDS engineering main goal is to improve performance of LDAP services.

Posted by Ludovic Poitou on January 25, 2010 at 05:18 AM CET #

the primary problem we were trying to solve with compaction were
entry encoding/decoding [ pure processing, not i/o ] costs. i/o
and disk space were not of primary concern tho its sure nice to
see it helps with that too but i digress. the reason i'm posting
these comments is to help your readers understand better how
compaction feature came about and its difference to compression feature because unlike compression compaction actually improves
entry processing performance so you do not end up sacrificing
performance for gaining smaller disk footprint in this case. the
gain would vary depending on the entry content/structure as well.

Posted by anton on January 25, 2010 at 06:44 AM CET #

Post a Comment:
Comments are closed for this entry.
About

This is the blog of a senior software engineer, specialized in LDAP, Directory Server and OpenDS. Ludovic Poitou works in France at the Grenoble Engineering Center, in the Directory Services Engineering team. Outside work, I love skiing and taking photo

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today