Sun Directory compresses data for better performance !
By Ludo on Jan 22, 2010
Sun Directory Server Enterprise Edition 7.0 was released last November, and in the December timeframe Brad Diggs and Wajih Ahmed, both Principal Field Technologists and big experts in Directory Services, backed with engineers from the Directory engineering team and Mr Benchmark, put the product on the test bench to evaluate its performance and scalability with Sun new hardware and especially the new F-20 PCIe flash drives (see also what Mr Benchmark says about the F-20).
Brad's first article describes how much Directory Server 7 entry compression rocks, "extending search performance by more than 50% through increased caching potential". Brad provides details of his findings and gives the commands to run to get the benefits of DSEE 7 in your deployment.
The entry compression feature is also available in the technology that will power future versions of Sun Directory Enterprise Edition: the OpenDS project. In OpenDS, there are 2 options to reduce the size of entries stored in the database. The first one is called entry compaction, and it's enabled by default. The entry compaction feature removes all references to attribute names and replace them with small identifiers. The second option is actually entry compression which will use the popular ZLib algorithm. This option is not activated by default, but it's just a command away :
<OPENDS_HOME>/bin/dsconfig -X -p 4444 -h localhost -D cn=Directory\\ manager\\
-w password -n set-backend-prop \\
--backend-name userRoot --set entries-compressed:true
Below is the dsconfig usage for disabling entry compaction with OpenDS:
<OPENDS_HOME>/bin/dsconfig -X -p 4444 -h localhost -D cn=directory\\ manager\\
-w password -n set-backend-prop \\
--backend-name backend --set compact-encoding:false
Here's a table that compares the size of the databases of OpenDS 2.2.0 with no compat encoding, with it (default settings) and with compression enabled. The table compares the size of the entry record within the database as well as the overall size of the database which also includes indexes (default OpenDS settings).
|Entry Count||LDIF Entry Size||Uncompacted Entry Size||Compacted Entry Size||Compressed Entry Size||Uncompacted DB Size||Compacted DB Size||Compressed DB Size|
|100K||599 b||645 b||481 b||361 b||178.8 MB||163.20 MB||151.65 MB|
|1M||603 b||649 b||485 b||364 b||1,515 MB||1,358 MB||1,243 MB|
|10M||607 b||653 b||490 b||363 b||13,973 MB||12,416 MB||11,188 MB|
The percentages are computed from the reference value which is the default i.e. compacted. A negative value means an increased size, a positive one means a reduced size.
The second table compares the import times for the 3 different modes for storing entries, for the 3 sample data files.
|100K||21 s||21 s||22 s|
|1M||106 s||107 s||112 s|
|10M||1006 s||1009 s||1101 s|
Note: in this table, negative numbers represent increase in time required to import compared to the default settings.
Enabling compression does result in a smaller disk use with that sample data (fully random values), but does come with a performance penalty at least at import time, less than 10% but the penalty increases with the amount of entries. If you've read Brad's article on DSEE entry compression, you understand that the smaller the entries in the database, the more can be potentially cached in the Database Cache and the better the overall performances are. So if your entries are quite large, contain values that are strings, you should consider enabling the entry compression with OpenDS.
Changing from the default mode (compacted) to uncompacted mode does not give any real advantage in performance, but does increase the disk space usage, so I do not see the value of changing these settings in OpenDS.
Anyway, the benefits of having compact entries in the database are available today with Sun Directory Server Enterprise Edition 7 and Sun OpenDS Standard Edition 2.2, and are helping customers to reduce the overall cost of ownership of the directory services.