Indexing in OpenDS
By sin on May 21, 2009
OpenDS wiki provides a good introduction to the index databases and their purposes. I'll try to dig further into how a particular index database looks like at the sleepycat end. For those who are not familiar with sleepycat or Berkley DB, it is the embedded DB used by openDS. I will try to explain one index in each of my posts. If you are coming from the relational database background, you'd probably be aware of the benefits of the indexing. Typically an index is a B+ Tree where each node stores a key and a value pair.
Before we begin with the index, it is a prerequisite to know about the EntryID. An EntryID is a long value which is unique for each entry inside an openDS database. While all of the indexes may have different keys based on types of indexes, all of them have the same values : EntryID. The idea behind an index is to search using a key and grab a list of EntryIDs. Further down, this list of EntryIDs is used to retrieve all the entries from another database called id2entry and the filter is applied on each one of those entries to see if they match. The matched ones are returned to the client. OpenDS provides following indexes:
children index -- system index ( not used by users directly )
subtree index -- system index (not used by users directly)
equality index -- user configurable
substring index --user configuration
ordering index -- user configurable
presence index -- user configurable
extensible index -- user configurable
System indexes are used anyway. OpenDS needs to maintain them to build information related to scope of the entries. Since these are not attribute-based, users don't have any control over them. Rest of the indexes are available per attribute and hence available for configuration. For more information on how to create/modify the indexes above have a look at the openDS wiki.
When you create an entry under a back-end, all the configured indexes are updated with the new entry information. For example, if you add an entry cn=user,dc=example,dc=com , equality indexes will have normalized value(s) of attribute "cn" as the key(s) and its EntryID (say 2), as the value.