Responses to the 2 Billion Entry OID Benchmark...
By Clayton on Apr 04, 2008
We got a lot of response to our benchmark on the Web and in our email bag. The responses were by the press, customers, and (most vocally) our competitors.
Regardless of where the responses came in from, the one thing that was obvious: people were excited. Certainly nobody was asking if directories were dead, that's for sure. Dave Kearns actually commented specifically about how infrequently he sees things like this in directory.
Not surprisingly, the customer camp loved this benchmark, as it confirms what they have already known and experienced, while giving them a fairly specific set of instructions to duplicate this type of result. It also goes hand in hand with our investment in virtual directory technology to show them that we're serious about having the best-in-breed product in this space.
In fact, the report was SO detailed and complete that it was obvious that the competition was going to try to find ways to discredit the big picture by focusing on the minute details. Since we made no attempt to compare apples to oranges or obscure the way testing was done, we're absolutely confident that this is not only a very valid benchmark, but one that has been more scrutinized than any before it.
Now to address the complaints from the competition. These fell into basically 3 major categories:
1. This is too many entries. Nobody needs this many!
I addressed this to some degree in my original blog post on the topic. This is just something our high-end customers want.
2. This isn't enough entries. We can do more!
I found this interesting, particularly because of who was most vocal about it: Howard Chu from Symas. I find this interesting because a benchmark I'm always asked to comment on is this one, which he did against Sun.
So basically on a system with 8 cores and 16GB of memory in that benchmark was able to scale to 10m entries with some reasonable performance. Now I see that this report is from 2006, but there's nothing newer posted on their site. Specifically, nothing with great detail about a 5 billion user benchmark on a quad processor server. Their own 2006 report also shows that the server isn't exactly scaling in a very linear way as you add entries. If you look at our previous benchmarks, this is absolutely the case with OID.
Another note is that there's no indication in his University of Michigan mailing list posting about the size of entries or other factors in his benchmark. For example, the entry sizes in one of our earlier 100m benchmarks (also published in great detail) was 120 attributes. A lot of scale is very dependent on I/O and clearly a small entry is easier to scale than a large one.
That said, the key point being made with our benchmark was not that someone else couldn't do a bigger one, but that we could do it without making the wrong kind of trade-offs. We're the only directory that can scale into the billions while still taking advantage of key enterprise data management features, such as secure backups, transparent data encryption, database vault, and other things that come with building on mature data management technology.
In fact, the idea that Oracle Database doesn't scale is pretty funny in itself. While LDAP benchmarks are completely unstandardized and generally use some tools built by Sun (SLAMD, for example), database benchmarking is very standardized and Oracle Database is a clear leader in this area, not only in scalability, but price per TPC-C.
3. The benchmark isn't realistic enough
Jonathan Gershater from Sun's directory team has a very thoughtful blog posting about our results. Not quite as dramatic and feisty as Howard Chu, but certainly equally skeptical.
He basically has three points to make:
1. Turning off change logs helped our results
Yes, we did, but this is actually a best practice for most directories during data load, as you'd simply load the initial set on each of the main servers and then turn on replication rather than load the entries in replicas via LDAP replication. Changelog was disabled more to avoid a large accumulation of changes during the various repeated runs, but it may have helped incrementally in the 'modify' tests only.
2. Password policy disabled helped our results
Not quite. Password policy being disabled helped bulkload incrementally, no impact on any other LDAP operation result since only failed bind/compare operations and password updates take an incremental hit. These were turned off primarily to make sure we could use the entries being generated by the Sun benchmarking tool.
3. The queries generated weren't realistic
I disagree. In a 2 billion user environment, you're generally talking about groups of active users and groups of less active users (and even some inactive users). A better test would have been to make 10% of these requests completely random, which I understand a new SLAMD beta is able to generate.
Even then, if one were to cut the number of responses in half in order to make every single user truly random, the results remain outstanding with hundreds of millions of operations every hour.
The benchmark is out there. It's a valid benchmark. Certainly people are going to do their own benchmarks and competitors will always find a flaw in any benchmark. If there's one thing we're looking to do, it's publish some of the typical benchmarks we do on every-day hardware as part of each release. And we're going to continue to do it in an above-board way with specific details as we did with this report in order to help our customers size and scale this software in the best possible way.