Main | April 4, 2008 »

April 1, 2008 Archives

April 1, 2008

Two Billion Users...Why?

I was reading the comments area on Dave Kearn's recent article about the two-billion user benchmark executed against Oracle Internet Directory and thought I'd answer a few questions and make a few key points. 2billionplus.jpg

First, why would we even bother to benchmark a single server to handle such an "unrealistic" number of users in the first place?

The answer is that we get asked to do benchmarks in the 150m range all-the-time and have many customers in telecommunications, online gaming, and online services that easily exceed 50-100m user entries. Rather than continue to do these benchmarks one-at-a-time, it's simply more efficient to prove out that we can blow the doors off just about any number presented.

Don't need 2 billion users? The test shows linear scalability. Get yourself a lower-end box and you're good to go. The exact calculations are very well defined. Can't figure it out for yourself? Send me an email and I'll get someone to do the math for you.

Second, there's a continuing perception that directories are still the read-optimized, hierarchical, specialized data stores that they were 10-15 years ago when the main application for directories was white pages.

I'll be talking about this "need a specialized database" myth in my next post.

Technorati Tags: ,

Don't Band-Aid Your Identity Infrastructure

8CCADA1B-A8C2-4A54-BC06-210BB7A0A226.jpgMy 3-year old daughter is a big fan of band-aids. The band-aid in the picture is a particular favorite. For every real or perceived injury, a band-aid heals all.

Having delivered virtual directory software for quite some time, I've been asked to help deal with a lot of infrastructure wounds. Some of these are real issues that virtual directories can play a strategic role in solving. Others have been areas that are better suited for other identity management technology.

In these later cases, a virtual directory would have been a band-aid on a very serious problem, or maybe a band-aid on a problem that never existed in the first place.

My Favorite: My data sources are flakey.

This can be a real infrastructure issue, but it's rarely something a virtual directory is going to solve.

Why? Because you want your identity information fresh and secure. The virtual directory way of solving this problem would be with a cache. However, caches have several draw-backs:


  1. They get old. Identity information needs to be fresh. Do you really want accounts hanging around for hours after someone has been escorted out of the building by security? Some vendors try to solve this by keeping remote repositories in sync, but this just opens up the virtualization layer to all of the problems of meta-directory.
  2. They bypass security at the source. By using a cache, you are keeping a copy of the data in the virtual directory that will be shared by all users instead of understanding that different people may have different data available to them.
  3. They're difficult to manage. A cache adds to the total cost of ownership of your infrastructure. After all, you'll need to back it up, secure it, replicate it, distribute it, and otherwise think about it a lot. By keeping the virtualization layer simpler, it can often sit in a closet without much attention at all.

What are the Alternatives?

In many cases, no alternative is really needed. That band-aid may have added some psychological comfort, but it wasn't what healed the bruise.

How badly does your data source really perform? Why is that performance acceptable today? Do you have a better place to get that information?

I hear the concern about network links a lot, but in these cases the concern often doesn't reflect the fact that if a link is down, people often lose access to the applications using the identity as well.

Often there is a bit of cover-your-rear here, and vendors can play a part. It's a bit like taking your car in for service and hearing "your breaks may be okay, but maybe you should replace them before you get into an accident". You're less likely to question the source of the advice if it sounds like they're keeping you safe.

However, you may in fact have real issues with your data sources.

In these cases, the answer is almost invariably that it's cheaper and better to fix the source. It's much easier to add an extra directory replica or replicate your database than it is to have a long-term solution that involves a product being successfully able to detect changes in all of your infrastructure components.

In cases where this data needs to move around, provisioning is a mature technology with a lot of success in solving exactly this problem.

Technorati Tags: ,

The "Directory is a Read-Centric Database" Myth

Pop Quiz: Which of the following pieces of data don't belong in a directory server?

A. Username

B. Telephone Number

C. Favorite Color

D. Last Login Time

E08138C2-1E4C-4D2D-932F-97DEBD8B561E.jpgIf you said the answer was D, you've probably read a few good LDAP books from the 90's, when directories were all about white pages and "tuned for reads". This was the same period of time when Java was mostly about applets, if you'll recall (though I still see the old "Java is Slow" myth floating around).

Yes, directories are still used for white pages, but nobody buys them for that anymore.

The real value in directories is the ability to build powerful, user-aware enterprise applications that can share a single source for information about user identity. This means that while directories continue to need to be strong at fetching information quickly, there's also a need to be more flexible and less arbitrary about the kind of information that is stored in a directory.

Last login time, like bad password count and other attributes, is very useful to applications, but violates ancient, arbitrarily establish rules for what gets stored in a directory server (reads vs. writes).

So what does this have to do with read vs. write, flat vs. hierarchical, relational vs. embedded, etc...?

A big ding against Oracle Internet Directory back in the early days was that we used Oracle Database under the covers to store our data. The myth was that this was somehow going to underperform with reads and over-perform with writes (eh? over-performing?). Clearly with the recently posted benchmark, the underperform-with-read argument has been buried and attributes that require writes on login, presence, or location can be easily supported.

A second ding was that because directories were hierarchical, you needed an embedded data store in order to represent that hierarchy. I'd like someone to explain why B-Trees are so much more efficient than R-Trees at this kind of thing -- they're not. At the end of the day, nearly every directory represents the distinguished name as a single, normalized string and indexes it. Your performance is likely to be the same either way.

Now that we've moved well beyond the white pages phase, we need to start treating identity information with the same seriousness that we treat transactional information. This includes layering on real data-level security, secure backups, and performance tuning/monitoring. This is the benefit that Oracle Internet Directory provides.

Technorati Tags: , ,

Oracle's New Operating System Security Software

oralogo_small.gifAlways exciting to see a successful new product launch. Version 1.0 is always the hardest and I think you'll agree that the team has done a really good job on this one.

Basically we've automated the entire process of Unix and Linux account centralization.

Yes. I know. It's been technically possible to do this to some degree for a while and we in fact are leveraging existing, standard technology as part of this offering.

However, if it was so easy to do already, why do nearly all of the customers I speak with tell me that they've not completed such a project after years of trying? The answer: it's actually pretty hard to do right. Or rather, it was.

If you're using an environment with Linux (RedHat, OEL, SUSE), Solaris, HP-UX, and/or AIX systems, you can follow some simple directions that will:


  • automatically configure a directory server instance to hold your users,

  • migrate your users from files, NIS, or other LDAP directories, and

  • perform client configuration across your managed systems -- including SSL.


This last bit about SSL is particularly important, given that we've seen customers spend months trying to get this to work across all of their platforms due to the myriad of SSL implementations out there.

All of this is pre-certified end-to-end.

Find out more...

Technorati Tags: , , ,

About April 2008

This page contains all entries posted to Clayton Donley's Blog in April 2008. They are listed from oldest to newest.

April 4, 2008 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type and Oracle