Main | May 2008 »

April 2008 Archives

April 1, 2008

Two Billion Users...Why?

I was reading the comments area on Dave Kearn's recent article about the two-billion user benchmark executed against Oracle Internet Directory and thought I'd answer a few questions and make a few key points. 2billionplus.jpg

First, why would we even bother to benchmark a single server to handle such an "unrealistic" number of users in the first place?

The answer is that we get asked to do benchmarks in the 150m range all-the-time and have many customers in telecommunications, online gaming, and online services that easily exceed 50-100m user entries. Rather than continue to do these benchmarks one-at-a-time, it's simply more efficient to prove out that we can blow the doors off just about any number presented.

Don't need 2 billion users? The test shows linear scalability. Get yourself a lower-end box and you're good to go. The exact calculations are very well defined. Can't figure it out for yourself? Send me an email and I'll get someone to do the math for you.

Second, there's a continuing perception that directories are still the read-optimized, hierarchical, specialized data stores that they were 10-15 years ago when the main application for directories was white pages.

I'll be talking about this "need a specialized database" myth in my next post.

Technorati Tags: ,

Don't Band-Aid Your Identity Infrastructure

8CCADA1B-A8C2-4A54-BC06-210BB7A0A226.jpgMy 3-year old daughter is a big fan of band-aids. The band-aid in the picture is a particular favorite. For every real or perceived injury, a band-aid heals all.

Having delivered virtual directory software for quite some time, I've been asked to help deal with a lot of infrastructure wounds. Some of these are real issues that virtual directories can play a strategic role in solving. Others have been areas that are better suited for other identity management technology.

In these later cases, a virtual directory would have been a band-aid on a very serious problem, or maybe a band-aid on a problem that never existed in the first place.

My Favorite: My data sources are flakey.

This can be a real infrastructure issue, but it's rarely something a virtual directory is going to solve.

Why? Because you want your identity information fresh and secure. The virtual directory way of solving this problem would be with a cache. However, caches have several draw-backs:


  1. They get old. Identity information needs to be fresh. Do you really want accounts hanging around for hours after someone has been escorted out of the building by security? Some vendors try to solve this by keeping remote repositories in sync, but this just opens up the virtualization layer to all of the problems of meta-directory.
  2. They bypass security at the source. By using a cache, you are keeping a copy of the data in the virtual directory that will be shared by all users instead of understanding that different people may have different data available to them.
  3. They're difficult to manage. A cache adds to the total cost of ownership of your infrastructure. After all, you'll need to back it up, secure it, replicate it, distribute it, and otherwise think about it a lot. By keeping the virtualization layer simpler, it can often sit in a closet without much attention at all.

What are the Alternatives?

In many cases, no alternative is really needed. That band-aid may have added some psychological comfort, but it wasn't what healed the bruise.

How badly does your data source really perform? Why is that performance acceptable today? Do you have a better place to get that information?

I hear the concern about network links a lot, but in these cases the concern often doesn't reflect the fact that if a link is down, people often lose access to the applications using the identity as well.

Often there is a bit of cover-your-rear here, and vendors can play a part. It's a bit like taking your car in for service and hearing "your breaks may be okay, but maybe you should replace them before you get into an accident". You're less likely to question the source of the advice if it sounds like they're keeping you safe.

However, you may in fact have real issues with your data sources.

In these cases, the answer is almost invariably that it's cheaper and better to fix the source. It's much easier to add an extra directory replica or replicate your database than it is to have a long-term solution that involves a product being successfully able to detect changes in all of your infrastructure components.

In cases where this data needs to move around, provisioning is a mature technology with a lot of success in solving exactly this problem.

Technorati Tags: ,

The "Directory is a Read-Centric Database" Myth

Pop Quiz: Which of the following pieces of data don't belong in a directory server?

A. Username

B. Telephone Number

C. Favorite Color

D. Last Login Time

E08138C2-1E4C-4D2D-932F-97DEBD8B561E.jpgIf you said the answer was D, you've probably read a few good LDAP books from the 90's, when directories were all about white pages and "tuned for reads". This was the same period of time when Java was mostly about applets, if you'll recall (though I still see the old "Java is Slow" myth floating around).

Yes, directories are still used for white pages, but nobody buys them for that anymore.

The real value in directories is the ability to build powerful, user-aware enterprise applications that can share a single source for information about user identity. This means that while directories continue to need to be strong at fetching information quickly, there's also a need to be more flexible and less arbitrary about the kind of information that is stored in a directory.

Last login time, like bad password count and other attributes, is very useful to applications, but violates ancient, arbitrarily establish rules for what gets stored in a directory server (reads vs. writes).

So what does this have to do with read vs. write, flat vs. hierarchical, relational vs. embedded, etc...?

A big ding against Oracle Internet Directory back in the early days was that we used Oracle Database under the covers to store our data. The myth was that this was somehow going to underperform with reads and over-perform with writes (eh? over-performing?). Clearly with the recently posted benchmark, the underperform-with-read argument has been buried and attributes that require writes on login, presence, or location can be easily supported.

A second ding was that because directories were hierarchical, you needed an embedded data store in order to represent that hierarchy. I'd like someone to explain why B-Trees are so much more efficient than R-Trees at this kind of thing -- they're not. At the end of the day, nearly every directory represents the distinguished name as a single, normalized string and indexes it. Your performance is likely to be the same either way.

Now that we've moved well beyond the white pages phase, we need to start treating identity information with the same seriousness that we treat transactional information. This includes layering on real data-level security, secure backups, and performance tuning/monitoring. This is the benefit that Oracle Internet Directory provides.

Technorati Tags: , ,

Oracle's New Operating System Security Software

oralogo_small.gifAlways exciting to see a successful new product launch. Version 1.0 is always the hardest and I think you'll agree that the team has done a really good job on this one.

Basically we've automated the entire process of Unix and Linux account centralization.

Yes. I know. It's been technically possible to do this to some degree for a while and we in fact are leveraging existing, standard technology as part of this offering.

However, if it was so easy to do already, why do nearly all of the customers I speak with tell me that they've not completed such a project after years of trying? The answer: it's actually pretty hard to do right. Or rather, it was.

If you're using an environment with Linux (RedHat, OEL, SUSE), Solaris, HP-UX, and/or AIX systems, you can follow some simple directions that will:


  • automatically configure a directory server instance to hold your users,

  • migrate your users from files, NIS, or other LDAP directories, and

  • perform client configuration across your managed systems -- including SSL.


This last bit about SSL is particularly important, given that we've seen customers spend months trying to get this to work across all of their platforms due to the myriad of SSL implementations out there.

All of this is pre-certified end-to-end.

Find out more...

Technorati Tags: , , ,

April 4, 2008

Death of the Meta-Directory - Follow-up

rip-meta.jpgDave Kearns had posted this article earlier in March and there's been a number of follow-ups (including my favorite here). I sent him a quick note as part of a follow-up on it at the time, but thought I'd share the same with everyone.

1. Meta-directory really merged into the basic directory services layer and stopped being an independent category several years ago, with the exception of one noted vendor. Meta-directories were being used for provisioning before provisioning became process-centric with workflow and identity auditing requirements. Our product direction on provisioning is certainly focused on the next frontier -- seamless integration with enterprise applications, including separation of duties and role management.

2. Virtual directories are really a more application-centric technology that solves the problem of getting the right identities to the right applications in the right format. It's the difference between transforming existing data vs. building new identity stores. Most customers need both provisioning and virtualization technologies. This is probably why so many Microsoft Active Directory users are using Oracle Virtual Directory, given their admitted gap in this area.

Technorati Tags: ,

Simple iPhone LDAP Phone Book App

Picture 1.pngI spend about 2 hours every Saturday morning at a Starbucks in Cupertino waiting for my 5-year-old to finish Chinese language lessons.

The great thing about free time, Internet, a cup of coffee, and close proximity to Apple is that it gives you time and inspiration to play with interesting software. Lately for me, this has been the iPhone SDK.

Since a few of the teams I manage are responsible for Oracle Internet Directory and Oracle Virtual Directory, it only seems logical that one of the first things I did was write some code that can talk to these products natively. It took about 2 hours from start to finish, including getting an LDAP SDK working on Mac OS X w/ an ARM processor.

Screenshot 2008-04-04 09:48:30 -0700-1.pngScreenshot 2008-04-04 09:49:42 -0700-1.pngScreenshot 2008-04-04 09:49:49 -0700-1.png

Above is a few pictures of a simple LDAP phonebook client that I wrote using the SDK over my last few visits. The pics show it connecting to Oracle's corporate directory with a few buttons for dialing or emailing the resulting contact.

Certainly this is pretty straight-forward and there's a lot more innovative uses of identity information in a device like this. No immediate plans to release this particular tool, but I'd love to talk to other developers that are looking to do innovative things with identity information on mobile platforms, be it iPhone, Nokia/Symbian, or others...

Clayton

I'll be at RSA Security Conference next week (4/8-4/9). Drop me an email or ping me on twitter if you'll be there and want to meet up.

Technorati Tags: ,

Responses to the 2 Billion Entry OID Benchmark...

2billionplus.jpgWe got a lot of response to our benchmark on the Web and in our email bag. The responses were by the press, customers, and (most vocally) our competitors.

Regardless of where the responses came in from, the one thing that was obvious: people were excited. Certainly nobody was asking if directories were dead, that's for sure. Dave Kearns actually commented specifically about how infrequently he sees things like this in directory.

Not surprisingly, the customer camp loved this benchmark, as it confirms what they have already known and experienced, while giving them a fairly specific set of instructions to duplicate this type of result. It also goes hand in hand with our investment in virtual directory technology to show them that we're serious about having the best-in-breed product in this space.

In fact, the report was SO detailed and complete that it was obvious that the competition was going to try to find ways to discredit the big picture by focusing on the minute details. Since we made no attempt to compare apples to oranges or obscure the way testing was done, we're absolutely confident that this is not only a very valid benchmark, but one that has been more scrutinized than any before it.

Now to address the complaints from the competition. These fell into basically 3 major categories:

1. This is too many entries. Nobody needs this many!

I addressed this to some degree in my original blog post on the topic. This is just something our high-end customers want.

2. This isn't enough entries. We can do more!

I found this interesting, particularly because of who was most vocal about it: Howard Chu from Symas. I find this interesting because a benchmark I'm always asked to comment on is this one, which he did against Sun.

So basically on a system with 8 cores and 16GB of memory in that benchmark was able to scale to 10m entries with some reasonable performance. Now I see that this report is from 2006, but there's nothing newer posted on their site. Specifically, nothing with great detail about a 5 billion user benchmark on a quad processor server. Their own 2006 report also shows that the server isn't exactly scaling in a very linear way as you add entries. If you look at our previous benchmarks, this is absolutely the case with OID.

Another note is that there's no indication in his University of Michigan mailing list posting about the size of entries or other factors in his benchmark. For example, the entry sizes in one of our earlier 100m benchmarks (also published in great detail) was 120 attributes. A lot of scale is very dependent on I/O and clearly a small entry is easier to scale than a large one.

That said, the key point being made with our benchmark was not that someone else couldn't do a bigger one, but that we could do it without making the wrong kind of trade-offs. We're the only directory that can scale into the billions while still taking advantage of key enterprise data management features, such as secure backups, transparent data encryption, database vault, and other things that come with building on mature data management technology.

In fact, the idea that Oracle Database doesn't scale is pretty funny in itself. While LDAP benchmarks are completely unstandardized and generally use some tools built by Sun (SLAMD, for example), database benchmarking is very standardized and Oracle Database is a clear leader in this area, not only in scalability, but price per TPC-C.

3. The benchmark isn't realistic enough

Jonathan Gershater from Sun's directory team has a very thoughtful blog posting about our results. Not quite as dramatic and feisty as Howard Chu, but certainly equally skeptical.

He basically has three points to make:

1. Turning off change logs helped our results

Yes, we did, but this is actually a best practice for most directories during data load, as you'd simply load the initial set on each of the main servers and then turn on replication rather than load the entries in replicas via LDAP replication. Changelog was disabled more to avoid a large accumulation of changes during the various repeated runs, but it may have helped incrementally in the 'modify' tests only.

2. Password policy disabled helped our results

Not quite. Password policy being disabled helped bulkload incrementally, no impact on any other LDAP operation result since only failed bind/compare operations and password updates take an incremental hit. These were turned off primarily to make sure we could use the entries being generated by the Sun benchmarking tool.

3. The queries generated weren't realistic

I disagree. In a 2 billion user environment, you're generally talking about groups of active users and groups of less active users (and even some inactive users). A better test would have been to make 10% of these requests completely random, which I understand a new SLAMD beta is able to generate.

Even then, if one were to cut the number of responses in half in order to make every single user truly random, the results remain outstanding with hundreds of millions of operations every hour.

In Summary

The benchmark is out there. It's a valid benchmark. Certainly people are going to do their own benchmarks and competitors will always find a flaw in any benchmark. If there's one thing we're looking to do, it's publish some of the typical benchmarks we do on every-day hardware as part of each release. And we're going to continue to do it in an above-board way with specific details as we did with this report in order to help our customers size and scale this software in the best possible way.

Technorati Tags: , ,

April 8, 2008

Will the Real Virtual Directory Please Stand Up?

I was reading this posting from my friend and colleague, Phil Hunt, in which he talks about the ongoing discussion between Dave Kearns and Kim Cameron about the death of meta-directories.

Not only is he correct in pointing out that Kim's definition of Meta 2.0 is exactly what virtual directory has been since 1.0, but it's interesting to see that some virtual directory vendors continue to push something that looks very much like meta-directory 1.0.

This something is persistent cache and is a code word for a locally stored meta-verse, which is basically contains a roll-up of all the remote data. I can assure you that Kim Cameron knows exactly what a meta-verse is...he probably invented it at Zoom-it.

Regardless of whether you call it a meta-verse or a persistent cache, the same mechanics apply. You're in the business of synchronizing data between repositories, with all he problems that arise from that.

Glad Microsoft now realizes that virtual directory (or as they call it, meta-direcory 2.0) is the way to go. Wish we could all agree on some standard terminology for these kinds of things, though. It would really help the customers.

Technorati Tags: ,

LDAP to NIS Gateway?

For those of you that don't like reading press releases and wonder what all he buzz is about our recently announced Service Oriented Security offerings, Tony Baer from OnStrategies has a great roll-up here.

Nishant Kaushik, one of our key architects, does a fantastic job of boiling down this announcement as follows:

SOS covers the four stages of an application lifecycle - development, deployment, administration and governance. With SOS, organizations can now centralize and externalize security solutions as part of a flexible security architecture. Recent identity related efforts like the Identity Governance Framework are also part of this architecture, providing the ability to deliver privacy-aware applications.

Certainly the key take-away for customers and those building everything from the next bank portal to the next Flickr is that you can get cohesive identity management as a service today, so you're better off crafting your value on top of these services rather than trying to do a better job with identity management fundamentals.

I'm in Seattle for my brother's wedding, so was unable to see Thomas' talk at RSA yesterday. I would love to get some email with first-hand accounts and feedback.

Technorati Tags: ,

April 11, 2008

Service Oriented Security - Oracle @ RSA

For those of you that don't like reading press releases and wonder what all he buzz is about our recently announced Service Oriented Security offerings, Tony Baer from OnStrategies has a great roll-up here.

Nishant Kaushik, one of our key architects, does a fantastic job of boiling down this announcement as follows:

SOS covers the four stages of an application lifecycle - development, deployment, administration and governance. With SOS, organizations can now centralize and externalize security solutions as part of a flexible security architecture. Recent identity related efforts like the Identity Governance Framework are also part of this architecture, providing the ability to deliver privacy-aware applications.

Certainly the key take-away for customers and those building everything from the next bank portal to the next Flickr is that you can get cohesive identity management as a service today, so you're better off crafting your value on top of these services rather than trying to do a better job with identity management fundamentals.

I'm in Seattle for my brother's wedding, so was unable to see Thomas' talk at RSA yesterday. I would love to get some email with first-hand accounts and feedback.

Technorati Tags: ,

April 15, 2008

Virtual, Meta, and Identity Buses -- Oh My!

Wow. Lots of good discussion lately on both the identity services topic as well as general directory services (meta, virtual, and the like).

I'll start with Jeff Bohren's AD as the elephant in the room post.

I agree with Jeff that Active Directory is almost always a source for internal user information. I also agree that LDAP is everywhere (having written a book on LDAP and one of the early Perl interfaces to it along-side Netscape, I MAY be biased on this point).

What I think his post misses is the fact that most LDAP access in most applications is poorly written, even when using ADSI or ADO to talk natively to Active Directory. I can't count the number of virtual directory deployments that we've sold to help customers in environments that were nearly 100% Microsoft (ADO/ADSI-enabled apps talking to Microsoft AD). Many of these deployments were to get around bad schema assumptions, others were to get around topology issues or forest boundary issues.

While we sell virtual directory technology, we hate making our customers pay money to solve such tactical issues. We want to be layering on higher-order value.

So when Phil Hunt or others talk about the Liberty IGF project, what they're really saying is that we want a better way to give application developers a way to code something in a way they understand and can do well rather than a native access protocol that requires specialization. So while LDAP isn't going away and everything from virtual directories to identity buses will need to support native access over LDAP to be successful, not looking at what developers are learning and using every day would be a mistake.

Keep in mind that developers must integrate with a LOT of technologies to build an enterprise application or portal. For example, a portal may be integrating with HR, CRM, and ERP systems. That integration is increasingly happening via web services. Giving these developers a mandate to use a completely different type of technology to integrate identity will only make identity more specialized and less standardized and understood over time. That is a recipe for disaster.

And as for Active Directory itself being the center of the universe? Hardly. While it may be the center for usernames, emails, and passwords of internal users for most enterprises, it is not as popular for extranets and Internet facing users.

As for Kim Cameron's post on second generation meta-directory and the identity bus, he knows better than me what the original ZoomIt product was capable of (and oddly enough, I've heard enough rants in the past from Phil Hunt about how much he loved that product that I'm inclined to agree that it could in fact, do these things).

Here are Kim's three key requirements for an identity bus:

  • By "next generation application" I mean applications based on web service protocols. Our directories need to integrate completely into the web services fabric, and application developers must to be able to interact with them without knowing LDAP.
  • Developers and users need places they can go to query for "core attributes". They must be able to use those attributes to "locate" object metadata. Having done so, applications need to be able to understand what the known information content of the object is, and how they can reach it.
  • Applications need to be able to register the information fields they can serve up.
  • His first point is exactly the same as my point above. LDAP is great. LDAP is ubiquitous. LDAP is not, however, the future of identity access.

    On the second point, that place today can be a directory, database, web service, or just about anything else -- and usually more than one of these. The big issue for developers and IT organizations is that it's hard to predict where this data will live by the time an app goes into production, so some abstraction service must exist. I'll disagree with Kim here and say that real virtual directories do an EXCELLENT job at navigating these complexities by giving application developers a sort-of identity dial-tone. Make one call and get your full identity. We even have people who do this over web services rather than web services for some applications, but there needs to be more standardization here.

    On the final point, I'll go further. It's a two-way street. Applications need to register their identity needs and repositories need a way to have their available attributes (and policies on those attributes) discoverable. Only then will supply and demand be accurately mapped, allowing services (whether based on IGF or an identity bus model) to thrive.


    Technorati Tags:
    , , ,

    April 21, 2008

    Secure Coding Practices and Web 2.0 Security

    I'm not sure how I missed Mary Ann Davidson's original blog posting on the subject of making fixing security by fixing how developers learn to write software (and much more), but I came across Dennis Howlett's response to it on ZDNet recently. Both postings are on the long side, but are must reads if you are involved in enterprise software as a creator or consumer.

    By a coincidence I also received an email from a colleague about a short white paper from HP covering common Web 2.0 security flaws. It's more an overview than a guide, but provides a nice overview of issues, such as cross-site scripting, that may not be familiar to developers that lack knowledge of core security concepts. The white paper is available here (after a very detailed registration process), but to tie back to the articles above, nearly all of these flaws can be avoided with the right developer mindset, training, and processes.

    Some problems will go away as we sediment complexity into lower layers, but the days of developers writing code with obviously poor security will only come to an end when we can fundamentally change the way security is written into applications in the first place.


    Technorati Tags:

    The Cuckoo's Egg Revisited

    Ah. The Cuckoo's Egg. The first non-fiction computer security book I ever read. Even saw the author (Cliff Stoll) give a talk at a local college 10+ years ago.

    I was reminded of this book by a great conversation at our Customer Advisory Board last week.

    For those of you who haven't read it, the basic idea is that the author, a part time IT administrator, finds a 75 cent billing discrepancy between two audit systems. Rather than write this off as computer error and move on, he discovers that a user that is on sabbatical used the system and one of the system accounting records for that access was intentionally deleted. From there, the book reads like a spy novel as the author tracks the hackers "in the early days" before most people thought of this sort of thing.

    Certainly while systems were compromised in the same way that systems are still compromised nearly 20 years later, basic security processes and practices have changed significantly. Identity Management certainly gives much more control over the management of inactive accounts, as well as better enforcement of good password policies that make it more difficult for password cracking tools to be so effective.

    I would love to get email with your IT security war stories that illustrate security then-and-now. I have a few of my own that I'll be sharing as well.

    Technorati Tags:

    April 22, 2008

    AmTrust Bank Talks about Centralizing Database Authentication

    AmTrust Bank packed the room at Oracle OpenWorld, but thankfully the web's a little larger and has more comfortable chairs.

    There will be a Webcast on May 1st at 1pm EDT (10am PDT) that reprises the original session and includes some new material.

    Follow this link to register for the webcast, which includes K.P. Singh and Peter Dinin of AmTrust and Forest Yin of Oracle talking about centralizing database accounts using the Oracle Database's Enterprise User Security (EUS) functionality together with Oracle Identity Management.


    Technorati Tags:
    , , ,

    Group Accounts and Lab Servers - How a Dating Service Took Out the Network

    Having just mentioned "The Cuckoo's Egg," I thought I'd share my first IT security experience. I started my career at a large enterprise on a team managing networks of servers and workstations from vendors like Sun, HP, Motorola, and the like. The events below took place in the early 90's.

    User accounts were centralized using NIS (in some cases exported to files and distributed to individual machines) with home grown tools for doing everything from adding/removing user accounts to backing up servers.

    Since we were a high-tech manufacturing company, we had many labs that contained specialized servers for testing. These specialized servers were generally wide open, with a large number of people holding privileged accounts (e.g. root). The lab machines were, of course, connected to the main network.

    At the same time, many of the tools used on various servers required shared access, which was done through the use of group accounts. Since many of these tools were run by commands that would remote shell using that group account, it was typical for these accounts to allow direct access (i.e. without using commands like SU).

    It should be pretty obvious after the last two paragraphs that we were set up for a train wreck. This train wreck was triggered by something unexpected:

    A Dating Service

    Needless to say, someone at the company had apparently had an extremely bad experience with a dating service called "Heart to Heart". Rather than call the better business bureau or tell his friends to avoid the service, he (or she) decided to send everyone in the company an email with the simple phrase:

    "Avoid Heart to Heart"

    The email was sent using a group account on a Sun server running SunOS 4.0.3. The connection to that server was made from an open HP lab server. The connection to the open lab server came from another open lab server in another city and in another division.

    All of the audit logs were enabled, but all of them simply logged that root or a group user had logged in and done some work. At no point was anything traceable to the user.

    The result?

    Because of the way the email was sent (large to lists, rather than bcc), large number of vacation mail messages were triggered that went back to the group account, which in fact had mail forwarding set up to the rather large group of people that had access to the account. This in turn triggered lots of other individual vacation mails, autoresponders, "bots", and so forth from every person on that list back to the same wide distribution list.

    Within about 15 minutes, the entire email system was choking and it took hours to get things back to normal.

    It could have been worse!

    Ok, so technically the dating service itself didn't take out the network. We tightened things up significantly from that point on. I had no security responsibilities at the time and was not at fault, but the experience has stayed with me since.

    If the person had been more upset with his or her employer than with a dating service, what untraceable havoc could have been caused? Probably a lot worse.

    So I'll just leave this as a cautionary story to those of you who are in environments where only "the important systems" are under identity management. Lab servers, group accounts, and similar gaps reduce or remove accountability and can compromise the rest of your network.

    Oh, and we can help. :-)


    Technorati Tags:
    ,

    About April 2008

    This page contains all entries posted to Clayton Donley's Blog in April 2008. They are listed from oldest to newest.

    May 2008 is the next archive.

    Many more can be found on the main index page or by looking through the archives.

    Powered by
    Movable Type and Oracle