Main

Identity Bus Archives

May 10, 2008

Re: Talking about the Identity Bus

Kim Cameron of Microsoft makes a pitch for why Metadirectory is still relevant, or at least why data needs to live in multiple places.

One key element of this argument is that when combining transactional data with identity data, you're not likely to do the required data joining across remote systems.

Compare this to what happens if all the information necessary to respond to a query is present locally in a single database. I just do a "join" across the tables, and the SQL engine understands exactly how to optimize the query so the result involves little computing power and "even less time". Indexes are used and distributions of values well understood: many thousands of really smart people have been working on these optimizations in many companies for the last 40 years.

He's right and this is a really important point. The data used in this example absolutely should live in a repository where it can be locally joined.

How does the data get from point A to point B in this example? Which of these points is the starting point? Does this data actually originate first in point C? Do these repositories have the same representation for the given data elements?

Certain very widely used data is likely to be in multiple systems and has a relatively low rate of change that doesn't cause much of an issue for any of the usual means of getting it there. Such information might include unique identifiers, names, department, job code, email addresses, and the like.

In Kim's example, it would not be unlikely to do a join against an employee number, department, or other information. In the same way, it would be highly unlikely that this join would be done with a password, data from CRM, and other such data.

The real problem today is that synchronization is so loosely-coupled. This is unlike replication, where it's become relatively easy to recover from failure and the mechanism involved in moving data knows exactly how to deal with both ends of the data movement connection.

As applications become better at pushing their changes, rather than depend on provisioning and meta-directory systems to do deltas against their databases, we'll see much of this problem become greatly simplified.

At that point, instead of the value being in how tightly you can make your connections and move changes, the value is in what you can do with those changes. Can you use those changes to trigger workflow? Can you apply business policy against those changes? Can you centrally audit and do reporting against those changes?

This higher order value is exactly what customers look for in provisioning.

The identity bus itself will be a mix of common publish/subscribe style data movement and virtualization that will provide the identity views that minimize the overall level of data movement through the system.

Technorati Tags: , , ,

May 11, 2008

I'm Sorry Dave, I'm afraid I can't do that...

Dave Kearns has followed up on Kim Cameron's posting from Friday.

  1. Kim says that sometimes you need to copy data in order to join it with other data
  2. Dave says the same thing, except indicates that you wouldn't copy the data but just use "certain virtual directory functionality"

Actually, in #2, that functionality would likely be persistent cache, which if you look under the covers is exactly the same as a meta-directory in that it will copy data locally. In fact, the data may even be stored (again!) in a relational database (SQLServer in the Radiant Logic example he provides).

Let's use laser focus and only look at Kim's example of joining purchase orders with user identity.

Let's face it. Most applications aren't designed to go to one database when you're dealing solely with transactional data and another database when you're dealing with a combination of transactional data and identities.

If we model this through the virtual directory and indicate that every time an application joins purchase orders and identities that it does so (even via SQL instead of LDAP) through the virtual directory, you've now said the following:

  1. You're okay with re-modelling all of these data relationships in a virtual directory -- even those representing purchase order information.
  2. You're okay with moving a lot of identity AND transactional information into a virtual directory's local database.
  3. You're okay with making this environment scalable and available for those applications.

Unfortunately, this doesn't really hold up. There are a lot more issues, but even after just these first three (or even the first one) you begin to realize that while virtual directory makes sense for identity, it may not make sense as the ONLY way to get identity. I think the same thing goes for an identity hub that ONLY thinks in terms of virtualization.

The real solution here is a combination of virtualization with more standardized publish/subscribe for delivery of changes. This gets us away from this ad-hoc change discovery that makes meta-directories miserable, while ensuring that the data gets where it needs to go for transactions within an application.

Technorati Tags: , , ,

Dave and Vikas Hop on the Right Bus

While I may not agree that doing SQL through your virtual directory to get access to combined views of transactions and identity information is the right way to go (and I think Dave really wasn't trying to say that anyway), but...

I absolutely DO agree with Dave (and Vikas Mahajan) that there's no reason we should be building additional infrastructure around moving identity around vs. moving any other data around.

Let's keep in mind that a bus can move any arbitrary object from A to B or even A to B, C, and D.

The trick is to make sure that all of these points understand the object being passed between those points.

Just as multiple LDAP-enabled applications need to understand the same schema, multiple parties publishing/subscribing to a queue will need to understand the same messages.

This is true even though each application may only need a slice of that identity data. The overall structure of what is being shared. The Identity Governance Framework (IGF) actually gives you a standard way of defining the attributes present in a message you could accept/publish. It even provides a place for defining which attributes might be used as keys by your particular application, which helps in the previous discussion re: joins.

If we agreed to use IGF's CARML representation to define the attributes that would be present/required by an application and agree on what representation will be used to encapsulate those attributes, all you need is a standard message bus.

Of course, the question then becomes, who will take the messages off the bus and send updates to legacy applications and who will take updates from legacy applications and push them onto the bus in the first place.

This is where identity services come into play. Like virtual directory, they're simply moving data from one context to another so that everyone else doesn't need to adapt to the legacy environment and legacy environments don't have to adapt to each other.

Maybe I could ask my friend and colleague, Phil Hunt, to spare some time to post a quick example of how this looks in real life.

Technorati Tags: , , ,

About Identity Bus

This page contains an archive of all entries posted to Clayton Donley's Blog in the Identity Bus category. They are listed from oldest to newest.

Identity 2.0 is the previous category.

Identity Management is the next category.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type and Oracle