Re: Talking about the Identity Bus
By Clayton on May 10, 2008
Kim Cameron of Microsoft makes a pitch for why Metadirectory is still relevant, or at least why data needs to live in multiple places.
One key element of this argument is that when combining transactional data with identity data, you're not likely to do the required data joining across remote systems.
Compare this to what happens if all the information necessary to respond to a query is present locally in a single database. I just do a "join" across the tables, and the SQL engine understands exactly how to optimize the query so the result involves little computing power and "even less time". Indexes are used and distributions of values well understood: many thousands of really smart people have been working on these optimizations in many companies for the last 40 years.
He's right and this is a really important point. The data used in this example absolutely should live in a repository where it can be locally joined.
How does the data get from point A to point B in this example? Which of these points is the starting point? Does this data actually originate first in point C? Do these repositories have the same representation for the given data elements?
Certain very widely used data is likely to be in multiple systems and has a relatively low rate of change that doesn't cause much of an issue for any of the usual means of getting it there. Such information might include unique identifiers, names, department, job code, email addresses, and the like.
In Kim's example, it would not be unlikely to do a join against an employee number, department, or other information. In the same way, it would be highly unlikely that this join would be done with a password, data from CRM, and other such data.
The real problem today is that synchronization is so loosely-coupled. This is unlike replication, where it's become relatively easy to recover from failure and the mechanism involved in moving data knows exactly how to deal with both ends of the data movement connection.
As applications become better at pushing their changes, rather than depend on provisioning and meta-directory systems to do deltas against their databases, we'll see much of this problem become greatly simplified.
At that point, instead of the value being in how tightly you can make your connections and move changes, the value is in what you can do with those changes. Can you use those changes to trigger workflow? Can you apply business policy against those changes? Can you centrally audit and do reporting against those changes?
This higher order value is exactly what customers look for in provisioning.
The identity bus itself will be a mix of common publish/subscribe style data movement and virtualization that will provide the identity views that minimize the overall level of data movement through the system.