Identity oracles and derived attributes
By yvonne on Nov 12, 2007
There has been some great discussion of late about identity oracles and the need to present
derived/obfuscated identity attributes and whether derived attributes offer greater privacy.
Personally, I am not yet convinced of the business model for a consumer-facing identity oracle. Perhaps I'm a cheapskate, but I can't see paying, say, $5 a month to a service to hand out my identity attributes to save me the trouble of typing them in. Chances are, any given website will have some unusual attribute that my identity oracle didn't think of, and in any case I prefer to type my information in to each website myself, because I tend to provide different, erroneous info to each site whenever I can get away with it. I guess that makes me a paranoid cheapskate!
I think the attribute provider concept makes a lot more sense when there is an entity that already has information about a person and which would be authoritative for a particular attribute and authorized (not necessarily by the user) to hand that attribute out, possibly in an obfuscated form. A government might be an authoritative source for whether someone is a citizen, and obliged to respond to another government to that effect. An employer might be an authoritative source for whether someone is employed and obligated to respond to a bank loan officer to that effect. A credit agency might be an authoritative source on someone's credit rating, and obligated to report that to a credit card company. The user doesn't get to intercept, alter or halt any of this, because the entities which need the information trust these authoritative sources more than they trust the user. Of course, this kind of provider would offer up only a limited set of attributes.
For either type of provider, be it identity oracle or limited attribute provider, there are operational and support considerations related to derived or obfuscated attributes. For example, it is important to think about how to troubleshoot issues with derived data when something goes wrong. A deployment I was involved in a couple of years ago is a good example.
A particular employee-facing application was provided by an outsourced application service provider. Federation using Liberty protocols was implemented for authentication between the application service provider and the employer. The authorization requirements for the application stated that some pages were open to all employees, and some pages were restricted to employees who were citizens and either above a certain level within the company, or active shareholders. The service provider didn't need to know a particular user's citizenship status or job level. The service provider really just needed a "YES" or "NO" in terms of eligibility for the special content. So it was determined that the equivalent of a "YES" or "NO" would be made known to the service provider instead of the raw data attributes.
Unfortunately, the raw data from which eligibility (the "YES" or "NO") was determined came from several different source systems, at least one of which was another outsourced system. Imagine that a user goes to the outsourced application, and doesn't get the "special" page which requires eligibility, and calls support to complain because they feel they've been erroneously prevented from viewing the page. It's not feasible to expect that the support desk folks would a) know where the derived "YES" or "NO" came from, and then b) have the ability to go out to the individual source systems to see which one caused the erroneous "NO" to be derived. (Which would be necessary to troubleshoot the problem)
In this case, we collected the salient facts from the different source systems (each of which of course was updated on a different time schedule) and stored the raw data attributes in a holding pen "behind" our identity service, to facilitate trouble shooting. Not a great solution perhaps, but better than nothing.
Now, imagine that Application Service Provider (ASP) #1 needs a derived value based on source data items x,y,z. ASP#2 needs a derived value based on source data items a,b,c, from other systems. ASP#3 needs a derived value based on source data items x,a,g,m This would start to be a bit of a maintenance headache keeping all these source data items in order to provide a derived data field to each ASP. I'm not suggesting that the raw attributes should be handed out. On the contrary, I think derived or obfuscated attributes are better, but they may be more work than imagined, when you consider how to support and troubleshoot issues with such systems.
If the raw data is simple (just a number for age) and stored in the identity service already, and your derived data is simply a quantitative comparison (e.g. is the age number > 21) then things aren't so complicated. However, if the source data comes from multiple places and the derivation requires some logic, then you need to think about how support people, with limited access, could start to troubleshoot things when something goes