Ten Requirements for Achieving Collaboration #3: Usage & Context Patterns
By billy.cripe on Aug 20, 2009
We are in the midst of a series investigating intentional and accidental collaboration:
INTENTIONAL: where we get together to achieve a goal and
ACCIDENTAL: where you interact with something of mine and I am never aware of your interaction
This week we continue the series investigating requirement #3 which continues on the aggregation theme: Usage and context patterns must be able to be automatically created, extracted, enhanced and preserved. It bears repeating that the reason we're spending so much time on aggregation is because it is in the aggregate that patterns and meta-patterns emerge that provide real intelligence that simply cannot be seen when looking at a single object. It is the difference between spotting "striking similarities" in two sets of DNA then pulling back and seeing that one belongs to a person and the other a cabbage. While it should be obvious now that data trapped in content items is ripe for the picking and clustering, there is still more information that should neither be missed nor dismissed. This is the transactional data, the usage data, the patterns of purpose that help us understand where and how and why some items are consumed and others are not. To be sure, traditional business intelligence and data warehousing has been working in this area for quite some time. Whether we are talking about web site analytics or OLAP cube reporting, the analysis of transactional information is mature. That is good for us. I wrote last year in BEye Network magazine that the traditional Business Intelligence focus on transactional and data warehouse data limited it to the "digital dark matter" of the information universe.But that does not mean that we cannot take the sophistication of BI tools and approaches and apply them to unstructured information.
The low-hanging fruit for this kind of application is the transactional information that is related to unstructured information: Access reports, search hits, download tallies, re-use information. When we can track who has looked at a document, when they looked at it, and what else they were looking at in the same time period, a picture emerges. We start to understand, by inference, what they wanted. Furthermore, if we look at their user profile and compare it to similar profiles of other users and then look at what they accessed for in the same time slice, or what they accessed when similar results appear in their search history we can begin to create a graph of potentially related items. Providing these items back to the original user has the effect of predicatively promoting potentially persuasive information. If we take the thought experiment a step further, we can see how I can generate a graph of my interests over time. Because this is in an enterprise setting, my interests correlate to my project and task activities. I may be tasked with putting together a sales kit about a new product. My search history, content access reports and IP address in corporate web site tracking engines will all indicate a cluster of activity centered around a general topic (e.g. Portals), a product suite (e.g. WebCenter), market terms (e.g. "Enterprise 2.0", "Mashups", "Composite Applications"). Taken together and graphed against a baseline of all other activity for a time slice, a series of humps and spikes emerges. The spikes show where I went deep into a topic or subject area by accessing a lot of information that fits the same general category (how do we know it fits the same category...? Now you should be seeing how this week's topic fits in with the previous topics...).The humps in the graph represent where I started to access information that clustered around certain topics, but I never went deep into those areas. The goal would be to use the combined topical clusters with the access clusters of other people similar to myself either in interest, or access or defined user profile or even job role. Promoting predicatively persuasive content items that I had not yet accessed (because I was presumably not even aware of them) helps to drive those interest humps past the "tipping point" and turn them into interest spikes.
The applications for scenarios such as this in the online retail space are profound. But think of the enterprise space. Employees spend a huge amount of time simply looking for information. That time wasted, that productivity and efficiency lost costs accounts for over 50% of staff costs according to this Butler Group study. The enterprise search companies have made a lot of noise about this but they represent only part of the answer. Search is how you may get there but that is of little value if you don't know where "there" is or you want to know the "best" way to get there rather than the 200 million different ways you could get there.
What remains true is that knowledge workers want to have the best data available to them in conveniently consumable content items so that they can take that information and apply it as knowledge to solve the problems and tasks they have selected, been assigned or are responsible for.
The need to track, map and graph usage patterns reveals some important considerations when it comes to information technology itself. For the tracking to be successful, the accesses and searches and downloads and transactions must be granular enough to be meaningful. It is no good if an access report cannot distinguish between a search hit, a view and a download. This is where the importance and value of an Enterprise Content Management System is very evident. Even if the content stores are spread throughout the organization, contain heterogeneous information (documents, web pages, images, compound documents, XML, videos etc) an ECM system based on a service oriented architecture is imperative. The SOA layer must be where the services are the true transaction brokers rather than an abstraction - translation layer on top of hidden core processing. Only then will usage patterns emerge that are granular enough to provide the relevant intelligence that can be mapped as an extra dimension on top of previously extracted data and topic clusters.
Remember too that the gathering and graphic of topic and usage clusters is the fundamental starting point for accidental collaboration. In order to facilitate the sharing of corporate wisdom across barriers that prevent normal intentional and social collaboration, we must have a solid foundation of data describing what the thing is as well as how the thing is used. When we graph those against our intent we start to create a wisdom ecosystem where intentional as well as accidental collaboration are the norm.
Next week we will continue the series investigating requirement #4 where the human element is brought back in: Human revision, annotations and classifications of information must be enabled and preserved.