Ten Requirements for Achieving Collaboration #7: Tracking the Change and Evolution of Information

We are in the midst of a series investigating collaboration. We previously wrote about the two types of collaboration - intentional and accidental.
INTENTIONAL: where we get together to achieve a goal and
ACCIDENTAL: where you interact with something of mine and I am never aware of your interaction
While intentional collaboration is good it is not where the bulk of untapped collaborative potential lies. Accidental collaboration is. But the challenge is to intentionally facilitate accidental collaboration. For the full list of 10 requirements see the original post. Last time I wrote about requirement #6: Data Accessibility for People and Computers. This time we will talk about the importance of keeping track of how the information changes over time.

No information systems are static. Information is continuously being added, removed and changed in the systems. Even records and governance systems that provide "immutable storage" for information assets are not static when considered from the system view. Accessing the system changes it. If nothing else, a new access record is logged. In many ways such feedback intensive systems are intrinsic to the human experience. It is no wonder that these cybernetic characteristics penetrate our information systems. But we still need to take advantage of them.
Consider the simple access log example. We do not merely access the system, we access some information in the system. When we track what item was accessed and graph those accesses over time it changes the information in the context of the system. While the binary information object itself may not be altered, the pattern of access over time yields valuable information. It lets us know that the item is popular, important or unpopular or unimportant.

The more robust the tracking, the more information emerges from the feedback pattern of access. If who has accessed is tracked then this information can be mashed up with business identity management system information or other web logging information. The emerging pattern of access lets us infer not only first degree conclusions like "Document 22 was important to Billy during the week of January 4" but also secondary inferences. Secondary inferences are identification of likely patterns; "Document 22 is likely important to other people in Billy's role." and "Document 22 is likely important to other people who have a similar browsing history as Billy." When such information is tracked and put into a larger context pattern, new intelligence emerges. Acting on that intelligence keeps organizations agile and proactive rather than reactive. If who has accessed is tracked and mashed up with identity management systems and also mashed up with other corporate information systems then tertiary and higher order inferences are yielded. Examples include patterns such as, "Document 22 is likely materially significant because it was accessed more frequently by people like Billy during the week of January 4 which was a closed communications period for our company". By delivering this information accidental collaboration is spurred. The graph patterns that emerge from the assembly of tracked and logged accesses indicate a sharing of information of which the individual participants were unaware. Predictive delivery systems can use those patterns to deliver that content to others who may be interested based on their browsing behavior or organizational roles.

At each level of inference the patterns introduce more degrees of probability and fewer degrees of certainty. However, they are strengthened by statistical correlations that emerge from the tracking then aggregation of many disparate points. From a single, insignificant tracking log entry, we are able to use that data as an informant that lets us aggregate many other seemingly insignificant points and produce/reveal a very significant pattern. This is nothing short of making an unstructured information hypercube designed to yield actionable business intelligence.
The key there is the term "actionable". After all, we are after accidental collaboration, that is collaboration across time despite participants not necessarily being aware of each other, not merely more information. The first key is to realize that actionable information lives not just in documents but also in document management systems; in the interactions of users with that system.

The second key is to understand the ways in which documents themselves change over time. Most full featured ECM suites like Oracle ECM have the ability to store full versions of content items. Many times, though, these revisions are simply archived. While available to easy access, there is little thought given to mining them for information. The assumption is that the latest and greatest versions are the ones with the most value. Anything that was worth from older revisions would have made it into the newest versions, or so the thinking goes. And this thinking is not without merit. But what often gets overlooked is the way in which documents or web pages or images change over time. Graphing out changes from version 1 to version 5 over time can reveal extremely interesting information to us. It can show us a maturation in our understanding of a topic, in our communication approaches or in the emphasis that is important in a sequence. Google Wave has an interesting "playback" feature over the course of a wave (which, for the uninitiated, is similar to a discussion thread where all kinds of assets like documents, chats, emails, images etc can be threaded together into a single "wave"). This is a very useful feature because it lets the participants visualize the evolution of a discussion.

Similarly, being able to graph the trends of metadata evolution can be very useful. Content metadata is becoming more and more of a driving force behind how content is used in applications, web sites and for records management purposes. Having an understanding of how those metadata attributes change over time or by what kinds of roles reveals interaction patterns, awareness trends and usage trends in much the same way that the logging analysis described above does. The critical difference with metadata change awareness is that the explicit intentionality of the changes are implicit. This means that we are able to confidently assert that someone intended to make a change as opposed to accidentally stumbled upon a page.

Folksonomy and tagging and tag clouds are also very dynamic change environments where an understanding of that change is incredibly useful but often overlooked. Watching how tag clouds grow and shrink and change over time can be a very revealing exercise. I would love to see something akin to a time-lapse animation of a popular tag cloud like Flickr or Digg. While most terms will stay the same for very stable content sets, terms that "pop" in and out of existence like so many sub-atomic particles in a bubble chamber show trends and ideas that are in popular conscience at the time. While real-time search is a popular topic of discussion today with faceting and tagging capabilities of Twitter, seeing what was popular in last-week's real time is very valuable as it will influence and often predict what will be popular this week. For businesses, this is key market and customer intelligence that is available but still largely resting under the surface of their information environment.

I suspect that it will take several big wins from visionary organizations that simply obliterate their competition in pattern analysis and trend spotting before such capabilities go fully mainstream. Still there is hope. Both the 2009 Forrester Wave for ECM Suites (PDF) and the 2009 Gartner Magic Quadrant for ECM call out business intelligence and analytics capabilities as key factors for consideration. However, even here thinking tends to stay in the box with BI and Analytics meaning large data warehouses and web logging instead of text analytics, entity extraction, pattern analysis and information evolution awareness.

Next time we will continue the series investigating requirement #8 on the changing patterns of the *relationships* between data to information artifact, information artifact to context and context to behavior, #9 on understanding and leveraging information and data creation patterns and finally #10 on how all of the above must be made available back to the end users be they people or computers in context sensitive and persuasive ways so that, ultimately, intentional and accidental collaboration are achieved in the organization.


Your article describes a profound change in our understanding of how enterprise content management systems work. For many years, ECM systems have been mere storage and retrieval systems. The data buried in access logs was a side effect of system operations. It was an audit trail used to troubleshoot system problems or security breaches. It was often stored in an offline fashion and accessible only to system administrators. But if we want actionable business intelligence, the usage data stored in these logs becomes vital to the operation of the system. It creates feedback loops that bubble important information to the surface. It also provides insight as to which features of the tools are used and whether the system is delivering ROI. It has huge implications for the way these systems are designed and implemented.

Posted by Dean Thrasher on November 25, 2009 at 12:52 AM CST #

Post a Comment:
Comments are closed for this entry.

Enterprise 2.0 and Content Management


« March 2015