Ten Requirements for Achieving Collaboration #2: Automatic Aggregation

companion cube
We are in the midst of a series investigating collaboration. I previously wrote about the two types of collaboration - intentional and accidental.
INTENTIONAL: where we get together to achieve a goal and
ACCIDENTAL: where you interact with something of mine and I am never aware of your interaction

While intentional collaboration is good it is not where the bulk of untapped collaborative potential lies. Accidental collaboration is. But the challenge is to *intentionally* facilitate accidental collaboration. For the full list of 10 requirements see the original post. Last week I wrote about requirement #1: how Human oriented information AND machine oriented information must be *able to be* seamlessly combined.

This week we move on to Automatic Aggregation. Specifically how topic, knowledge and concept clusters must be able to be automatically created, extracted, enhanced and preserved in order to ultimately achieve accidental collaboration. The importance of aggregation is not a new idea. In his book, The Wisdom of Crowds, James Surowiecki notes that if you have diverse information, independence of the actors/creators, decentralization and can bring that kind of information together in an aggregation then you can get astoundingly good decisions/performance/efficiencies/wisdom.

Keep that thought in the back of your mind for a moment.

AIIM found in their recent Collaboration and Enterprise 2.0 report that 71% of respondents thought it easier to find information on the web than in internal systems. While this is a damning indictment of the state of enterprise IT and information management, it more importantly shows that the public web model of determining relevancy works at least in part. This model is largely a combination of non-directed democratic relevancy (weighting inbound links and such), paid weighting and preference targeting.

Aggregations, Groupings, Mobs and Power

cluster ballooning
Now think about the aggregation again. Most of the information aggregation efforts are intentionally based. They require some person to do something that, while it may be easy, is not something they would normally do and have little to no self-interest in actually doing. Have you ever given a rating to a product, blog post or discussion thread posting? Typically you will click 1-5 stars to indicate how "good" the thing is. Why do you bother? Altruism? Maybe. What about tag clouds? They are incredibly useful ways to aggregate social classifications. I have written about their power extensively in my book, Reshaping Your Business With Web 2.0. And yet what is it that causes people to tag and classify an item? Within business it might be a job requirement or it might help the individual find information the next time they want it. But what about on the public web? There is no central authority forcing folks to rate items or tag them. The result is that those with a combination of self-interest, employer expectation, and a sense of "good web citizenship" tend to participate whilst those without do not (or participate less).

The Question
So then, if aggregate classifications, signifiers of business wisdom, organizational knowledge and concept clusters are so useful (they are!) for getting the right (but previously unknown) information to the right (but previously unknown) people, how can organizations produce those aggregates in a consistent way? Relying on a sense of "good web citizenship" will not get very far methinks. Creating rules and requirements to participate flies in the face of the web culture and would likely yield poor and polluted results.

The Answer

teen group
The best answer is to automate the aggregations where possible and continue to tap and encourage the voluntary and convivial participation that exists as part of the Web 2.0 culture. That may sound all fine and dandy but it quickly leads to another challenge - how to automatically create the kinds of aggregations that humans create through their participation? And it is here that we get to the thread that binds the posts in this series together.
Last week I wrote:
This is where new generation text analytics, text mining, sentiment analysis and semantic analysis comes in. These are programs and algorithms that are able to get into the information container and programmatically extract, parse, index and store the information *within* the container not just [the human created metadata] next to it.
Text analytics and categorization engines are one important step along the way to our goal of combining the power of intentional with accidental collaboration. Oracle ECM has had light weight versions of these features for years. With ECM you have a decentralized set of diverse information created by independent actors. What is missing is the automatic aggregation part. This is why enterprise content management is such a ripe system and discipline for information and collaboration evolution.

Newer systems under development and beta testing now are light-years ahead by using ontology assisted matching, heuristics based learning and corpus driven extraction techniques. The items extracted are raw data but when that data is linked patterns, clusters and classifications emerge. But the interesting part is that it is based on what is inside the content item - the information in the container, rather than what people say about what is inside the container.

It is the difference between chemical analysis of the liquid in a container and asking 100 different people what is in the container. It is not that the people will "get it wrong" but rather that when you combine the judgment of the people with the chemical analysis you get a much more accurate description of what is actually in the container. Where 50 people might say "water" and 20 might say "liquid" and 10 people might say "dirty pond water" and 10 might say "something slippery" and 9 might say "nothing" and 1 say "milk" the chemical analysis would describe it as H20 + various organic compounds.

The Benefits
Furthermore, research has shown that, just like objects at rest tend to stay at rest and those in motion tend to stay in motion, participation is often more difficult to start than it is to continue. With aggregates it is the same. It takes more work for me to step out of my normal daily business flow and create a linkage between two previously unrelated items than it does for me to modify/enhance/comment on a linkage that someone else has already started. Think of discussion forums. Compare the number of discussions you have participated in (even if it is just reading them) vs. the number of new discussions that you have started. The first number is always bigger - massively so. If we apply this finding to the area of spurring collaboration, the automatic generation of topic, knowledge and concept clusters yields foundations of organic analysis and clustering that are then ripe for enhancement by you and me.

In this way the automatic generation of knowledge aggregates using textual analytics, latent semantic indexing, and ontology assisted extraction techniques relieve us of the burden of stepping out of our daily work flow (a truly disruptive productivity killer!) and creating those initial aggregates ourselves. The automatically generated clusters tease us with just enough information to allow our supremely powerful brains to quickly evaluate if the items represented by a particular aggregate are important to us. If they are important, great, we have found new information that we were not originally aware of and are accidentally collaborating with those who toiled over those information artifacts. Additionally, we are as free as we always are to enhance, extend, engage and participate with the clusters that have already been started for us. They become springboards to further participation and accidental collaboration and that is the goal.

Next week we will continue the series investigating requirement #3 which is also related to automatic aggregation: Usage and context patterns must be able to be automatically created, extracted, enhanced and preserved.


Post a Comment:
Comments are closed for this entry.

Enterprise 2.0 and Content Management


« February 2017