Ten Requirements for Achieving Collaboration #1: Seamlessly combine human oriented and machine oriented information.
By billy.cripe on Aug 05, 2009
You will recall that I previously wrote about the two types of collaboration -
INTENTIONAL: where we get together to achieve a goal and
ACCIDENTAL: where you interact with something of mine and I am never aware of your interaction.
While intentional collaboration is good it is not where the bulk of untapped collaborative potential lies. Accidental collaboration is. But the challenge is to *intentionally* facilitate accidental collaboration. For the full list of 10 requirements see the previous post.
Requirement #1 is that Human oriented information (e.g. documents, images) AND machine oriented information (e.g. indexes, transactions) must be *able to be* seamlessly combined.This first requirement is a bit techy but follow along because it is absolutely foundational to achieving accidental collaboration.
Human oriented information is contained in documents, images, web pages, videos and such. The information container is designed for consumption by humans - we read, we watch, we listen, we look and then our brains do the processing. But before we can read, watch, listen or look we have to locate. Sometimes we might know exactly where to find some information, but more likely we are searching either by querying or drilling or browsing. In order to be successful in our location efforts the computer must to a better job of interpreting our desires and delivering relevant information.
This is where machine oriented information comes in. It turns out that computers have a difficult time understanding information in human oriented containers. Looking at a picture such as Van Gogh's Sunflowers you and I can immediately make some basic claims. "It's about Sunflowers" we might say. A computer has a much more difficult time making such simple claims because they rely on being able to crack the container and parse the information. So we humans resort to describing the information with meta data, tags, ratings and other structured information that is easier for a computer to access (it is in a database not a document) and parse (values are typed, indexable, traversable and code-ready). Add in other kinds of meta data such as web tracking metrics, inbound link data and usage analytics and the computer gets a fairly rich set of data about the content item.
It might seem that the machine has a good set of data about the content. But upon closer examination it is obvious that the computer only knows about the information to the extent that a human intentionally describes it, either by writing additional meta data, applying tags or linking and accessing the information. The machine still has not understood, on its own, what the information is about. This is where new generation text analytics, text mining, sentiment analysis and semantic analysis comes in. These are programs and algorithms that are able to get into the information container and programatically extract, parse, index and store the information *within* the container not just next to it. The extracted information is then able to be stored in RDF format in the Oracle Spatial Database. The machine understandable information also contain concept and relationship information that relate terms like "A319" to "Airbus" and "Chest Pain" to "Heart Attack" just like our minds do. Sets of relationships are described in OWL which are also stored in the database and used to make basic inferences about related items.
All this is helping to give the machine access to the information within a document, image, video, or web page container.
With such information access, parsing and indexing capabilities the computer is able to start combining its information with the human oriented information we are *really* interested in. After all, we are not interested in even a search index no matter how robust it may be. We are interested in the computer helping us locate and then consume and use relevant information - to us. But the next step to doing that is helping the computer to better understand what we are after. By only giving the machine meaningful access to meta data and not the actual data within the container we limit its ability to help us find what we are after.
To be certain, I am not talking about throwing out or supplanting the existing data we give the computer today. I am talking about evolving and expanding that data set by giving the computer access to the data within the containers and allowing it to parse and relate those data bits to concepts, documents, web pages, images and videos that we might not be aware of. This relational activity is the joining of the human and machine oriented information. The machine can weed out extraneous information (I'm interested in the care of a rubber plant in my house, not the factory down the street!) Only then do we have an odds-on chance of locating relevant information so that we can reuse it.
Remember, all the talk about parsing, extracting, indexing and relating is merely a means to our end. The end is the interaction with information from others of which we were previously unaware. When we interact with that information we "accidentally" collaborate with the creators and influencers of that information. And therein lies the power of accidental collaboration - the ability to tap the wisdom and knowledge of others and put it to our own ends.
I will leave with this, there is a tsunami of information being created in our organizations and on the public web. There is too much for we as individuals to search through. We must tap computational power to help but we must direct that computational power so that it parses and finds information that means something to us beyond simple keyword matches or inbound link tallies. Semantic analysis and indexing is the best today way for computers to get into the documents and information containers and return information that they can understand to help us find what we really want.
Next time we continue Ten Requirements for Achieving Collaboration with #2: Topic, knowledge and concept clusters must be able to be automatically created, extracted, enhanced and preserved.