InfoQ looking for users to categorize NoSQL Tech use
By thegreeneman on Jul 26, 2013
Often, when people talk about NoSQL technologies, there is an attempt to categorize the solutions. In a new Adoption Trends breakdown by InfoQ they also take this tact, providing the following categorizations: Columnar, Document, Graph, In-Memory Grid, Key-Value. I think this definitely has some utility, yet in another respect, misses the main point about this technology segment. These technologies have come to existence to fulfill the needs of 1 primary and 1 ancillary core ability. The primary ability, surfaced by Amazon, LinkedIn, Facebook, etc is the ability to scale and remain available in the face of tremendous data and concurrent users. The ancillary ability is to provide more agility for the line of business which is constantly adjusting its solutions with changing needs and understanding of the consumer. What considerations should drive the thought process of adoption for NoSQL?
Each of the NoSQL technologies from that list of categories has, from its database perspective, a key underlying principle that enables those core NoSQL abilities, a break away from server side relationship management, moving responsibility of data constraints to the application tier ( -vs- the database ) via the implementation of a key-value based distribution model. It is that key-value paradigm that enables the scale-out architecture thru key distribution and so in some sense, they are all key-value stores. In fact, it is for that reason that we've seen for instance Cassandra evolve it's value implementation several times over the last couple of year from binary to super column, to table structure. If it had not been for the underlying key value nature of the implementation, they could have never undergone those drastic changes in data storage format in such a short period of time. This is why the Oracle NoSQL Database (OnDB) was implemented as a key-value store. It provides the ability to layer multiple value abstractions under the core key based distribution and scale-model. Today it supports 2 value abstractions, Binary and JSON with a 3rd abstraction on the way, Table. Each of the value abstractions provide different utility in application implementation and feature the best run-time / storage characteristics for a particular data use. For example, when storing image and video data, the binary abstraction is best suited, especially when it is overlayed with the OnDB streaming interfaces. However, when you want to store nested data structures with internal value inter-dependencies and sub-field updates, JSON is a great value abstraction. Finally, if you have the need to model data in a format amenable to integrated systems and capable of supporting a richer set of query semantics, the Table abstraction does the job best. Btw - I might argue that the Graph category of NoSQL database is really an application layer above a NoSQL database. Its the reason we've seen NoSQL databases like Objectivity enter the Graph database category and why you will find that OnDB supports Graph storage and retrieval for Oracle Spatial and Graph ...but this is a different Blog topic all together.
Anyway, the point I am trying to make is that companies use of data will vary greatly. The real category to which all of the NoSQL database implementations belong, is the key-value category. Bringing in NoSQL technology that provides a range of value options that can be selected and intermixed to achieve the optimal solution for a given application will provide the greatest flexibility and reduction of risk. The scalability of a key based distribution architecture should be ever present, but the application of the value abstraction will most surely vary for each solution space. This is something project leads and managers adopting these new technologies should reflect on as they invest their resources and time in learning and adopting a particular product. The repeatability and applicability of that investment for unforeseen future work.
Btw - in the InfoQ Adoption Trends article, there is a survey on current and future use of the many vendor technologies. I encourage everyone to take the time to visit the site and share your position on this important area of data management.