Collapsing the Multiverse

In the web application I work on, the data in question exists in one of about four phases at any given time, depending on how you draw the distinctions. Most of the work I do is in trying to herd the data through these phases, expose it to some interface for consumption and/or manipulation, then herd it back across again. Yah!

Phase is actually the wrong word. Universe is better. The data has to shift in and out of different universes, each defining its own view of reality. RDBs and OOP come readily to mind. Both are built around basic concerns necessitated by fundamental paradigms in modern computer architecture; the data needs to, 1) be stored on a disk (RDB) and, 2) pulled into RAM and run through the CPU (OOP).

Since both these universes have their own internal data model empire (object model, schema, whatever), you have adapters like JDBC to convert between them. Dancing around the adapter, I believe, is where much of the pull-your-hair-out complexity comes from in writing software. The adapter's functionality is simple: acquire cnxn, execute stmt, release cnxn. The adapter's strengths and weaknesses are easy to understand: connections are expensive, network calls introduce latency, certain types of statements hang the DB server. Surrounding it all you have the ever tightening noose of increasing drain on the computer's resouces as the application gets used.

My observation is this: coding to this basic set of conditions while navigating the paradigmatic rift between the universes quickly results in Enormous Complexity, similar to how a cellular automata with simple rules and preconditions propagates into a complex set of arrangements (e.g. chess or the Game of Life). Enormous Complexity necessitates clever exception handling, patterns and antipatterns, performance tuning, persistence frameworks or backing as much logic into the DB layer as possible, but while these techniques help, they're essentially artifacts built around complexity and they don't inform the task at hand.

Therefore anything that eliminates the need for these adapters in the first place is a good thing. Case in point: XForms. Besides RDBs and OOP there are the universes of XML and web forms as name/value pairs. XForms replaced the name/value universe with basic XML, which means that, if I used XForms, I could cut out reams of tedious parameter mapping code and let the duty fall to my already-existing XML functions. XForms collapsed the multiverse.

The question on my mind is, will the RDB/OOP multiverse ever collapse? Is it possible to store data in a form that can be exposed directly as a DOM-like construct, without any under-the-sheets translation between disparate RDB and OOP systems? If so, perhaps the database can be thought of as a virtualized instance of a Big Linked List whose links are more akin to URIs than object references, in that they point to resources within an abstract space that doesn't know about RAM or disk memory. A Container would map these links to real data behind the virtual layer, where this Big Linked List (BLL) would be backed by disk data, wile different parts of it are instantiated in physical memory at any given moment according to some intelligent algorithm managed by the Container. (Imagine something remotely akin to swap memory.) Mutations against the BLL's nodes would be backed directly by changes to disk data, or queued in a transaction space. In front of the virtual layer, the BLL would be treated as fully instantiated, so that you could expose pieces of it with XPath- or XQuery-like statements and do work on them, and various nodes throughout the BLL could listen to (or be observed by) other nodes, reacting to conditions and events in useful ways.

As an example I'm thinking of a CMS. The either/or distinction between an XML document tree and a collection of relational data (either of which could be considered "content") is one that gets everybody in my org sufficiently jumbled as to cause me major grief; more casualties of the multiverse problem. In my theoretical system it's not either/or, it's both/and. The BLL is directly analogous to a DOM tree that can be serialized as XML or transformed, but its individual nodes—being resources unto themselves—can also be shared, cross-linked and made relational via an RDF-like meta-framework. CMS content is thus stored in a super-normal state that is inherently both document-centric and relational, therefore collapsing the multiverse and avoiding Enormous Complexity and subsequent artifacts.

Now I must disclaim that I'm a speculative nut with these kind of things. Maybe it's a pipe dream, or maybe it's been done. Maybe it's a dumb idea for reasons I haven't thought of. It just seems that if such a framework could be built, data-driven applications would be an order of magnitude or two easier to write and maintain, and web based applications would almost fall out of it, especially with XForms in the mix. I've heard interesting things about DOM databases that seem to hold some promise, but I must say I don't know a heck of a lot about them. I'll keep researching. Meanwhile I'd welcome any comments, corrections, hints or pointers about this stuff if anybody feels so inclined.

Well, I suppose I'd better get back to navigating rifts in the multiverse and dealing with Enormous Complexity and said artifacts, meanwhile attempting to mitigate the confusion created by my app's plurality of data representation. Thanks for reading my screed.

Comments:

What about JDO, (http://access1.sun.com/jdo/) Java Data Objects?

Benefits of Using JDO for Application Programming
http://java.sun.com/products/jdo/index.jsp

Applications written with the JDO API are portable, and independent of the underlying database. You can focus on your domain object model and leave the details of persistence (field-by-field storage of objects) to the JDO implementation.

Java Data Objects (JDO) Overview
http://java.sun.com/products/jdo/overview.html

The Java Data Objects (JDO) API is a standard interface-based Java model abstraction of persistence, developed as Java Specification Request 12 under the auspices of the Java Community Process. Application programmers use JDO to directly store their Java domain model instances into the persistent store (database).

Alternatives to JDO include direct file I/O, serialization, JDBC, and Enterprise Java Beans (EJB) Bean Managed Persistence (BMP) or Container Managed Persistence (CMP) Entity Beans.

If you are an application programmer, these are the benefits for you in using JDO:

<u>Portability:</u> Applications written with the JDO API can be run on multiple implementations without recompiling or changing source code.

<u>Database independence:</u> Applications written with the JDO API are independent of the underlying database.

<u>Ease of use:</u> Application programmers can focus on their domain object model and leave the details of persistence (field-by-field storage of objects) to the JDO implementation.

<u>High performance:</u> Application programmers delegate the details of persistence to the JDO implementation, which can optimize data access patterns for optimal performance.

<u>Integration with EJB:</u> Applications can take advantage of EJB features such as remote message processing, automatic distributed transaction coordination, and security, using the same domain object models throughout the enterprise.

Posted by mxt on July 30, 2004 at 09:19 AM MDT #

I'm really interested in something that eliminates the need to write two separate data models; a DB schema and an object model. This is admittedly far-fetched and not immediately useful to Java developers. On the other hand JDO is a good way to separate the two concerns, and has obvious applicability today, but it ultimately widens the rift between RDBs and OOP.

Posted by greimer on August 02, 2004 at 03:44 AM MDT #

Post a Comment:
Comments are closed for this entry.
About

My name is Greg Reimer and I'm a web technologist for the Sun.COM web design team.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today