Wednesday Nov 02, 2011

Oracle NoSQL Database Slides From HPTS

Here are my slides from my HPTS. There are some slides with performance figures starting at slide 28.

Monday Oct 24, 2011

Oracle NoSQL Database Available for Download

Oracle NoSQL Database is available for download on OTN.

Press Release

Friday Oct 21, 2011

Oracle NoSQL Database Indexing

Oracle NoSQL Database's simple K/V pair model utilizes a B+Tree on each node to index by the key of each record.  Is a Key-Value store useful with only primary key indexing?  Absolutely.


  • Click stream logs - indexed by timestamp or ipaddr
  • User Profiles - indexed by UID
  • Sensor/Stats/Network capture - indexed by timestamp
  • Mobile device backup services - indexed by device id or user id
  • Personalization - index these by user id and then do further look-up within the user id by sub key as needed
  • Authentication services - indexed by user id

In an unstructured or semi-structured environment, primary-key indexing is very often sufficient.  Further, consider the case of Map/Reduce post-processing of NoSQL Database data in any of the above scenarios.  During the M/R steps, secondary indices, sometimes ad-hoc, are effectively generated on-the-fly.

Tuesday Oct 18, 2011

Oracle NoSQL Database Doc available on OTN

The Oracle NoSQL Database doc is available on OTN at:

Sunday Oct 09, 2011

Cisco and Oracle NoSQL Database

The Oracle NoSQL Database development team has been working closely with the Cisco UCS team.  This is a great partnership in that we work closely on performance and scalability testing using their UCS C-Series Rack-Mount Servers and Cisco Nexus 550 Series Switching and have access Cisco‚Äôs large cluster to run tests at massive scales and proof of concepts.

I am planning to write some blog entries describing the results. Cisco has produced a solution brief about Oracle NoSQL Database on the UCS platform.

Thursday Oct 06, 2011

Oracle NoSQL Database vs Berkeley DB Java Edition

I've been watching the twitter-sphere for comments about Oracle NoSQL Database.  There are a number of common questions and misconceptions floating around that I'll address here:

Misconception #1: "Oracle NoSQL Database is just Berkeley DB Java Edition rebranded."; "Oracle NoSQL Database sounds like it's just Berkeley DB with extra bits."

When we built NoSQL Database, we recognized that Berkeley DB Java Edition HA provided us with lots of necessary, but not sufficient, elements for a NoSQL store.  For instance, JE/HA gives us:

  • ACID Transactions
  • Persistence
  • High Availability
  • High Throughput
  • Large Capacity
  • Lights out administration

And you could even argue that its key/value data model is already "NoSQL".  But we believe that NoSQL means something more to most people.  Like

  • Data distribution
  • Dynamic partitioning (aka "sharding")
  • Load balancing
  • Monitoring and Administration
  • Predictable latency
  • Multi-node backup

So although NoSQL Database is built using BDB JE/HA as the underlying, battle-tested, storage system (why reinvent the wheel?), NoSQL Database adds a large amount of infrastructure on top of it to bring it into the NoSQL realm.  As my colleague Chao Huang says, "BDB JE is like an engine. NoSQL Database is the car built with the engine."

Misconception #2: "Oracle NoSQL Database has the same API as Berkeley DB Java Edition"

I realize that at the time of this writing we have not released the software so the reader has no way of looking at the javadoc to see the actual NoSQL Database API, but suffice it to say that the API is not the same as BDB JE.  The interface is Java, and it provides CRUD, iteration, and CAS (aka "RMW") capabilities on key/value pairs.  There is also a major/minor key capability.  All key/value pairs with the same major key reside on the same "Rep Group" (a Rep Group is just a BDB JE HA replication group of a master and N replicas).  That way, records can be clustered (e.g. put all records related to "Fred" on the same node).  One other (slight) difference between the BDB JE and NoSQL Database APIs is that the former uses byte[] for keys and the latter uses Strings for keys.  Both use byte[] for the data portion.

(Non-) Misconception #3: "Oracle is adding network bindings to Berkeley DB Java, branding it Oracle NoSQL. I am curious how easy setup and develoment will be."

Let me address the second question first (ease of setup/development).  Although this isn't a misconception, it is a good question.  In general it is difficult for the average developer who wants to try out a large distributed store to find sufficient hardware to get a reasonable sized cluster going.   Well, maybe it's difficult not for you, but it sure is for all of us -- we have to claw and scratch for every machine we use(*).  So George (one of developers) put together what we call "kvlite", a single process version of Oracle NoSQL Database.  kvlite is really easy to start up (one simple command line invocation) and gives the user a good way of trying out the API without a lot of muss and fuss.  The "server side" is in no way tuned for performance, but it lets you get things going really quickly so you can kick the tires, try out your application code, etc. while your sysadmins and IT folks scrounge the real hardware for you to use for deployment.

(*) We actually have several large clusters to do development and performance testing at our disposal.

And now the first part of the question (adding network bindings to Berkeley DB Java Edition).  Hmm, that's kind of, sort of true.  Let me try to reframe the statement.  BDB JE HA allows a user to perform operations on either the master (for updates and reads) or the replicas (for reads).  The most common objection that we encounter is that the application has to "know" which nodes are the master and the replicas (for routing updates and read requests appropriately).  There is no network layer in BDB JE/HA to handle this for you.  Oracle NoSQL Database provides this capability.  You link in the kvclient.jar (the "driver") to your application, and presto, you can make your CRUD (or iteration) method calls on your K/V Store.  The kvclient.jar figures out which node to route the request to (it knows which Rep Group holds the key value pair and which node in that Rep Group is the master).  So in that sense, it adds a network layer to BDB, but the API is different from BDB so I wouldn't exactly call it a network binding.  There's a lot of infrastructure and intelligence (e.g. load balancing) built into the kvclient "driver".

Monday Oct 03, 2011

Oracle NoSQL Database

Today at Oracle OpenWorld, we are announcing Oracle NoSQL Database.  From the datasheet:

Oracle NoSQL Database provides network-accessible multi-terabyte distributed key/value pair storage that offers predictable latency. That is, it services network requests to store and retrieve data which is organized into key-value pairs. It offers full Create, Read, Update and Delete (CRUD) operations, with adjustable durability guarantees.  Oracle NoSQL Database is designed to be a highly available and extremely scalable system, with predictable levels of throughput and latency, while requiring minimal administrative interaction.

My colleagues and I have been working hard to bring this project to fruition and it's truly exciting for all of us to see it roll out the door (as well as to be able to finally talk about it in public).  It will come in two versions, an Open Source Community Edition, and a value-add "Enterprise Edition".  Initially, both Editions will have the same feature set, but in subsequent releases there will be differentiation between the two. My colleague Margo Seltzer has written a fine whitepaper which describes the system.  If you have the time, it's an easy read.

In future posts to this blog I hope to talk about some of the great performance and scaling numbers we're seeing in our tests.  To demonstrate the system's capabilities, we've been working with two very fine corporate partners to run tests on clusters of up to 192 nodes.

We also announced the Oracle Big Data Appliance, an "engineered system" which will run (among other things) Oracle NoSQL Database.


Anything related to Oracle NoSQL Database and/or Berkeley DB Java Edition.


« July 2016