Tuesday Dec 20, 2011

Oracle NoSQL Database in 5 Minutes

Inspired by some other "Getting started in 5 minutes" guides, we now have a Quick Start Guide for Oracle NoSQL Database.  kvlite, the single process Oracle NoSQL Database, makes it incredibly easy to get up and running.  I have to say the standard disclaimer: kvlite is only meant for kicking the tires on the API.  It is not meant for any kind of performance evaluation or production use.

Oracle NoSQL Database - A Quickstart In 5 Minutes

Install Oracle NoSQL Database

  • Download the tar.gz file from http://www.oracle.com/technetwork/database/nosqldb/downloads/index.html.
  • gunzip and untar the .tar.gz package (or unzip if you downloaded the .zip package). Oracle NoSQL Database version 1.2.116 Community Edition is used in this example.

    $ gunzip kv-ce-1.2.116.tar.gz
    $ tar xvf kv-ce-1.2.116.tar

Start up KVLite

KVLite is a single process version of Oracle NoSQL Database. KVLite is not tuned for performance, but does give you easy access to a simple Key/Value store so that you can test the API.

  • cd into the kv-1.2.116 directory to start the NoSQL Database server.

    $ cd kv-1.2.116
    $ java -jar lib/kvstore-1.2.116.jar kvlite
    Created new kvlite store with args:
    -root ./kvroot -store kvstore -host myhost -port 5000 -admin 5001
  • In a second shell, cd into the kv-1.2.116 directory and ping your KV Lite to test that it's alive.

    $ cd kv-1.2.116
    $ java -jar lib/kvstore-1.2.116.jar ping -port 5000 -host myhost
    Pinging components of store kvstore based upon topology sequence #14
    kvstore comprises 10 partitions and 1 Storage Nodes
    Storage Node [sn1] on myhost:5000    Datacenter: KVLite [dc1]    Status: RUNNING   Ver: 11gR2.1.2.116
            Rep Node [rg1-rn1]      Status: RUNNING,MASTER at sequence number: 31 haPort: 5011
  • Compile and run the Hello World example. This opens the Oracle NoSQL Database and writes a single record.

    $ javac -cp examples:lib/kvclient-1.2.116.jar examples/hello/HelloBigDataWorld.java
    $ java -cp examples:lib/kvclient-1.2.116.jar hello.HelloBigDataWorld
    Hello Big Data World!
  • Peruse the Hello World example code and expand it to experiment more with the Oracle NoSQL Database API.

Learn more about Oracle NoSQL Database

Open the doc landing page (either locally in kv-1.2.116/doc/index.html or on OTN). From there, the Getting Starting Guide (HTML | PDF) and Javadoc will introduce you to the NoSQL Database API. The Oracle NoSQL Database Administrator's Guide (HTML | PDF) will help you understand how to plan and deploy a larger installation.

Remember, KVLite should only be used to become familiar with the NoSQL Database API. Any serious evaluation of the system should be done with a multi-process, multi-node configuration.

  • To install a standard, multi-node system, you need to repeat the instructions above on how to unpack the package on any nodes that do not yet have the software accessible. Then follow a few additional steps, described in the Admin Guide Installation chapter. Be sure to run ntp on each node in the system.
  • If you want to get started with a multi-node installation right away, here's a sample script for creating a 3 node configuration on a set of nodes named compute01, compute02, compute03. You can execute it using the NoSQL Database CLI.
    configure "mystore"
    plan -execute deploy-datacenter BurlDC Burlington
    plan -execute deploy-sn 1 compute01 5000 Compute01StorageNode
    plan -execute deploy-admin 1 5001
    addpool mySNPool
    joinpool mySNPool 1
    plan -execute deploy-sn 1 compute02 5000 Compute02StorageNode
    joinpool mySNPool 2
    plan -execute deploy-sn 1 compute03 5000 Compute03StorageNode
    joinpool mySNPool 3
    plan -execute deploy-store mySNPool 3 100
    show plans
    show topology
  • You can access the Adminstrative Console at http://compute01:5001/ at any time after the plan-execute deploy-admin command to view the status of your store.
  • To evaluate performance, you will want to be sure to set JVM and cache size parameters to values appropriate for your target hosts. See Planning Your Installation for information on how to determine those values. The following commands are sample parameters for target machines that have more than 32GB of memory. These commands would be invoked after the configure "mystore" command.
    set policy "javaMiscParams=-server -d64 -XX:+UseCompressedOops -XX:+AlwaysPreTouch -Xms32000m -Xmx32000m -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc:/tmp/gc-kv.log"
    set policy "cacheSize=22423814540"

You can ask questions, or make comments on the Oracle NoSQL Database OTN forum.

Oracle NoSQL DAtabase 1.2.123 Community and Enterprise Editions Available for Download

Oracle NoSQL Database release 1.2.123, both Community Edition (new) and Enterprise Edition, are now available for download on OTN:


In addition to some minor bug fixes, a performance improvement to the snapshot function, and deprecation of kvctl (see the changelog for details), the Community Edition is now available.  The CE package includes source code and the license for CE is aGPLv3.  The license for the EE remains the same as before (standard OTN license).

Wednesday Nov 02, 2011

Oracle NoSQL Database Slides From HPTS

Here are my slides from my HPTS. There are some slides with performance figures starting at slide 28.

Monday Oct 24, 2011

Oracle NoSQL Database Available for Download

Oracle NoSQL Database is available for download on OTN.

Press Release

Friday Oct 21, 2011

Oracle NoSQL Database Indexing

Oracle NoSQL Database's simple K/V pair model utilizes a B+Tree on each node to index by the key of each record.  Is a Key-Value store useful with only primary key indexing?  Absolutely.


  • Click stream logs - indexed by timestamp or ipaddr
  • User Profiles - indexed by UID
  • Sensor/Stats/Network capture - indexed by timestamp
  • Mobile device backup services - indexed by device id or user id
  • Personalization - index these by user id and then do further look-up within the user id by sub key as needed
  • Authentication services - indexed by user id

In an unstructured or semi-structured environment, primary-key indexing is very often sufficient.  Further, consider the case of Map/Reduce post-processing of NoSQL Database data in any of the above scenarios.  During the M/R steps, secondary indices, sometimes ad-hoc, are effectively generated on-the-fly.

Tuesday Oct 18, 2011

Oracle NoSQL Database Doc available on OTN

The Oracle NoSQL Database doc is available on OTN at:


Sunday Oct 09, 2011

Cisco and Oracle NoSQL Database

The Oracle NoSQL Database development team has been working closely with the Cisco UCS team.  This is a great partnership in that we work closely on performance and scalability testing using their UCS C-Series Rack-Mount Servers and Cisco Nexus 550 Series Switching and have access Cisco‚Äôs large cluster to run tests at massive scales and proof of concepts.

I am planning to write some blog entries describing the results. Cisco has produced a solution brief about Oracle NoSQL Database on the UCS platform.

Thursday Oct 06, 2011

Oracle NoSQL Database vs Berkeley DB Java Edition

I've been watching the twitter-sphere for comments about Oracle NoSQL Database.  There are a number of common questions and misconceptions floating around that I'll address here:

Misconception #1: "Oracle NoSQL Database is just Berkeley DB Java Edition rebranded."; "Oracle NoSQL Database sounds like it's just Berkeley DB with extra bits."

When we built NoSQL Database, we recognized that Berkeley DB Java Edition HA provided us with lots of necessary, but not sufficient, elements for a NoSQL store.  For instance, JE/HA gives us:

  • ACID Transactions
  • Persistence
  • High Availability
  • High Throughput
  • Large Capacity
  • Lights out administration

And you could even argue that its key/value data model is already "NoSQL".  But we believe that NoSQL means something more to most people.  Like

  • Data distribution
  • Dynamic partitioning (aka "sharding")
  • Load balancing
  • Monitoring and Administration
  • Predictable latency
  • Multi-node backup

So although NoSQL Database is built using BDB JE/HA as the underlying, battle-tested, storage system (why reinvent the wheel?), NoSQL Database adds a large amount of infrastructure on top of it to bring it into the NoSQL realm.  As my colleague Chao Huang says, "BDB JE is like an engine. NoSQL Database is the car built with the engine."

Misconception #2: "Oracle NoSQL Database has the same API as Berkeley DB Java Edition"

I realize that at the time of this writing we have not released the software so the reader has no way of looking at the javadoc to see the actual NoSQL Database API, but suffice it to say that the API is not the same as BDB JE.  The interface is Java, and it provides CRUD, iteration, and CAS (aka "RMW") capabilities on key/value pairs.  There is also a major/minor key capability.  All key/value pairs with the same major key reside on the same "Rep Group" (a Rep Group is just a BDB JE HA replication group of a master and N replicas).  That way, records can be clustered (e.g. put all records related to "Fred" on the same node).  One other (slight) difference between the BDB JE and NoSQL Database APIs is that the former uses byte[] for keys and the latter uses Strings for keys.  Both use byte[] for the data portion.

(Non-) Misconception #3: "Oracle is adding network bindings to Berkeley DB Java, branding it Oracle NoSQL. I am curious how easy setup and develoment will be."

Let me address the second question first (ease of setup/development).  Although this isn't a misconception, it is a good question.  In general it is difficult for the average developer who wants to try out a large distributed store to find sufficient hardware to get a reasonable sized cluster going.   Well, maybe it's difficult not for you, but it sure is for all of us -- we have to claw and scratch for every machine we use(*).  So George (one of developers) put together what we call "kvlite", a single process version of Oracle NoSQL Database.  kvlite is really easy to start up (one simple command line invocation) and gives the user a good way of trying out the API without a lot of muss and fuss.  The "server side" is in no way tuned for performance, but it lets you get things going really quickly so you can kick the tires, try out your application code, etc. while your sysadmins and IT folks scrounge the real hardware for you to use for deployment.

(*) We actually have several large clusters to do development and performance testing at our disposal.

And now the first part of the question (adding network bindings to Berkeley DB Java Edition).  Hmm, that's kind of, sort of true.  Let me try to reframe the statement.  BDB JE HA allows a user to perform operations on either the master (for updates and reads) or the replicas (for reads).  The most common objection that we encounter is that the application has to "know" which nodes are the master and the replicas (for routing updates and read requests appropriately).  There is no network layer in BDB JE/HA to handle this for you.  Oracle NoSQL Database provides this capability.  You link in the kvclient.jar (the "driver") to your application, and presto, you can make your CRUD (or iteration) method calls on your K/V Store.  The kvclient.jar figures out which node to route the request to (it knows which Rep Group holds the key value pair and which node in that Rep Group is the master).  So in that sense, it adds a network layer to BDB, but the API is different from BDB so I wouldn't exactly call it a network binding.  There's a lot of infrastructure and intelligence (e.g. load balancing) built into the kvclient "driver".

Steve Jobs, 1955 - 2011

I respectfully contemplate the impact Steve Jobs has had on our industry and the world.

Monday Oct 03, 2011

Oracle NoSQL Database

Today at Oracle OpenWorld, we are announcing Oracle NoSQL Database.  From the datasheet:

Oracle NoSQL Database provides network-accessible multi-terabyte distributed key/value pair storage that offers predictable latency. That is, it services network requests to store and retrieve data which is organized into key-value pairs. It offers full Create, Read, Update and Delete (CRUD) operations, with adjustable durability guarantees.  Oracle NoSQL Database is designed to be a highly available and extremely scalable system, with predictable levels of throughput and latency, while requiring minimal administrative interaction.

My colleagues and I have been working hard to bring this project to fruition and it's truly exciting for all of us to see it roll out the door (as well as to be able to finally talk about it in public).  It will come in two versions, an Open Source Community Edition, and a value-add "Enterprise Edition".  Initially, both Editions will have the same feature set, but in subsequent releases there will be differentiation between the two. My colleague Margo Seltzer has written a fine whitepaper which describes the system.  If you have the time, it's an easy read.

In future posts to this blog I hope to talk about some of the great performance and scaling numbers we're seeing in our tests.  To demonstrate the system's capabilities, we've been working with two very fine corporate partners to run tests on clusters of up to 192 nodes.

We also announced the Oracle Big Data Appliance, an "engineered system" which will run (among other things) Oracle NoSQL Database.

Thursday Jan 06, 2011

Berkeley DB Java Edition 4.1.7

[Read More]

Friday Oct 29, 2010

Berkeley DB Java Edition 4.1 Improvements

[Read More]

Monday May 03, 2010

Berkeley DB Java Edition 4.0.103 Available

[Read More]

Anything related to Oracle NoSQL Database and/or Berkeley DB Java Edition.


« July 2016