Wednesday Dec 18, 2013

Enterprise vs Consumer HDDs

Backblaze has two interesting blog posts about enterprise vs consumer
drives. Their conclusions are that drives fail either when they're
young, or when they're old, but not when they're in mid-life. They also
found no real difference failure rates between the two classes.

Thursday Aug 01, 2013

Webinar: NoSQL - Data Center Centric Application Enablement

NoSQL - Data Center Centric Application Enablement


About the Webinar

The growth of Datacenter infrastructure is trending out of bounds, along with the pace in user activity and data generation in this digital era. However, the nature of the typical application deployment within the data center is changing to accommodate new business needs. Those changes introduce complexities in application deployment architecture and design, which cascade into requirements for a new generation of database technology (NoSQL) destined to ease that complexity. This webcast will discuss the modern data centers data centric application, the complexities that must be dealt with and common architectures found to describe and prescribe new data center aware services. Well look at the practical issues in implementation and overview current state of art in NoSQL database technology solving the problems of data center awareness in application development.



NOTE! All attendees will be entered to win a guest pass to the NoSQL Now! 2013 Conference & Expo.

About the Speaker Robert Greene, Oracle NoSQL Product Management

Robert GreeneRobert Greene is a principle product manager / strategist for Oracle’s NoSQL Database technology. Prior to Oracle he was the V.P. Technology for a NoSQL Database company, Versant Corporation, where he set the strategy for alignment with Big Data technology trends resulting in the acquisition of the company by Actian Corp in 2012. Robert has been an active member of both commercial and open source initiatives in the NoSQL and Object Relational Mapping spaces for the past 18 years, developing software, leading project teams, authoring articles and presenting at major conferences on these topics. In his previous life, Robert was an electronic engineer developing first generation wireless, spread spectrum based security systems.

Sunday Jul 28, 2013

MorphoTrak: "Storing billions of images in a hybrid relational and NoSQL database using Oracle Active Data Guard and Oracle NoSQL Database"Database

Aris Prassinos of MorphoTrak posted a slide set entitled "Storing billions of images in a hybrid relational and NoSQL database using Oracle Active Data Guard and Oracle NoSQL". In it, he details how they migrated their application from being one that was implemented completely upon Oracle Server, to one that uses Oracle Server + NoSQL.

Friday Jun 14, 2013

Oracle NoSQL Database Contest Announced

As many of you know, the Oracle NoSQL Database was advanced into the pole position of the NoSQL landscape thru its underlying use of BerkeleyDB as its core storage engine.  Oracle's NoSQL Database is an advanced cluster management, load balancing, parallelization and replication layer that turns BerkeleyDB into a superior scale-out solution.   An Oracle partner has now launched a contest to highlight some of the coolest applications built on the solution.

Contest - Show off your NoSQL, win an iPad!!

A partner of ours, OpalSoft Inc, is running a contest to select the Coolest “Oracle NoSQL Database Application".  It is simple to enter the contest, just go to and submit information about an application you've built on the Oracle NoSQL Database.  If you haven't built one already, you still have time, so go ahead and download create an application or integrate with some cool open source project then submit your entry.  You have until July, 8th 2013 to complete your submission. The chosen winner will receive a new iPad with one of those retina displays, perfect for hanging out by the pool this summer!  
For complete details, visit the contest website.

Tuesday Jan 15, 2013

Oracle NoSQL Database Storage Node Capacity Parameter

I noticed in this article about Cassandra 1.2 that they have added the concept of vnodes, which allow you to have multiple nodes on a piece of hardware. This is pretty much the same as Oracle NoSQL Database's capability to place multiple Rep Nodes per Storage Node using the Capacity parameter. In general, the recommended starting point in configuring multiple Replication Nodes per Storage Node is one Rep Node per spindle or IO Channel.

The article also talks about Atomic Batching, which has been available in Oracle NoSQL Database since R1 through the various oracle.kv.KVStore.execute() methods. This capability allows an application to batch multiple operations against multiple records with the same major key in one atomic operation (transaction). Our users have all said that this is an important capability.

Thursday Dec 20, 2012

Oracle NoSQL Database R2 Released

It's official: we've shipped Oracle NoSQL Database R2.

Of course there's a press release, but if you want to cut to the chase, the major features this release brings are:

  • Elasticity - the ability to dynamically add more storage nodes and have the system rebalance the data onto the nodes without interrupting operations.
  • Large Object Support - the ability to store large objects without materializing those objects in the NoSQL Database (there's a stream API to them).
  • Avro Schema Support - Data records can be stored using Avro as the schema.
  • Oracle Database External Table Support - A NoSQL Database can act as an Oracle Database External Table.
  • SNMP and JMX Support
  • A C Language API

There are both an open-source Community Edition (CE) licensed under aGPLv3, and an Enterprise Edition (EE) licensed under a standard Oracle EE license. This is the first release where the EE has additional features and functionality.

Congratulations to the team for a fine effort.

Friday Dec 14, 2012

Oracle NoSQL Database: Cleaner Performance

In an earlier post I noted that Berkeley DB Java Edition cleaner performance had improved significantly in release 5.x. From an Oracle NoSQL Database point of view, this is important because Berkeley DB Java Edition is the core storage engine for Oracle NoSQL Database.

Many contemporary NoSQL Databases utilize log based (i.e. append-only) storage systems and it is well-understood that these architectures also require a "cleaning" or "compaction" mechanism (effectively a garbage collector) to free up unused space. 10 years ago when we set out to write a new Berkeley DB storage architecture for the BDB Java Edition ("JE") we knew that the corresponding compaction mechanism would take years to perfect. "Cleaning", or GC, is a hard problem to solve and it has taken all of those years of experience, bug fixes, tuning exercises, user deployment, and user feedback to bring it to the mature point it is at today. Reports like Vinoth Chandar's where he observes a 20x improvement validate the maturity of JE's cleaner.

Cleaner performance has a direct impact on predictability and throughput in Oracle NoSQL Database. A cleaner that is too aggressive will consume too many resources and negatively affect system throughput. A cleaner that is not aggressive enough will allow the disk storage to become inefficient over time. It has to

  1. Work well out of the box, and
  2. Needs to be configurable so that customers can tune it for their specific workloads and requirements.

The JE Cleaner has been field tested in production for many years managing instances with hundreds of GBs to TBs of data. The maturity of the cleaner and the entire underlying JE storage system is one of the key advantages that Oracle NoSQL Database brings to the table -- we haven't had to reinvent the wheel.

Thursday Nov 29, 2012

Berkeley DB Java Edition 5.x Cleaner Performance Improvements

Berkeley DB Java Edition 5.x has significant performance improvements. One user noted that they are seeing a 20x improvement in cleaner performance.

Tuesday Oct 30, 2012

YCSB Benchmark Results for Other NoSQL DBs

Here's an interesting article on YCSB benchmarks run on Cassandra, HBase, MongoDB, and Riak.  Compare these to the Oracle NoSQL Database YCSB performance test results.

Oracle NoSQL Database Performance Tests

Oracle NoSQL Database Exceeds 1 Million Mixed YCSB Ops/Sec

Friday Oct 19, 2012

Passoker Online Betting Use of Oracle NoSQL Database

Here's an Oracle NoSQL Database customer success story for Passoker, an online betting house.

There are a lot of great points made in the Solutions section, but as a developer the one I like the most is this one:

  • Eliminated daily maintenance related to single-node points-of-failure by moving to Oracle NoSQL Database, which is designed to be resilient and hands-off, thus minimizing IT support costs

Blueprints API for Oracle NoSQL Database

Here's an implementation of the Blueprints API for Oracle NoSQL Database.

Blueprints is a collection of interfaces, implementations, ouplementations, and test suites for the property graph data model. Blueprints is analogous to the JDBC, but for graph databases. As such, it provides a common set of interfaces to allow developers to plug-and-play their graph database backend. Moreover, software written atop Blueprints works over all Blueprints-enabled graph databases. Within the TinkerPop software stack, Blueprints serves as the foundational technology for:

  • Pipes: A lazy, data flow framework
  • Gremlin: A graph traversal language
  • Frames: An object-to-graph mapper
  • Furnace: A graph algorithms package
  • Rexster: A graph server

Friday Sep 21, 2012

Oracle NoSQL Database Exceeds 1 Million Mixed YCSB Ops/Sec

We ran a set of YCSB performance tests on Oracle NoSQL Database using SSD cards and Intel Xeon E5-2690 CPUs with the goal of achieving 1M mixed ops/sec on a 95% read / 5% update workload. We used the standard YCSB parameters: 13 byte keys and 1KB data size (1,102 bytes after serialization). The maximum database size was 2 billion records, or approximately 2 TB of data. We sized the shards to ensure that this was not an "in-memory" test (i.e. the data portion of the B-Trees did not fit into memory). All updates were durable and used the "simple majority" replica ack policy, effectively 'committing to the network'. All read operations used the Consistency.NONE_REQUIRED parameter allowing reads to be performed on any replica.

In the past we have achieved 100K ops/sec using SSD cards on a single shard cluster (replication factor 3) so for this test we used 10 shards on 15 Storage Nodes with each SN carrying 2 Rep Nodes and each RN assigned to its own SSD card. After correcting a scaling problem in YCSB, we blew past the 1M ops/sec mark with 8 shards and proceeded to hit 1.2M ops/sec with 10 shards.

[Read More]

Thursday Jun 21, 2012

Oracle NoSQL Database Using FusionIO ioDrive2

We ran some benchmarks using FusionIO ioDrive2 SSD drives and Oracle NoSQL Database. FusionIO has published a whitepaper with the results of the benchmarks.

"Results of testing showed that using an ioDrive2 for data delivered nearly 30 times more operations per second than a 300GB 10k SAS disk on a 90 percent read and 10 percent write workload and nearly eight times more operations per second on a 50 percent read and 50 percent write workload. Equally impressive, an ioDrive2 reduced latency over 700 percent (seven times) on inserts in a 90 percent read and 10 percent write workload and over 5800 percent (58 times) on reads in a 50 percent read and 50 percent write workload."

Tuesday Jun 12, 2012

Eventual Consistency Explained

Here's a short paper called De-mystifying "Eventual-Consistency" In Distributed Systems by Ashok Joshi.

Recently, there’s been a lot of talk about the notion of eventual consistency, mostly in the context of NoSQL databases and “Big Data”. This short article explains the notion of consistency, and also how it is relevant for building NoSQL applications.

Thursday Jan 12, 2012

Oracle NoSQL Database Performance Tests

Our colleagues at Cisco gave us access to their Unified Computing and Servers (UCS) labs for some Oracle NoSQL Database performance testing.  Specifically, they let us use a dozen C210 servers for hosting the Oracle NoSQL Database Rep Nodes and a handful of C200 servers for driving load.

The C210 machines were configured with 96GB RAM, dual Xeon X5670 CPUs (2.93 GHz), and 16 x 7200 rpm SAS drives.  The drives were configured into two sets of 8 drives, each in a RAID-0 array using the hardware controller, and then combined into one large RAID-0 volume using the OS.  The OS was Linux 2.6.32-130.el6.x86_64.

Cisco 10GigE switches were used to connect all the machines (Rep Nodes and load drivers).

We used the Yahoo! Cloud System Benchmark as the client for the tests.  Our keysize was 13 bytes and the datasize 1108 bytes (that's how our serialization turned out for 1K of data).  We ran two phases: a load, and a 50/50 read/update benchmark.  Because YCSB only supports a Java integer's worth of records (2.1 billion), we created 400 million records per NoSQL Database Rep Group.  The "KVS size" column shows the total number of records in the K/V Store followed by the number of rep groups and replication factor in ()'s.  For example, "400m(1x3)" means 400m total records in a K/V Store consisting of 1 Rep Group with a Replication Factor of 3 (3 Replication Nodes total).

The clients ran on the C200 nodes, which were configured with dual X5670 Xeon CPUs and 96GB of memory, although really only the CPU speed matters on that side of the equation since they were not memory or IO bound.  Typically, we ran with 90 client threads per YCSB client process.  In the table below, the total number of client processes is shown in the "Clients" column, and at 90 threads/client (in general), the total client threads is shown in the "Total Client Threads" column.

The Oracle NoSQL Database Rep Node cache sizes were configured such that the B+Tree Internal Nodes fit into memory, but the leaf nodes (the data) did not.  Specifically, we configured them with 32GB of JVM heap and 22GB of  cache.  Therefore, the 50/50 Read/Update results are showing a single I/O per YCSB operation. The Durability was the NoSQL Database recommended (and default) value of no_sync, simple_majority, no_sync. The Consistency that we used for the 50/50 read/update test was Consistency.NONE.

Insert Results

KVS size 

Clients Total Client Threads
Time (sec)
Throughput (inserts/sec)
Insert Avg Latency (ms)
95% Latency (ms)
99% Latency (ms)
 400m(1x3)  3  90 15,139 26,498  3.3 5 7
 1200m(3x3)  3  270 16,738 71,684  3.6 7
 1600m(4x3)  4  360 17,053 94,441  3.7 7 18

50/50 Read/Update Results

KVS size

Clients Total Client Threads Total Throughput
Avg Read Latency
95% Read Latency
99% Read Latency
Avg Update Latency 95% Update Latency 99% Update Latency

 ops/sec  ms  ms  ms  ms  ms  ms
 30  5,595  4.8  13  50  5.6  13  52
 270  17,097  4.0  13  53  5.7  15  57
 360  24,893  4.0  12  43  5.3  14  51

The results demonstrate excellent scalability, throughput, and latency of Oracle NoSQL Database.

I want to say "thank you" to my colleagues at Cisco for sharing their extremely capable hardware, lab, and staff with us for these tests.


Anything related to Oracle NoSQL Database and/or Berkeley DB Java Edition.


  • Oracle
« December 2016