Tuesday Nov 29, 2011

Announcing Berkeley DB Java Edition Major Release

Berkeley DB Java Edition 5.0 was just released. There are a number of new features, enhancements, and options in there that our users have been asking for. Chief among them is a new class called DiskOrderedCursor, which greatly increases performance of systems using spinning platter magnetic hard drives. A number of users expressed interest in this feature, including Alex Feinberg of LinkedIn. Berkeley DB Java Edition is part of Project Voldemort, a distributed key/value database used by LinkedIn.

There have been many other improvements and optimizations. Concurrency is significantly improved, as is the performance of update and delete operations. New and interesting methods include Environment.preload, which allows multiple databases to be preloaded simultaneously. New Cursor methods enable for more effective searching through the database.

We continue to enhance Berkeley DB Java Edition’s High Availability as well. One new feature is the ability to open a replicated node read-only when the master is unavailable. This can allow critical systems to continue offering some functionality, even during a network or master node failure.

There’s a lot more in release 5.0. I encourage you to take a look at the extensive changelog yourself. As always, you can download the new release and try it out here:


Thursday Sep 22, 2011

Berkeley DB at OpenWorld/JavaOne 2011

It’s the question on everyone’s mind: what is Berkeley DB bringing to OpenWorld this year? Even if you’re more preoccupied with the latest iPhone 5 rumors, (I hear the front facing camera can look into your eyes and tell you what you had for breakfast) Berkeley DB users (or fans) who are planning to attend Oracle OpenWorld in San Francisco about 10 days from now will want to read this post.

As always, we will have a session where you can learn about the cool and interesting things our customers are doing with a key/value data store. But this year, we have two additional general areas of focus: Embedded, and Mobile Applications.

Let’s take a closer look at the embedded front first. As the cost of components like network hardware, flash memory, and microcontrollers continues to fall, OEMs in many verticals are choosing to enhance their product lineup by adding new applications and internet-enabled features. Networked cash registers? That was only the beginning. This was the year cars started getting on the internet, and I already blogged about that. Who knows what the next years will bring?

Whatever our customers are planning, we’ve got solutions that can help. Our first session with an embedded focus is 15185, BDB and Embedded Java presentation. There you’ll learn how we are extending Java’s “write once, deploy anywhere” mantra. With Berkeley DB and Database Mobile Server, you get worry free data management and sync capabilities after you’ve deployed, as well.

Next, over on the JavaOne side we have 25143, Telemetry and Synchronization with Embedded Java and Berkeley DB. This session will feature Java Architect Greg Bollella talking about how these same technologies are enabling telemetry solutions to plug into the enterprise right alongside your existing data and apps. The ability to manage networks of embedded devices using existing enterprise frameworks could prove to be quite revolutionary. The embedded Java platform, when coupled with Berkeley DB and Database Mobile Server, has the ability to do just that. We’re excited about this, and we think our customers will be too.

On the mobile applications side of things Tata Consulting Services (TCS) will be joining us for session 15178, Achieve Ubiquitous Data Access, from Device Databases to Enterprise Repositories. There you’ll learn how TCS is helping their customers deploy mobile applications that maximize the ROI of their existing enterprise infrastructure.

Rounding out the list is our key/value customer highlight session. You thought I forgot, right? No chance! Session 15167 is entitled Transactional Key-Value Storage: Super Simple, Super Fast, Super Flexible. Raghunath Nambiar, an Architect at Cisco, will co-present with us. The topic will be super cool things you can accomplish using the key-value paradigm for data management.

In addition to the sessions, we will have a number of exciting demos for you to check out, both at OpenWorld and JavaOne.

Finally, be on the lookout for an exciting product announcement building on the inherent strengths of the Berkeley DB product family!

Hope to see you there!

Tuesday Jul 12, 2011

Customer Highlight: AdaptiveMobile

Last week AdaptiveMobile announced they signed a 3 year agreement to embed Oracle Berkeley DB in their Network Protection Platform, a key component of their product portfolio.

AdaptiveMobile is the world leader in mobile security, enabling trusted networks for the world’s largest operator groups and protecting one in six subscribers globally. AdaptiveMobile provides operators with the most comprehensive network-based security solutions enabling them to protect their consumer and enterprise customers against the growing threat of mobile abuse. From the announcement: “AdaptiveMobile selected Oracle Berkeley DB for its outstanding data retrieval speeds, reliability, scalability and availability and its integration with the company’s existing application environment.”

Here is a great quote from Gareth Maclachlan, AdaptiveMobile Chief Operating Officer: “Oracle’s Berkeley DB has enabled us to develop a solution that resolves the content and access control challenges of both network operators and their customers without impacting the user experience. “Working with Oracle has allowed us to reduce our development time and cut our total cost of ownership for network operators by helping us reduce our hardware and administration costs.”

For more information, you can see the announcement, or learn more about AdaptiveMobile’s products.

Wednesday Jun 22, 2011

New Release of Oracle Berkeley DB

We are pleased to announce that a new release of Oracle Berkeley DB, version, is available today.

Our latest release includes yet more value added features for SQLite users, as well as several performance enhancements and new customer-requested features to the key-value pair API.  We continue to provide technology leadership, features and performance for SQLite applications.  This release introduces additional features that are not available in native SQLite, and adds functionality allowing customers to create richer, more scalable, more concurrent applications using the Berkeley DB SQL API.

This release is compelling to Oracle’s customers and partners because it:

  • delivers a complete, embeddable SQL92 database
  • as a library under 1MB size
  • drop-in API compatible with SQLite version 3
  • no-oversight, zero-touch database administration
  • industrial quality, battle tested Berkeley DB B-TREE for concurrent transactional data storage

New Features Include:

  • MVCC support for even higher concurrency
  • direct SQL support for HA/replication
  • transactionally protected Sequence number generation functions
    • lower memory requirements, shared memory regions and faster/smaller memory on startup
  • easier B-TREE page size configuration with new ''db_tuner" utility

New Key-Value API Features Include:

  • HEAP access method for constrained disk-space applications (key-value API)
  • faster QUEUE access method operations for highly concurrent applications -- up 2-3X faster! (key-value API)
  • new X/open compliant XA resource manager, easily integrated with Oracle Tuxedo (key-value API)
    • additional HA/replication management and communication options (key-value API)

and a lot more!

BDB is hands-down the best edge, mobile, and embedded database available to developers.

Downloads available today on the Berkeley DB download page

Product Documentation

Wednesday Jun 01, 2011

Embedded Systems Conference San Jose

The annual Embedded Systems Conference in San Jose was held at the beginning of May. This was Oracle’s first year at the conference, and we did get visitors who were surprised to see us there. However some people, myself included, think we're going to see increased convergence between enterprise and embedded in the coming years. I know we're not alone, because a certain other big name in enterprise systems had a booth right next to ours!

Since embedded is not a topic everyone is familiar with, I want to give a little insight into this conference and the embedded space in general, as I think it is potentially an important growth area for products like Berkeley DB. Since I was an embedded developer myself in a past life, this is familiar territory for me.

Wikipedia defines embedded systems as "a computer system designed to do one or a few dedicated and/or specific functions." It is important to note that embedded is generally considered to be distinct from mobile. Mobile platforms are typically derived in some way from desktop platforms such as Linux, Windows, OSX, and are often more general purpose devices. Before the advent of cheap, high-res, general purpose LCD displays, having a display in your device meant a Cathode Ray Tube (CRT). Embedded devices were traditionally ‘headless,’ meaning they had no display and no generic input device such as a keyboard. Because of this, in the early days developers would commonly develop on desktop machines using a cross toolchain. A cross toolchain is a set of tools designed to build software on a target embedded platform, which was a completely different hardware architecture and OS from the desktop development platform. Nowadays, many embedded platforms are powerful enough that they can run their own toolchains. Such was the case with our demo, more on that below.

Another common aspect of an embedded system is “real time” requirements. A simplified definition would be if a given operation does not complete by a certain time, it’s just as bad as not finishing at all. Real Time Operating Systems, or RTOSes, can provide guarantees about when operations will finish. Real time embedded devices are still quite prevalent in some industries, including military, aviation, industrial manufacturing, and networking. The embedded space has certainly been encroached on by the rise of mobile, but as long as we have mission critical devices there will continue to be a requirement for embedded devices.

Now back to the Embedded Systems Conference. Booth traffic was high, we were averaging about 1 visitor per minute the nearly the whole time I was there. I attribute this partly to curiosity, but mostly to our great giveaways! They did their job, I talked to a number of people who ended up having a genuine interest in what we were showing, and were initially attracted to the booth by our swag. Also we held a drawing for an iPad, which brought a ton of people to register.

Our demo was a temperature sensor attached to a small device called a SheevaPlug, which is a general purpose embedded development device from Marvell. By embedded standards, the SheevaPlug is a very powerful device, and we were able to develop directly on it. The idea behind the demo was that the device represented one of many nodes in a sensor network. Some real world examples of this include weather stations, or monitoring conditions inside laboratories or industrial facilities. Our demo showed the system collecting temperature data, which was then uploaded to Oracle Database. All of this was running on top of Java SE Embedded. The demo was well received. Nearly everyone who listened to me present agreed that the sync functionality would be useful to them, or useful in general if they didn’t need it themselves.

The main purpose of our presence at ESC was to showcase the power, ease of use, and versatility of Java Embedded. When you combine that Berkeley DB and Oracle Database Lite Mobile Server, you get a system that has out of the box capability to move data to and from enterprise storage systems. After a few simple configuration steps, the data stored on the local Berkeley DB or SQLite data store is connected to the enterprise backend. This is a potent combination of features, and one that we feel will be in high demand in the coming years, as M2M and embedded solutions continue to proliferate.

Tuesday Mar 29, 2011

Ubiquitous Mobile Applications

It goes without saying that smart phones are already very popular and enjoying rapid growth. But a few things finally happened in the last year that people have been predicting for a long time, things that could make smartphones and mobile apps an even bigger part of our lives.
First off, in what many consider to be a bold move from a normally conservative company, GM has already released mobile apps that allow many of their cars to be remotely monitored, unlocked, and even started. So as long as you have a high degree of confidence in your smartphone's battery life, you don't need your car keys anymore.

Next, we have Starbucks, who have quietly sidestepped the ongoing mobile payments dispute by launching a barcode based payments app that doesn't require any special technology. Other large retailers are probably watching this closely, and if the big cell phone companies don't get their acts together, they might just end up missing the boat again. In any case, many industry observers feel that the Starbucks system could be the spark that sets off an explosion of mobile payment systems, from McDonald's to Neiman Marcus, and everywhere in between. Here is some more detail on the Starbucks solution, and for an industry analysis click here.

But I saved the biggest one for last. Are you ready? Take a deep breath, and then read this: Smartphones outsold PCs for the first time, Q4 2010. This milestone happened much sooner than anyone expected, thanks mostly to new Android activation numbers so high they're almost hard to believe. At the start of 2011 people were talking about activations in excess of 125k units per day. Currently the number being tossed around is 300k units per day. If Android can keep it up, that platform alone will outsell PCs in 2011. Folks, this could be the end of an era.

When we combine these milestones, the smartphone appears to poised to become a sort of ultimate swiss army knife; a single device that does everything. The thought of it is exciting; imagine the convenience! Alas, as with all great things in life, there is a downside. Money, privacy, communication, and even physical security could be compromised if the software or data on the device is not secure.

The average person has at least 3 things with them when they leave the house: wallet, keys, and cell phone. Today, if one of the three things were to be lost, that would be bad. It would be a huge inconvenience, in the best case. However, when a cell phone is lost today, the unfortunate person is not normally forced to cancel their credit cards and reprogram their car locks. Worse than that, if we envision the future that the stories listed above point to, there is a potential for any or all of the three to be stolen electronically, by a thief far away, even in a different country. That should be a sobering prospect, both for consumers and the various companies providing services to them. It is easy to tout the benefits of technology convergence; it's a fun and exciting topic. But as technology providers, it is our duty to also consider the potential drawbacks.

In order to keep everyone safe, sensitive mobile data needs to be securely stored on the local device, and transmitted reliably to each company's data vaults. Those same companies need to be able to determine, with 100% accuracy, whether a certain request is coming from an authorized device or an imposter. This could be a mobile purchase, a request to unlock your car, or even your home.

Two critical components of a safe, trustworthy solution for storing and syncing sensitive mobile data are Oracle's Berkeley DB and Database Mobile Server. Most of the world's large enterprises already store their critical data in Oracle Database. Berkeley DB is a stable, mature product with over 10 years proven history. It is ideally positioned to be the mobile data store to complement Oracle Database. It is secure in the traditional sense, meaning it can protect mobile data from malicious intent, and in the database architecture sense as well, with full transactional guarantees. Database Mobile Server provides the final piece to form a complete solution: the capability to synchronize your data between the mobile devices and your Oracle backend, and manage the mobile application, data, and even the device itself if desired.
If you're considering a mobile application that will access sensitive data, data that you already trust Oracle to store in your backend infrastructure, Berkeley DB and Database Mobile Server are the best choices to handle it.

Sunday Jan 30, 2011

Is Berkeley DB a NoSQL solution?

Berkeley DB is a library. To use it to store data you must link the library into your application. You can use most programming languages to access the API, the calls across these APIs generally mimic the Berkeley DB C-API which makes perfect sense because Berkeley DB is written in C. The inspiration for Berkeley DB was the DBM library, a part of the earliest versions of UNIX written by AT&T's Ken Thompson in 1979. DBM was a simple key/value hashtable-based storage library. In the early 1990s as BSD UNIX was transitioning from version 4.3 to 4.4 and retrofitting commercial code owned by AT&T with unencumbered code, it was the future founders of Sleepycat Software who wrote libdb (aka Berkeley DB) as the replacement for DBM. The problem it addressed was fast, reliable local key/value storage.

At that time databases almost always lived on a single node, even the most sophisticated databases only had simple fail-over two node solutions. If you had a lot of data to store you would choose between the few commercial RDBMS solutions or to write your own custom solution. Berkeley DB took the headache out of the custom approach. These basic market forces inspired other DBM implementations. There was the "New DBM" (ndbm) and the "GNU DBM" (GDBM) and a few others, but the theme was the same. Even today TokyoCabinet calls itself "a modern implementation of DBM" mimicking, and improving on, something first created over thirty years ago. In the mid-1990s, DBM was the name for what you needed if you were looking for fast, reliable local storage.

Fast forward to today. What's changed? Systems are connected over fast, very reliable networks. Disks are cheap, fast, and capable of storing huge amounts of data. CPUs continued to follow Moore's Law, processing power that filled a room in 1990 now fits in your pocket. PCs, servers, and other computers proliferated both in business and the personal markets. In addition to the new hardware entire markets, social systems, and new modes of interpersonal communication moved onto the web and started evolving rapidly. These changes cause a massive explosion of data and a need to analyze and understand that data. Taken together this resulted in an entirely different landscape for database storage, new solutions were needed.

A number of novel solutions stepped up and eventually a category called NoSQL emerged. The new market forces inspired the CAP theorem and the heated debate of BASE vs. ACID. But in essence this was simply the market looking at what to trade off to meet these new demands. These new database systems shared many qualities. They were designed to address massive amounts of data, millions of requests per second, and scale out across multiple systems.

The first large-scale and successful solution was Dynamo, Amazon's distributed key/value database. Dynamo essentially took the next logical step and added a twist. Dynamo was to be the database of record, it would be distributed, data would be partitioned across many nodes, and it would tolerate failure by avoiding single points of failure. Amazon did this because they recognized that the majority of the dynamic content they provided to customers visiting their web store front didn't require the services of an RDBMS. The queries were simple, key/value look-ups or simple range queries with only a few queries that required more complex joins. They set about to use relational technology only in places where it was the best solution for the task, places like accounting and order fulfillment, but not in the myriad of other situations.

The success of Dynamo, and it's design, inspired the next generation of Non-SQL, distributed database solutions including Cassandra, Riak and Voldemort. The problem their designers set out to solve was, "reliability at massive scale" so the first focal point was distributed database algorithms. Underneath Dynamo there is a local transactional database; either Berkeley DB, Berkeley DB Java Edition, MySQL or an in-memory key/value data structure. Dynamo was an evolution of local key/value storage onto networks. Cassandra, Riak, and Voldemort all faced similar design decisions and one, Voldemort, choose Berkeley DB Java Edition for it's node-local storage. Riak at first was entirely in-memory, but has recently added write-once, append-only log-based on-disk storage similar type of storage as Berkeley DB except that it is based on a hash table which must reside entirely in-memory rather than a btree which can live in-memory or on disk.

Berkeley DB evolved too, we added high availability (HA) and a replication manager that makes it easy to setup replica groups. Berkeley DB's replication doesn't partition the data, every node keeps an entire copy of the database. For consistency, there is a single node where writes are committed first - a master - then those changes are delivered to the replica nodes as log records. Applications can choose to wait until all nodes are consistent, or fire and forget allowing Berkeley DB to eventually become consistent. Berkeley DB's HA scales-out quite well for read-intensive applications and also effectively eliminates the central point of failure by allowing replica nodes to be elected (using a PAXOS algorithm) to mastership if the master should fail. This implementation covers a wide variety of use cases. MemcacheDB is a server that implements the Memcache network protocol but uses Berkeley DB for storage and HA to replicate the cache state across all the nodes in the cache group. Google Accounts, the user authentication layer for all Google properties, was until recently running Berkeley DB HA. That scaled to a globally distributed system. That said, most NoSQL solutions try to partition (shard) data across nodes in the replication group and some allow writes as well as reads at any node, Berkeley DB HA does not.

So, is Berkeley DB a "NoSQL" solution? Not really, but it certainly is a component of many of the existing NoSQL solutions out there. Forgetting all the noise about how NoSQL solutions are complex distributed databases when you boil them down to a single node you still have to store the data to some form of stable local storage. DBMs solved that problem a long time ago. NoSQL has more to do with the layers on top of the DBM; the distributed, sometimes-consistent, partitioned, scale-out storage that manage key/value or document sets and generally have some form of simple HTTP/REST-style network API. Does Berkeley DB do that? Not really.

Is Berkeley DB a "NoSQL" solution today? Nope, but it's the most robust solution on which to build such a system. Re-inventing the node-local data storage isn't easy. A lot of people are starting to come to appreciate the sophisticated features found in Berkeley DB, even mimic them in some cases. Could Berkeley DB grow into a NoSQL solution? Absolutely. Our key/value API could be extended over the net using any of a number of existing network protocols such as memcache or HTTP/REST. We could adapt our node-local data partitioning out over replicated nodes. We even have a nice query language and cost-based query optimizer in our BDB XML product that we could reuse were we to build out a document-based NoSQL-style product. XML and JSON are not so different that we couldn't adapt one to work with the other interchangeably. Without too much effort we could add what's missing, we could jump into this No SQL market withing a single product development cycle.

Why isn't Berkeley DB already a NoSQL solution? Why aren't we working on it? Why indeed...

Tuesday Nov 02, 2010

Berkeley DB Java Edition 4.1.6

Yesterday we released a new version of Berkeley DB Java Edition. This new release has some major enhancements for speed. BDB JE has always been as fast as the I/O + stable storage (disk) system for writes due to its write-once, append-only log-based architecture for fully durable commits (semi-durable, those which commit to operating system buffers rather than to the stable storage, operate at in-memory speeds). The issue until now was with random reads. Now, even with modest sized caches (512MB), you can experience predictable latency for random out-of-cache reads even for multi-TB databases.

This is a first in the pure-Java world. BDB JE is the only solution when you need large scale, predictable ACID storage for non-relational data. Imagine configuring your heap to 2GB and BDB JE's cache to 512MB then accessing TBs of data on disk knowing that your application will have 1.5GB of memory in the JVM to use.

Memory management and GC have always been tricky to get right when building large scale Java systems. With this release of Berkeley DB Java Edition we help take you one step closer to a predictable database in pure-Java.

Read more on Charlie Lamb's blog.

Friday Oct 15, 2010

Open SQL Camp, Boston

[Read More]

Information about Berkeley DB products directly from the people who build them.


« June 2016