January 8, 2010

Reddit

More than one person on the Berkeley DB team visits Reddit daily so it's nice to see that part of what makes Reddit work is our product. MemcacheDB (built on Berkeley DB using it's high availability (HA) feature to replicate cached data between nodes) is a natural fit for your permacache (key/value cache as a database of record) and it seems like it's working fairly well.

Scalability, performance, zero-downtime are major targets for our work. We'd be happy to work with you and the engineers behind MemcacheDB to make sure you're getting everything you need from Berkeley DB. Maybe we can shrink your storage requirements by using compression? Maybe we can help scale using partitioning? Maybe we can work to reduce or eliminate the 71 minutes it took to upgrade to a new MemcacheDB layout. Berkeley DB/HA supports live migration now (even across different endian systems). MemcacheDB needs to add this as a feature, good thing we're friends with Steve Chu. While we're at it maybe we could optimize the log write thread (the trickle call) for cloud-based storage so as to avoid the lag you experienced to begin with.

We really like having a challenge, sounds like you've got one, we're here to help.

October 5, 2009

Berkeley DB at Oracle Open World 2009

Where to find Berkeley DB people & sessions at Oracle Open World

If you are interested in hearing about Berkeley DB, learning more about how the Berkeley DB products can be integrated into your applications/appliances/devices, seeing exciting Berkeley DB customer use cases or speaking directly with one of the Berkeley DB product development engineers, you can find us here:

  • Session ID#: S311365, Title: Oracle Berkeley DB: Lightning-Fast Key Value Storage Just Got Faster, Date: Sun, Oct. 11th, Time: 15:45 - 16:45, Venue: Hilton Hotel, Room: Golden Gate 1, Track: Oracle Develop: Database, Speaker: David Segleau (Director Product Management), including customer presentations from Lucas Vogel, Managing Partner at EndPoint Systems and Madhu Bhimaraju, Database Architecture Engineer at Verizon Wireless. We will be covering several of the new features in Berkeley DB 4.8 and discussing how Verizon Wireless uses Berkeley DB to provide services to an ever-growing customer base.
  • Session ID#: S311364, Title: Oracle Berkeley DB Java Edition High Availability: Java Persistence at Network Speeds, Date: Sun, Oct 11th, Time: 14:30 - 15:30, Venue: Hilton Hotel, Room: Golden Gate 1, Track: Oracle Develop: Database, Speaker: Sam Haradhvala (Senior Engineer on the Berkeley DB Java Edition product). We will be discussing the new High Availability/Replication functionality that is now available in Berkeley DB Java Edition -- how it works, how it can improve your application performance, reliability and throughput, as well as common use cases and configurations.
  • Berkeley DB Applications Demo/Lunch. Date: Mon, Oct. 12th, Meeting Time: 11:00 - 13:00, Venue: Hilton Hotel - Union Square Room 6, 4th Fl. Greg Burd (Senior Product Manager) and several Berkeley DB engineers will be showing several Berkeley DB application demos, discussing how they were implemented and how similar functionality can be part of your application. Join us for lunch, some interesting demos and a open question and answer session.
  • The Berkeley DB Product booth in the Exhibition Hall in Moscone West. We're workstation W-035, under the Database Track, in the Embedded Database sub-area just like last year. The booth is open Monday and Tuesday from 10:30-6:30 and Wednesday from 9:15-5:15. We're always delighted to talk with existing users, potential users and anyone who is curious about our embedded database libraries.
We hope to see you all there!

September 14, 2009

Berkeley DB XML 2.5

Today is a big day for Berkeley DB products. Not only are we releasing a new version of Berkeley DB, but we also are announcing the availability of a new version of Berkeley DB XML. This release makes use of the new features in Berkeley DB 4.8 and has many exciting features of its own.

New in Berkeley DB XML 2.5:

  • Berkeley DB XML will guess what indexes might speed up your queries based on XML document shape and content anticipating potential query patterns and pre-indexing them to speed up query execution and lower developer effort.
  • 30% reduction of XML database containers on disk using the new Berkeley DB Compression APIs and a custom compression algorithm optimized for XML data.
  • New XQuery extension support enabling integration with XQuery scripting services.

Berkeley DB 4.8

Today we are very happy to announce the availability of an amazing new version of Berkeley DB. This release is full of new features, performance improvements, and much more.

New in Berkeley DB 4.8:

  • Performance has been enhanced for multi-threaded/process applications on CMP/SMP systems.
  • New C# and .NET API providing more support for Microsoft programmers.
  • New support for C++ Standard Template Library (STL) allows developers to use transactions, cache large datasets, and persist data using familiar STL data structures and APIs.
  • New API to support referential integrity constraints including abort, cascade, nullify semantics.
  • New B-tree compression APIs allowing you to reduce the amount of disk space required to store your data with your custom compression code (or our default implementation). Using compression will reduce disk I/O and allow more data to live in-cache. This space savings in turn improves performance (less I/O and more in-cache data) despite the overhead of compressing/decompressing data. Random reads is one case where performance could suffer when compressing data, be sure to understand your data access patterns and test your performance.
  • New support for partitioning database files can improve application throughput and reduce lock contention when data is split across multiple I/O channels and/or disks.
  • New bulk load/delete APIs can significantly improve application performance.
  • New command line tool for bridging the gap between SQL and Berkeley DB databases called 'db_sql'. This utility will translate the majority of SQL92 DDL into Berkeley DB source code.

September 2, 2009

Go ahead, give it a try - Berkeley DB XML is good, and good for you.

Pierre Lindenbaum, we hear you. Thanks for taking Berkeley DB XML for a spin and writing up your experiences with the product!

A New BDB/HA Replication Whitepaper by Margo Seltzer

Berkeley DB has many sophisticated database features including an ability to replicate database transactions across multiple independent systems. This is what we call the "HA" or "High Availability" option for Berkeley DB. Software systems have many reasons to replicate the data on which they are operating. Berkeley DB/HA (BDB/HA) is designed to be flexible enough to cover most of these use cases. In this new whitepaper by Margo Seltzer you can read about how Berkeley DB can be the embedded storage manager with sophisticated replication features for your software project that you don't have to debug, that's our job.

July 22, 2009

Welcome, let's DB->open() a conversation about data storage.

Here you will find articles about Berkeley DB products, interesting use cases, and fun factoids to keep you informed and hopefully interested in Berkeley DB products.

The term "database" conjures up thoughts of SQL, tables, and client/server architecture. This hasn't always been true. Database is not just another way of saying "SQL" and it's not always a synonym for RDBMS (relational database management systems). In today's complex distributed and diverse software ecosystem there are many new and divergent requirements for data storage. A sizable portion of those cases don't have any need for SQL, but they do need transactions, recovery, concurrent access, replication, fail-over, hot-backup, and all the other core features of an RDBMS. There is a growing awareness that non-relational (aka NoSQL) storage solutions have an important role to play in systems of all shapes and sizes.

Of course here within Oracle, and especially within the Berkeley DB group, we've known this for a long time. We've been working for over a decade on a database engine that has the qualities found at the core of most modern database products (relational, object, hierarchical, etc.) without any of the fluff or imposed structure layered on top. We've built a data storage component, a library to be incorporated into your application. And your product doesn't have to be a database. Most uses of Berkeley DB are within programs that never export storage services directly to their users, it can be a software application, a hardware controller, or a globally networked service.

Berkeley DB products are designed to let you focus on building your application while leaving the complexity of data storage to us.

For now DB->close().