Learn all about Oracle BerkeleyDB family of databases here. Scalable and high-performance data management services.

The C++ Standard Template Library as a BDB Database (part 3)

Guest Author

In the second entry I showed that the Berkeley DB's C++ STL API can substitute in for in-memory data structures like vectors. Now in this final installment we'll see Berkeley DB at work storing object instance data referenced by STL structures.

Berkeley DB's C++ STL API, the dbstl, will automatically store the STL container in an on-disk database, but what about the data referenced by the elements within the STL container? You must choose how to store the values pointed to within the STL data structures so that they too can be reconstituted from the database. This means storing the actual data that the pointers point to, which isn't so hard. Let's get started.

Suppose we have a simple class like a "Car" (as below) that has an "Engine" member (a pointer) referencing an instance of the class "Engine".

class Car {
  size_t length, width, height;
  Engine *engine;
  string model;
  // Member functions follow ...
class Engine {
  size_t horse_power;
  char num_cylinder;
  float displacement;
  // Member functions follow ...

Now we want to store the "owner name" of the Car and we do that using a map container like std::map container, here is the pseudo code:

typedef std::map<const char *, Car> owner_car_map_t;

owner_car_map_t ocmap;
while (has more owner-car pairs to input) {
  // Accept input data for owner name and car, create car.
  Engine *pEngine = new Engine(hp, ncyl, displ);
  Car car(len, width, height, pEngine);
  ocmap[owner.c_str()] = car;

Note how the preceding code stores all information in memory, so it can't be automatically persisted by Berkeley DB. When using dbstl, we can persist it, as follows.

typedef dbstl::db_map<const char *, Car> owner_car_map_t;

// (1)
owner_car_map_t ocmap;
while (has more owner-car pairs to input) {
  // Accept input data for owner name and car, create car.
  Engine *pEngine = new Engine(hp, ncyl, displ);
  Car car(len, width, height, pEngine);
  ocmap[owner.c_str()] = car;

// (2)

Now the code will store the ower name strings and Car objects into underlying Berkeley DB database as a key/data pair in (2).

In (1), though we are using a "const char*" type as key, they key string characters can be stored into underlying Berkeley DB database. This is possible because we are using a class Car rather than primitive types like int, double, etc, we don't need the ElementHolder template here. If we are storing a char* string pair in the map, we should use this type: dbstl::db_map>.

However, the Engine objects referenced by each "car.engine" member pointer are not yet properly stored into database. At this point only their memory address is stored, which is meaningless for persistence.

This is because by default the Berkeley DB STL API (dbstl) will simply copy the object to be stored using memcpy, (i.e. we are doing a shallow copy of objects) by default. Any pointer members in the object is shallow copied, the objects they refer to are not copied. This is sufficient for classes with no member pointers, i.e. when each of its instance locates on a single continous chunk of memory, but not sufficient to be completely persisted if they have pointer members.

In order to store the Engine object of each car object, we need to do deep copy of such objects, we should register the following callback functions to do so. They cooperate closely together to enable an elegant deep copy of any complex objects.

u_int32_t CarSize(const Car &car)

// (3)
  return sizeof(length) * 3 + model.length() + 1 +

// (4)
void CopyCar(void *dest, const Car&car)

// (5)
  // Copy all bytes of the car object into memory chunk M
  // referenced by dest.
  // M is preallocated and just big enough because our CarSize is used
  // to measure the object size. Note the bytes
  // of the Engine object E referenced by car.engine should be copied
  // into this M, rather than the car.engine pointer value,
  // in order to deep copy E.
  char *p = (char *)dest;
  memcpy(p, car.length, sizeof(size_t));
  p += sizeof(size_t);
  memcpy(p, car.width, sizeof(size_t));
  p += sizeof(size_t);
  memcpy(p, car.height, sizeof(size_t));
  p += sizeof(size_t);
  memcpy(p, car.model.c_str(), car.model.length() + 1); // (6)
  p += car.model.length() + 1;

// (6)
  memcpy(p, car.engine, sizeof(*car.engine);

// (7)
void RestoreCar(Car &car, const void* src)

// (8)
  // src references the memory chunk M which contains the bytes of a Car
  // object previously marshalled by CopyCar function, so we know
  // the data structure and composition of M. Thus here we can
  // un-marshal bytes stored in M to assign to each member of car.
  // Since we have data of the Engine member in M, we should create an
  // Engine object E using 'new' operator, and assign to its members
  // using bytes in M, and assign E's pointer to car.engine.
  char *p = src;
  memcpy(car.length, p, sizeof(size_t));
  p += sizeof(size_t);
  memcpy(car.width, p, sizeof(size_t));
  p += sizeof(size_t);
  memcpy(car.height, p, sizeof(size_t));
  p += sizeof(size_t);
  car.model = p;

// (9)
  p += car.model.length() + 1;

// (9)
  memcpy(car.engine, p, sizeof(*car.engine);
dbstl::DbstlElemTraits<Car> *pCarTraits =

// (10)

// (11)

// (12)

// (13)

In (3), this function measures a Car's size in bytes, dbstl uses it to allocate just enough space to store an object's bytes. We should consider space needed for all members we want to store.

In (4) Since we want to deep copy the engine object "E" referenced by "car.engine", we should consider its size too using sizeof(*car.engine) since Engine is a simple class each of whose instance locates on a continuous memory;

And we want to store the model string's trailing '\0' character to unmarshal it easily, so we add 1 to the model.length().

In this function we return a size just big enough so that all bytes can be placed into M with no extra trash bytes left. This depends on what bytes we want to store, i.e. the CopyCar function.

In (5), this function does the marshalling work, see the comments in the function body for more information.

In (6), note how the string is copied --- we only copy its characters, ignoring all other members. And we want to copy the tailing '\0' for easier handling later.

In (7), note we are copying the Engine object rather than the car.engine pointer, in order to completely persist the car object. Since in CarSize we have allowed for the space for the Engine object, we have just enough space here.

In (8), this function does the unmarshalling work, see the comments in the function body for more information.

In (9), we are safe to do so here because we copied the trailing '\0'.

In (10) This is the global singleton for class Car where we register callback functions for dbstl to use internally to store/retrieve Car objects.

(11)(12)(13) We register callback functions like this. This way, we can do deep copy for instances of the Car class. And note that we should do (10), (11), (12), (13) actions before (1).

There are some other features in dbstl that are beyond standard C++ STL classes, which enable you to make use of Berkeley DB's advanced features such as secondary indexes, bulk retrieval (to speed up reading chunks of consecutive data items), transactions, replication, etc. Please refer to the dbstl API documentation and reference manual for details.

The full code for this example can be found here.

There are quite a lot of example code in the download package $(db)/examples_stl and $(db)/test_stl/base demonstrating these advanced features; And in $(db)/test_stl/ms_examples and $(db)/test_stl/stlport there are dbstl standard usage example code. These example code is good resource to start with.

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.Captcha