Oracle NoSQL Database Exceeds 1 Million Mixed YCSB Ops/Sec

We ran a set of YCSB performance tests on Oracle NoSQL Database using SSD cards and Intel Xeon E5-2690 CPUs with the goal of achieving 1M mixed ops/sec on a 95% read / 5% update workload. We used the standard YCSB parameters: 13 byte keys and 1KB data size (1,102 bytes after serialization). The maximum database size was 2 billion records, or approximately 2 TB of data. We sized the shards to ensure that this was not an "in-memory" test (i.e. the data portion of the B-Trees did not fit into memory). All updates were durable and used the "simple majority" replica ack policy, effectively 'committing to the network'. All read operations used the Consistency.NONE_REQUIRED parameter allowing reads to be performed on any replica.

In the past we have achieved 100K ops/sec using SSD cards on a single shard cluster (replication factor 3) so for this test we used 10 shards on 15 Storage Nodes with each SN carrying 2 Rep Nodes and each RN assigned to its own SSD card. After correcting a scaling problem in YCSB, we blew past the 1M ops/sec mark with 8 shards and proceeded to hit 1.2M ops/sec with 10 shards. 

Hardware Configuration

We used 15 servers, each configured with two 335 GB SSD cards. We did not have homogeneous CPUs across all 15 servers available to us so 12 of the 15 were Xeon E5-2690, 2.9 GHz, 2 sockets, 32 threads, 193 GB RAM, and the other 3 were Xeon E5-2680, 2.7 GHz, 2 sockets, 32 threads, 193 GB RAM.  There might have been some upside in having all 15 machines configured with the faster CPU, but since CPU was not the limiting factor we don't believe the improvement would be significant.

The client machines were Xeon X5670, 2.93 GHz, 2 sockets, 24 threads, 96 GB RAM. Although the clients had 96 GB of RAM, neither the NoSQL Database or YCSB clients require anywhere near that amount of memory and the test could have just easily been run with much less.

Networking was all 10GigE.

YCSB Scaling Problem

We made three modifications to the YCSB benchmark. The first was to allow the test to accommodate more than 2 billion records (effectively int's vs long's). To keep the key size constant, we changed the code to use base 32 for the user ids.

The second change involved to the way we run the YCSB client in order to make the test itself horizontally scalable.The basic problem has to do with the way the YCSB test creates its Zipfian distribution of keys which is intended to model "real" loads by generating clusters of key collisions. Unfortunately, the percentage of collisions on the most contentious keys remains the same even as the number of keys in the database increases. As we scale up the load, the number of collisions on those keys increases as well, eventually exceeding the capacity of the single server used for a given key.This is not a workload that is realistic or amenable to horizontal scaling. YCSB does provide alternate key distribution algorithms so this is not a shortcoming of YCSB in general.

We decided that a better model would be for the key collisions to be limited to a given YCSB client process. That way, as additional YCSB client processes (i.e. additional load) are added, they each maintain the same number of collisions they encounter themselves, but do not increase the number of collisions on a single key in the entire store. We added client processes proportionally to the number of records in the database (and therefore the number of shards).

This change to the use of YCSB better models a use case where new groups of users are likely to access either just their own entries, or entries within their own subgroups, rather than all users showing the same interest in a single global collection of keys. If an application finds every user having the same likelihood of wanting to modify a single global key, that application has no real hope of getting horizontal scaling.

Finally, we used read/modify/write (also known as "Compare And Set") style updates during the mixed phase. This uses versioned operations to make sure that no updates are lost. This mode of operation provides better application behavior than the way we have typically run YCSB in the past, and is only practical at scale because we eliminated the shared key collision hotspots.It is also a more realistic testing scenario. To reiterate, all updates used a simple majority replica ack policy making them durable.

Scalability Results

In the table below, the "KVS Size" column is the number of records with the number of shards and the replication factor. Hence, the first row indicates 400m total records in the NoSQL Database (KV Store), 2 shards, and a replication factor of 3. The "Clients" column indicates the number of YCSB client processes. "Threads" is the number of threads per process with the total number of threads. Hence, 90 threads per YCSB process for a total of 360 threads. The client processes were distributed across 10 client machines.

Shards KVS Size




Clients





Mixed





(records)















Threads




Overall
Throughput
(ops/sec)



Read Latency
av/95%/99%
(ms)

Write Latency
av/95%/99%
(ms)





2
400m(2x3)




4








90(360)




302,152


0.76/1/3
3.08/8/35




4
800m(4x3)




8








90(720)




558,569


0.79/1/4
3.82/16/45




8
1600m(8x3)




16








90(1440)




1,028,868


0.85/2/5
4.29/21/51




10
2000m(10x3)




20








90(1800)




1,244,550


0.88/2/6
4.47/23/53




Comments:

Very exciting and awesome. Thanks for sharing.

Posted by Amin on September 22, 2012 at 12:25 PM EDT #

Several people have asked what hardware we used for these tests. We used Cisco C240 machines with FusionIO cards.

Posted by Charles Lamb on October 15, 2012 at 02:52 PM EDT #

How many client machines were used for this effort.

Posted by guest on March 17, 2013 at 10:06 PM EDT #

Hi Charles,
you have mentioned .."12 of the 15 were Xeon E5-2690, 2.9 GHz, 2 sockets, 32 threads, 193 GB RAM".

Can you explain if each of those 12 machine had 193 GB RAM? that is a ton of RAM that would cost a ton of dough.

Posted by guest on November 15, 2013 at 04:57 PM EST #

Yes, each machine had 193GB of RAM.

Posted by Charles Lamb on November 18, 2013 at 03:45 PM EST #

Post a Comment:
  • HTML Syntax: NOT allowed
About

Anything related to Oracle NoSQL Database and/or Berkeley DB Java Edition.

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today