New Enhancements for InnoDB Memcached

In MySQL 5.6, we continued our development on InnoDB Memcached and completed a few widely desirable features that make InnoDB Memcached a competitive feature in more scenario. Notablely, they are

1) Support multiple table mapping

2) Added background thread to auto-commit long running transactions

3) Enhancement in binlog performance

 Let’s go over each of these features one by one. And in the last section, we will go over a couple of internally performed performance tests.

Support multiple table mapping

In our earlier release, all InnoDB Memcached operations are mapped to a single InnoDB table. In the real life, user might want to use this InnoDB Memcached features on different tables. Thus being able to support access to different table at run time, and having different mapping for different connections becomes a very desirable feature. And in this GA release, we allow user just be able to do both. We will discuss the key concepts and key steps in using this feature.

1) "mapping name" in the "get" and "set" command

In order to allow InnoDB Memcached map to a new table, the user (DBA) would still require to "pre-register" table(s) in InnoDB Memcached “containers” table (there is security consideration for this requirement). If you would like to know about “containers” table, please refer to my earlier blogs in blogs.innodb.com. Once registered, the InnoDB Memcached will then be able to look for such table when they are referred.

Each of such registered table will have a unique "registration name" (or mapping_name) corresponding to the “name” field in the “containers” table.. To access these tables, user will include such "registration name" in their get or set commands, in the form of "get @@new_mapping_name.key", prefix "@@" is required for signaling a mapped table change. The key and the "mapping name" are separated by a configurable delimiter, by default, it is ".". So the syntax is:

get [@@mapping_name.]key_name

set [@@mapping_name.]key_name

 or

 get @@mapping_name

set @@mapping_name

Here is an example:

Let's set up three tables in the "containers" table:

The first is a map to InnoDB table "test/demo_test" table with mapping name "setup_1"

INSERT INTO containers VALUES ("setup_1", "test", "demo_test",

"c1", "c2", "c3", "c4", "c5", "PRIMARY");

 Similarly, we set up table mappings for table "test/new_demo" with name "setup_2" and that to table "mydatabase/my_demo" with name "setup_3":

INSERT INTO containers VALUES ("setup_2", "test", "new_demo", "c1", "c2",

"c3", "c4", "c5", "secondary_index_x");

INSERT INTO containers VALUES ("setup_3", "my_database", "my_demo",

"c1", "c2", "c3", "c4", "c5", "idx");

To switch to table "my_database/my_demo", and get the value corresponding to “key_a”, user will do:

get @@setup_3.key_a

(this will also output the value that corresponding to key "key_a"

or simply

get @@setup_3

Once this is done, this connection will switch to "my_database/my_demo" table until another table mapping switch is requested. so it can continue issue regular command like:

get key_b

 set key_c 0 0 7

These DMLs will all be directed to "my_database/my_demo" table.

And this also implies that different connections can have different bindings (to different table).

2) Delimiter:

For the delimiter "." that separates the "mapping name" and key value, we also added a configure option in the "config_options" system table with name of "table_map_delimiter":

INSERT INTO config_options VALUES("table_map_delimiter", ".");

So if user wants to change to a different delimiter, they can change it in the config_option table.

3) Default mapping:

Once we have multiple table mapping, there should be always a "default" map setting. For this, we decided if there exists a mapping name of "default", then this will be chosen as default mapping. Otherwise, the first row of the containers table will chosen as default setting.

Please note, user tables can be repeated in the "containers" table (for example, user wants to access different columns of the table in different settings), as long as they are using different mapping/configure names in the first column, which is enforced by a unique index.

4) bind command

In addition, we also extend the protocol and added a bind command, its usage is fairly straightforward. To switch to "setup_3" mapping above, you simply issue:

bind setup_3

This will switch this connection's InnoDB table to "my_database/my_demo"

In summary, with this feature, you now can direct access to difference tables with difference session. And even a single connection, you can query into difference tables.

Background thread to auto-commit long running transactions

This is a feature related to the “batch” concept we discussed in earlier blogs. This “batch” feature allows us batch the read and write operations, and commit them only after certain calls. The “batch” size is controlled by the configure parameter “daemon_memcached_w_batch_size” and “daemon_memcached_r_batch_size”. This could significantly boost performance.

However, it also comes with some disadvantages, for example, you will not be able to view “uncommitted” operations from SQL end unless you set transaction isolation level to read_uncommitted, and in addition, this will held certain row locks for extend period of time that might reduce the concurrency.

To deal with this, we introduce a background thread that “auto-commits” the transaction if they are idle for certain amount of time (default is 5 seconds). The background thread will wake up every second and loop through every “connections” opened by Memcached, and check for idle transactions. And if such transaction is idle longer than certain limit and not being used, it will commit such transactions. This limit is configurable by change “innodb_api_bk_commit_interval”. Its default value is 5 seconds, and minimum is 1 second, and maximum is 1073741824 seconds.

With the help of such background thread, you will not need to worry about long running uncommitted transactions when set daemon_memcached_w_batch_size and daemon_memcached_r_batch_size to a large number. This also reduces the number of locks that could be held due to long running transactions, and thus further increase the concurrency.

Enhancement in binlog performance

As you might all know, binlog operation is not done by InnoDB storage engine, rather it is handled in the MySQL layer. In order to support binlog operation through InnoDB Memcached, we would have to artificially create some MySQL constructs in order to access binlog handler APIs. In previous lab release, for simplicity consideration, we open and destroy these MySQL constructs (such as THD) for each operations. This required us to set the “batch” size always to 1 when binlog is on, no matter what “daemon_memcached_w_batch_size” and “daemon_memcached_r_batch_size” are configured to. This put a big restriction on our capability to scale, and also there are quite a bit overhead in creating destroying such constructs that bogs the performance down.

With this release, we made necessary change that would keep MySQL constructs as long as they are valid for a particular connection. So there will not be repeated and redundant open and close (table) calls. And now even with binlog option is enabled (with innodb_api_enable_binlog,), we still can batch the transactions with daemon_memcached_w_batch_size and daemon_memcached_r_batch_size, thus scale the write/read performance.

Although there are still overheads that makes InnoDB Memcached cannot perform as fast as when binlog is turned off. It is much better off comparing to previous release. And we are continuing optimize the solution is this area to improve the performance as much as possible.

Performance Study:

Amerandra of our System QA team have conducted some performance studies on queries through our InnoDB Memcached connection and plain SQL end. And it shows some interesting results.

The test is conducted on a “Linux 2.6.32-300.7.1.el6uek.x86_64 ix86 (64)” machine with 16 GB Memory, Intel Xeon 2.0 GHz CPU X86_64 2 CPUs- 4 Core Each, 2 RAID DISKS (1027 GB,733.9GB). Results are described in following tables:

Table 1: Performance comparison on Set operations

Connections

5.6.7-RC-Memcached-plugin ( TPS / Qps) with memcached-threads=8***

5.6.7-RC*

X faster


Set (QPS)

Set**


8

30,000

5,600

5.36

32

59,000

13,000

4.54

128

68,000

8,000

8.50

512

63,000

6.800

9.23

* mysql-5.6.7-rc-linux2.6-x86_64

** The “set” operation when implemented in InnoDB Memcached involves a couple of DMLs: it first query the table to see whether the “key” exists, if it does not, the new key/value pair will be inserted. If it does exist, the “value” field of matching row (by key) will be updated. So when used in above query, it is a precompiled store procedure, and query will just execute such procedures.

*** added “–daemon_memcached_option=-t8” (default is 4 threads)

So we can see with this “set” query, InnoDB Memcached can run 4.5 to 9 time faster than MySQL server.

Table 2: Performance comparison on Get operations

Connections

5.6.7-RC-Memcached-plugin ( TPS / Qps) with memcached-threads=8

5.6.7-RC*

X faster


Get (QPS)

Get


8

42,000

27,000

1.56

32

101,000

55.000

1.83

128

117,000

52,000

2.25

512

109,000

52,000

2.10

With the “get” query (or the select query), memcached performs 1.5 to 2 times faster than normal SQL.

Summary:

In summary, we added several much-desired features to InnoDB Memcached in this release, allowing user to operate on different tables with this Memcached interface. We also now provide a background commit thread to commit long running idle transactions, thus allow user to configure large batch write/read without worrying about large number of rows held or not being able to see (uncommit) data. We also greatly enhanced the performance when Binlog is enabled. We will continue making efforts in both performance enhancement and functionality areas to make InnoDB Memcached a good demo case for our InnoDB APIs.

Jimmy Yang, September 29, 2012

Comments:

Why are sets 4X to 9X faster? If it is because the SQL writes are auto-commit and the memcache sets are doing a background commit every 5 seconds then this is a very misleading comparison.

Posted by Mark Callaghan on November 06, 2012 at 11:30 AM EST #

Mark, the background commit (every 5 sec) kicks in only if there are uncommitted transactions with some "daemon_memcached_w_batch_size" value, and number of transactions not yet cross the batch size. By default daemon_memcached_w_batch_size is 1, which is the same as auto-commit. And the background commit will not do any commits (since all are committed). But you have a valid point that the test harness needs to make sure the commit batch size in both case.

Posted by guest on November 09, 2012 at 03:36 AM EST #

Post a Comment:
Comments are closed for this entry.
About

This is the InnoDB team blog.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today