Tuesday Apr 10, 2012

Benchmarking MySQL Replication with Multi-Threaded Slaves

The objective of this benchmark is to measure the performance improvement achieved when enabling the Multi-Threaded Slave enhancement delivered as a part MySQL 5.6.

As the results demonstrate, Multi-Threaded Slaves delivers 5x higher replication performance based on a configuration with 10 databases/schemas. For real-world deployments, higher replication performance directly translates to:

· Improved consistency of reads from slaves (i.e. reduced risk of reading "stale" data)

· Reduced risk of data loss should the master fail before replicating all events in its binary log (binlog)

The multi-threaded slave splits processing between worker threads based on schema, allowing updates to be applied in parallel, rather than sequentially. This delivers benefits to those workloads that isolate application data using databases - e.g. multi-tenant systems deployed in cloud environments.

Multi-Threaded Slaves are just one of many enhancements to replication previewed as part of the MySQL 5.6 Development Release, which include:

· Global Transaction Identifiers coupled with MySQL utilities for automatic failover / switchover and slave promotion

· Crash Safe Slaves and Binlog

· Optimized Row Based Replication

· Replication Event Checksums

· Time Delayed Replication

These and many more are discussed in the “MySQL 5.6 Replication: Enabling the Next Generation of Web & Cloud Services” Developer Zone article 

Back to the benchmark - details are as follows.


Environment
The test environment consisted of two Linux servers:

· one running the replication master

· one running the replication slave.

Only the slave was involved in the actual measurements, and was based on the following configuration:

- Hardware: Oracle Sun Fire X4170 M2 Server

- CPU: 2 sockets, 6 cores with hyper-threading, 2930 MHz.

- OS: 64-bit Oracle Enterprise Linux 6.1
- Memory: 48 GB

Test Procedure
Initial Setup:

Two MySQL servers were started on two different hosts, configured as replication master and slave.

10 sysbench schemas were created, each with a single table:

CREATE TABLE `sbtest` (
   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
   `k` int(10) unsigned NOT NULL DEFAULT '0',
   `c` char(120) NOT NULL DEFAULT '',
   `pad` char(60) NOT NULL DEFAULT '',
   PRIMARY KEY (`id`),
   KEY `k` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

10,000 rows were inserted in each of the 10 tables, for a total of 100,000 rows. When the inserts had replicated to the slave, the slave threads were stopped. The slave data directory was copied to a backup location and the slave threads position in the master binlog noted.

10 sysbench clients, each configured with 10 threads, were spawned at the same time to generate a random schema load against each of the 10 schemas on the master. Each sysbench client executed 10,000 "update key" statements:

UPDATE sbtest set k=k+1 WHERE id = <random row>

In total, this generated 100,000 update statements to later replicate during the test itself.

Test Methodology:
The number of slave workers to test with was configured using:

SET GLOBAL slave_parallel_workers=<workers>

Then the slave IO thread was started and the test waited for all the update queries to be copied over to the relay log on the slave.

The benchmark clock was started and then the slave SQL thread was started. The test waited for the slave SQL thread to finish executing the 100k update queries, doing "select master_pos_wait()". When master_pos_wait() returned, the benchmark clock was stopped and the duration calculated.

The calculated duration from the benchmark clock should be close to the time it took for the SQL thread to execute the 100,000 update queries. The 100k queries divided by this duration gave the benchmark metric, reported as Queries Per Second (QPS).

Test Reset:

The test-reset cycle was implemented as follows:

· the slave was stopped

· the slave data directory replaced with the previous backup

· the slave restarted with the slave threads replication pointer repositioned to the point before the update queries in the binlog.

The test could then be repeated with identical set of queries but a different number of slave worker threads, enabling a fair comparison.

The Test-Reset cycle was repeated 3 times for 0-24 number of workers and the QPS metric calculated and averaged for each worker count.

MySQL Configuration
The relevant configuration settings used for MySQL are as follows:

binlog-format=STATEMENT
relay-log-info-repository=TABLE
master-info-repository=TABLE

As described in the test procedure, the
slave_parallel_workers setting was modified as part of the test logic. The consequence of changing this setting is:

0 worker threads:
   - current (i.e. single threaded) sequential mode
   - 1 x IO thread and 1 x SQL thread
   - SQL thread both reads and executes the events

1 worker thread:
   - sequential mode
   - 1 x IO thread, 1 x Coordinator SQL thread and 1 x Worker thread
   - coordinator reads the event and hands it to the worker who executes

2+ worker threads:
   - parallel execution
   - 1 x IO thread, 1 x Coordinator SQL thread and 2+ Worker threads
   - coordinator reads events and hands them to the workers who execute them

Results
Figure 1 below shows that Multi-Threaded Slaves deliver ~5x higher replication performance when configured with 10 worker threads, with the load evenly distributed across our 10 x schemas. This result is compared to the current replication implementation which is based on a single SQL thread only (i.e. zero worker threads).

Figure 1: 5x Higher Performance with Multi-Threaded Slaves

The following figure shows more detailed results, with QPS sampled and reported as the worker threads are incremented.

The raw numbers behind this graph are reported in the Appendix section of this post.



Figure 2: Detailed Results

As the results above show, the configuration does not scale noticably from 5 to 9 worker threads. When configured with 10 worker threads however, scalability increases significantly. The conclusion therefore is that it is desirable to configure the same number of worker threads as schemas.

Other conclusions from the results:

· Running with 1 worker compared to zero workers just introduces overhead without the benefit of parallel execution.

· As expected, having more workers than schemas adds no visible benefit.

Aside from what is shown in the results above, testing also demonstrated that the following settings had a very positive effect on slave performance:


relay-log-info-repository=TABLE
master-info-repository=TABLE

For 5+ workers, it was up to 2.3 times as fast to run with TABLE compared to FILE.

Conclusion

As the results demonstrate, Multi-Threaded Slaves deliver significant performance increases to MySQL replication when handling multiple schemas.

This, and the other replication enhancements introduced in MySQL 5.6 are fully available for you to download and evaluate now from the MySQL Developer site (select Development Release tab).

You can learn more about MySQL 5.6 from the documentation 

Please don’t hesitate to comment on this or other replication blogs with feedback and questions.

Appendix – Detailed Results

Tuesday Dec 20, 2011

MySQL 5.6.4 Development Milestone Now Available!

I am pleased to announce that the MySQL Database 5.6.4 development milestone release ("DMR") is now available for download (select the Development Release tab). MySQL 5.6.4 includes all 5.5 production-ready features and provides an aggreation of all of the new features that have been released in earlier 5.6 DMRs.  5.6.4 adds many bug fixes and more new "early and often" enhancements that are development and system QA complete and ready for Community evaluation and feedback.  You can get the complete rundown of all the new 5.6.4 specific features here.

For those following the progression of the 5.6 DMRs as the trains leave the station, you should bookmark these MySQL Engineering development team specific blogs:

You can also track the thought and innovation leaders on the MySQL Optimizer and the new Optimizer specific improvements in 5.6.4 by following the MySQL Optimizer Team member blogs:

And of course you can follow others on the Optimizer team and all of MySQL Engineering teams by bookmarking/subscribing to PlanetMySQL.

We look forward to your feedback on MySQL 5.6.4, so please download your copy now and help us make a better MySQL. 

As always, a sincere thanks for your continued support of MySQL!   


Monday Oct 03, 2011

More Early Access Features in the MySQL 5.6.3 Development Milestone!

For those with an interest in MySQL, this week at Oracle OpenWorld has gotten off to a great start.  Demonstrating how Oracle drives MySQL innovation Tomas' "State of the Dolphin" keynote on Monday gave a great overview of the new MySQL products that have recently been delivered:

Of these announcements I am most excited about the new early access features ready for community evaluation in the MySQL 5.6.3 DMR.  Since 5.5 was released last December we have gotten community and customer feedback that it is the best, highest quality release to date.  5.6 builds on 5.5 by improving:

The MySQL Optimizer for better query performance, scale, diagnostics
Oracle owns the MySQL optimizer and is making a huge investment in engineering and re-factoring to ensure the optimizer evolves to meet the most demanding and complex query requirements.  To this end, the 5.6.3 optimizer now includes:

File sort optimizations for queries with SELECT *, ORDER by non-indexed column, with small LIMIT values 

Currently such queries that ORDER by a non-indexed column perform a full table scan, create sort operations for entire table, potentially create temp tables, toss out the unneeded rows and merge the result set.  With the new default behavior the whole table is scanned, but only the needed rows are sorted and returned in the result set.  Early internal tests show 3x improvement in execution times when using the default sort buffer (of course your mileage may vary.)

Index Condition Pushdown ("ICP")

By default, the optimizer now pushes indexed columns in WHERE clause down to the storage engine for evaluation.  For example, for this table and related query on secondary key of postalcode+age:

CREATE TABLE person (
      personid INTEGER PRIMARY KEY,
      firstname CHAR(20),
      lastname CHAR(20),
      postalcode INTEGER,
      age INTEGER,
      address CHAR(50),
      KEY k1 (postalcode,age)
   ) ENGINE=InnoDB;

SELECT lastname, firstname FROM person
   WHERE postalcode BETWEEN 5000 AND 5500 AND age BETWEEN 21 AND 25; 


In this case the optimizer will use a range scan on postalcode from 5000-5500 (on index) and push the evaluation of age to InnoDB which will evaluate the value of age "BETWEEN 21 and 25".  For each match, InnoDB will return a row back to the server, which will continue requesting/receiving rows until all rows have been evaluated and returned.  Early non-scientific testing shows:

With 5 million rows in table, random values, running on commodity, desktop-grade hardware
RESULTS:
ICP disabled: Disk bound (default buffer pool 128 Mb) the query will take 15 seconds
ICP disabled: All data in memory (buffer pool 1.5 Gb) the query will take 1.4 seconds
ICP enabled: execution time reduced to 90 ms for both

Batched Key Access ("BKA") and Multi-Range Read ("MRR")

Multi-Range Read was introduced in the 5.6.2 DMR. MRR improves the performance of queries that use secondary indexes, by scanning one or more index ranges used in a query, sorting the associated disk blocks for the row data, then reading those disk blocks using larger sequential I/O requests. The speedup benefits operations such as
range index scans and equi-joins on indexed columns.

BKA builds on MRR by improving on the performance of disk-bound join queries.  With BKA the server fills up the join buffer with "join keys".  The server then sends batches of join keys to the storage engine layer (InnoDB or MyISAM, and soon NDB) for processing.  The storage engine then uses MRR to sort indexes according to the data sequence so the data pages can be accessed in sequence for efficiency.

Early testing of DBT3 Q3: Customer Distribution Query on InnoDB, buffer pool = 50 MB, 4GB InnoDB data file, and disk/memory ratio (~2% in memory) has shown execution time drops from 2000 sec to 10 sec with a sufficiently large join buffer.

EXPLAIN for INSERT/UPDATE/DELETE

EXPLAIN for SELECT has been available since MySQL 5.0.  This new feature is a long standing request from our community and customer base.  Best to get straight to an example of how it is used:

CREATE TABLE t1(c1 INT, c2 INT, KEY(c2), KEY(c2, c1));

EXPLAIN UPDATE t1 SET c1 = 10 WHERE c2 = 1;

mysql> EXPLAIN UPDATE t1 SET c1 = 10 WHERE c2 = 1;

+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows |  | Extra    |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | t1    | range | c2,c2_2       | c2   | 5       | NULL |    1 | Using where |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+

1 row in set (0.00 sec)

In this example the optimizer will use a type=range (scan) on key=c2 to identify and UPDATE the affected row(s).

Optimizer Traces
This feature allows DBAs, developers to trace the decision-making steps of the optimizer.  The TRACE output shows how the EXPLAIN plan was generated and the decision points resolved along the way.  An example would be:

SET SESSION.OPTIMIZER_TRACE=enabled=on";

SELECT (SELECT 1 FROM t6 WHERE d = c) AS RESULT FROM t5 ;

select * from information_schema.OPTIMIZER_TRACE;

With output snippet:
...

"records_estimation": [
  {
    "database": "test",
    "table": "t6",
    "range_analysis": {
      "table_scan": {
         "records": 2,
         "cost": 4.5034
      },
      "potential_range_indices": [
        {
          "index": "d",
          "usable": true,
          "key_parts": [
            "d"
          ]
        }
      ],
      "best_covering_index_scan": {
        "index": "d",
        "cost": 1.4233,
        "chosen": true
       },


...

Optimizer tracing works for SELECT, EXPLAIN SELECT, INSERT or REPLACE ( with VALUES or SELECT), UPDATE/DELETE, SET, DO, DECLARE/CASE/IF/RETURN.

Other Optimizer related improvements in 5.6.3 include:

Late materialization of views/subqueries in FROM clause
This improvement enables fast EXPLAINs for views/subqueries and defers/avoids materialization when possible, enabling for quicker bail out if query is abandoned or canceled. Internal tests show a 240x improvement in execution time (drops from 8 min to 2 sec)

Persistent Optimizer Statistics for InnoDB
This allows for more stable, accurate InnoDB statistics and minimizes the overhead of recalculating statistics on start up or crash recovery.  Statistics can be manually updated when needed.

The 5.6.3 DMR also includes new features and improvements to the following:

Performance Schema now includes instrumentation for:

  • Statements/execution stages - What are my most resource intensive queries?
  • Table and Index I/O - Which app tables/indexes cause the most load?
  • Table Locks - Which tables cause the most contention?
  • Users/Hosts/Accounts level resource consumption - Who are the most resource intensive users, hosts, accounts?
  • Network I/O - Is it the network or my app?  How long do sessions idle?
  • Aggregated summaries by thread, user, host, account, object

InnoDB

  • New INFORMATION_SCHEMA tables (metrics, system and buffer pool information) for better diagnostics
  • Dump and restore buffer pool for quicker startup and recovery
  • Reduce contention during file extension for better concurrency

Replication

  • Crash-safe slaves and binlog, replication checksums for better data integrity
  • Multi-threaded slaves for better performance
  • time-delayed replication for better testing, data integrity
  • Informational log events for easier troubleshooting

And more...

You can learn more about the 5.6.3 DMR and other early access features by visiting the MySQL Developers Zone.  Here you will find technical articles, the MySQL Database 5.6.3 and MySQL Cluster 7.2 DMR downloads, supporting documentation and all related PlanetMySQL blogs.

If you are attending Oracle Open World this week please plan to attend as many of the 47 MySQL sessions as you can, including my sessions on the "MySQL Roadmap" (Tuesday at 5:00 pm) and "Using MySQL with Other Oracle Products" (Wednesday at 11:45 am PT).  Both are in the Marriott Marquis (just up the block from the Moscone Center), Golden Gate C2.  Also, be sure to join us at the MySQL Community Reception .  There will be many familiar faces and friends there, and it is open to all, even if you are not attending OOW.

As always, thanks for your continued support of MySQL!

 





Thursday Sep 29, 2011

MySQL HA Solutions: New Guide Available

Databases are the center of today’s web, enterprise and embedded applications, storing and protecting an organization’s most valuable assets and supporting business-critical applications. Just minutes of downtime can result in significant lost revenue and dissatisfied customers. Ensuring database highly availability is therefore a top priority for any organization.

The new MySQL Guide to High Availability solutions is designed to navigate users through the HA maze, discussing:

- The causes, effects and impacts of downtime;

- Methodologies to select the right HA solution;

- Different approaches to delivering highly available MySQL services;

- Operational best practices to meet Service Level Agreements (SLAs).

As discussed in the new Guide, selecting the high availability solution that is appropriate for your application depends upon 3 core principles:

- The level of availability required to meet business objectives, within budgetary constraints;

- The profile of application being deployed (i.e. concurrent users, requests per second, etc.);

- Operational standards within each data center.

Recognizing that each application or service has different operational and availability requirements, the guide discusses the range of certified and supported High Availability (HA) solutions – from internal departmental applications all the way through to geographically redundant, multi-data center systems delivering 99.999% availability (i.e. less than 5 ½ minutes of downtime per year) supporting transactional web services, communications networks, cloud and hosting environments, etc.

By combining the right technology with the right skills and processes, users can achieve business continuity, while developers and DBAs can sleep tight at night! Download the guide to learn more.

Monday Aug 01, 2011

More New MySQL 5.6 Early Access Features

Last week was a banner week for MySQL at OSCON. We had many MySQL developers meeting with the MySQL community, conducting technical sessions, leading BOF sessions, working the exhibit hall, and confirming Oracle's leadership in the technical evolution of MySQL.  The highlight of the week was the unveiling of even more 5.6 early access InnoDB and Replication features that are now available for early adopters to download, evaluate and shape via labs.mysql.com.  

InnoDB is one of MySQL's "crown jewels" and beginning in 5.5 is now the default storage engine.  The following 5.6 feature improvements are in direct response to community and customer feedback and requests.  The new 5.6 early access features include:


  • Full-text search
  • REDO log files max size extended to 2 TB
  • UNDO logs on their own tablespace
  • Buffer Pool options for pre-loading/warming on re-start
  • Improved auto-extension of .ibd files
  • Support for smaller 4k, 8k page sizes

Replication is by far the most popular and widely used MySQL feature. The following feature improvements in 5.6 are also in direct response to community and customer feedback and requests.  The new 5.6 early access features include:
  • New Binlog API
  • Binlog group commit (completes InnoDB group commit implemented in MySQL 5.5)
  • Durable Slave Reads
  • Enhanced multi-threaded slaves
You can read about the details, including development blogs on how to get started with each in this new DevZone article.  My sincere thanks and appreciation to the InnoDB and Replication development teams for their leadership in technical innovation and mind share and for their dedicated work in providing these and other new features "early and often" to the MySQL community.  Stay tuned for more to come!

We can't say this "early and often" enough...thanks for your continued support of MySQL!

MySQL 5.6 Replication – New Early Access Features

At OSCON 2011 last week, Oracle delivered more early access (labs) features for MySQL 5.6 replication. These features are focused on better integration, performance and data integrity, and are summarized in this blog with links to resources enabling users to download, configure and evaluate them[Read More]

Thursday Jul 14, 2011

New “Meet The MySQL Experts” Podcast Series

During the past year, we’ve put a lot of emphasis at Oracle on sharing information with the MySQL Community. Actions included PR and communications not only about GA products but also about development milestone releases and features available in labs.mysql.com, as well as more technical articles, more blogs, participation to conferences…etc. We very much value the feedback and discussions such communications generate.

We therefore plan to keep communicating this way, and do even more. This is why we’re now very pleased to announce a new Oracle podcast series entitled “Meet the MySQL Experts”! We plan to, once a month, interview an Oracle MySQL Engineer who will share information about his/her domain expertise, and what he/she is currently working on.

For the first edition, it was my pleasure to interview Mats Kindahl about MySQL Replication.

We hope you enjoy the new podcast series, which we think will be a good complement to the great “OurSQL” podcast that Sheeri and Sarah are running.

Feedback welcome, including about topics that you would like us to cover in subsequent editions.

Tuesday Dec 15, 2009

A new MySQL Milestone Release (5.5.0-M2) has been published

Following our new Milestone-based release model, we've now published MySQL 5.5.0-M2. Please see the "What's New in MySQL 5.5" section of the reference manual for a summary of the most notable changes. The MySQL 5.5.0 Changelog provides a much more detailed list of changes, bug fixes and improvements.

Highlights in this release include:

The InnoDB plugin (currently at version 1.0.5) replaces the built-in InnoDB storage engine. This version includes a number of important performance improvements, especially when running on multi-core CPUs with many concurrent transactions.

Support for "semisynchronous" replication. This is a very useful feature to check out, if you use MySQL replication in a high-availability scenario and you want to ensure that at least one replication slave has received the latest transaction that was committed on the master. The master waits for confirmation from the slave after a commit before it returns control to the application. See the chapter "Semisynchronous replication" in the reference manual for more details. Mark Callaghan and Giuseppe Maxia also provide some more background information about this feature in their blog posts.

You can now use SIGNAL and RESIGNAL statements in stored routines and triggers, as defined by the SQL standard. These statements can be used to raise an error inside of your routines and can define appropriate actions, e.g. displaying an error message or issuing warnings. Check out the reference manual or take a look at Roland Bouman's blog post that provides a quick overview by giving a practical example.

The table partitioning functionality that was introduced with MySQL 5.1 received a large number of bug fixes and improvements. For example, you can now use two new types of user-defined partitioning: RANGE COLUMNS and LIST COLUMNS, providing more flexibility when it comes to defining how a table should be partitioned (e.g. by defining ranges or lists based on DATE, DATETIME or strings). Ranges based on multiple columns are now possible, too. For more information, please see the chapters "RANGE Partitioning", and "LIST Partitioning" in the reference manual.

Other improvements include new LOAD XML statement, which allows you to read data from an XML file into a table, a new TO_SECONDS() date function to calculate the number of seconds since the year zero and a ton of bug fixes.

Binary packages and sources are now available from the MySQL 5.5 download pages. The Bazaar source tree of this milestone release is available on Launchpad, too, in case you want to take a closer look at the ongoing development work in this branch. So please download and toy around with this release – we are looking forward to your feedback!

About

Get the latest updates on products, technology, news, events, webcasts, customers and more.

Twitter


Facebook

Search

Archives
« June 2016
SunMonTueWedThuFriSat
   
1
2
3
4
5
6
7
8
9
10
11
12
13
14
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
  
       
Today