Monday Jun 24, 2013

Beginning on MySQL 5.6? Take the New MySQL for Beginners Training

The MySQL for Beginners training course is a great way for you to learn about the world's more popular open source database. During this 4 day course, epxert instructors will teach you how to use MySQL Server 5.6 and the latest tools while helping you develop deeper knowledge of using relational databases.

You can take this live-instructor course as a:

  • Live-Virtual event: Take this course from your own desk, choosing from a selection of events on the schedule to suit different time-zones.
  • In-Class Event: Travel to an education center to follow this course. Below is a selection of events already on the schedule.

 Location

 Date

 Delivery Language

 Brussels, Belgium

 8 September 2013

 English

 London, England

 1 July 2013

 English

 Berlin, Germany

 2 September 2013

 German

 Stuttgart, Germany

 28 October 2013

 German

 Riga, Latvia

 26 August 2013

 Latvian

Utrecht, Netherlands 

9 September 2013 

English 

 Warsaw, Poland

 15 July 2013

 Polish

 Cape Town, South Africa

 22 July 2013

 English

 Petaling Jaya, Malaysia

 22 July 2013

 English

 Sao Paulo, Brazil

 7 October 2013

 Brazilian Portugese

To register for this course or to learn more about the authentic MySQL curriculum, go to http://oracle.com/education/mysql.

Wednesday Apr 10, 2013

MySQL Replication: Self-Healing Recovery with GTIDs and MySQL Utilities

MySQL 5.6 includes a host of enhancements to replication, enabling DevOps teams to reliably scale-out their MySQL infrastructure across commodity hardware, on-premise or in the cloud.

One of the most significant enhancements is the introduction of Global Transaction Identifiers (GTIDs) where the primary development motivation was:

- enabling seamless failover or switchover from a replication master to slave

- promoting that slave to the new master

- without manual intervention and with minimal service disruption.

You can download the new MySQL Replication High Availability Guide to learn more.  The following sections provide an overview of how GTIDs and new MySQL utilities work together to enable self-healing replication clusters.  

GTIDs

To understand the implementation and capabilities of GTIDs, we will use an example illustrated below:

- Server “A” crashes

- We need to failover to one of the slaves, promoting it to master

- The remaining server becomes a slave of that new master

Figure 1: MySQL Master Failover and Slave Promotion

As MySQL replication is asynchronous by default, servers B and C may not have both replicated and executed the same number of transactions, i.e. one may be ahead of the other. Consider the following scenarios:

Scenario #1

- Server B is ahead of C and B is chosen as the new master;

- [Then] Server C needs to start to replicate from the first transaction in server B that it is yet to receive;

Scenario #2

- Server C has executed transactions that have so far not been received by Server B, but the administrator designates Server B as the new master (for example, it is configured with higher performing hardware than Server C).

- Server B therefore needs to execute later transactions from Server C, before assuming the role of the master, otherwise lost transactions and conflicts can ensue.

GTIDs apply a unique identifier to each transaction, so it is becomes easy to track when it is executed on each slave. When the master commits a transaction, a GTID is generated which is written to the binary log prior to the transaction. The GTID and the transaction are replicated to the slave.

If the slave is configured to write changes to its own binary log, the slave ensures that the GTID and transaction are preserved and written after the transaction has been committed.

The set of GTIDs executed on each slave is exposed to the user in a new, read-only, global server variable, gtid_executed. The variable can be used in conjunction with GTID_SUBTRACT() to determine if a slave is up to date with a master, and if not, which transactions are missing.

A new replication protocol was created to make this process automatic. When a slave connects to the master, the new protocol ensures it sends the range of GTIDs that the slave has executed and committed and requests any missing transactions. The master then sends all other transactions, i.e. those that the slave has not yet executed. This is illustrated in the following example (note that the GTID numbering and binlog format is simplified for clarity):

Figure 2: Automatically Synchronizing a New Master with it's Slaves

In this example, Server B has executed all transactions before Server A crashed. Using the MySQL replication protocol, Server C will send “id1” to Server B, and then B will send “id 2” and “id3” to Server C, before then replicating new transactions as they are committed. If the roles were reversed and Server C is ahead of Server B, this same process also ensures that Server B receives any transactions it has so far not executed, before being promoted to the role of the new master.

We therefore have a foundation for reliable slave promotion, ensuring that any transactions executed on a slave are not lost in the event of an outage to the master.

MySQL Utilities for GTIDs

Implemented as Python-based command-line scripts, the Replication utilities are available as a standalone download as well as packaged with MySQL Workbench. The source code is available from the MySQL Utilities Launchpad page.

The utilities supporting failover and recovery are components of a broader suite of MySQL utilities that simplify the maintenance and administration of MySQL servers, including the provisioning and verification of replication, comparing and cloning databases, diagnostics, etc. The utilities are available under the GPLv2 license, and are extendable using a supplied library. They are designed to work with Python 2.6 and above.

Replication Utility: mysqlfailover

While providing continuous monitoring of the replication topology, mysqlfailover enables automatic or manual failover to a slave in the event of an outage to the master. Its default behavior is to promote the most viable slave, as defined by the following slave election criteria.

• The slave is running and reachable;

• GTIDs are enabled;

• The slaves replication filters do not conflict;

• The replication user exists;

• Binary logging is enabled.

Once a viable slave is elected (called the candidate), the process to retrieve all transactions active in the replication cluster is initiated. This is done by connecting the candidate slave to all of the remaining slaves thereby gathering and executing all transactions in the cluster on the candidate slave. This ensures no replicated transactions are lost, even if the candidate is not the most current slave when failover is initiated.

The election process is configurable. The administrator can use a list to nominate a specific set of candidate slaves to become the new master (e.g. because they have better performing hardware).

At pre-configured intervals, the utility will check to see if the server is alive via a ping operation, followed by a check of the connector to detect if the server is still reachable. If the master is found to be offline or unreachable, the utility will execute one of the following actions based on the value of the failover mode option, which enables the user to define failover policies:

• The auto mode tells the utility to failover to the list of specified candidates first and if none are viable, search the list of remaining slaves for a candidate.

• The elect mode limits election to the candidate slave list and if none are viable, automatic failover is aborted.

• The fail mode tells the utility to not perform failover and instead stop execution, awaiting further manual recovery actions from the DevOps team.

Via a series of Extension Points users can also bind in their own scripts to trigger failover actions beyond the database (such as Virtual IP failover). Review the mysqlfailover documentation for more detail on configuration and options of this utility.

Replication Utility: mysqlrpladmin

If a user needs to take a master offline for scheduled maintenance, mysqlrpladmin can perform a switchover to a specific slave (called the new master). When performing a switchover, the original master is locked and all slaves are allowed to catch up. Once the slaves have read all events from the original master, the original master is shutdown and control is switched to the new master.

There are many Operations teams that prefer to take failover decisions themselves, and so mysqlrpladmin provides a mechanism for manual failover after an outage to the master has been identified, either by using the health reporting provided by the utilities, or by alerting provided by a tool such as the MySQL Enterprise Monitor.

It is also possible to easily re-introduce recovered masters back to the replication topology by using the demote-master command and initiating a switchover to the recovered master.

Review the mysqlrpladmin documentation for more detail on configuration and options of the utility.

Resources to Get Started

In addition to the documentation, two additional resources are useful to enable you to learn more:

- MySQL replication utilities tutorial video

- MySQL replication guide: building a self healing replication topology

Summary

The combination of options to control failover or switchover and the ability to inform applications of the failover event are powerful features that enable self-healing for critical applications.

Thursday Apr 04, 2013

MySQL 5.6 Replication: All That Is New, On-Demand

The new MySQL 5.6 GA release delivers a host of new capabilities to support developers releasing new services faster, with more agility, performance and security .

One of the areas with the most far-reaching set of enhancements is MySQL replication used by the largest web, mobile and social properties to horizontally scale highly-available MySQL databases across distributed clusters of low cost, commodity servers.

A new on-demand MySQL 5.6 replication webinar takes you on a guided tour through all of those enhancements, including:

- 5x higher master and slave performance with Binary Log Group Commit, Multi-Threaded Slaves and Optimized Row-based replication

- Self healing replication with Global Transaction Identifiers (GTIDs), utilities and crash-safe masters and slaves

- Replication event checksums for rock-solid data integrity across a replication cluster

- DevOps features with Time-Delayed Replication, Remote binary log backup and new CLI utilities

We had some great questions during the live session - here are a few:

Question: When multi-threaded slave is configured, how do you handle multi-database DML? 

Answer: When that happens the slave switches to single-threaded mode for that transaction. After it is applied, the slave switches back to multi-threaded mode.

Question: We have seen in past that replication would sometime break when complex queries get fired or alter statements take place. Neither row based or Statement based replication guaranteed 100% fail safe replication. Does MySQL5.6 address this issue ?

Answer: Replication can fail for several specific reasons. There is no general algorithm enabling "100% fail safe" replication. But we made sure that all that are unsafe are either converted to row based replication or issue warnings.

Question: We are looking for a HA cluster solution where most of the websites are Wordpress based. Do you have any recommendations on what kind of replication solution should we choose? The tables are using innodb

Answer: Using MySQL 5.6 with the master/slave replication we are discussing today, configured with GTIDs and using the mysqlfailover utility will provide you High Availability for your Wordpress cluster and InnoDB tables. Nothing you need to add - it is all part of MySQL you download today

Question: I have question about data loss and translating my-bin.000132 to my-bin.00101 when a master fails?

Answer: That's exactly the problem GTIDs address. Without GITDs it's hard to know the correct position on my-bin.000101 that corresponds to my-bin.000132.

With GTIDs we don't need to know positions, only what the slave has already applied, knowing that it's simple to send to the new master any missing transactions that are on other slaves. The data that was already on my-bin.00132 can be ensured that it was sent to a slave using Semi-syncronous replication.

Question: How do the new utilities work with a Pacemaker/Corosync setup?

Answer: The utilities integrate with Pacemaker/Corosync via extension points. The MySQL Pacemaker agent can call rpladmin to perform a switchover.

Question: Where can I learn more about supported HA solutions for MySQL?

Answer: From the Guide to MySQL HA solutions. I hope you find this useful

Question: Are there any plans to move replication from host-based to either 'database' or 'table' based?

Answer: Current MySQL replication supports replication of specific databases or tables from a Host using Replication filters. This means, we support Replication of a host (all data which could be replicated), replication of specific databases per host and specific tables per host. (You can ignore databases and tables using filters too).

For these and other questions, along with the full replay of the webinar, register here


Friday Mar 29, 2013

“Let’s Celebrate MySQL 5.6 GA!” - MySQL Community Reception by Oracle

Join Oracle’s MySQL Team on April 22, 2013, as we celebrate the general availability of MySQL 5.6. With product demos and fun activities in a relaxing atmosphere, this is the party for the MySQL community to get together and have a toast on the work all of us did to make MySQL 5.6 the best release ever. Whether you are an attendee at Percona Live, a member of local MySQL user groups, a MySQL user in the San Francisco Bay Area, or simply interested in MySQL technology, you’re all invited to Oracle’s MySQL Community Reception.
    •    Mingle with your peers and learn from real-world experiences.
    •    Meet MySQL engineers and get the first-hand information on the latest product development.
    •    Have lots of fun!

Date: Monday, April 22, 2013

6:30 PM – 8:30 PM

Location: TechMart Santa Clara
5201 Great America Parkway
Santa Clara, CA, USA
(1-minute walk from Hyatt Regency Santa Clara)

Registration will be open soon. Mark your calendar today so you don't miss the opportunity to enjoy an evening of casual but informative conversation, with complimentary food and refreshments from Oracle!

Friday Mar 08, 2013

MySQL Web Reference Architectures - Your Guide to Innovating on the Web

MySQL is deployed in 9 of the top 10 most trafficked sites on the web including Facebook, Twitter, eBay and YouTube, as well as in some of the fastest growing services such as Tumblr, Pinterest and box.com

Working with these companies has given MySQL developers, consultants and support engineers unique insight into how to design database-driven web architectures – whether deployed on-premise or in the cloud.

The MySQL Web Reference Architectures are a set of documented and repeatable best practices for building infrastructure that deliver the highest levels of scalability, agility and availability with the lowest levels of cost, risk and complexity. 

Four components common to most web and mobile properties are sized, with optimum deployment architectures for each:

User authentication and session management

Content management

Ecommerce

Analytics and big data integration

The sizing is defined by database size and load, as shown below

For each reference architecture, strategies for scaling the service and ensuring high availability are discussed, along with approaches to secure, audit and backup user data, and tools to monitor and manage the environment.

The Reference Architectures cover the core underlying technologies supporting today’s most successful web services including:

- MySQL Database

- MySQL Cluster

- MySQL Replication

- Caching with Memcached and Redis

- Big Data with Hadoop

- NoSQL APIs

- Geographic Redundancy

- Hardware Recommendations

- Operational Best Practices

An example of the "Large" reference architecture is shown below

To learn more:

- Download the MySQL Web Reference Architectures Guide

- View the MySQL Web Reference Architectures slides

The Reference Architecture are designed as a starting point which we hope will enable you build the next web and mobile phenomenon!

Thursday Feb 07, 2013

MySQL 5.6 Replication Performance

With data volumes and user populations growing, its no wonder that database performance is a hot topic in developer and DBA circles.  

Its also no surprise that continued performance improvements were one of the top design goals of the new MySQL 5.6 release which was declared GA on February 5th (note: GA means “Generally Available”, not “Gypsy Approved” @mysqlborat)

And the performance gains haven’t disappointed:

- Dimitri Kravtchuk’s Sysbench tests showed MySQL delivering up to 4x higher performance than the previous 5.5 release.

- Mikael Ronstrom’s testing showed up to 4x better scalability as thread counts rose to 48 and 60 threads (not so uncommon in commodity systems today)

Of course, YMMV (Your Mileage May Vary) in real-world workloads, but you can be reasonably certain you will see performance gains by upgrading to MySQL 5.6.  It is up to you whether you invest these gains to serve more applications and users from your MySQL databases, or you consolidate to fewer instances to reduce operational costs.

How about if you are using MySQL replication in order to scale out your database across commodity nodes, and as a foundation for High Availability (HA)? Performance here has also been improved through a combination of:

- Binary Log Group Commit

- Multi-Threaded Slaves

- Optimized Row-Based Replication

As well as giving you the benefits above; higher replication performance directly translates to:

- Reduced risk of losing data in the event of a failure on the master

- Improved read consistency from slaves

- Resource-efficient binlogs traversing the replication cluster

We will look at each of these in more detail.

Binlog Group Commit

Rather than applying writes one at a time, Binary Log Group Commit batches writes to the Binlog, significantly reducing overhead to the master. This is demonstrated by the benchmark results below.


Using the SysBench RW tests, enabling the binlog and using the default sync_binlog=0 reduces the throughput of the master by around 10%. With sync_binlog=1 (where the MySQL server synchronizes its binary log to disk after every write to the binlog, thereby giving maximum data safety), throughput is reduced by a further 5%.

To understand the difference this makes, the tests were repeated comparing MySQL 5.6 to MySQL 5.5 


Even with sync_binlog=1, MySQL 5.6 was still up to 4.5x faster than 5.5 with sync_binlog=0

Gone are the days when configuring replication resulted in a 50% or more hit to performance of your master.

The result of Binary Log Group Commit is that MySQL replication is much better able to keep pace with the demands of write-intensive workloads, imposing much less overhead on the master, even when the binlog is flushed to disk after each commit!

Details of the configurations used for the benchmark are in the Appendix at the end of this post.

You can learn more about the implementation of Binlog Group Commit from Mats Kindahl’s blog.

Multi-Threaded Slave

Looking beyond the replication master, it is also necessary to bring performance enhancements to the replication slaves.

Using Multi-Threaded Slaves, processing is split between worker threads based on schema, allowing updates to be applied in parallel, rather than sequentially. This delivers benefits to those workloads that isolate application data using databases - e.g. multi-tenant systems deployed in cloud environments.

Benchmarks demonstrate that Multi-Threaded Slaves increase performance by 5x.  


This performance enhancement translates to improved read consistency for clients accessing the replication cluster. Slaves are better able to keep up with the master, and so users are much less likely to need to throttle the sustained throughput of writes, just so that the slaves don't indefinitely fall further and further behind (previously some users had to reduce the capacity of their systems in order to reduce slave lag).

You can get all of the details on this benchmark and the configurations used in this earlier blog posting

Optimized Row-Based Replication

The final piece in improving replication performance is to introduce efficiencies to the binary log itself.

By only replicating those elements of the row image that have changed following INSERT, UPDATE and DELETE operations, replication throughput for both the master and slave(s) can be increased while binary log disk space, network resource and server memory footprint are all reduced.

This is especially useful when replicating across datacenters or cloud availability zones.

Another performance enhancements is the was Row-Based Replication events are handled on the slave against tables without Primary Keys. Updates would be made via multiple table scans. In MySQL 5.6, no matter size of the event, only one table scan is performed, significantly reducing the apply time.

You can learn more about Optimized Row Based Replication from the MySQL documentation.  

Crash-Safe Slaves and Binlog

Aka “Transactional Replication” this is more of an availability than a performance feature, but there is a nice performance angle to it.

The key concept behind crash safe replication is that replication positions are stored in tables rather than files and can therefore be updated transactionally, together with the data. This enables the slave or master to automatically roll back replication to the last committed event before a crash, and resume replication without administrator intervention. Not only does this reduce operational overhead, it also eliminates the risk of data loss or corruption.

From a performance perspective, it also means there is one less disk sync, since we leverage the storage engine's fsync and don't need an additional fsync for the file, giving users a performance gain.

Wrapping Up

So with a faster MySQL Server, InnoDB storage engine and replication, developers and DBAs can get ahead of performance demands. Bear in mind there is more to MySQL 5.6 replication than performance – for example Global Transaction Identifiers (GTIDs), replication event checksums, time-delayed replication and more.

To learn more about all of the new replication features in MySQL 5.6, download the Replication Introduction guide

To get up and running with MySQL replication, download the new Replication Tutorial Guide 


Appendix – Binary Log Group Commit Benchmarks

System Under Test:

5 x 12 thread Intel Xeon E7540 CPUs @2.00 GHz

512 GB memory

Oracle Linux 6.1


Configuration file:

1) --innodb_purge_threads=1

2) --innodb_file_format=barracuda

3) --innodb-buffer-pool-size=8192M

4) --innodb-support-xa=FALSE

5) --innodb-thread-concurrency=0

6) --innodb-flush-log-at-trx-commit=2

7) --innodb-log-file-size=8000M

8) --innodb-log-buffer-size=256M

9) --innodb-io-capacity=2000

10) --innodb-io-capacity-max=4000

11) --innodb-flush-neighbors=0

12) --skip-innodb-adaptive-hash-index

13) --innodb-read-io-threads=8

14) --innodb-write-io-threads=8

15) --innodb_change_buffering=all

16) --innodb-spin-wait-delay=48

17) --innodb-stats-on-metadata=off

18) --innodb-buffer-pool-instances=12

19) --innodb-monitor-enable='%'

20) --max-tmp-tables=100

21) --performance-schema

22) --performance-schema-instrument='%=on'

23) --query-cache-size=0

24) --query-cache-type=0

25) --max-connections=4000

26) --max-prepared-stmt-count=1048576

27) --sort-buffer-size=32768

28) --table-open-cache=4000

29) --table-definition-cache=4000

30) --table-open-cache-instances=16

31) --tmp-table-size=100M

32) --max-heap-table-size=1000M

33) --key-buffer-size=50M

34) --join-buffer-size=1000000



Tuesday Feb 05, 2013

The Best MySQL Release Ever - MySQL 5.6 is now GA

MySQL 5.6 is now generally available. Read the press release.

Chock-full of new enhancements and features around performance, scalability and availability, MySQL 5.6 is the best MySQL release ever. Read Rob Young's blog article on the key enhancements in MySQL 5.6.

This is open source goodness all around.

Congratulations to the MySQL Engineering team on delivering a stellar product release yet again for the MySQL community and users!

Thursday Jan 24, 2013

MySQL 5.6: What's New in Performance, Scalability, Availability

With the MySQL 5.6 production-ready GA set for release in the coming days, it’s good to re-cap the key features that make 5.6 the best release of the database ever.  At a glance, MySQL 5.6 is simply a better MySQL with improvements that enhance every functional area of the database kernel, including:
  • Improved Security for worry-free application deployments
  • Better Performance and Scalability
    • Improved InnoDB storage engine for better transactional throughput
    • Improved Optimizer for better query execution times and diagnostics
  • Better Application Availability with Online DDL/Schema changes
  • Better Developer Agility with NoSQL Access with Memcached API to InnoDB
  • Improved Replication for high performance, self-healing distributed deployments
  • Improved Performance Schema for better instrumentation
  • And other Important Enhancements


Improved Security for worry-free deployments
Security is near and dear to every DBA and Sys Admin's heart.  With this in mind, MySQL 5.6 introduces a major overhaul to how passwords are internally handled and encrypted.  The new options and features include:

New alternative to password in master.info – MySQL 5.6 extends the replication START SLAVE command to enable DBAs to specify master user and password as part of the replication slave options and to authenticate the account used to connect to the master through an external authentication plugin (user defined or those provided under MySQL Enterprise Edition).  With these options the user and password no longer need to be exposed in plain text in the master.info file.
New encryption for passwords in general query log, slow query log, and binary log – Passwords in statements written to these logs are no longer recorded in plain text.
New password hashing with appropriate strength – Default password hashing for internal MySQL server authentication via PASSWORD() is now done using the SHA-256 password hashing algorithm using a random salt value.
New options for passwords on the command line – MySQL 5.6 introduces a new “scrambled” option/config file (.mylogin.cnf) that can be used to securely store user passwords that are used for command line operations.
New change password at next login – DBAs and developers can now control when account passwords must be changed via a new password_expired flag in the mysql.user table.
New policy-based Password validations – Passwords can now be validated for appropriate strength, length, mixed case, special chars, and other user defined policies based on LOW, MEDIUM and STRONG designation settings. 

Learn about these and all of MySQL 5.6 Security improvements and features, along with all technical documentation, in the MySQL docs

Better Performance and Scalability: Improved InnoDB Storage Engine

From an operational standpoint MySQL 5.6 provides better sustained linear performance and scale on systems supporting multi-processors and high CPU thread concurrency.  Key to this are improvements to Oracle’s InnoDB storage engine efficiency and concurrency that remove legacy thread contention and mutex locking within the InnoDB kernel.  These improvements enable MySQL to fully exploit the advanced multi-threaded processing power of today’s x86-based commodity-off-the-shelf hardware.

Internal benchmarks for SysBench Read/Write and Read Only workloads show a marked improvement in sustained scale over the most current version of MySQL 5.5.  The following shows that MySQL 5.6 provides “up and to the right” linear read/write transactions per second (“TPS”) scale on systems that support upwards of 48 concurrent CPU threads. 


Read only TPS workload sustained scale is also improved as demonstrated here:


Better Transactional Throughput

MySQL 5.6 improves InnoDB for better performance and scalability on highly concurrent, transactional and read intensive workloads.  In these cases performance gains are best measured by how an application performs and scales as concurrent user workloads grow.  In support of these use cases, InnoDB has a new re-factored architecture that minimizes mutex contentions and bottlenecks and provides a more consistent access path to underlying data.  Improvements include:

  • Kernel mutex split to remove a single point of contention
  • New thread for flushing operations
  • New multi-threaded purge
  • New adaptive hashing algorithm
  • Less buffer pool contention
  • Better, more consistent query execution via persistent optimizer statistics that are collected at more regular, predictable intervals

The net result of these improvements is reflected in the SysBench read/write benchmarks shown here: 


For Linux, MySQL 5.6 shows up to a 150% improvement in transactional TPS throughput over MySQL 5.5, while similar tests run on Windows 2008 reveal a 47% performance gain. 

Better Read Only Workload Throughput
New optimizations have been made to InnoDB for read only transactions that greatly improve the performance of high concurrency web-based lookups and report-generating applications.  These optimizations bypass transactional overhead and are enabled by default when autocommit = 1, or can be atomically controlled by the developer using the new START_TRANSACTION_READ_ONLY syntax:

SET autocommit = 0;
START_TRANSACTION_READ_ONLY;
SELECT c FROM T1 WHERE id=N;
COMMIT;

The results of these optimizations are shown here:


For Linux, MySQL 5.6 shows up to a 230% improvement in read only TPS throughput over MySQL 5.5, while similar tests run on Windows 2008 show a 65% performance gain.

For context, all benchmarks shown above were run on the following platform configuration:

  • Oracle Linux 6
  • Intel(R) Xeon(R) E7540 x86_64
  • MySQL leveraging:
    • 48 of 96 available CPU threads
    • 2 GHz, 512GB RAM

The SysBench benchmark tool is freely available for application use-case specific benchmarks and can be downloaded here.

You can also get in depth MySQL 5.6 performance and feature specific benchmarks by following related blogs by Mikael Ronstrom and Dimitri Kravtchuk.  Both share the test cases and configurations they use to arrive at the conclusions drawn above.

Better Performance with Solid State Drives (SSD)
Spinning disks are among the most common bottlenecks on any system, simply because they have mechanical parts that physically limiit the ability to scale as concurrency grows.  As a result, many MySQL applications are being deployed on SSD enabled systems which provide the memory-based speed and reliability required to support the highest levels of concurrency on today’s web-based systems.  With this in mind, MySQL 5.6 includes several key enhancements designed specifically for use with SSD, including:

  • Support for smaller 4k and 8k page sizes to better fit the standard storage algorithm of SSD.
  • Portable .ibd (InnoDB data) files that allow “hot” InnoDB tables to be easily moved from the default data directory to SSD or network storage devices.
  • Separate tablespaces for the InnoDB unlog log that optionally moves the undo log out of the system tablespace into one or more separate tablespaces.  The read-intensive I/O patterns for the undo log make these new tablespaces good candidates to move to SSD storage, while keeping the system tablespace on hard drive storage.

Learn about all supporting SSD optimizations here.

Better Query Execution Times and Diagnostics: Improved Optimizer
The MySQL 5.6 Optimizer has been re-factored for better efficiency and performance and provides an improved feature set for better query execution times and diagnostics.  They key 5.6 optimizer improvements include:

Subquery Optimizations – Using semi-JOINs and materialization, the MySQL Optimizer delivers greatly improved subquery performance, simplifying how developers construct queries.  Specifically, the optimizer is now more efficient in handling subqueries in the FROM clause; materialization of subqueries in the FROM clause is now postponed until their contents are needed during execution, greatly improving performance.  Additionally, the optimizer may add an index to derived tables during execution to speed up row retrieval.  Tests run using the DBT-3 benchmark Query #13, shown below, demonstrate an order of magnitude improvement in execution times (from days to seconds) over previous versions.

select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
from customer, orders, lineitem
where o_orderkey in (
                select l_orderkey
                from lineitem
                group by l_orderkey
                having sum(l_quantity) > 313
  )
  and c_custkey = o_custkey
  and o_orderkey = l_orderkey
group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
order by o_totalprice desc, o_orderdate
LIMIT 100;

File Sort Optimizations with Small Limit – For queries with ORDER BY and small LIMIT values, the optimizer now produces an ordered result set using a single table scan.  These queries are common in web applications that display only a few rows from a large result set such as:

SELECT col1, ... FROM t1 ... ORDER BY name LIMIT 10;

Internal benchmarks have shown up to a 4x improvement in query execution times which helps improve overall user experience and response times. 

Index Condition Pushdown (ICP) – By default, the optimizer now pushes WHERE conditions down to the storage engine for evaluation, table scan and return of ordered result set to the MySQL server. 

CREATE TABLE person (
      personid INTEGER PRIMARY KEY,
      firstname CHAR(20),
      lastname CHAR(20),
      postalcode INTEGER,
      age INTEGER,
      address CHAR(50),
      KEY k1 (postalcode,age)‏
   ) ENGINE=InnoDB;

SELECT lastname, firstname FROM person
   WHERE postalcode BETWEEN 5000 AND 5500 AND age BETWEEN 21 AND 22; 


Internal benchmarks on this type of table and query have shown up to 15x improved execution times with the ICP default behavior. 

Batched Key Access (BKA) and Multi-Range Read (MRR) – The optimizer now provides the storage engine with all primary keys in batches and enables the storage engine to access, order and return the data more efficiently greatly improving query execution times. 



Together, BKA and MRR show up to 280x improvement in query execution times for DBT-3 Query 13 and other disk-bound query benchmarks.

Better Optimizer Diagnostics – The MySQL 5.6 optimizer also provides better diagnostics and debugging with:

  • EXPLAIN for INSERT, UPDATE, and DELETE operations,
  • EXPLAIN plan output in JSON format with more precise optimizer metrics and better readability
  • Optimizer Traces for tracking the optimizer decision-making process.


Learn about all of MySQL 5.6 Optimizer improvements and features, in the MySQL docs.

For a deep technical dive into the implementation, how to enable/disable where applicable, related benchmarks and the use case specific performance improvements you can expect with each of these new features check out the MySQL Optimizer Engineering team blog.

Better Application Availability: Online DDL/Schema Changes
Today's web-based applications are designed to rapidly evolve and adapt to meet business and revenue-generation requirements. As a result, development SLAs are now most often measured in minutes vs days or weeks. So when an application must quickly support new product lines or new products within existing product lines, the backend database schema must adapt in kind, most commonly while the application remains available for normal business operations.  MySQL 5.6 supports this level of online schema flexibility and agility by providing the following new ALTER TABLE DDL syntax additions:

CREATE INDEX
DROP INDEX
Change AUTO_INCREMENT value for a column
ADD/DROP FOREIGN KEY
Rename COLUMN
Change ROW FORMAT, KEY_BLOCK_SIZE for a table
Change COLUMN NULL, NOT_NULL
Add, drop, reorder COLUMN


DBAs and Developers can add indexes and perform standard InnoDB table alterations while the database remains available for application updates. This is especially beneficial for rapidly evolving applications where developers need schema flexibility to accommodate changing business requirements.

Learn about all of MySQL 5.6 InnoDB online DDL improvements and features, in the MySQL docs.

Better Developer Agility: NoSQL Access to InnoDB
MySQL 5.6 provides simple, key-value interaction with InnoDB data via the familiar Memcached API.  Implemented via a new Memcached daemon plug-in to mysqld, the new Memcached protocol is mapped directly to the native InnoDB API and enables developers to use existing Memcached clients to bypass the expense of query parsing and go directly to InnoDB data for lookups and transactional compliant updates.  The API makes it possible to re-use standard Memcached libraries and clients, while extending Memcached functionality by integrating a persistent, crash-safe, transactional database back-end.  The implementation is shown here:


So does this option provide a performance benefit over SQL?  Internal performance benchmarks using a customized Java application and test harness show some very promising results with a 9X improvement in overall throughput for SET/INSERT operations:


Not only do developers and DBAs get more performance and flexibility, they also reduce complexity as it is possible to compress previously separate caching and database layers into a single data management tier, as well as eliminate the overhead of maintaining cache consistency.

You can follow the InnoDB team blog for the methodology, implementation and internal test cases that generated the above results.

Learn more about the details and how to get started with the new Memcached API to InnoDB in the MySQL docs.

Better Developer Agility: Extended InnoDB Use Cases
New MySQL 5.6 optimizations and features extend InnoDB into more use cases so developers can simplify applications by standardizing on a single storage engine.

New Full Text Search (FTS) – Provided as a better alternative to MyISAM FTS, InnoDB now enables developers to build FULLTEXT indexes on InnoDB tables to represent text-based content and speed up application searches for words and phrases.  InnoDB full-text search supports Natural language/Boolean modes, proximity search and relevance ranking.  A simple use case example looks like: 

CREATE TABLE quotes
(id int unsigned auto_increment primary key
, author varchar(64)
, quote varchar(4000)
, source varchar(64)
, fulltext(quote)
) engine=innodb;

SELECT author AS “Apple" FROM quotes
    WHERE match(quote) against (‘apple' in natural language mode);


New Transportable Tablespaces – InnoDB .ibd files created in file-per-table mode are now transportable between physical storage devices and database servers; when creating a table developers can now designate a storage location for the .idb file outside of the MySQL data directory.  This enables “hot” or busy tables to be easily moved to an external network storage device (SSD, HDD) that does not compete with application or database overhead. This new feature also enables quick, seamless application scale by allowing users to easily export/import InnoDB tables between running MySQL servers, as shown here:

Example Export:
CREATE TABLE t(c1 INT) engine=InnoDB;
FLUSH TABLE t FOR EXPORT; -- quiesce the table and create the meta data file
$innodb_data_home_dir/test/t.cfg
UNLOCK TABLES;


Corresponding Import:
CREATE TABLE t(c1 INT) engine=InnoDB; -- if it doesn't already exist
ALTER TABLE t DISCARD TABLESPACE;
-- The user must stop all updates on the tables, prior to the IMPORT
ALTER TABLE t IMPORT TABLESPACE;


The InnoDB improvements noted here are by no means exhaustive.  The complete accounting of all MySQL 5.6 features, along with all technical documentation, is available in the MySQL docs.

For a deep technical dive into the implementation, how to enable/disable where applicable and the use case specific improvements you can expect with each of these new features follow the MySQL InnoDB Engineering team blog.

Improved Replication and High Availability
Replication is the most widely used MySQL feature for scale-out and High Availability (HA) and MySQL 5.6 includes new features designed to enable developers building next generation web, cloud, social and mobile applications and services with self-healing replication topologies and high performance master and slaves.  The key features include:

New Global Transactions Identifiers (GTIDs) – GTIDs enable replication transactional integrity to be tracked through a replication master/slave topology, providing a foundation for self-healing recovery, and enabling DBAs and developers to easily identify the most up to date slave in the event of a master failure.  Built directly into the Binlog stream, GTIDs eliminate the need for complex third-party add-ons to provide this same level of tracking intelligence.



New MySQL Replication utilities – A new set of Python Utilities are designed to leverage the new replication GTIDs to provide replication administration and monitoring with automatic fail-over in the event of a failed master, or switchover in the event of maintenance to the master. This eliminates the need for additional third party High-Availability solutions, protecting web and cloud-based services against both planned and unplanned downtime without operator intervention.

New Multi-threaded Slaves - Splits processing between worker threads based on schema, allowing updates to be applied in parallel, rather than sequentially. This delivers benefits to those workloads that isolate application data using databases - e.g. multi-tenant systems. 



SysBench benchmarks using a graduated number of worker threads across 10 schemas show up to 5x in performance gain with multi-threading enabled. 

New Binary Log Group Commit (BGC) – In MySQL 5.6 replication masters now group writes to the Binlog rather than committing them one at a time, significantly improving performance on the master side of the topology.  BGC also enables finer grained locking which reduces lock waits, again, adding to the performance gain, shown here:



MySQL 5.6 shows up to a 180% performance gain over 5.5 in master server throughput with replication enabled (Binlog=1). BGC largely eliminates the trade-off users had to make between performance overhead to the master and the scale-out, HA benefits offered by MySQL replication.

New Optimized Row-based Replication – MySQL 5.6 provides a new option variable binlog-row-image=minimal that enables applications to replicate only data elements of the row image that have changed following DML operations.  This improves replication throughput for both the master and slave(s) and minimizes binary log disk space, network resource and server memory footprint.
New Crash-Safe Slaves – MySQL 5.6 stores Binlog positional data within tables so slaves can automatically roll back replication to the last committed event before a failure, and resume replication without administrator intervention. Not only does this reduce operational overhead, it also eliminates the risk of data loss caused by a slave attempting to recover from a corrupted data file.  Further, if a crash to the master causes corruption of the binary log, the server will automatically recover it to a position where it can be read correctly.
New Replication Checksums – MySQL 5.6 ensure the integrity of data being replicated to a slave by detecting data corruption and returning an error before corrupt events are applied to the slave, preventing the slave itself from becoming corrupt.
New Time-delayed Replication – MySQL 5.6 provides protection against operational errors made on the master from propagating to attached slaves by allowing developers to add defined delays in the replication stream.  With configurable master to slave time delays, in the event of failure or mishap, slaves can be promoted to the new master in order to restore the database to its previous state. It also becomes possible to inspect the state of a database before an error or outage without the need to reload a back up.

Learn about these and all of MySQL 5.6 Replication and High Availability improvements and features, along with all technical documentation, in the MySQL docs
For a rundown of the details, use cases and related benchmarks of all of these features check out Mat Keep’s Developer Zone article.

Improved Performance Schema
The MySQL Performance Schema was introduced in MySQL 5.5 and is designed to provide point in time metrics for key performance indicators.  MySQL 5.6 improves the Performance Schema in answer to the most common DBA and developer problems.  New instrumentation includes:

Statements/Stages -  What are my most resource intensive queries? Where do they spend time?
Table/Index I/O, Table Locks - Which application tables/indexes cause the most load or contention?
Users/Hosts/Accounts - Which application users, hosts, accounts are consuming the most resources?
Network I/O - What is the network load like? How long do sessions idle?
Summaries - Aggregated statistics grouped by statement, thread, user, host, account or object.

The MySQL 5.6 Performance Schema is now enabled by default in the my.cnf file with optimized and auto-tune settings that minimize overhead (< 5%, but mileage will vary), so using the Performance Schema a production server to monitor the most common application use cases is less of an issue.  In addition, new atomic levels of instrumentation enable the capture of granular levels of resource consumption by users, hosts, accounts, applications, etc. for billing and chargeback purposes in cloud computing environments.

MySQL Engineering has several champions behind the 5.6 Performance Schema, and many have published excellent blogs that you can reference for technical and practical details.  To get started see blogs by Mark Leith and Marc Alff.

The MySQL docs are also an excellent resource for all that is available and that can be done with the 5.6 Performance Schema. 


Other Important Enhancements
New default configuration optimizations – MySQL 5.6 introduces changes to the server defaults that provide better out-of-the-box performance on today’s system architectures.  These new defaults are designed to minimize the upfront time spent on changing the most commonly updated variables and configuration options.   Many configuration options are now auto sized based on environment, and can also be set and controlled at server start up.

Improved TIME/TIMESTAMP/DATETIME Data Types:

  • TIME/TIMESTAMP/DATETIME – Now allow microsecond level precision for more precise time/date comparisons and data selection.
  • TIMESTAMP/DATETIME – Improves on 5.5. by allowing developers to assign the current timestamp, an auto-update value, or both, as the default value for TIMESTAMP and DATETIME columns, the auto-update value, or both.
  • TIMESTAMP - Columns are now nullable by default.  TIMESTAMP columns no longer get DEFAULT NOW() or ON UPDATE NOW() attributes automatically without them being explicitly specified and non-NULLable TIMESTAMP columns without explicit default value treated as having no default value.

Better Condition Handling – GET DIAGNOSTICS
MySQL 5.6 enables developers to easily check for error conditions and code for exceptions by introducing the new MySQL Diagnostics Area and corresponding GET DIAGNOSTICS interface command. The Diagnostic Area can be populated via multiple options and provides 2 kinds of information:

  • Statement - which provides affected row count and number of conditions that occurred
  • Condition - which provides error codes and messages for all conditions that were returned by a previous operation

The addressable items for each are:



The new GET DIAGNOSTICS command provides a standard interface into the Diagnostics Area and can be used via the CLI or from within application code to easily retrieve and handle the results of the most recent statement execution:

mysql> DROP TABLE test.no_such_table;
ERROR 1051 (42S02): Unknown table 'test.no_such_table'
mysql> GET DIAGNOSTICS CONDITION 1
-> @p1 = RETURNED_SQLSTATE, @p2 = MESSAGE_TEXT;
mysql> SELECT @p1, @p2;
+-------+------------------------------------+
| @p1   | @p2                                |
+-------+------------------------------------+
| 42S02 | Unknown table 'test.no_such_table' |
+-------+------------------------------------+


Options for leveraging the MySQL Diagnostics Area are detailed here. You can learn more about GET DIAGNOSTICS here.  

Improved IPv6 Support

  • MySQL 5.6 improves INET_ATON() to convert and store string-based IPv6 addresses as binary data for minimal space consumption.
  • MySQL 5.6 changes the default value for the bind-address option from “0.0.0.0” to “0::0” so the MySQL server accepts connections for all IPv4 and IPv6 addresses.  You can learn more here.

Improved Partitioning

  • Improved performance for tables with large number of partitions – MySQL 5.6 now performs and scales on highly partitioned systems, specifically for INSERT operations that span upwards of hundreds of partitions.
  • Import/export tables to/from partitioned tables - MySQL 5.6 enables users to exchange a table partition or sub-partition with a table using the ALTER TABLE ... EXCHANGE PARTITION statement; existing rows in a partition or subpartition can be moved to a non-partitioned table, and conversely, any existing rows in a non-partitioned table can be moved to an existing table partition or sub-partition. 
  • Explicit partition selection - MySQL 5.6 supports explicit selection of partitions and subpartitions that are checked for rows matching a given WHERE condition. Similar to automatic partition pruning, the partitions to be checked are specified/controlled by the issuer of the statement, and is supported for both queries and a number of DML statements (SELECT, DELETE, INSERT, REPLACE, UPDATE, LOAD DATA, LOAD XML). 


Improved GIS: Precise spatial operations - MySQL 5.6 provides geometric operations via precise object shapes that conform to the OpenGIS standard for testing the relationship between two geometric values. 

Conclusion

MySQL 5.5 has been called the best release of MySQL ever.  MySQL 5.6 builds on this by providing across the board improvements in performance, scalability, transactional throughput, availability and performance related instrumentation all designed to keep pace with requirements of the most demanding web, cloud and embedded use cases. The MySQL 5.6 Release Candidate is now available for download for early adopter and development purposes.

Next Steps

As always, thanks for reading, and thanks for your continued support of MySQL!


Tuesday Jan 08, 2013

Deep Dive into GTIDs and MySQL 5.6 - What, Why and How

Global Transaction Identifiers (GTIDs) are one of the key replication enhancements in MySQL 5.6. GTIDs make it simple to track and compare replication across a master - slave topology. This enables:

- Much simpler recovery from failures of the master,

- Introduces great flexibility in the provisioning and on-going management of multi-tier or ring (circular) replication topologies.

A new on-demand MySQL 5.6 GTID webinar delivered by the replication engineering team is now available, providing deep insight into the design and implementation of GTIDs, and how they enable users to simplify MySQL scaling and HA. The webinar covers:

- Concepts: What is a GTID? How does the server generate GTIDs? What is the life cycle of GTIDs? How are GTIDs used to connect to a master?

- Handling conflicts

- How to skip transactions using GTIDs

- What happens when binary logs are purged

- How to provision a new slave or restore from a backup

- MySQL utilities for automated failover and controlled switchover

To whet your appetite, an extract of the Q&A from the webinar is as follows. These, and many other questions were answered during the session:

Q. Which versions of MySQL support GTIDs?

A. MySQL 5.6.5 and above

Q. Is GTID ON by default in 5.6?

A. It is OFF by default

Q. What does the GTID contain?

A. It is made up of a unique ID for the server followed by an ever-increasing counter that's specific to that server

Q: Do GTIDs introduce any increased space requirements?

A: Yes, since GTIDs are stored in the binary log, the binary logs will be larger. However, we expect the overhead to be relatively small. GTIDs are written to the binary log in two places:

(1) A small header is stored in the beginning of the binary log. This contains the variable @@gtid_purged, i.e., a list of previously logged GTIDS. Since the list is range-compressed, this is expected to be small: a small fixed-size header plus 40 bytes times the number of master servers in your topology.

(2) A fixed size header is added before each transaction in the binary log. This is 44 bytes, so will typically be small compared to the transaction itself.

Q. I understand GTID's are associated with Transactions. How do they map to the events within each transaction, or do GTID's map as an event itself in a binlog file?

A. Yes, GTIDs are associated with transactions. In the binary log, the GTID is realized as an event written prior to the events that constitute the transaction. The event is called a Gtid_log_event.

Q What if a transaction spans a filtered out table and a non-filtered out table? How does it get recorded on the slave?

A. If the filters are on the master, then a partly logged transaction will be replicated with its GTID.

If filtering on the slave side, a partial image will be processed on the slave and the original GTID is logged (to the slave's binlog) with the processed transaction.

Q. Prior to GTID, to build a new slave, we use mysqldump --master-data=1 to get the slave starting sync point in the dump. With GTID enabled, does it set the gtid_executed / purged in the dump instead?

A. Yes, mysqldump will detect that the server uses GTIDs and output a SET GTID_PURGED statement. (And there is an option to turn off that, e.g., in case you want to execute the output on an old server).

Q. How do GTIDs enable failover and recovery?

A. GTIDs are using in combination with the MySQL utilities. The mysqlfailover and rpladmin utilities provide administration of GTID-enabled slaves, enabling monitoring with automatic failover and on-demand switchover, coupled with slave promotion. GTIDs make it straightforward to reliably failover from the master to the most current slave automatically in the event of a failure. DBAs no longer need to manually analyze the status of each of their slaves to identify the most current when seeking a target to promote to the new master.

Resources to Get Started

In addition to the webinar, here are some other key resources that will give you the detail you need to take advantage of GTIDs in your most important MySQL workloads:

- Engineering blog: Global Transaction Identifiers – why, what, and how

- Engineering blog: Advanced use of GTIDs

- Documentation: Setting up replication with GTIDs 

- Video Tutorial: MySQL replication utilities for auto-failover and switchover 

- Engineering Blog: Controlling read consistency with GTIDs 

If you have any comments, questions or feature requests, don't hesitate to leave a comment on this blog

Wednesday Dec 05, 2012

Top 5 Developer Enabling Nuggets in MySQL 5.6

MySQL 5.6 is truly a better MySQL and reflects Oracle's commitment to the evolution of the most popular and widely
used open source database on the planet.  The feature-complete 5.6 release candidate was announced at MySQL Connect in late September and the production-ready, generally available ("GA") product should be available in early 2013.  

While the message around 5.6 has been focused mainly on mass appeal, advanced topics like performance/scale, high availability, and self-healing replication clusters, MySQL 5.6 also provides many developer-friendly nuggets that
are designed to enable those who are building the next generation of web-based and embedded applications and services. Boiling down the 5.6 feature set into a smaller set, of simple, easy to use goodies designed with developer agility in mind, these things deserve a quick look:

Subquery Optimizations

Using semi-JOINs and late materialization, the MySQL 5.6 Optimizer delivers greatly improved subquery performance. Specifically, the optimizer is now more efficient in handling subqueries in the FROM clause; materialization of subqueries in the FROM clause is now postponed until their contents are needed during execution. Additionally, the optimizer may add an index to derived tables during execution to speed up row retrieval. Internal tests run using the DBT-3 benchmark Query #13, shown below, demonstrate an order of magnitude improvement in execution times (from days to seconds) over previous versions.

select c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice, sum(l_quantity)
from customer, orders, lineitem
where o_orderkey in (
                select l_orderkey
                from lineitem
                group by l_orderkey
                having sum(l_quantity) > 313
  )
  and c_custkey = o_custkey
  and o_orderkey = l_orderkey
group by c_name, c_custkey, o_orderkey, o_orderdate, o_totalprice
order by o_totalprice desc, o_orderdate
LIMIT 100;


What does this mean for developers?  For starters, simplified subqueries can now be coded instead of complex joins for cross table lookups:

SELECT title FROM film WHERE film_id IN

(SELECT film_id FROM film_actor

GROUP BY film_id HAVING count(*) > 12);

And even more importantly subqueries embedded in packaged applications no longer need to be re-written into joins.  This is good news for both ISVs and their customers who have access to the underlying queries and who have spent development cycles writing, testing and maintaining their own versions of re-written queries across updated versions of a packaged app.

The details are in the MySQL 5.6 docs.

Online DDL Operations

Today's web-based applications are designed to rapidly evolve and adapt to meet business and revenue-generation
requirements. As a result, development SLAs are now most often measured in minutes vs days or weeks. For example, when an application must quickly support new product lines or new products within existing product lines, the backend database schema must adapt in kind, and most commonly while the application remains available for normal business operations.  MySQL 5.6 supports this level of online schema flexibility and agility by providing the following new ALTER TABLE online DDL syntax additions: 

  • CREATE INDEX
  • DROP INDEX
  • Change AUTO_INCREMENT value for a column
  • ADD/DROP FOREIGN KEY
  • Rename COLUMN
  • Change ROW FORMAT, KEY_BLOCK_SIZE for a table
  • Change COLUMN NULL, NOT_NULL
  • Add, drop, reorder COLUMN

Again, the details are in the MySQL 5.6 docs.

Key-value access to InnoDB via Memcached API

Many of the next generation of web, cloud, social and mobile applications require fast operations against simple
Key/Value pairs. At the same time, they must retain the ability to run complex queries against the same data, as
well as ensure the data is protected with ACID guarantees. With the new NoSQL API for InnoDB, developers have all
the benefits of a transactional RDBMS, coupled with the performance capabilities of Key/Value store.

MySQL 5.6 provides simple, key-value interaction with InnoDB data via the familiar Memcached API.  Implemented via a new Memcached daemon plug-in to mysqld, the new Memcached protocol is mapped directly to the native InnoDB API and enables developers to use existing Memcached clients to bypass the expense of query parsing and go directly to InnoDB data for lookups and transactional compliant updates.  The API makes it possible to re-use standard Memcached libraries and clients, while extending Memcached functionality by integrating a persistent, crash-safe, transactional database back-end.  The implementation is shown here:



So does this option provide a performance benefit over SQL?  Internal performance benchmarks using a customized
Java application and test harness show some very promising results with a 9X improvement in overall throughput for SET/INSERT operations:


You can follow the InnoDB team blog for the methodology, implementation and internal test cases that generated these results here.

How to get started with Memcached API to InnoDB is here.

New Instrumentation in Performance Schema

The MySQL Performance Schema was introduced in MySQL 5.5 and is designed to provide point in time metrics for key performance indicators.  MySQL 5.6 improves the Performance Schema in answer to the most common DBA and Developer problems.  New instrumentations include:

  • Statements/Stages
    • What are my most resource intensive queries? Where do they spend time?
  • Table/Index I/O, Table Locks
    • Which application tables/indexes cause the most load or contention?
  • Users/Hosts/Accounts
    • Which application users, hosts, accounts are consuming the most resources?
  • Network I/O
    • What is the network load like? How long do sessions idle?
  • Summaries
    • Aggregated statistics grouped by statement, thread, user, host, account or object.

The MySQL 5.6 Performance Schema is now enabled by default in the my.cnf file with optimized and auto-tune settings that minimize overhead (< 5%, but mileage will vary), so using the Performance Schema on
a production server to monitor the most common application use cases is less of an issue.  In addition, new atomic levels of instrumentation enable the capture of granular levels of resource consumption by users, hosts, accounts, applications, etc. for billing and chargeback purposes in cloud computing environments.

The MySQL docs are an excellent resource for all that is available and that can be done with the 5.6 Performance Schema.

Better Condition Handling - GET DIAGNOSTICS

MySQL 5.6 enables developers to easily check for error conditions and code for exceptions by introducing the new
MySQL Diagnostics Area and corresponding GET DIAGNOSTICS interface command. The Diagnostic Area can be populated via multiple options and provides 2 kinds of information:

Statement - which provides affected row count and number of conditions that occurred
Condition - which provides error codes and messages for all conditions that were returned by a previous operation

The addressable items for each are:


The new GET DIAGNOSTICS command provides a standard interface into the Diagnostics Area and can be used via the CLI or from within application code to easily retrieve and handle the results of the most recent statement execution.  An example of how it is used might be:

mysql> DROP TABLE test.no_such_table;
ERROR 1051 (42S02): Unknown table 'test.no_such_table'
mysql> GET DIAGNOSTICS CONDITION 1
-> @p1 = RETURNED_SQLSTATE, @p2 = MESSAGE_TEXT;
mysql> SELECT @p1, @p2;
+-------+------------------------------------+
| @p1   | @p2                                |
+-------+------------------------------------+
| 42S02 | Unknown table 'test.no_such_table' |
+-------+------------------------------------+


Options for leveraging the MySQL Diagnotics Area and GET DIAGNOSTICS are detailed in the MySQL Docs.

While the above is a summary of some of the key developer enabling 5.6 features, it is by no means exhaustive. You can dig deeper into what MySQL 5.6 has to offer by reading this developer zone article or checking out "What's New in MySQL 5.6" in the MySQL docs.

BONUS ALERT!  If you are developing on Windows or are considering MySQL as an alternative to SQL Server for your next project, application or shipping product, you should check out the MySQL Installer for Windows.  The installer includes the MySQL 5.6 RC database, all drivers, Visual Studio and Excel plugins, tray monitor and development tools all a single download and GUI installer.  

So what are your next steps?

  • Register for Dec. 12 "MySQL Replication: Simplifying Scaling and HA with GTIDs" live web event to learn about the new Replication features in MySQL 5.6.
  • Register for Dec. 13 "MySQL 5.6: Building the Next Generation of Web-Based Applications and Services" live web event.  Hurry!  Seats are limited.
  • Download the MySQL 5.6 Release Candidate (look under the Development Releases tab)
  • Provide Feedback <link to http://bugs.mysql.com/>
  • Join the Developer discussion on the MySQL Forums
  • Explore all MySQL Products and Developer Tools

As always, thanks for your continued support of MySQL!

Friday Sep 21, 2012

MySQL Connect 8 Days Away - Replication Sessions

Following on from my post about MySQL Cluster sessions at the forthcoming Connect conference, its now the turn of MySQL Replication - another technology at the heart of scaling and high availability for MySQL.

Unless you've only just returned from a 6-month alien abduction, you will know that MySQL 5.6 includes the largest set of replication enhancements ever packaged into a single new release:

- Global Transaction IDs + HA utilities for self-healing cluster..(yes both automatic failover and manual switchover available!)

- Crash-safe slaves and binlog

- Binlog Group Commit and Multi-Threaded Slaves for high performance

- Replication Event Checksums and Time-Delayed replication

- and many more

There are a number of sessions dedicated to learn more about these important new enhancements, delivered by the same engineers who developed them. Here is a summary

Saturday 29th, 13.00

Replication Tips and Tricks, Mats Kindahl

In this session, the developers of MySQL Replication present a bag of useful tips and tricks related to the MySQL 5.5 GA and MySQL 5.6 development milestone releases, including multisource replication, using logs for auditing, handling filtering, examining the binary log, using relay slaves, splitting the replication stream, and handling failover.


Saturday 29th, 17.30

Enabling the New Generation of Web and Cloud Services with MySQL 5.6 Replication, Lars Thalmann

This session showcases the new replication features, including

• High performance (group commit, multithreaded slave)

• High availability (crash-safe slaves, failover utilities)

• Flexibility and usability (global transaction identifiers, annotated row-based replication [RBR])

• Data integrity (event checksums)


Saturday 29th, 1900

MySQL Replication Birds of a Feather

In this session, the MySQL Replication engineers discuss all the goodies, including global transaction identifiers (GTIDs) with autofailover; multithreaded, crash-safe slaves; checksums; and more. The team discusses the design behind these enhancements and how to get started with them. You will get the opportunity to present your feedback on how these can be further enhanced and can share any additional replication requirements you have to further scale your critical MySQL-based workloads.


Sunday 30th, 10.15

Hands-On Lab, MySQL Replication, Luis Soares and Sven Sandberg

But how do you get started, how does it work, and what are the best practices and tools? During this hands-on lab, you will learn how to get started with replication, how it works, architecture, replication prerequisites, setting up a simple topology, and advanced replication configurations. The session also covers some of the new features in the MySQL 5.6 development milestone releases.


Sunday 30th, 13.15

Hands-On Lab, MySQL Utilities, Chuck Bell

Would you like to learn how to more effectively manage a host of MySQL servers and manage high-availability features such as replication? This hands-on lab addresses these areas and more. Participants will get familiar with all of the MySQL utilities, using each of them with a variety of options to configure and manage MySQL servers.


Sunday 30th, 14.45

Eliminating Downtime with MySQL Replication, Luis Soares

The presentation takes a deep dive into new replication features such as global transaction identifiers and crash-safe slaves. It also showcases a range of Python utilities that, combined with the Release 5.6 feature set, results in a self-healing data infrastructure. By the end of the session, attendees will be familiar with the new high-availability features in the whole MySQL 5.6 release and how to make use of them to protect and grow their business.


Sunday 30th, 17.45

Scaling for the Web and the Cloud with MySQL Replication, Luis Soares

In a Replication topology, high performance directly translates into improving read consistency from slaves and reducing the risk of data loss if a master fails. MySQL 5.6 introduces several new replication features to enhance performance. In this session, you will learn about these new features, how they work, and how you can leverage them in your applications. In addition, you will learn about some other best practices that can be used to improve performance.

So how can you make sure you don't miss out - the good news is that registration is still open ;-)

And just to whet your appetite, listen to the On-Demand webinar that presents an overview of MySQL 5.6 Replication.  

Wednesday May 23, 2012

MySQL 5.6 Replication: FAQ

On Wednesday May 16th, we ran a webinar to provide an overview of all of the new replication features and enhancements that are previewed in the MySQL 5.6 Development Release – including Global Transaction IDs, auto-failover and self-healing, multi-threaded, crash-safe slaves and more.

Collectively, these new capabilities enable MySQL users to scale for next generation web and cloud applications.

Attendees posted a number of great questions to the MySQL developers, serving to provide additional insights into how these new features are implemented. So I thought it would be useful to post those below, for the benefit of those unable to attend the live webinar (note, you can listen to the On-Demand replay which is available now).

Before getting to the Q&A, there are a couple of other resources that maybe useful to those wanting to learn more about Replication in MySQL 5.6

On-Demand webinar

Slides used during the webinar

For more detail on any of the features discussed below, be sure to check out the Developer Zone article: Replication developments in MySQL 5.6

So here is the Q&A from the event 

Multi-Threaded Slaves

The results from recent benchmarking of the Multi-Threaded Slave enhancement were discussed, prompting the following questions

Q. Going from 0 - 10 threads, did you notice any increase in IOwait on CPU?

A. In this case, yes. The actual amount depends on a number of factors: your hardware, query/transaction distribution across databases, server general and InnoDB specific configuration parameters.

Q. Will the multiple threads work on different transactions on the same database, or each thread works on a separate database?

A. Each worker thread works on separate databases (schemas).

Q. If I set the slave-parallel-workers to less than the number of databases, can I configure which databases use which worker threads?

A. There is no such configuration option to assign a certain Worker thread to a database. Configuring slave-parallel-workers to less than the number of databases is a good setup. A Worker can handle multiple databases.

Q. If I create 4 threads and I have 100 databases, can I configure which databases use which threads?

A. There won't be 1 worker to a mapping of exactly 25 databases, but it will be very close to that type of distribution.

Q. Thank You. I ask as we have about 8 databases that have a lot of transactions and about 30 that are used less frequently. It would be nice to have the ability to create 9 workers, 1 for each if the heavy databases and 1 for all the others

A. The current design enables relatively equal distribution of transactions across your databases, but it could be that 9 workers will fit anyway.

Q. Does multi-thread slave still work if there is foreign key across database?

A. MTS preserves slave consistency *within* a database, but not necessarily *between* databases. With MTS enabled and replication ongoing, updates to one database can be executed before updates to another causing them to be temporarily out of sync with each other.

Q. How is auto-increment managed with the multi-thread slave?

A. MTS makes sure Worker threads do not execute concurrently on the same table. Auto-increment is guaranteed to be handled in the same way as the single threaded "standard" slave handles auto-increment

Q. Can you use semi-synchronous replication on one of the slaves when using MTS?

A. Semi-sync is friendly to MTS. MTS is about parallel execution - so they cooperate well.

Optimized Row Based Replication

Q. If you only store the PK column in the Before Image for updates, does this mean you don't care about the slave's data potentially being out-of-sync? Will we be able to control how much data is stored in the binary logs?

A. The rule is that we ship a *PK equivalent* so that the slave is always able to find a row. This means:
  1. if master table has PK, then we ship the PK only
  2. if master table does not have PK, then we ship the entire row

Global Transaction IDs and HA Utilities

Q. Would the failover utility need to sit on a 3rd party host to allow arbitration?

A. The utility would typically run on a client, not on the hosts it is monitoring

Q. Can you explain the upgrade process to move to MySQL 5.6? I am assuming that the slave(s) are upgraded first and that replication would be backwards compatible. And after the slave(s) are upgraded, the master would be next. But then how do you turn on the GTID?

A. Right: the slave(s) are upgraded first. After upgrading Slaves they would be started with --gtid-mode=on (technically a couple of other options are needed as well). And then the same process would be followed for the upgraded Master.

Q. For failover functionality, if I had a setup like this: Server1 replicates to Server2, Server2 replicates to Server3, Server4, and Server5. If Server2 were to fail, can I have it configured so that Server1 can become the new master for Server3/4/5?

A. Yes - you can either configure the failover to the most recent slave based on GTID, or list specific candidates. What will happen is that Slave 1 will temporarily become a slave of 3, 4 and 5 to ensure it replicates any more recent transactions those slaves may have, and then it will become the master

Replication Event Checksums

Q. What is the overhead of replication checksums?

A. It is minimal - we plan to publish benchmarks over the summer to better characterize any overhead

Q. Are you exposing the checksum to the plugin api so we can add our own checksum types?

A. We prepared some of interfaces, i.e. checksum methods are identified by a specific code byte (1-127) values. But it's not really a plugin at this point.

Q. Do checksums verify both replicated data and the data on slave?

A. Yes - checksums are implemented across the entire path - so you can check for issues in replication itself, or in the hardware or network

Q. I think, it is better to turn on checksums at the relaylog to avoid overhead on the master, but if we do that and checksum fails (i.e. not matching the master's data) then what happens – will the slave throw an error

A. I agree, it's better to relax the Master, which verifies the checksum only optionally when the Dump Thread reads from the binlog prior to the replication event being sent out to the Slave. The slave mandatorily checks the checksummed events when they are sent across the network, and optionally when they are read from Relay-log. In either case, an error is thrown.

Q. Are checksums optional? In some cases we don't care for huge data loads

A. Yes, checksums are optional.

Time Delayed Replication

Q. Is Time delayed replication applied at the database level only and not for the entire slave?

A. Applied for the slave as execution is global.

Informational Log Events

Q. When we configure information log event, does it show meaningful query log for any binlog format? (Row based especially)

A. When using row based replication, you get the original query in human readable format... obviously you don’t want to see all the rows modified in a table of size, which can be huge

Q. Will the binlog include user id?

A. User id is replicated in some cases for Query-log-event - namely user id of the invoker when a stored routine is called.

Remote Binlog Backup

Q. What level of access does the remote binlog backup need?

A. Replication slave

Summary

As you can see, our Engineering team was kept busy with questions over the course of the webinar. Be sure to check out the MySQL 5.6 Replication webinar replay and if you have further questions, please don’t hesitate to use the comments below!

Tuesday Apr 10, 2012

Benchmarking MySQL Replication with Multi-Threaded Slaves

The objective of this benchmark is to measure the performance improvement achieved when enabling the Multi-Threaded Slave enhancement delivered as a part MySQL 5.6.

As the results demonstrate, Multi-Threaded Slaves delivers 5x higher replication performance based on a configuration with 10 databases/schemas. For real-world deployments, higher replication performance directly translates to:

· Improved consistency of reads from slaves (i.e. reduced risk of reading "stale" data)

· Reduced risk of data loss should the master fail before replicating all events in its binary log (binlog)

The multi-threaded slave splits processing between worker threads based on schema, allowing updates to be applied in parallel, rather than sequentially. This delivers benefits to those workloads that isolate application data using databases - e.g. multi-tenant systems deployed in cloud environments.

Multi-Threaded Slaves are just one of many enhancements to replication previewed as part of the MySQL 5.6 Development Release, which include:

· Global Transaction Identifiers coupled with MySQL utilities for automatic failover / switchover and slave promotion

· Crash Safe Slaves and Binlog

· Optimized Row Based Replication

· Replication Event Checksums

· Time Delayed Replication

These and many more are discussed in the “MySQL 5.6 Replication: Enabling the Next Generation of Web & Cloud Services” Developer Zone article 

Back to the benchmark - details are as follows.


Environment
The test environment consisted of two Linux servers:

· one running the replication master

· one running the replication slave.

Only the slave was involved in the actual measurements, and was based on the following configuration:

- Hardware: Oracle Sun Fire X4170 M2 Server

- CPU: 2 sockets, 6 cores with hyper-threading, 2930 MHz.

- OS: 64-bit Oracle Enterprise Linux 6.1
- Memory: 48 GB

Test Procedure
Initial Setup:

Two MySQL servers were started on two different hosts, configured as replication master and slave.

10 sysbench schemas were created, each with a single table:

CREATE TABLE `sbtest` (
   `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
   `k` int(10) unsigned NOT NULL DEFAULT '0',
   `c` char(120) NOT NULL DEFAULT '',
   `pad` char(60) NOT NULL DEFAULT '',
   PRIMARY KEY (`id`),
   KEY `k` (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1

10,000 rows were inserted in each of the 10 tables, for a total of 100,000 rows. When the inserts had replicated to the slave, the slave threads were stopped. The slave data directory was copied to a backup location and the slave threads position in the master binlog noted.

10 sysbench clients, each configured with 10 threads, were spawned at the same time to generate a random schema load against each of the 10 schemas on the master. Each sysbench client executed 10,000 "update key" statements:

UPDATE sbtest set k=k+1 WHERE id = <random row>

In total, this generated 100,000 update statements to later replicate during the test itself.

Test Methodology:
The number of slave workers to test with was configured using:

SET GLOBAL slave_parallel_workers=<workers>

Then the slave IO thread was started and the test waited for all the update queries to be copied over to the relay log on the slave.

The benchmark clock was started and then the slave SQL thread was started. The test waited for the slave SQL thread to finish executing the 100k update queries, doing "select master_pos_wait()". When master_pos_wait() returned, the benchmark clock was stopped and the duration calculated.

The calculated duration from the benchmark clock should be close to the time it took for the SQL thread to execute the 100,000 update queries. The 100k queries divided by this duration gave the benchmark metric, reported as Queries Per Second (QPS).

Test Reset:

The test-reset cycle was implemented as follows:

· the slave was stopped

· the slave data directory replaced with the previous backup

· the slave restarted with the slave threads replication pointer repositioned to the point before the update queries in the binlog.

The test could then be repeated with identical set of queries but a different number of slave worker threads, enabling a fair comparison.

The Test-Reset cycle was repeated 3 times for 0-24 number of workers and the QPS metric calculated and averaged for each worker count.

MySQL Configuration
The relevant configuration settings used for MySQL are as follows:

binlog-format=STATEMENT
relay-log-info-repository=TABLE
master-info-repository=TABLE

As described in the test procedure, the
slave_parallel_workers setting was modified as part of the test logic. The consequence of changing this setting is:

0 worker threads:
   - current (i.e. single threaded) sequential mode
   - 1 x IO thread and 1 x SQL thread
   - SQL thread both reads and executes the events

1 worker thread:
   - sequential mode
   - 1 x IO thread, 1 x Coordinator SQL thread and 1 x Worker thread
   - coordinator reads the event and hands it to the worker who executes

2+ worker threads:
   - parallel execution
   - 1 x IO thread, 1 x Coordinator SQL thread and 2+ Worker threads
   - coordinator reads events and hands them to the workers who execute them

Results
Figure 1 below shows that Multi-Threaded Slaves deliver ~5x higher replication performance when configured with 10 worker threads, with the load evenly distributed across our 10 x schemas. This result is compared to the current replication implementation which is based on a single SQL thread only (i.e. zero worker threads).

Figure 1: 5x Higher Performance with Multi-Threaded Slaves

The following figure shows more detailed results, with QPS sampled and reported as the worker threads are incremented.

The raw numbers behind this graph are reported in the Appendix section of this post.



Figure 2: Detailed Results

As the results above show, the configuration does not scale noticably from 5 to 9 worker threads. When configured with 10 worker threads however, scalability increases significantly. The conclusion therefore is that it is desirable to configure the same number of worker threads as schemas.

Other conclusions from the results:

· Running with 1 worker compared to zero workers just introduces overhead without the benefit of parallel execution.

· As expected, having more workers than schemas adds no visible benefit.

Aside from what is shown in the results above, testing also demonstrated that the following settings had a very positive effect on slave performance:


relay-log-info-repository=TABLE
master-info-repository=TABLE

For 5+ workers, it was up to 2.3 times as fast to run with TABLE compared to FILE.

Conclusion

As the results demonstrate, Multi-Threaded Slaves deliver significant performance increases to MySQL replication when handling multiple schemas.

This, and the other replication enhancements introduced in MySQL 5.6 are fully available for you to download and evaluate now from the MySQL Developer site (select Development Release tab).

You can learn more about MySQL 5.6 from the documentation 

Please don’t hesitate to comment on this or other replication blogs with feedback and questions.

Appendix – Detailed Results

Tuesday Dec 20, 2011

MySQL 5.6.4 Development Milestone Now Available!

I am pleased to announce that the MySQL Database 5.6.4 development milestone release ("DMR") is now available for download (select the Development Release tab). MySQL 5.6.4 includes all 5.5 production-ready features and provides an aggreation of all of the new features that have been released in earlier 5.6 DMRs.  5.6.4 adds many bug fixes and more new "early and often" enhancements that are development and system QA complete and ready for Community evaluation and feedback.  You can get the complete rundown of all the new 5.6.4 specific features here.

For those following the progression of the 5.6 DMRs as the trains leave the station, you should bookmark these MySQL Engineering development team specific blogs:

You can also track the thought and innovation leaders on the MySQL Optimizer and the new Optimizer specific improvements in 5.6.4 by following the MySQL Optimizer Team member blogs:

And of course you can follow others on the Optimizer team and all of MySQL Engineering teams by bookmarking/subscribing to PlanetMySQL.

We look forward to your feedback on MySQL 5.6.4, so please download your copy now and help us make a better MySQL. 

As always, a sincere thanks for your continued support of MySQL!   


Monday Aug 01, 2011

MySQL 5.6 Replication – New Early Access Features

At OSCON 2011 last week, Oracle delivered more early access (labs) features for MySQL 5.6 replication. These features are focused on better integration, performance and data integrity, and are summarized in this blog with links to resources enabling users to download, configure and evaluate them[Read More]
About

Get the latest updates on products, technology, news, events, webcasts, customers and more.

Twitter


Facebook

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
2
5
6
9
10
11
12
13
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today