Monday Oct 08, 2012

New Options for MySQL High Availability

Data is the currency of today’s web, mobile, social, enterprise and cloud applications. Ensuring data is always available is a top priority for any organization – minutes of downtime will result in significant loss of revenue and reputation.

There is not a “one size fits all” approach to delivering High Availability (HA). Unique application attributes, business requirements, operational capabilities and legacy infrastructure can all influence HA technology selection. And then technology is only one element in delivering HA – “People and Processes” are just as critical as the technology itself.

For this reason, MySQL Enterprise Edition is available supporting a range of HA solutions, fully certified and supported by Oracle. MySQL Enterprise HA is not some expensive add-on, but included within the core Enterprise Edition offering, along with the management tools, consulting and 24x7 support needed to deliver true HA.

At the recent MySQL Connect conference, we announced new HA options for MySQL users running on both Linux and Solaris:

- DRBD for MySQL

- Oracle Solaris Clustering for MySQL

DRBD (Distributed Replicated Block Device) is an open source Linux kernel module which leverages synchronous replication to deliver high availability database applications across local storage. DRBD synchronizes database changes by mirroring data from an active node to a standby node and supports automatic failover and recovery. Linux, DRBD, Corosync and Pacemaker, provide an integrated stack of mature and proven open source technologies.

DRBD Stack: Providing Synchronous Replication for the MySQL Database with InnoDB

Download the DRBD for MySQL whitepaper to learn more, including step-by-step instructions to install, configure and provision DRBD with MySQL

Oracle Solaris Cluster provides high availability and load balancing to mission-critical applications and services in physical or virtualized environments. With Oracle Solaris Cluster, organizations have a scalable and flexible solution that is suited equally to small clusters in local datacenters or larger multi-site, multi-cluster deployments that are part of enterprise disaster recovery implementations. The Oracle Solaris Cluster MySQL agent integrates seamlessly with MySQL offering a selection of configuration options in the various Oracle Solaris Cluster topologies.

Putting it All Together

When you add MySQL Replication and MySQL Cluster into the HA mix, along with 3rd party solutions, users have extensive choice (and decisions to make) to deliver HA services built on MySQL

To make the decision process simpler, we have also published a new MySQL HA Solutions Guide.

Exploring beyond just the technology, the guide presents a methodology to select the best HA solution for your new web, cloud and mobile services, while also discussing the importance of people and process in ensuring service continuity.

This is subject recently presented at Oracle Open World, and the slides are available here.

Whatever your uptime requirements, you can be sure MySQL has an HA solution for your needs

Please don't hesitate to let us know of your HA requirements in the comments section of this blog. You can also contact MySQL consulting to learn more about their HA Jumpstart offering which will help you scope out your scaling and HA requirements.

Friday Oct 05, 2012

New MySQL Cluster 7.3 Previews: Foreign Keys, NoSQL Node.js API and Auto-Tuned Clusters

At this weeks MySQL Connect conference, Oracle previewed an exciting new wave of developments for MySQL Cluster, further extending its simplicity and flexibility by expanding the range of use-cases, adding new NoSQL options, and automating configuration.

What’s new:

  • Development Release 1: MySQL Cluster 7.3 with Foreign Keys
  • Early Access “Labs” Preview: MySQL Cluster NoSQL API for Node.js
  • Early Access “Labs” Preview: MySQL Cluster GUI-Based Auto-Installer

In this blog, I'll introduce you to the features being previewed.

Review the blogs listed below for more detail on each of the specific features discussed.

Save the date!: A live webinar is scheduled for Thursday 25th October at 0900 Pacific Time / 1600UTC where we will discuss each of these enhancements in more detail. Registration will be open soon and published to the MySQL webinars page

MySQL Cluster 7.3: Development Release 1

The first MySQL Cluster 7.3 Development Milestone Release (DMR) previews Foreign Keys, bringing powerful new functionality to MySQL Cluster while reducing development complexity.

Foreign Key support has been one of the most requested enhancements to MySQL Cluster – enabling users to simplify their data models and application logic – while extending the range of use-cases for both custom projects requiring referential integrity and packaged applications, such as eCommerce, CRM, CMS, etc.


The Foreign Key functionality is implemented directly within the MySQL Cluster data nodes, allowing any client API accessing the cluster to benefit from them – whether they are SQL or one of the NoSQL interfaces (Memcached, C++, Java, JPA, HTTP/REST or the new Node.js API - discussed later.)

The core referential actions defined in the SQL:2003 standard are implemented:


In addition, the MySQL Cluster implementation supports the online adding and dropping of Foreign Keys, ensuring the Cluster continues to serve both read and write requests during the operation.  This represents a further enhancement to MySQL Cluster's support for on0line schema changes, ie adding and dropping indexes, adding columns, etc. 

Read this blog for a demonstration of using Foreign Keys with MySQL Cluster. 

Getting Started with MySQL Cluster 7.3 DMR1:

Users can download either the source or binary and evaluate the MySQL Cluster 7.3 DMR with Foreign Keys now! (Select the Development Release tab).

MySQL Cluster NoSQL API for Node.js

Node.js is hot! In a little over 3 years, it has become one of the most popular environments for developing next generation web, cloud, mobile and social applications. Bringing JavaScript from the browser to the server, the design goal of Node.js is to build new real-time applications supporting millions of client connections, serviced by a single CPU core.

Making it simple to further extend the flexibility and power of Node.js to the database layer, we are previewing the Node.js Javascript API for MySQL Cluster as an Early Access release, available for download now from Select the following build:


Alternatively, you can clone the project at the MySQL GitHub page

Implemented as a module for the V8 engine, the new API provides Node.js with a native, asynchronous JavaScript interface that can be used to both query and receive results sets directly from MySQL Cluster, without transformations to SQL.

Figure 1: MySQL Cluster NoSQL API for Node.js enables end-to-end JavaScript development

Rather than just presenting a simple interface to the database, the Node.js module integrates the MySQL Cluster native API library directly within the web application itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

The new Node.js API joins a rich array of NoSQL interfaces available for MySQL Cluster. Whichever API is chosen for an application, SQL and NoSQL can be used concurrently across the same data set, providing the ultimate in developer flexibility. 

Get started with MySQL Cluster NoSQL API for Node.js tutorial

MySQL Cluster GUI-Based Auto-Installer

Compatible with both MySQL Cluster 7.2 and 7.3, the Auto-Installer makes it simple for DevOps teams to quickly configure and provision highly optimized MySQL Cluster deployments – whether on-premise or in the cloud.

Implemented with a standard HTML GUI and Python-based web server back-end, the Auto-Installer intelligently configures MySQL Cluster based on application requirements and auto-discovered hardware resources

Figure 2: Automated Tuning and Configuration of MySQL Cluster

Developed by the same engineering team responsible for the MySQL Cluster database, the installer provides standardized configurations that make it simple, quick and easy to build stable and high performance clustered environments.

The auto-installer is previewed as an Early Access release, available for download now from, by selecting the MySQL-Cluster-Auto-Installer build.

You can read more about getting started with the MySQL Cluster auto-installer here.

Watch the YouTube video for a demonstration of using the MySQL Cluster auto-installer

Getting Started with MySQL Cluster

If you are new to MySQL Cluster, the Getting Started guide will walk you through installing an evaluation cluster on a singe host (these guides reflect MySQL Cluster 7.2, but apply equally well to 7.3 and the Early Access previews). Or use the new MySQL Cluster Auto-Installer!

Download the Guide to Scaling Web Databases with MySQL Cluster (to learn more about its architecture, design and ideal use-cases).

Post any questions to the MySQL Cluster forum where our Engineering team and the MySQL Cluster community will attempt to assist you.

Post any bugs you find to the MySQL bug tracking system (select MySQL Cluster from the Category drop-down menu)

And if you have any feedback, please post them to the Comments section here or in the blogs referenced in this article.


MySQL Cluster 7.2 is the GA, production-ready release of MySQL Cluster. The first Development Release of MySQL Cluster 7.3 and the Early Access previews give you the opportunity to preview and evaluate future developments in the MySQL Cluster database, and we are very excited to be able to share that with you.

Let us know how you get along with MySQL Cluster 7.3, and other features that you want to see in future releases, by using the comments of this blog.

Saturday Sep 29, 2012

Tutorial: Getting Started with the NoSQL JavaScript / Node.js API for MySQL Cluster

Tutorial authored by Craig Russell and JD Duncan 

The MySQL Cluster team are working on a new NoSQL JavaScript connector for MySQL. The objectives are simplicity and high performance for JavaScript users:

- allows end-to-end JavaScript development, from the browser to the server and now to the world's most popular open source database

- native "NoSQL" access to the storage layer without going first through SQL transformations and parsing.

Node.js is a complete web platform built around JavaScript designed to deliver millions of client connections on commodity hardware. With the MySQL NoSQL Connector for JavaScript, Node.js users can easily add data access and persistence to their web, cloud, social and mobile applications.

While the initial implementation is designed to plug and play with Node.js, the actual implementation doesn't depend heavily on Node, potentially enabling wider platform support in the future.


The architecture and user interface of this connector are very different from other MySQL connectors in a major way: it is an asynchronous interface that follows the event model built into Node.js.

To make it as easy as possible, we decided to use a domain object model to store the data. This allows for users to query data from the database and have a fully-instantiated object to work with, instead of having to deal with rows and columns of the database. The domain object model can have any user behavior that is desired, with the NoSQL connector providing the data from the database.

To make it as fast as possible, we use a direct connection from the user's address space to the database. This approach means that no SQL (pun intended) is needed to get to the data, and no SQL server is between the user and the data.

The connector is being developed to be extensible to multiple underlying database technologies, including direct, native access to both the MySQL Cluster "ndb" and InnoDB storage engines.

The connector integrates the MySQL Cluster native API library directly within the Node.js platform itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

The following sections take you through how to connect to MySQL, query the data and how to get started.

Connecting to the database

A Session is the main user access path to the database. You can get a Session object directly from the connector using the openSession function:

var nosql = require("mysql-js");

var dbProperties = {

    "implementation" : "ndb",

    "database" : "test"


nosql.openSession(dbProperties, null, onSession);

The openSession function calls back into the application upon creating a Session. The Session is then used to create, delete, update, and read objects.

Reading data

The Session can read data from the database in a number of ways. If you simply want the data from the database, you provide a table name and the key of the row that you want. For example, consider this schema:

create table employee (

  id int not null primary key,

  name varchar(32),

  salary float

) ENGINE=ndbcluster;

Since the primary key is a number, you can provide the key as a number to the find function.

function onSession = function(err, session) {

  if (err) {


    ... error handling


  session.find('employee', 0, onData);


function onData = function(err, data) {

  if (err) {


    ... error handling


  console.log('Found: ', JSON.stringify(data));

  ... use data in application


If you want to have the data stored in your own domain model, you tell the connector which table your domain model uses, by specifying an annotation, and pass your domain model to the find function.

var annotations = new nosql.Annotations();

function Employee = function(id, name, salary) { = id; = name;

  this.salary = salary;

  this.giveRaise = function(percent) {

    this.salary *= percent;



annotations.mapClass(Employee, {'table' : 'employee'});

function onSession = function(err, session) {

  if (err) {


    ... error handling


  session.find(Employee, 0, onData);


Updating data

You can update the emp instance in memory, but to make the raise persistent, you need to write it back to the database, using the update function.

function onData = function(err, emp) {

  if (err) {


    ... error handling


  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp); // oops, session is out of scope here


Using JavaScript can be tricky because it does not have the concept of block scope for variables. You can create a closure to handle these variables, or use a feature of the connector to remember your variables.

The connector api takes a fixed number of parameters and returns a fixed number of result parameters to the callback function. But the connector will keep track of variables for you and return them to the callback. So in the above example, change the onSession function to remember the session variable, and you can refer to it in the onData function:

function onSession = function(err, session) {

  if (err) {


    ... error handling


  session.find(Employee, 0, onData, session);


function onData = function(err, emp, session) {

  if (err) {


    ... error handling


  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp, onUpdate); // session is now in scope


function onUpdate = function(err, emp) {

  if (err) {


    ... error handling


Inserting data

Inserting data requires a mapped JavaScript user function (constructor) and a session. Create a variable and persist it:

function onSession = function(err, session) {

  var data = new Employee(999, 'Mat Keep', 20000000);

  session.persist(data, onInsert);



Deleting data

To remove data from the database, use the session remove function. You use an instance of the domain object to identify the row you want to remove. Only the key field is relevant.

function onSession = function(err, session) {

  var key = new Employee(999);

  session.remove(Employee, onDelete);



More extensive queries

We are working on the implementation of more extensive queries along the lines of the criteria query api. Stay tuned.

How to evaluate

The MySQL Connector for JavaScript is available for download from Select the build:


You can also clone the project on GitHub

Since it is still early in development, feedback is especially valuable (so don't hesitate to leave comments on this blog, or head to the MySQL Cluster forum). Try it out and see how easy (and fast) it is to integrate MySQL Cluster into your Node.js platforms.

You can learn more about other previewed functionality of MySQL Cluster 7.3 here

Wednesday Sep 12, 2012

MySQL Connect: What to Expect From the Wondrous Land of MySQL Cluster

The MySQL Connect conference is only a couple of weeks away, with MySQL engineers, support teams, consultants and community aces busy putting the final touches to their talks.

There will be many exciting new announcements and sharing of best practices at the conference, covering the range of MySQL technologies.

MySQL Cluster will a big part of this, so I wanted to share some key sessions for those of you who plan on attending, as well as some resources for those who are not lucky enough to be able to make the trip, but who can't afford to miss the key news. Of course, this is no substitute to actually being there….and the good news is that registration is still open ;-)


Whats New in MySQL Cluster Saturday 29th, 1300-1400, in Golden Gate room 5.                                                                                        Bernd Ocklin, director of MySQL Cluster development, and myself will be taking a look at what follows the latest MySQL Cluster 7.2 release. I don't want to give to much away - lets just say its not often you can add powerful new functionality to a product while at the same time making life radically simpler for its users.

For those not making it to the Conference, a live webinar repeating the talk is scheduled for Thursday 25th October at 09.00 pacific time. Hold the date, registration will be open for that soon and published to our MySQL Webinars page

Best Practices

Getting Started with MySQL Cluster, Hands-On Lab Saturday 29th, 1600-1700, in Plaza Room A.                                                              Santo Leto, one of our lead MySQL Cluster support engineers, regularly works with users new to MySQL Cluster, assisting them in installation, configuration, scaling, etc. In this lab, Santo will share best-practices in getting started.

Delivering Breakthrough Performance with MySQL Cluster Saturday 29th, 1730-1830, in Golden Gate room 5.

Frazer Clement, lead MySQL Cluster software engineer, will demonstrate how to translate the awesome Cluster benchmarks (remember 1 BILLION UPDATEs per minute ?!) into real-world performance.

You can also get some best practices from our new MySQL Cluster performance guide 

MySQL Cluster BoF Saturday 29th, 1900-2000, room Golden Gate 5.                                                                                                           Come and get a demonstration of new tools for the installation and configuration of MySQL Cluster, and spend time with the engineering team discussing any questions or issues you may have.

Developing High-Throughput Services with NoSQL APIs to InnoDB and MySQL Cluster Sunday 30th, 1145 - 1245, in Golden Gate room 7.  

In this session, JD Duncan and Andrew Morgan will present how to get started with both Memcached and new NoSQL APIs.

JD and I recently ran a webinar demonstrating how to build simple Twitter-like services with Memcached and MySQL Cluster. The replay is available for download

Case Studies:

MySQL Cluster @ El Chavo, Latin America’s #1 Facebook Game Sunday 30th, 1745 - 1845, in Golden Gate room 4.                             Playful Play deployed MySQL Cluster CGE to power their market leading social game. This session will discuss the challenges they faced, why they selected MySQL Cluster and their experiences to date.

You can read more about Playful Play and MySQL Cluster here 

A Journey into NoSQLand: MySQL’s NoSQL Implementation Sunday 30th, 1345 - 1445, in Golden Gate room 4.                                          Lig Turmelle, web DBA at Kaplan Professional and esteemed Oracle Ace, will discuss her experiences working with the NoSQL interfaces for both MySQL Cluster and InnoDB

Evaluating MySQL HA Alternatives Saturday 29th, 1430-1530, room Golden Gate 5                                                                                   Henrik Ingo, former member of the MySQL sales engineering team, will provide an overview of various HA technologies for MySQL, starting with replication, progressing to InnoDB, Galera and MySQL Cluster

What about the other stuff?

Of course MySQL Connect has much, much more than MySQL Cluster. There will be lots on replication (which I'll blog about soon), MySQL 5.6, InnoDB, cloud, etc, etc. Take a look at the full Content Catalog to see more.

If you are attending, I hope to see you at one of the Cluster sessions...and remember, registration is still open

Monday Aug 20, 2012

Learn about the Real-Time Performance and High Availability of MySQL Cluster

If you need 99.999% availability, real-time performance, auto-sharding and write scalability, geographic replication - using SQL and NoSQL applications then you should learn more about MySQL Cluster.

The authentic MySQL Cluster course brings you key conceptual and configuration information in a 3 day instructor led class.

Examples of class events already on the schedule:



 Delivery Language

 Prague, Czech Republic

 10 December 2012


 Warsaw, Poland

 3 September 2012


 London, England

 12 December 2012


 Lisbon, Portugal

 3 December 2012

 European Portugese

 Nice, France

 8 October 2012


Barcelona, Spain 

 25 September 2012


 Madrid, Spain

6 November 2012 


 Denver, Colorado

 17 October 2012


 Petaling Jaya, Malaysia

 10 October 2012



 19 September 2012


 Sao Paolo, Brazil

 24 September 2012

 Brazilian Portugese

To see the full schedule and course details or to register interest in an additional event, go to

Tuesday Jul 31, 2012

MySQL Cluster Performance Best Practices: Q & A

With its distributed, shared-nothing, real-time design, MySQL Cluster has attracted a lot of attention from developers who need to scale both read and write traffic with ultra-low latency and fault-tolerance, using commodity hardware. With many proven deployments in web, gaming, telecoms and mobile use-cases, MySQL Cluster is certainly able to meet these sorts of requirements.

But, as a distributed database, developers do need to think a little differently about data access patterns along with schema and query optimisations in order to get the best possible performance.

Sharing best practices developed by working with MySQL Cluster's largest users, we recently ran a Performance Essentials webinar, and the replay is now available, on-demand, for you to listen to in the comfort of your own office.

The webinar also accompanies a newly published Guide to optimizing the performance of MySQL Cluster.

We received a number of great questions over the course of the webinar, and I thought it would be useful to share a selection of those:

Q. How do I calculate and then monitor memory usage with MySQL Cluster?

A. If designing a completely new database, the following calculations can be used to help determine the approximate memory sizing requirements for the data nodes:

(in memory) Data Size * Replicas * 1.25 = Total Database Memory Requirements

Example: 50 GB * 2 * 1.25 = 125 GB

(Data Size * Replicas * 1.25)/Nodes = RAM Per Node

Example: (2 GB * 2 * 1.25)/4 = 31.25 GB

To see how much of the configured memory is currently in use by the database, you can query the ndbinfo.memory usage table

If using MySQL Cluster CGE then you can view this information over time in a MySQL Enterprise Monitor graph.

Q. Would enabling Disk space Table Space be an impact on the Query Performance ?

A. It can do. The only reason to use Disk based table spaces is when you do not have sufficient memory to store all data in-memory. Therefore some of your disk based data will be uncached at some time, and reads or writes which access this data will stall while the necessary pages are read into the page buffer. This can reduce throughput.

Q. I've seen that MySQL Cluster 7.2 can speed up JOIN operations by 70x. How does it do this?

A. There are two new features in MySQL Cluster 7.2, which when combined, can significantly improve the performance of joins over previous versions of MySQL Cluster:

- The Index Statistics function enables the SQL optimizer to build a better execution plan for each query. In the past, non-optimal query plans required a manual enforcement of indexes via USE INDEX or FORCE INDEX to alter the execution plan. ANALYZE TABLE must first be run on each table to take advantage of this.

- Adaptive Query Localization (AQL) allows the work of the join to be distributed across the data nodes (local to the data it’s working with) rather than up in the MySQL Server; this allows more computing power to be applied to calculating the join as well as dramatically reducing the number of messages being passed around the system.

You can learn more about AQL and a sample query here

Q. Can all JOINs use AQL?

A. In order for a join to be able to exploit AQL (in other words be “pushed down” to the data nodes), it must meet the following conditions:

1. Any columns to be joined must use exactly the same data type. (For example, if an INT and a BIGINT column are joined, the join cannot be pushed down). This includes the lengths of any VARCHAR columns.

2. Joins referencing BLOB or TEXT columns will not be pushed down.

3. Explicit locking is not supported; however, the NDB (MySQL Cluster) storage engine's characteristic implicit row-based locking is enforced.

4. In order for a join to be pushed down, child tables in the Join must be accessed using one of the ref, eq_ref, or const access methods, or some combination of these methods. These access methods are described in the documentation

5. Joins referencing tables explicitly partitioned by [LINEAR] HASH, LIST, or RANGE currently cannot be pushed down

6. If the query plan decides to 'Using join buffer' for a candidate child table, that table cannot be pushed as child. However, it might be the root of another set of pushed tables.

7. If the root of the pushed Join is an eq_ref or const, only child tables joined by eq_ref can be appended. (A ref joined table will then likely become a root of another pushed Join)

These conditions should be considered when designing your schema and application queries – for example, to comply with constraint 4, attempt to make any table scan that is part of the Join be the first clause.

Where a query involves multiple levels of Joins, it is perfectly possible for some levels to be pushed down while others continue to be executed within the MySQL Server.

If your application consists of many of these types of JOIN operations which cannot be made to exploit AQL, other MySQL storage engines such as InnoDB will present a better option for your workload.

Q. What are best practices for data model and query design?

A. The data model and queries should be designed to minimize network roundtrips between hosts. Ensuring that joins meet the requirements for

AQL and avoiding full table scans can help with this.

Looking up data in a hash table is a constant time operation, unaffected by the size of the data set

Looking up data in a tree (T-tree, B-tree etc) structure is logarithmic (O (log n)).

For a database designer this means it is very important to choose the right index structure and access method to retrieve data. We strongly recommend application requests with high requirements on performance be designed as primary key lookups. This is because looking up data in a hash structure is faster than from a tree structure and can be satisfied by a single data node. Therefore, it is very important that the data model takes this into account. It also follows that choosing a good primary key definition is extremely important.

If ordered index lookups are required then tables should be partitioned such that only one data node will be scanned.

The distributed nature of the Cluster and the ability to exploit multiple CPUs, cores or threads within nodes means that the maximum performance will be achieved if the application is architected to run many transactions in parallel. Alternatively you should run many instances of the application simultaneously to ensure that the Cluster is always able to work on many transactions in parallel.

Take a look at the Guide to optimizing the performance of MySQL Cluster for more detail

Q. What are best practices for parallelising my application and access to MySQL Cluster?

A. As mentioned MySQL Cluster is a distributed, auto-sharded database. This means that there is often more than one Data Node that can work in parallel to satisfy application requests.

Additionally, MySQL Cluster 7.2 enhances multi-threading so data nodes can now effectively exploit multiple threads / cores. To use this functionality, the data nodes should be started using the ndbmtd binary rather than ndb and config.ini should be configured correctly

Parallelization can be achieved in MySQL Cluster in several ways:

- Adding more Application Nodes

- Use of multi-threaded data nodes

- Batching of requests

- Parallelizing work on different Application Nodes connected to the Data Nodes

- Utilizing multiple connections between each Application Node and the Data Nodes (connection pooling)

How many threads and how many applications are needed to drive the desired load has to be studied by benchmarks. One approach of doing this is to connect one Application Node at a time and increment the number of threads. When one Application Node cannot generate any more load, add another one. It is advisable to start studying this on a two Data Node cluster, and then grow the number of Data Nodes to understand how your system is scaling.

If you have designed your application queries, and data model according to best practices presented in the Performance Guide, you can expect close to double the throughput on a four Data Node system compared to a two Data Node system, given that the application can generate the load.

Try to multi-thread whenever possible and load balance over more MySQL servers.

In MySQL Cluster you have access to additional performance enhancements that allow better utilization on multi-core / thread CPUs, including:

- Reduced lock contention by having multiple connections from one MySQL Server to the Data Nodes (--ndb-cluster-connection-pool=X):

- Setting threads to real-time priority

- Locking Data Node threads (kernel thread and maintenance threads to a CPU)

Q. Does MySQL Cluster’s data distribution add complexity to my application and limit the types of queries I can run?

A. No, it doesn't. By default, tables are automatically partitioned (sharded) across data nodes by hashing the primary key. Other partitioning methods are supported, but in most instances the default is acceptable.

As the sharding is automatic and implemented at the database layer, application developers do not need to modify their applications to control data distribution – which significantly simplifies scaling.

In addition, applications are free to run complex queries such as JOIN operations across the shards, therefore users do not need to trade functionality for scalability.

Q. What hardware would you recommend to get the best performance from MySQL Cluster?

A. It varies by node type. For data nodes:

- Up to 32 x x86-64 bit CPU cores. Use as high a frequency as possible as this will enable faster processing of messages between nodes;

- Large CPU caches assist in delivering optimal performance;

- 64-bit hosts with enough RAM to store your in-memory data set

- Linux, Solaris or Windows operating systems.

- 2 x Network Interface Cards and 2 x Power Supply Units for hardware redundancy.

It is important to ensure systems are configured to reduce swapping to disk whenever possible.

As a rule of thumb, have 7x times the amount of DataMemory configured for disk space for each data node. This space is needed for storing 2 Local Checkpoints (LCPs), the Redo log and 3 backups. You will also want to allocate space for table spaces if you are making use of disk-based data – including allowing extra space for the backups.

Having a fast, low-latency disk subsystem is very important and will affect check pointing and backups.

Download the MySQL Cluster Evaluation Guide for more recommendations 

The hardware requirements for MySQL Servers would be a little less:

- 4 - 32 x86-64 bit CPU cores

- Minimum 4GB of RAM. Memory is not as critical at this layer, and requirements will be influenced by connections and buffers.

- 2 x Network Interface Cards and 2 x Power Supply Units for hardware redundancy.

Q. I heard that MySQL Cluster doesn't support Foreign Keys, how can I get around that?

A. Foreign keys are previewed in MySQL Cluster 7.3 Early Access release which you can download and evaluate now. In MySQL Cluster 7.2 and earlier, you can emulate foreign keys programmatically via triggers.


If you are thinking about using MySQL Cluster for your next project, it is worth investing a little bit of time to get familiar with these performance best practices. The Webinar replay, the MySQL Cluster Performance Guide and the MySQL Cluster Evaluation Guide will give you pretty much everything you need to build high performance, high availability services with MySQL Cluster. 

Tuesday Jul 17, 2012

MySQL Cluster Powers El Chavo from Playful Play, Latin America’s Most Popular Facebook Game


Attracting over 3m subscribers in just 6 months and growing by 30,000 new users per day, Playful Play needed a database that was able to keep pace with the massive scalability and high availability demands of the wildly successful La Vecindad de El Chavo Facebook game.

Playful Play selected MySQL Cluster CGE running on a public cloud to power their gaming platform, providing:

  • 45% improvement in performance;
  • 99.999% uptime;
  • 80% reduction in DBA overhead;
  • Local language support, 24x7.

As a result, Playful Play has been able to maintain user growth rates and attract new advertisers to the platform, while enhancing agility and reducing cost.

Company Overview

Based out of Monterrey, Mexico, Playful Play has created Latin America’s #1 Facebook game based on "El Chavo del 8", the cultural and TV phenomenon that has been running across Latin America for the past four decades. The show is also extremely popular in Spain and the United States.

El Chavo (The Kid) appeals to a broad demographic, having attracted over 3M subscribers in its first 6 months, and adding 30,000 new users per day. The game has also proved popular with advertisers who are keen to integrate social gaming into their marketing strategies in order to raise awareness and build brand loyalty.

Through partnerships with Televisa Home Entertainment and Grupo Chespirito, Playful Play has aggressive plans to grow their business through the development of new social games targeting Latin America, Brazil, the United States and Spain.

The Challenges of Rapid Growth

As a start-up business, fast time to market at the lowest possible case was the leading development priority after Playful Play secured the rights to develop the El Chavo game.

As a result, Playful Play developed the first release of the game on the proven LAMP (Linux, Apache, MySQL, PHP / Perl / Python) stack.

To meet both the scalability and availability requirements of the game, Playful Play initially deployed MySQL in a replicated, multi-master configuration.

The database is core to the game, responsible for managing:

  • User profiles and avatars
  • Gaming session data;
  • In-app (application) purchases;
  • Advertising and digital marketing event data.

As La Vecidad de El Chavo spread virally across Facebook, subscriptions rapidly exceeded one million users, leading Playful Play to consider how to best architect their gaming platforms for long-term growth. They had heard about the release of MySQL Cluster 7.2, and so decided to initiate an evaluation to determine if it could meet their requirements.

The Route to Internet-Scale with MySQL Cluster

In addition to growing user volumes, the El Chavo game also added new features that changed the profile of the database. Operations became more write-intensive, with INSERTs and UPDATEs accounting for up to 70% of the database load.

The game’s popularity also attracted advertisers, who demanded strict SLAs for both performance (predictable throughput with low latency) as well as uptime.

Through their evaluation, the developers at Playful Play identified that MySQL Cluster was able to meet their needs.

Write Performance with Real-Time Responsiveness

MySQL Cluster’s multi-master architecture, coupled with the ability to auto-shard (partition) tables across nodes, enabled Playful Play to meet the performance and scalability demands of the El Chavo game, without changes to their applications.

The ability to store data in memory and persist to disk delivered the sub-millisecond responsiveness and durability demanded by the game’s users.

In addition, Playful Play was able to horizontally scale the database across low-cost commodity nodes running in the cloud, reducing their TCO and enhancing agility.

Continuous Availability

The shared-nothing, distributed design of MySQL Cluster coupled with integrated replication and self-healing recovery ensured high availability of the gaming platform, without DevOps intervention.

MySQL Cluster was able to protect against downtime resulting from both failures and planned upgrades. For example, Playful Play has been able to scale the cluster on-line to support growing user volumes, without service interruption.

Data Integrity Supporting Monetization

As the game evolved to support in-app purchases and digital marketing, the ACID-compliance offered by MySQL Cluster afforded the transactional integrity demanded by users and advertisers.

The Value of MySQL Cluster Consulting, Support and Management Tools

Following their successful evaluation, Playful Play deployed MySQL Cluster with their public cloud provider.

Playful Play wanted to ensure they were getting the most out of the deployment, so Oracle’s MySQL consulting team reviewed their Architecture and Design. The consultants were able to further optimize the database to deliver even higher levels of performance, and demonstrated how Playful Play could automatically scale the database using MySQL Cluster Manager.

The MySQL Cluster deployment at Playful Play is growing rapidly, and currently comprised of over 25 nodes running Linux on commodity x86 servers, each configured with 24-cores and 64GB of RAM:

  • 12 x MySQL Cluster data nodes;
  • 12 x MySQL Server SQL nodes;
  • 2 x MySQL Cluster management nodes.

Each data and SQL node is co-located to a single physical instance.

 Playful Play Deployment Architecture, built for rapid scale

Using the architecture above, MySQL Cluster is currently supporting:

  • 3 million subscribers, with 30,000 new additions per day across Latin America, Europe and U.S;
  • 10,000 concurrent users;
  • 10,000 Transactions Per Second;
  • 99.999% uptime.

"The MySQL support service has been essential in helping us for troubleshooting and giving recommendations for the production cluster, Thanks" Carlos Morales – DBA, México

Playful Play has deployed MySQL Cluster Carrier-Grade Edition, providing access to technical and consultative support, in addition to the MySQL Enterprise Monitor and MySQL Cluster Manager tools.

MySQL Enterprise Monitor Enables Continuous Service Availability

MySQL Enterprise Monitor enables Playful Play to proactively monitor the entire cluster from a single GUI. DBAs are automatically notified if any environment variables begin to exceed pre-defined thresholds, and presented with the instructions necessary to take corrective action, before users are impacted.

MySQL Cluster Manager is used to automate cluster configuration changes, eliminating the risk of manual errors and significantly enhancing DevOps productivity.

The subscription has delivered terrific value to Playful Play, enabling them to:

  • Improve performance by 45% while reducing CPU utilization by 35%, providing significant headroom for continued growth;
  • Reduced DBA overhead by 80%, providing substantial cost savings;
  • Access to local language support, 24x7.

The Future with MySQL Cluster

Playful Play has aggressive growth plans, seeking to attract over 50M subscribers within 5 years. El Chavo del 8 is continuing to gain widespread popularity in Latin regions around the world, and will be joined by other social networking games currently in development. The next major target is Brazil, presenting a very attractive, emerging market in social gaming.

MySQL Cluster CGE has been selected as the database platform powering this growth.

Playful Play will be presenting their experiences and best practices with MySQL Cluster at the MySQL Connect conference, September 29th and 30th 2012 in San Francisco.

Further Resources

MySQL Cluster at Playful Play On-Demand Webinar (Spanish)

MySQL Cluster Demonstration

Guide to Scaling Web Databases with MySQL Cluster

Monday Jul 09, 2012

Top Reasons to Take the MySQL Cluster Training

Here are the top reasons to take the authorized MySQL Cluster training course:

  • Take training which was developed by MySQL Cluster product team and delivered by the MySQL Cluster experts at Oracle
  • Learn how to develop, deploy, manage and scale your MySQL Cluster applications more efficiently
  • Keep your mission-critical applications and essential services up and running 24x7
  • Deliver the highest performance and scalability using MySQL Cluster best practices

In this 3 day course, experienced database users learn the important details of clustering necessary to get started with MySQL Cluster, to properly configure and manage the cluster nodes to ensure high availability, to install the different nodes and provide a better understanding of the internals of the cluster.

To see the schedule for this course, go to the Oracle University Portal (click on MySQL). Should you not see an event for a location/date that suits you, register your interest in additional events.

Here is a small sample of the events already on the schedule for the MySQL Cluster course:



 Delivery Language

 Prague, Czech Republic

 17 September 2012


 Warsaw, Poland

 1 August 2012


 London, United Kingdom

 18 July 2012


 Lisbon, Portugal

 3 December 2012

 European Portugese

 Nice, France

 8 October 2012


 Barcelona, Spain

 25 September 2012


 Madrid, Spain

 20 August 2012


 Denver, United States

 17 October 2012


 Chicago, United States

 22 August 2012


 Petaling Jaya, Malaysia

 10 October 2012



 21 August 2012


 Mexico City, Mexico

 23 July 2012


Monday Jul 02, 2012

NoSQL Java API for MySQL Cluster: Questions & Answers

The MySQL Cluster engineering team recently ran a live webinar, available now on-demand demonstrating the ClusterJ and ClusterJPA NoSQL APIs for MySQL Cluster, and how these can be used in building real-time, high scale Java-based services that require continuous availability.

Attendees asked a number of great questions during the webinar, and I thought it would be useful to share those here, so others are also able to learn more about the Java NoSQL APIs.

First, a little bit about why we developed these APIs and why they are interesting to Java developers.

ClusterJ and Cluster JPA

ClusterJ is a Java interface to MySQL Cluster that provides either a static or dynamic domain object model, similar to the data model used by JDO, JPA, and Hibernate. A simple API gives users extremely high performance for common operations: insert, delete, update, and query.

ClusterJPA works with ClusterJ to extend functionality, including

- Persistent classes

- Relationships

- Joins in queries

- Lazy loading

- Table and index creation from object model

By eliminating data transformations via SQL, users get lower data access latency and higher throughput. In addition, Java developers have a more natural programming method to directly manage their data, with a complete, feature-rich solution for Object/Relational Mapping. As a result, the development of Java applications is simplified with faster development cycles resulting in accelerated time to market for new services.

MySQL Cluster offers multiple NoSQL APIs alongside Java:

  • - Memcached for a persistent, high performance, write-scalable Key/Value store,
  • - HTTP/REST via an Apache module
  • - C++ via the NDB API for the lowest absolute latency.

Developers can use SQL as well as NoSQL APIs for access to the same data set via multiple query patterns – from simple Primary Key lookups or inserts to complex cross-shard JOINs using Adaptive Query Localization

Marrying NoSQL and SQL access to an ACID-compliant database offers developers a number of benefits. MySQL Cluster’s distributed, shared-nothing architecture with auto-sharding and real time performance makes it a great fit for workloads requiring high volume OLTP. Users also get the added flexibility of being able to run real-time analytics across the same OLTP data set for real-time business insight.

OK – hopefully you now have a better idea of why ClusterJ and JPA are available. Now, for the Q&A.

Q & A

Q. Why would I use Connector/J vs. ClusterJ?

A. Partly it's a question of whether you prefer to work with SQL (Connector/J) or objects (ClusterJ). Performance of ClusterJ will be better as there is no need to pass through the MySQL Server. A ClusterJ operation can only act on a single table (e.g. no joins) - ClusterJPA extends that capability

Q. Can I mix different APIs (ie ClusterJ, Connector/J) in our application for different query types?

A. Yes. You can mix and match all of the API types, SQL, JDBC, ODBC, ClusterJ, Memcached, REST, C++. They all access the exact same data in the data nodes. Update through one API and new data is instantly visible to all of the others.

Q. How many TCP connections would a SessionFactory instance create for a cluster of 8 data nodes?

A. SessionFactory has a connection to the mgmd (management node) but otherwise is just a vehicle to create Sessions. Without using connection pooling, a SessionFactory will have one connection open with each data node. Using optional connection pooling allows multiple connections from the SessionFactory to increase throughput.

Q. Can you give details of how Cluster J optimizes sharding to enhance performance of distributed query processing?

A. Each data node in a cluster runs a Transaction Coordinator (TC), which begins and ends the transaction, but also serves as a resource to operate on the result rows. While an API node (such as a ClusterJ process) can send queries to any TC/data node, there are performance gains if the TC is where most of the result data is stored. ClusterJ computes the shard (partition) key to choose the data node where the row resides as the TC.

Q. What happens if we perform two primary key lookups within the same transaction? Are they sent to the data node in one transaction?

A. ClusterJ will send identical PK lookups to the same data node.

Q. How is distributed query processing handled by MySQL Cluster ?

A. If the data is split between data nodes then all of the information will be transparently combined and passed back to the application. The session will connect to a data node - typically by hashing the primary key - which then interacts with its neighboring nodes to collect the data needed to fulfil the query.

Q. Can I use Foreign Keys with MySQL Cluster

A. Support for Foreign Keys is included in the MySQL Cluster 7.3 Early Access release


The NoSQL Java APIs are packaged with MySQL Cluster, available for download here so feel free to take them for a spin today!

Key Resources

MySQL Cluster on-line demo 

MySQL ClusterJ and JPA On-demand webinar 

MySQL ClusterJ and JPA documentation

MySQL ClusterJ and JPA whitepaper and tutorial

Monday Jun 11, 2012

Official MySQL Cluster Training Available Near You!

Oracle is the official provider of MySQL Training.

To learn more about MySQL Cluster, you can register for the MySQL Cluster training at a large selection of locations and often you will find the course delivery in your local language! For example:



 Delivery Language

 Prague, Czech Republic

 17 September 2012


 Warsaw, Poland

 1 August 2012


 Wien, Austria

 27 August 2012


 London, United Kingdom

 18 July 2012


 Lisbon, Portugal

 3 December 2012

 European Portugese

 Nice, France

 8 October 2012


 Barcelona, Spain

 25 September 2012


 Madrid, Spain

 20 August 2012


 Denver, United States

 17 October 2012


 Chicago, United States

 22 August 2012


 New York, United States

 20 June 2012


 Petaling Jaya, Malaysia

 18 July 2012



 21 August 2012


 Melbourne, Australia

 13 June 2012


 Mexico City, Mexico

 23 July 2012


To learn more or register your interest in another course, location, or date, go to Oracle University's official portal.

Tuesday Jun 05, 2012

MySQL Cluster 7.3 Labs Release – Foreign Keys Are In!

Summary (aka TL/DR):

Support for Foreign Key constraints has been one of the most requested feature enhancements for MySQL Cluster. We are therefore extremely excited to announce that Foreign Keys are part of the first Labs Release of MySQL Cluster 7.3 – available for download, evaluation and feedback now! (Select the mysql-cluster-7.3-labs-June-2012 build)

In this blog, I will attempt to discuss the design rationale, implementation, configuration and steps to get started in evaluating the first MySQL Cluster 7.3 Labs Release.

Pace of Innovation

It was only a couple of months ago that we announced the General Availability (GA) of MySQL Cluster 7.2, delivering 1 billion Queries per Minute, with 70x higher cross-shard JOIN performance, Memcached NoSQL key-value API and cross-data center replication.  This release has been a huge hit, with downloads and deployments quickly reaching record levels.

The announcement of the first MySQL Cluster 7.3 Early Access lab release at today's MySQL Innovation Day event demonstrates the continued pace in Cluster development, and provides an opportunity for the community to evaluate and feedback on new features they want to see.

What’s the Plan for MySQL Cluster 7.3?

Well, Foreign Keys, as you may have gathered by now (!), and this is the focus of this first Labs Release.

As with MySQL Cluster 7.2, we plan to publish a series of preview releases for 7.3 that will incrementally add new candidate features for a final GA release (subject to usual safe harbor statement below*), including:

- New NoSQL APIs;

- Features to automate the configuration and provisioning of multi-node clusters, on premise or in the cloud;

- Performance and scalability enhancements;

- Taking advantage of features in the latest MySQL 5.x Server GA.

Design Rationale

MySQL Cluster is designed as a “Not-Only-SQL” database. It combines attributes that enable users to blend the best of both relational and NoSQL technologies into solutions that deliver web scalability with 99.999% availability and real-time performance, including:

  • Concurrent NoSQL and SQL access to the database;
  • Auto-sharding with simple scale-out across commodity hardware;
  • Multi-master replication with failover and recovery both within and across data centers;
  • Shared-nothing architecture with no single point of failure;
  • Online scaling and schema changes;
  • ACID compliance and support for complex queries, across shards.

Native support for Foreign Key constraints enables users to extend the benefits of MySQL Cluster into a broader range of use-cases, including:

- Packaged applications in areas such as eCommerce and Web Content Management that prescribe databases with Foreign Key support.

- In-house developments benefiting from Foreign Key constraints to simplify data models and eliminate the additional application logic needed to maintain data consistency and integrity between tables.


The Foreign Key functionality is implemented directly within MySQL Cluster’s data nodes, allowing any client API accessing the cluster to benefit from them – whether using SQL or one of the NoSQL interfaces (Memcached, C++, Java, JPA or HTTP/REST.)

The core referential actions defined in the SQL:2003 standard are implemented:


In addition, the MySQL Cluster implementation supports the online adding and dropping of Foreign Keys, ensuring the Cluster continues to serve both read and write requests during the operation.

An important difference to note with the Foreign Key implementation in InnoDB is that MySQL Cluster does not support the updating of Primary Keys from within the Data Nodes themselves - instead the UPDATE is emulated with a DELETE followed by an INSERT operation. Therefore an UPDATE operation will return an error if the parent reference is using a Primary Key, unless using CASCADE action, in which case the delete operation will result in the corresponding rows in the child table being deleted. The Engineering team plans to change this behavior in a subsequent preview release.

Also note that when using InnoDB "NO ACTION" is identical to "RESTRICT". In the case of MySQL Cluster “NO ACTION” means “deferred check”, i.e. the constraint is checked before commit, allowing user-defined triggers to automatically make changes in order to satisfy the Foreign Key constraints.


There is nothing special you have to do here – Foreign Key constraint checking is enabled by default.

If you intend to migrate existing tables from another database or storage engine, for example from InnoDB, there are a couple of best practices to observe:

1. Analyze the structure of the Foreign Key graph and run the ALTER TABLE ENGINE=NDB in the correct sequence to ensure constraints are enforced

2. Alternatively drop the Foreign Key constraints prior to the import process and then recreate when complete.

Getting Started

Read this blog for a demonstration of using Foreign Keys with MySQL Cluster. 

You can download MySQL Cluster 7.3 Labs Release with Foreign Keys today - (select the mysql-cluster-7.3-labs-June-2012 build)

If you are new to MySQL Cluster, the Getting Started guide will walk you through installing an evaluation cluster on a singe host (these guides reflect MySQL Cluster 7.2, but apply equally well to 7.3)

Post any questions to the MySQL Cluster forum where our Engineering team will attempt to assist you.

Post any bugs you find to the MySQL bug tracking system (select MySQL Cluster from the Category drop-down menu)

And if you have any feedback, please post them to the Comments section of this blog.


MySQL Cluster 7.2 is the GA, production-ready release of MySQL Cluster. This first Labs Release of MySQL Cluster 7.3 gives you the opportunity to preview and evaluate future developments in the MySQL Cluster database, and we are very excited to be able to share that with you.

Let us know how you get along with MySQL Cluster 7.3, and other features that you want to see in future releases.

* Safe Harbor Statement

This information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracles products remains at the sole discretion of Oracle.

Friday Jun 01, 2012

Configuring MySQL Cluster Data Nodes

In my previous blog post, I discussed the enhanced performance and scalability delivered by extensions to the multi-threaded data nodes in MySQL Cluster 7.2. In this post, I’ll share best practices on the configuration of data nodes to achieve optimum performance on the latest generations of multi-core, multi-thread CPU designs.

Configuring the Data Nodes

The configuration of data node threads can be managed in two ways via the config.ini file:

- Simply set MaxNoOfExecutionThreads to the appropriate number of threads to be run in the data node, based on the number of threads presented by the processors used in the host or VM.

- Use the new ThreadConfig variable that enables users to configure both the number of each thread type to use and also which CPUs to bind them too.

The flexible configuration afforded by the multi-threaded data node enhancements means that it is possible to optimise data nodes to use anything from a single CPU/thread up to a 48 CPU/thread server. Co-locating the MySQL Server with a single data node can fully utilize servers with 64 – 80 CPU/threads. It is also possible to co-locate multiple data nodes per server, but this is now only required for very large servers with 4+ CPU sockets dense multi-core processors.

24 Threads and Beyond!

An example of how to make best use of a 24 CPU/thread server box is to configure the following:

- 8 ldm threads

- 4 tc threads

- 3 recv threads

- 3 send threads

- 1 rep thread for asynchronous replication.

Each of those threads should be bound to a CPU. It is possible to bind the main thread (schema management domain) and the IO threads to the same CPU in most installations.

In the configuration above, we have bound threads to 20 different CPUs. We should also protect these 20 CPUs from interrupts by using the IRQBALANCE_BANNED_CPUS configuration variable in /etc/sysconfig/irqbalance and setting it to 0x0FFFFF.

The reason for doing this is that MySQL Cluster generates a lot of interrupt and OS kernel processing, and so it is recommended to separate activity across CPUs to ensure conflicts with the MySQL Cluster threads are eliminated.

When booting a Linux kernel it is also possible to provide an option isolcpus=0-19 in grub.conf. The result is that the Linux scheduler won't use these CPUs for any task. Only by using CPU affinity syscalls can a process be made to run on those CPUs.

By using this approach, together with binding MySQL Cluster threads to specific CPUs and banning CPUs IRQ processing on these tasks, a very stable performance environment is created for a MySQL Cluster data node.

On a 32 CPU/Thread server:

- Increase the number of ldm threads to 12

- Increase tc threads to 6

- Provide 2 more CPUs for the OS and interrupts.

- The number of send and receive threads should, in most cases, still be sufficient.

On a 40 CPU/Thread server, increase ldm threads to 16, tc threads to 8 and increment send and receive threads to 4.

On a 48 CPU/Thread server it is possible to optimize further by using:

- 12 tc threads

- 2 more CPUs for the OS and interrupts

- Avoid using IO threads and main thread on same CPU

- Add 1 more receive thread.


As both this and the previous post seek to demonstrate, the multi-threaded data node extensions not only serve to increase performance of MySQL Cluster, they also enable users to achieve significantly improved levels of utilization from current and future generations of massively multi-core, multi-thread processor designs.

A big thanks to Mikael Ronstrom, Senior MySQL Architect at Oracle, for his work in developing these enhancements and best practices.

You can download MySQL Cluster 7.2 today and try out all of these enhancements. The Getting Started guides are an invaluable aid to quickly building a Proof of Concept

Don’t forget to check out the MySQL Cluster 7.2 New Features whitepaper to discover everything that is new in the latest GA release

Wednesday May 30, 2012

MySQL Cluster 7.2: Over 8x Higher Performance than Cluster 7.1


The scalability enhancements delivered by extensions to multi-threaded data nodes enables MySQL Cluster 7.2 to deliver over 8x higher performance than the previous MySQL Cluster 7.1 release on a recent benchmark

What’s New in MySQL Cluster 7.2

MySQL Cluster 7.2 was released as GA (Generally Available) in February 2012, delivering many enhancements to performance on complex queries, new NoSQL Key / Value API, cross-data center replication and ease-of-use. These enhancements are summarized in the Figure below, and detailed in the MySQL Cluster New Features whitepaper

Figure 1: Next Generation Web Services, Cross Data Center Replication and Ease-of-Use

Once of the key enhancements delivered in MySQL Cluster 7.2 is extensions made to the multi-threading processes of the data nodes.

Multi-Threaded Data Node Extensions
The MySQL Cluster 7.2 data node is now functionally divided into seven thread types:
1) Local Data Manager threads (ldm). Note – these are sometimes also called LQH threads.
2) Transaction Coordinator threads (tc)
3) Asynchronous Replication threads (rep)
4) Schema Management threads (main)
5) Network receiver threads (recv)
6) Network send threads (send)
7) IO threads

Each of these thread types are discussed in more detail below.

MySQL Cluster 7.2 increases the maximum number of LDM threads from 4 to 16. The LDM contains the actual data, which means that when using 16 threads the data is more heavily partitioned (this is automatic in MySQL Cluster). Each LDM thread maintains its own set of data partitions, index partitions and REDO log. The number of LDM partitions per data node is not dynamically configurable, but it is possible, however, to map more than one partition onto each LDM thread, providing flexibility in modifying the number of LDM threads.

The TC domain stores the state of in-flight transactions. This means that every new transaction can easily be assigned to a new TC thread. Testing has shown that in most cases 1 TC thread per 2 LDM threads is sufficient, and in many cases even 1 TC thread per 4 LDM threads is also acceptable. Testing also demonstrated that in some instances where the workload needed to sustain very high update loads it is necessary to configure 3 to 4 TC threads per 4 LDM threads. In the previous MySQL Cluster 7.1 release, only one TC thread was available. This limit has been increased to 16 TC threads in MySQL Cluster 7.2. The TC domain also manages the Adaptive Query Localization functionality introduced in MySQL Cluster 7.2 that significantly enhanced complex query performance by pushing JOIN operations down to the data nodes.

Asynchronous Replication was separated into its own thread with the release of MySQL Cluster 7.1, and has not been modified in the latest 7.2 release.

To scale the number of TC threads, it was necessary to separate the Schema Management domain from the TC domain. The schema management thread has little load, so is implemented with a single thread.

The Network receiver domain was bound to 1 thread in MySQL Cluster 7.1. With the increase of threads in MySQL Cluster 7.2 it is also necessary to increase the number of recv threads to 8. This enables each receive thread to service one or more sockets used to communicate with other nodes the Cluster.

The Network send thread is a new thread type introduced in MySQL Cluster 7.2. Previously other threads handled the sending operations themselves, which can provide for lower latency. To achieve highest throughput however, it has been necessary to create dedicated send threads, of which 8 can be configured. It is still possible to configure MySQL Cluster 7.2 to a legacy mode that does not use any of the send threads – useful for those workloads that are most sensitive to latency.

The IO Thread is the final thread type and there have been no changes to this domain in MySQL Cluster 7.2. Multiple IO threads were already available, which could be configured to either one thread per open file, or to a fixed number of IO threads that handle the IO traffic. Except when using compression on disk, the IO threads typically have a very light load.

Benchmarking the Scalability Enhancements

The scalability enhancements discussed above have made it possible to scale CPU usage of each data node to more than 5x of that possible in MySQL Cluster 7.1. In addition, a number of bottlenecks have been removed, making it possible to scale data node performance by even more than 5x.

Figure 2: MySQL Cluster 7.2 Delivers 8.4x Higher Performance than 7.1

The flexAsynch benchmark was used to compare MySQL Cluster 7.2 performance to 7.1 across an 8-node Intel Xeon x5670-based cluster of dual socket commodity servers (6 cores each).

As the results demonstrate, MySQL Cluster 7.2 delivers over 8x higher performance per data nodes than MySQL Cluster 7.1.

More details of this and other benchmarks will be published in a new whitepaper – coming soon, so stay tuned!

In a following blog post, I’ll provide recommendations on optimum thread configurations for different types of server processor. You can also learn more from the Best Practices Guide to Optimizing Performance of MySQL Cluster


MySQL Cluster has achieved a range of impressive benchmark results, and set in context with the previous 7.1 release, is able to deliver over 8x higher performance per node.

As a result, the multi-threaded data node extensions not only serve to increase performance of MySQL Cluster, they also enable users to achieve significantly improved levels of utilization from current and future generations of massively multi-core, multi-thread processor designs.

Tuesday May 29, 2012

Performance Testing of MySQL Cluster: The flexAsynch Benchmark

Following the release of MySQL Cluster 7.2, the Engineering has been busy publishing a range of new performance benchmarks, most recently delivering 1.2 Billion UPDATE operations per Minute across a cluster of 30 x commodity Intel Xeon E5-based servers.

Figure 1: Linear Scaling of Write Operations

These performance tests have been run on the flexAsynch benchmark, so in the this blog, I wanted to provide a little more detail on that benchmark, and provide guidance on how you can use it in your own performance evaluations.

FlexAsynch is an open source, highly adaptable test suite that can be downloaded as part of the MySQL Cluster source tarball under the <storage/ndb/test/ndbapi> directory.

An automated tool is available to run the benchmark, with full instructions documented in the README file packaged in the dbt2-0.37.50 tarball. The tarball also includes the scripts to run the benchmark, and are further described on the MySQL benchmark page.

The benchmark reads or updates an entire row from the database as part of its test operation. All UPDATE operations are fully transactional. As part of these tests, each row in this benchmark is 100 bytes total, comprising 25 columns, each 4 bytes in size, though the size and number of columns are fully configurable.

Database access from the application tier is via the C++ NDB API, one of the NoSQL interfaces implemented by MySQL Cluster that bypasses the SQL layer to communicate directly with the data nodes. Other NoSQL interfaces include the Memcached API, Java, JPA and HTTP/REST

flexAsynch makes it possible to generate a variety of loads for MySQL Cluster benchmarking, enabling users to simulate their own environment.

Using the scripts in the dbt2-0.37.50 tarball and flexAsynch it is possible to configure a range of parameters, including:

  • The number of benchmark drivers and the number of threads per benchmark driver;
  • The number of simultaneous transactions executed per thread;
  • The size and number of records;
  • The type of operation performed (INSERT, UPDATE, DELETE, SELECT);
  • The size of the database.

By using flexAsync, users can test the limits of MySQL Cluster performance. The size of the database used for testing is configurable, but given that flexAsynch can load more than 1GB of data per second into the MySQL Cluster database, users should ensure the database is large in order to achieve consistent benchmark numbers.

We will shortly publish a whitepaper that discusses MySQL Cluster benchmarking in more detail – so stay tuned for that. In the meantime, flexAsynch is fully available today for you to run your own tests. Also check out our Best Practices guide for optimizing the performance of MySQL Cluster 

Friday Mar 30, 2012

Guide to MySQL & NoSQL, Webinar Q&A

Yesterday we ran a webinar discussing the demands of next generation web services and how blending the best of relational and NoSQL technologies enables developers and architects to deliver the agility, performance and availability needed to be successful.

Attendees posted a number of great questions to the MySQL developers, serving to provide additional insights into areas like auto-sharding and cross-shard JOINs, replication, performance, client libraries, etc. So I thought it would be useful to post those below, for the benefit of those unable to attend the webinar.

Before getting to the Q&A, there are a couple of other resources that maybe useful to those looking at NoSQL capabilities within MySQL:

- On-Demand webinar

- Slides used during the webinar

- Guide to MySQL and NoSQL whitepaper 

- MySQL Cluster demo, including NoSQL interfaces, auto-sharing, high availability, etc. 

So here is the Q&A from the event 

Q. Where does MySQL Cluster fit in to the CAP theorem?

A. MySQL Cluster is flexible. A single Cluster will prefer consistency over availability in the presence of network partitions. A pair of Clusters can be configured to prefer availability over consistency. A full explanation can be found on the MySQL Cluster & CAP Theorem blog post. 

Q. Can you configure the number of replicas? (the slide used a replication factor of 1)

Yes. A cluster is configured by an .ini file. The option NoOfReplicas sets the number of originals and replicas: 1 = no data redundancy, 2 = one copy etc. Usually there's no benefit in setting it >2.

Q. Interestingly most (if not all) of the NoSQL databases recommend having 3 copies of data (the replication factor).   

Yes, with configurable quorum based Reads and writes. MySQL Cluster does not need a quorum of replicas online to provide service. Systems that require a quorum need > 2 replicas to be able to tolerate a single failure. Additionally, many NoSQL systems take liberal inspiration from the original GFS paper which described a 3 replica configuration. MySQL Cluster avoids the need for a quorum by using a lightweight arbitrator. You can configure more than 2 replicas, but this is a tradeoff between incrementally improved availability, and linearly increased cost.

Q. Can you have cross node group JOINS? Wouldn't that run into the risk of flooding the network?

MySQL Cluster 7.2 supports cross nodegroup joins. A full cross-join can require a large amount of data transfer, which may bottleneck on network bandwidth. However, for more selective joins, typically seen with OLTP and light analytic applications, cross node-group joins give a great performance boost and network bandwidth saving over having the MySQL Server perform the join.

Q. Are the details of the benchmark available anywhere? According to my calculations it results in approx. 350k ops/sec per processor which is the largest number I've seen lately

The details are linked from Mikael Ronstrom's blog

The benchmark uses a benchmarking tool we call flexAsynch which runs parallel asynchronous transactions. It involved 100 byte reads, of 25 columns each. Regarding the per-processor ops/s, MySQL Cluster is particularly efficient in terms of throughput/node. It uses lock-free minimal copy message passing internally, and maximizes ID cache reuse. Note also that these are in-memory tables, there is no need to read anything from disk.

Q. Is access control (like table) planned to be supported for NoSQL access mode?

Currently we have not seen much need for full SQL-like access control (which has always been overkill for web apps and telco apps). So we have no plans, though especially with memcached it is certainly possible to turn-on connection-level access control. But specifically table level controls are not planned.

Q. How is the performance of memcached APi with MySQL against memcached+MySQL or any other Object Cache like Ecache with MySQL DB?

With the memcache API we generally see a memcached response in less than 1 ms. and a small cluster with one memcached server can handle tens of thousands of operations per second.

Q. Can .NET can access MemcachedAPI?

Yes, just use a .Net memcache client such as the enyim or BeIT memcache libraries.

Q. Is the row level locking applicable when you update a column through memcached API?

An update that comes through memcached uses a row lock and then releases it immediately. Memcached operations like "INCREMENT" are actually pushed down to the data nodes. In most cases the locks are not even held long enough for a network round trip.

Q. Has anyone published an example using something like PHP? I am assuming that you just use the PHP memcached extension to hook into the memcached API. Is that correct?

Not that I'm aware of but absolutely you can use it with php or any of the other drivers

Q. For beginner we need more examples.

Take a look here for a fully worked example

Q. Can I access MySQL using Cobol (Open Cobol) or C and if so where can I find the coding libraries etc?

A. There is a cobol implementation that works well with MySQL, but I do not think it is Open Cobol. Also there is a MySQL C client library that is a standard part of every mysql distribution

Q. Is there a place to go to find help when testing and/implementing the NoSQL access?

If using Cluster then you can use the alias or post on the MySQL Cluster forum

Q. Are there any white papers on this? 

Yes - there is more detail in the MySQL Guide to NoSQL whitepaper

If you have further questions, please don’t hesitate to use the comments below!


Get the latest updates on products, technology, news, events, webcasts, customers and more.




« November 2015