Thursday Nov 29, 2012

MySQL and Hadoop Integration - Unlocking New Insight

“Big Data” offers the potential for organizations to revolutionize their operations. With the volume of business data doubling every 1.2 years, analysts and business users are discovering very real benefits when integrating and analyzing data from multiple sources, enabling deeper insight into their customers, partners, and business processes.

As the world’s most popular open source database, and the most deployed database in the web and cloud, MySQL is a key component of many big data platforms, with Hadoop vendors estimating 80% of deployments are integrated with MySQL.

The new Guide to MySQL and Hadoop presents the tools enabling integration between the two data platforms, supporting the data lifecycle from acquisition and organisation to analysis and visualisation / decision, as shown in the figure below

The Guide details each of these stages and the technologies supporting them:

Acquire: Through new NoSQL APIs, MySQL is able to ingest high volume, high velocity data, without sacrificing ACID guarantees, thereby ensuring data quality. Real-time analytics can also be run against newly acquired data, enabling immediate business insight, before data is loaded into Hadoop. In addition, sensitive data can be pre-processed, for example healthcare or financial services records can be anonymized, before transfer to Hadoop.

Organize: Data is transferred from MySQL tables to Hadoop using Apache Sqoop. With the MySQL Binlog (Binary Log) API, users can also invoke real-time change data capture processes to stream updates to HDFS.

Analyze: Multi-structured data ingested from multiple sources is consolidated and processed within the Hadoop platform.

Decide: The results of the analysis are loaded back to MySQL via Apache Sqoop where they inform real-time operational processes or provide source data for BI analytics tools.

So how are companies taking advantage of this today? As an example, on-line retailers can use big data from their web properties to better understand site visitors’ activities, such as paths through the site, pages viewed, and comments posted. This knowledge can be combined with user profiles and purchasing history to gain a better understanding of customers, and the delivery of highly targeted offers.

Of course, it is not just in the web that big data can make a difference. Every business activity can benefit, with other common use cases including:

- Sentiment analysis;

- Marketing campaign analysis;

- Customer churn modeling;

- Fraud detection;

- Research and Development;

- Risk Modeling;

- And more.

As the guide discusses, Big Data is promising a significant transformation of the way organizations leverage data to run their businesses. MySQL can be seamlessly integrated within a Big Data lifecycle, enabling the unification of multi-structured data into common data platforms, taking advantage of all new data sources and yielding more insight than was ever previously imaginable.

Download the guide to MySQL and Hadoop integration to learn more. I'd also be interested in hearing about how you are integrating MySQL with Hadoop today, and your requirements for the future, so please use the comments on this blog to share your insights.

Thursday Nov 01, 2012

MySQL Cluster 7.3: On-Demand Webinar and Q&A Available

The on-demand webinar for the MySQL Cluster 7.3 Development Release is now available.

You can learn more about the design, implementation and getting started with all of the new MySQL Cluster 7.3 features from the comfort and convenience of your own device, including:

- Foreign Key constraints in MySQL Cluster

- Node.js NoSQL API 

- Auto-installation of higher performance distributed, clusters

We received some great questions over the course of the webinar, and I wanted to share those for the benefit of a broader audience.

Q. What Foreign Key actions are supported:

A. The core referential actions defined in the SQL:2003 standard are implemented:





Q. Where are Foreign Keys implemented, ie data nodes or SQL nodes?

A. They are implemented in the data nodes, therefore can be enforced for both the SQL and NoSQL APIs

Q. Are they compatible with the InnoDB Foreign Key implementation?

A. Yes, with the following exceptions:

- InnoDB doesn’t support “No Action” constraints, MySQL Cluster does

- You can choose to suspend FK constraint enforcement with InnoDB using the FOREIGN_KEY_CHECKS parameter; at the moment, MySQL Cluster ignores that parameter.

- You cannot set up FKs between 2 tables where one is stored using MySQL Cluster and the other InnoDB.

- You cannot change primary keys through the NDB API which means that the MySQL Server actually has to simulate such operations by deleting and re-adding the row. If the PK in the parent table has a FK constraint on it then this causes non-ideal behaviour. With Restrict or No Action constraints, the change will result in an error. With Cascaded constraints, you’d want the rows in the child table to be updated with the new FK value but, the implicit delete of the row from the parent table would remove the associated rows from the child table and the subsequent implicit insert into the parent wouldn’t reinstate the child rows. For this reason, an attempt to add an ON UPDATE CASCADE where the parent column is a primary key will be rejected.

Q. Does adding or dropping Foreign Keys cause downtime due to a schema change?

A. Nope, this is an online operation. MySQL Cluster supports a number of on-line schema changes, ie adding and dropping indexes, adding columns, etc.

Q. Where can I see an example of node.js with MySQL Cluster?

A. Check out the tutorial and download the code from GitHub

Q. Can I use the auto-installer to support remote deployments? How about setting up MySQL Cluster 7.2?

A. Yes to both!

Q. Can I get a demo

Check out the tutorial. You can download the code from Go to Select Build drop-down box

Q. What is be minimum internet speen required for Geo distributed cluster with synchronous replication?

A. if you're splitting you cluster between sites then we recommend a network latency of 20ms or less. Alternatively, use MySQL asynchronous replication where the latency of your WAN doesn't impact the latency of your reads/writes.

Q. Where you can one learn more about the PayPal project with MySQL Cluster?

A. Take a look at the following - you'll find press coverage, a video and slides from their keynote presentation 

So, if you want to learn more, listen to the new MySQL Cluster 7.3 on-demand webinar 

MySQL Cluster 7.3 is still in the development phase, so it would be great to get your feedback on these new features, and things you want to see!

Monday Oct 22, 2012

MySQL Cluster 7.3 - Join This Week's Webinar to Learn What's New

The first Development Milestone and Early Access releases of MySQL Cluster 7.3 were announced just several weeks ago. To provide more detail and demonstrate the new features, Andrew Morgan and I will be hosting a live webinar this coming Thursday 25th October at 0900 Pacific Time / 16.00 UTC

Even if you can't make the live webinar, it is still worth registering for the event as you will receive a notification when the replay will be available, to view on-demand at your convenience

In the webinar, we will discuss the enhancements being previewed as part of MySQL Cluster 7.3, including:

- Foreign Key Constraints: Yes, we've looked into the future and decided Foreign Keys are it ;-)

You can read more about the implementation of Foreign Keys in MySQL Cluster 7.3 here

- Node.js NoSQL API: Allowing web, mobile and cloud services to query and receive results sets from MySQL Cluster, natively in JavaScript, enables developers to seamlessly couple high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

You can study the Node.js / MySQL Cluster tutorial here

- Auto-Installer: This new web-based GUI makes it simple for DevOps teams to quickly configure and provision highly optimized MySQL Cluster deployments on-premise or in the cloud

You can view a YouTube tutorial on the MySQL Cluster Auto-Installer here 

So we have a lot to cover in our 45 minute session. It will be time well spent if you want to know more about the future direction of MySQL Cluster and how it can help you innovate faster, with greater simplicity.

Registration is open 

Jumpstart your MySQL Cluster Knowledge

Join companies in the web, gaming, telecoms and mobile areas by learning about MySQL Cluster's distributed, shared-nothing, real-time design.

The 3 days, MySQL Cluster course teaches you how to configure and manage the cluster nodes to ensure high availability. Learn how to install different nodes and understand cluster internals. Here is a sample of some events on the schedule for this course:



 Delivery Language

 Wien, Austria

 4 February, 2013


 Prague, Czech Republic

 10 December, 2012


 London, England

 12 December, 2012


 Hamburg, Germany

 21 January, 2013


 Stuttgart, Germany

 26 March, 2013


 Budapest, Hungary

 4 December, 2012


 Warsaw, Poland

 10 December, 2012


 Lisbon, Portugal

 3 December, 2012

European Portugese 

 Barcelona, Spain

 19 November, 2012


 Madrid, Spain

 25 February, 2013


 Jakarta, Indonesia

 21 January, 2013



 29 October, 2012


 Chicago, United States

 27 March, 2013


 Reston, United States

 6 February, 2013


For more information on the authentic MySQL curriculum go to

Tuesday Oct 09, 2012

EmblaCom Oy Maximizes Database Availability and Reduces Costs with MySQL Cluster

Headquartered in Finland, EmblaCom Oy provides turnkey and cloud-hosted voice solutions to mobile operators around the globe. Since launching the original mobile private branch exchange (PBX) in 1998, the company has focused on helping its partners provide efficient voice communications to their key business customers. The company’s voice solutions are used by millions of subscribers, worldwide.

EmblaCom Oy needed to replace several database engines with a standardized, scalable, development-friendly database solution to maximize availability and cut costs. The company chose MySQL Cluster Carrier Grade Edition, which has maximized accessibility to EmblaCom’s services for its clients and their hundreds of thousands of subscribers. The initiative has also reduced, by half, the cost of the database solution installation for customers, as well as lowered maintenance and customer service costs.

Read the entire case study here.

Monday Oct 08, 2012

New Options for MySQL High Availability

Data is the currency of today’s web, mobile, social, enterprise and cloud applications. Ensuring data is always available is a top priority for any organization – minutes of downtime will result in significant loss of revenue and reputation.

There is not a “one size fits all” approach to delivering High Availability (HA). Unique application attributes, business requirements, operational capabilities and legacy infrastructure can all influence HA technology selection. And then technology is only one element in delivering HA – “People and Processes” are just as critical as the technology itself.

For this reason, MySQL Enterprise Edition is available supporting a range of HA solutions, fully certified and supported by Oracle. MySQL Enterprise HA is not some expensive add-on, but included within the core Enterprise Edition offering, along with the management tools, consulting and 24x7 support needed to deliver true HA.

At the recent MySQL Connect conference, we announced new HA options for MySQL users running on both Linux and Solaris:

- DRBD for MySQL

- Oracle Solaris Clustering for MySQL

DRBD (Distributed Replicated Block Device) is an open source Linux kernel module which leverages synchronous replication to deliver high availability database applications across local storage. DRBD synchronizes database changes by mirroring data from an active node to a standby node and supports automatic failover and recovery. Linux, DRBD, Corosync and Pacemaker, provide an integrated stack of mature and proven open source technologies.

DRBD Stack: Providing Synchronous Replication for the MySQL Database with InnoDB

Download the DRBD for MySQL whitepaper to learn more, including step-by-step instructions to install, configure and provision DRBD with MySQL

Oracle Solaris Cluster provides high availability and load balancing to mission-critical applications and services in physical or virtualized environments. With Oracle Solaris Cluster, organizations have a scalable and flexible solution that is suited equally to small clusters in local datacenters or larger multi-site, multi-cluster deployments that are part of enterprise disaster recovery implementations. The Oracle Solaris Cluster MySQL agent integrates seamlessly with MySQL offering a selection of configuration options in the various Oracle Solaris Cluster topologies.

Putting it All Together

When you add MySQL Replication and MySQL Cluster into the HA mix, along with 3rd party solutions, users have extensive choice (and decisions to make) to deliver HA services built on MySQL

To make the decision process simpler, we have also published a new MySQL HA Solutions Guide.

Exploring beyond just the technology, the guide presents a methodology to select the best HA solution for your new web, cloud and mobile services, while also discussing the importance of people and process in ensuring service continuity.

This is subject recently presented at Oracle Open World, and the slides are available here.

Whatever your uptime requirements, you can be sure MySQL has an HA solution for your needs

Please don't hesitate to let us know of your HA requirements in the comments section of this blog. You can also contact MySQL consulting to learn more about their HA Jumpstart offering which will help you scope out your scaling and HA requirements.

Friday Oct 05, 2012

New MySQL Cluster 7.3 Previews: Foreign Keys, NoSQL Node.js API and Auto-Tuned Clusters

At this weeks MySQL Connect conference, Oracle previewed an exciting new wave of developments for MySQL Cluster, further extending its simplicity and flexibility by expanding the range of use-cases, adding new NoSQL options, and automating configuration.

What’s new:

  • Development Release 1: MySQL Cluster 7.3 with Foreign Keys
  • Early Access “Labs” Preview: MySQL Cluster NoSQL API for Node.js
  • Early Access “Labs” Preview: MySQL Cluster GUI-Based Auto-Installer

In this blog, I'll introduce you to the features being previewed.

Review the blogs listed below for more detail on each of the specific features discussed.

Save the date!: A live webinar is scheduled for Thursday 25th October at 0900 Pacific Time / 1600UTC where we will discuss each of these enhancements in more detail. Registration will be open soon and published to the MySQL webinars page

MySQL Cluster 7.3: Development Release 1

The first MySQL Cluster 7.3 Development Milestone Release (DMR) previews Foreign Keys, bringing powerful new functionality to MySQL Cluster while reducing development complexity.

Foreign Key support has been one of the most requested enhancements to MySQL Cluster – enabling users to simplify their data models and application logic – while extending the range of use-cases for both custom projects requiring referential integrity and packaged applications, such as eCommerce, CRM, CMS, etc.


The Foreign Key functionality is implemented directly within the MySQL Cluster data nodes, allowing any client API accessing the cluster to benefit from them – whether they are SQL or one of the NoSQL interfaces (Memcached, C++, Java, JPA, HTTP/REST or the new Node.js API - discussed later.)

The core referential actions defined in the SQL:2003 standard are implemented:


In addition, the MySQL Cluster implementation supports the online adding and dropping of Foreign Keys, ensuring the Cluster continues to serve both read and write requests during the operation.  This represents a further enhancement to MySQL Cluster's support for on0line schema changes, ie adding and dropping indexes, adding columns, etc. 

Read this blog for a demonstration of using Foreign Keys with MySQL Cluster. 

Getting Started with MySQL Cluster 7.3 DMR1:

Users can download either the source or binary and evaluate the MySQL Cluster 7.3 DMR with Foreign Keys now! (Select the Development Release tab).

MySQL Cluster NoSQL API for Node.js

Node.js is hot! In a little over 3 years, it has become one of the most popular environments for developing next generation web, cloud, mobile and social applications. Bringing JavaScript from the browser to the server, the design goal of Node.js is to build new real-time applications supporting millions of client connections, serviced by a single CPU core.

Making it simple to further extend the flexibility and power of Node.js to the database layer, we are previewing the Node.js Javascript API for MySQL Cluster as an Early Access release, available for download now from Select the following build:


Alternatively, you can clone the project at the MySQL GitHub page

Implemented as a module for the V8 engine, the new API provides Node.js with a native, asynchronous JavaScript interface that can be used to both query and receive results sets directly from MySQL Cluster, without transformations to SQL.

Figure 1: MySQL Cluster NoSQL API for Node.js enables end-to-end JavaScript development

Rather than just presenting a simple interface to the database, the Node.js module integrates the MySQL Cluster native API library directly within the web application itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

The new Node.js API joins a rich array of NoSQL interfaces available for MySQL Cluster. Whichever API is chosen for an application, SQL and NoSQL can be used concurrently across the same data set, providing the ultimate in developer flexibility. 

Get started with MySQL Cluster NoSQL API for Node.js tutorial

MySQL Cluster GUI-Based Auto-Installer

Compatible with both MySQL Cluster 7.2 and 7.3, the Auto-Installer makes it simple for DevOps teams to quickly configure and provision highly optimized MySQL Cluster deployments – whether on-premise or in the cloud.

Implemented with a standard HTML GUI and Python-based web server back-end, the Auto-Installer intelligently configures MySQL Cluster based on application requirements and auto-discovered hardware resources

Figure 2: Automated Tuning and Configuration of MySQL Cluster

Developed by the same engineering team responsible for the MySQL Cluster database, the installer provides standardized configurations that make it simple, quick and easy to build stable and high performance clustered environments.

The auto-installer is previewed as an Early Access release, available for download now from, by selecting the MySQL-Cluster-Auto-Installer build.

You can read more about getting started with the MySQL Cluster auto-installer here.

Watch the YouTube video for a demonstration of using the MySQL Cluster auto-installer

Getting Started with MySQL Cluster

If you are new to MySQL Cluster, the Getting Started guide will walk you through installing an evaluation cluster on a singe host (these guides reflect MySQL Cluster 7.2, but apply equally well to 7.3 and the Early Access previews). Or use the new MySQL Cluster Auto-Installer!

Download the Guide to Scaling Web Databases with MySQL Cluster (to learn more about its architecture, design and ideal use-cases).

Post any questions to the MySQL Cluster forum where our Engineering team and the MySQL Cluster community will attempt to assist you.

Post any bugs you find to the MySQL bug tracking system (select MySQL Cluster from the Category drop-down menu)

And if you have any feedback, please post them to the Comments section here or in the blogs referenced in this article.


MySQL Cluster 7.2 is the GA, production-ready release of MySQL Cluster. The first Development Release of MySQL Cluster 7.3 and the Early Access previews give you the opportunity to preview and evaluate future developments in the MySQL Cluster database, and we are very excited to be able to share that with you.

Let us know how you get along with MySQL Cluster 7.3, and other features that you want to see in future releases, by using the comments of this blog.

Saturday Sep 29, 2012

Tutorial: Getting Started with the NoSQL JavaScript / Node.js API for MySQL Cluster

Tutorial authored by Craig Russell and JD Duncan 

The MySQL Cluster team are working on a new NoSQL JavaScript connector for MySQL. The objectives are simplicity and high performance for JavaScript users:

- allows end-to-end JavaScript development, from the browser to the server and now to the world's most popular open source database

- native "NoSQL" access to the storage layer without going first through SQL transformations and parsing.

Node.js is a complete web platform built around JavaScript designed to deliver millions of client connections on commodity hardware. With the MySQL NoSQL Connector for JavaScript, Node.js users can easily add data access and persistence to their web, cloud, social and mobile applications.

While the initial implementation is designed to plug and play with Node.js, the actual implementation doesn't depend heavily on Node, potentially enabling wider platform support in the future.


The architecture and user interface of this connector are very different from other MySQL connectors in a major way: it is an asynchronous interface that follows the event model built into Node.js.

To make it as easy as possible, we decided to use a domain object model to store the data. This allows for users to query data from the database and have a fully-instantiated object to work with, instead of having to deal with rows and columns of the database. The domain object model can have any user behavior that is desired, with the NoSQL connector providing the data from the database.

To make it as fast as possible, we use a direct connection from the user's address space to the database. This approach means that no SQL (pun intended) is needed to get to the data, and no SQL server is between the user and the data.

The connector is being developed to be extensible to multiple underlying database technologies, including direct, native access to both the MySQL Cluster "ndb" and InnoDB storage engines.

The connector integrates the MySQL Cluster native API library directly within the Node.js platform itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

The following sections take you through how to connect to MySQL, query the data and how to get started.

Connecting to the database

A Session is the main user access path to the database. You can get a Session object directly from the connector using the openSession function:

var nosql = require("mysql-js");

var dbProperties = {

    "implementation" : "ndb",

    "database" : "test"


nosql.openSession(dbProperties, null, onSession);

The openSession function calls back into the application upon creating a Session. The Session is then used to create, delete, update, and read objects.

Reading data

The Session can read data from the database in a number of ways. If you simply want the data from the database, you provide a table name and the key of the row that you want. For example, consider this schema:

create table employee (

  id int not null primary key,

  name varchar(32),

  salary float

) ENGINE=ndbcluster;

Since the primary key is a number, you can provide the key as a number to the find function.

function onSession = function(err, session) {

  if (err) {


    ... error handling


  session.find('employee', 0, onData);


function onData = function(err, data) {

  if (err) {


    ... error handling


  console.log('Found: ', JSON.stringify(data));

  ... use data in application


If you want to have the data stored in your own domain model, you tell the connector which table your domain model uses, by specifying an annotation, and pass your domain model to the find function.

var annotations = new nosql.Annotations();

function Employee = function(id, name, salary) { = id; = name;

  this.salary = salary;

  this.giveRaise = function(percent) {

    this.salary *= percent;



annotations.mapClass(Employee, {'table' : 'employee'});

function onSession = function(err, session) {

  if (err) {


    ... error handling


  session.find(Employee, 0, onData);


Updating data

You can update the emp instance in memory, but to make the raise persistent, you need to write it back to the database, using the update function.

function onData = function(err, emp) {

  if (err) {


    ... error handling


  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp); // oops, session is out of scope here


Using JavaScript can be tricky because it does not have the concept of block scope for variables. You can create a closure to handle these variables, or use a feature of the connector to remember your variables.

The connector api takes a fixed number of parameters and returns a fixed number of result parameters to the callback function. But the connector will keep track of variables for you and return them to the callback. So in the above example, change the onSession function to remember the session variable, and you can refer to it in the onData function:

function onSession = function(err, session) {

  if (err) {


    ... error handling


  session.find(Employee, 0, onData, session);


function onData = function(err, emp, session) {

  if (err) {


    ... error handling


  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp, onUpdate); // session is now in scope


function onUpdate = function(err, emp) {

  if (err) {


    ... error handling


Inserting data

Inserting data requires a mapped JavaScript user function (constructor) and a session. Create a variable and persist it:

function onSession = function(err, session) {

  var data = new Employee(999, 'Mat Keep', 20000000);

  session.persist(data, onInsert);



Deleting data

To remove data from the database, use the session remove function. You use an instance of the domain object to identify the row you want to remove. Only the key field is relevant.

function onSession = function(err, session) {

  var key = new Employee(999);

  session.remove(Employee, onDelete);



More extensive queries

We are working on the implementation of more extensive queries along the lines of the criteria query api. Stay tuned.

How to evaluate

The MySQL Connector for JavaScript is available for download from Select the build:


You can also clone the project on GitHub

Since it is still early in development, feedback is especially valuable (so don't hesitate to leave comments on this blog, or head to the MySQL Cluster forum). Try it out and see how easy (and fast) it is to integrate MySQL Cluster into your Node.js platforms.

You can learn more about other previewed functionality of MySQL Cluster 7.3 here

Wednesday Sep 12, 2012

MySQL Connect: What to Expect From the Wondrous Land of MySQL Cluster

The MySQL Connect conference is only a couple of weeks away, with MySQL engineers, support teams, consultants and community aces busy putting the final touches to their talks.

There will be many exciting new announcements and sharing of best practices at the conference, covering the range of MySQL technologies.

MySQL Cluster will a big part of this, so I wanted to share some key sessions for those of you who plan on attending, as well as some resources for those who are not lucky enough to be able to make the trip, but who can't afford to miss the key news. Of course, this is no substitute to actually being there….and the good news is that registration is still open ;-)


Whats New in MySQL Cluster Saturday 29th, 1300-1400, in Golden Gate room 5.                                                                                        Bernd Ocklin, director of MySQL Cluster development, and myself will be taking a look at what follows the latest MySQL Cluster 7.2 release. I don't want to give to much away - lets just say its not often you can add powerful new functionality to a product while at the same time making life radically simpler for its users.

For those not making it to the Conference, a live webinar repeating the talk is scheduled for Thursday 25th October at 09.00 pacific time. Hold the date, registration will be open for that soon and published to our MySQL Webinars page

Best Practices

Getting Started with MySQL Cluster, Hands-On Lab Saturday 29th, 1600-1700, in Plaza Room A.                                                              Santo Leto, one of our lead MySQL Cluster support engineers, regularly works with users new to MySQL Cluster, assisting them in installation, configuration, scaling, etc. In this lab, Santo will share best-practices in getting started.

Delivering Breakthrough Performance with MySQL Cluster Saturday 29th, 1730-1830, in Golden Gate room 5.

Frazer Clement, lead MySQL Cluster software engineer, will demonstrate how to translate the awesome Cluster benchmarks (remember 1 BILLION UPDATEs per minute ?!) into real-world performance.

You can also get some best practices from our new MySQL Cluster performance guide 

MySQL Cluster BoF Saturday 29th, 1900-2000, room Golden Gate 5.                                                                                                           Come and get a demonstration of new tools for the installation and configuration of MySQL Cluster, and spend time with the engineering team discussing any questions or issues you may have.

Developing High-Throughput Services with NoSQL APIs to InnoDB and MySQL Cluster Sunday 30th, 1145 - 1245, in Golden Gate room 7.  

In this session, JD Duncan and Andrew Morgan will present how to get started with both Memcached and new NoSQL APIs.

JD and I recently ran a webinar demonstrating how to build simple Twitter-like services with Memcached and MySQL Cluster. The replay is available for download

Case Studies:

MySQL Cluster @ El Chavo, Latin America’s #1 Facebook Game Sunday 30th, 1745 - 1845, in Golden Gate room 4.                             Playful Play deployed MySQL Cluster CGE to power their market leading social game. This session will discuss the challenges they faced, why they selected MySQL Cluster and their experiences to date.

You can read more about Playful Play and MySQL Cluster here 

A Journey into NoSQLand: MySQL’s NoSQL Implementation Sunday 30th, 1345 - 1445, in Golden Gate room 4.                                          Lig Turmelle, web DBA at Kaplan Professional and esteemed Oracle Ace, will discuss her experiences working with the NoSQL interfaces for both MySQL Cluster and InnoDB

Evaluating MySQL HA Alternatives Saturday 29th, 1430-1530, room Golden Gate 5                                                                                   Henrik Ingo, former member of the MySQL sales engineering team, will provide an overview of various HA technologies for MySQL, starting with replication, progressing to InnoDB, Galera and MySQL Cluster

What about the other stuff?

Of course MySQL Connect has much, much more than MySQL Cluster. There will be lots on replication (which I'll blog about soon), MySQL 5.6, InnoDB, cloud, etc, etc. Take a look at the full Content Catalog to see more.

If you are attending, I hope to see you at one of the Cluster sessions...and remember, registration is still open

Monday Aug 20, 2012

Learn about the Real-Time Performance and High Availability of MySQL Cluster

If you need 99.999% availability, real-time performance, auto-sharding and write scalability, geographic replication - using SQL and NoSQL applications then you should learn more about MySQL Cluster.

The authentic MySQL Cluster course brings you key conceptual and configuration information in a 3 day instructor led class.

Examples of class events already on the schedule:



 Delivery Language

 Prague, Czech Republic

 10 December 2012


 Warsaw, Poland

 3 September 2012


 London, England

 12 December 2012


 Lisbon, Portugal

 3 December 2012

 European Portugese

 Nice, France

 8 October 2012


Barcelona, Spain 

 25 September 2012


 Madrid, Spain

6 November 2012 


 Denver, Colorado

 17 October 2012


 Petaling Jaya, Malaysia

 10 October 2012



 19 September 2012


 Sao Paolo, Brazil

 24 September 2012

 Brazilian Portugese

To see the full schedule and course details or to register interest in an additional event, go to

Tuesday Jul 31, 2012

MySQL Cluster Performance Best Practices: Q & A

With its distributed, shared-nothing, real-time design, MySQL Cluster has attracted a lot of attention from developers who need to scale both read and write traffic with ultra-low latency and fault-tolerance, using commodity hardware. With many proven deployments in web, gaming, telecoms and mobile use-cases, MySQL Cluster is certainly able to meet these sorts of requirements.

But, as a distributed database, developers do need to think a little differently about data access patterns along with schema and query optimisations in order to get the best possible performance.

Sharing best practices developed by working with MySQL Cluster's largest users, we recently ran a Performance Essentials webinar, and the replay is now available, on-demand, for you to listen to in the comfort of your own office.

The webinar also accompanies a newly published Guide to optimizing the performance of MySQL Cluster.

We received a number of great questions over the course of the webinar, and I thought it would be useful to share a selection of those:

Q. How do I calculate and then monitor memory usage with MySQL Cluster?

A. If designing a completely new database, the following calculations can be used to help determine the approximate memory sizing requirements for the data nodes:

(in memory) Data Size * Replicas * 1.25 = Total Database Memory Requirements

Example: 50 GB * 2 * 1.25 = 125 GB

(Data Size * Replicas * 1.25)/Nodes = RAM Per Node

Example: (2 GB * 2 * 1.25)/4 = 31.25 GB

To see how much of the configured memory is currently in use by the database, you can query the ndbinfo.memory usage table

If using MySQL Cluster CGE then you can view this information over time in a MySQL Enterprise Monitor graph.

Q. Would enabling Disk space Table Space be an impact on the Query Performance ?

A. It can do. The only reason to use Disk based table spaces is when you do not have sufficient memory to store all data in-memory. Therefore some of your disk based data will be uncached at some time, and reads or writes which access this data will stall while the necessary pages are read into the page buffer. This can reduce throughput.

Q. I've seen that MySQL Cluster 7.2 can speed up JOIN operations by 70x. How does it do this?

A. There are two new features in MySQL Cluster 7.2, which when combined, can significantly improve the performance of joins over previous versions of MySQL Cluster:

- The Index Statistics function enables the SQL optimizer to build a better execution plan for each query. In the past, non-optimal query plans required a manual enforcement of indexes via USE INDEX or FORCE INDEX to alter the execution plan. ANALYZE TABLE must first be run on each table to take advantage of this.

- Adaptive Query Localization (AQL) allows the work of the join to be distributed across the data nodes (local to the data it’s working with) rather than up in the MySQL Server; this allows more computing power to be applied to calculating the join as well as dramatically reducing the number of messages being passed around the system.

You can learn more about AQL and a sample query here

Q. Can all JOINs use AQL?

A. In order for a join to be able to exploit AQL (in other words be “pushed down” to the data nodes), it must meet the following conditions:

1. Any columns to be joined must use exactly the same data type. (For example, if an INT and a BIGINT column are joined, the join cannot be pushed down). This includes the lengths of any VARCHAR columns.

2. Joins referencing BLOB or TEXT columns will not be pushed down.

3. Explicit locking is not supported; however, the NDB (MySQL Cluster) storage engine's characteristic implicit row-based locking is enforced.

4. In order for a join to be pushed down, child tables in the Join must be accessed using one of the ref, eq_ref, or const access methods, or some combination of these methods. These access methods are described in the documentation

5. Joins referencing tables explicitly partitioned by [LINEAR] HASH, LIST, or RANGE currently cannot be pushed down

6. If the query plan decides to 'Using join buffer' for a candidate child table, that table cannot be pushed as child. However, it might be the root of another set of pushed tables.

7. If the root of the pushed Join is an eq_ref or const, only child tables joined by eq_ref can be appended. (A ref joined table will then likely become a root of another pushed Join)

These conditions should be considered when designing your schema and application queries – for example, to comply with constraint 4, attempt to make any table scan that is part of the Join be the first clause.

Where a query involves multiple levels of Joins, it is perfectly possible for some levels to be pushed down while others continue to be executed within the MySQL Server.

If your application consists of many of these types of JOIN operations which cannot be made to exploit AQL, other MySQL storage engines such as InnoDB will present a better option for your workload.

Q. What are best practices for data model and query design?

A. The data model and queries should be designed to minimize network roundtrips between hosts. Ensuring that joins meet the requirements for

AQL and avoiding full table scans can help with this.

Looking up data in a hash table is a constant time operation, unaffected by the size of the data set

Looking up data in a tree (T-tree, B-tree etc) structure is logarithmic (O (log n)).

For a database designer this means it is very important to choose the right index structure and access method to retrieve data. We strongly recommend application requests with high requirements on performance be designed as primary key lookups. This is because looking up data in a hash structure is faster than from a tree structure and can be satisfied by a single data node. Therefore, it is very important that the data model takes this into account. It also follows that choosing a good primary key definition is extremely important.

If ordered index lookups are required then tables should be partitioned such that only one data node will be scanned.

The distributed nature of the Cluster and the ability to exploit multiple CPUs, cores or threads within nodes means that the maximum performance will be achieved if the application is architected to run many transactions in parallel. Alternatively you should run many instances of the application simultaneously to ensure that the Cluster is always able to work on many transactions in parallel.

Take a look at the Guide to optimizing the performance of MySQL Cluster for more detail

Q. What are best practices for parallelising my application and access to MySQL Cluster?

A. As mentioned MySQL Cluster is a distributed, auto-sharded database. This means that there is often more than one Data Node that can work in parallel to satisfy application requests.

Additionally, MySQL Cluster 7.2 enhances multi-threading so data nodes can now effectively exploit multiple threads / cores. To use this functionality, the data nodes should be started using the ndbmtd binary rather than ndb and config.ini should be configured correctly

Parallelization can be achieved in MySQL Cluster in several ways:

- Adding more Application Nodes

- Use of multi-threaded data nodes

- Batching of requests

- Parallelizing work on different Application Nodes connected to the Data Nodes

- Utilizing multiple connections between each Application Node and the Data Nodes (connection pooling)

How many threads and how many applications are needed to drive the desired load has to be studied by benchmarks. One approach of doing this is to connect one Application Node at a time and increment the number of threads. When one Application Node cannot generate any more load, add another one. It is advisable to start studying this on a two Data Node cluster, and then grow the number of Data Nodes to understand how your system is scaling.

If you have designed your application queries, and data model according to best practices presented in the Performance Guide, you can expect close to double the throughput on a four Data Node system compared to a two Data Node system, given that the application can generate the load.

Try to multi-thread whenever possible and load balance over more MySQL servers.

In MySQL Cluster you have access to additional performance enhancements that allow better utilization on multi-core / thread CPUs, including:

- Reduced lock contention by having multiple connections from one MySQL Server to the Data Nodes (--ndb-cluster-connection-pool=X):

- Setting threads to real-time priority

- Locking Data Node threads (kernel thread and maintenance threads to a CPU)

Q. Does MySQL Cluster’s data distribution add complexity to my application and limit the types of queries I can run?

A. No, it doesn't. By default, tables are automatically partitioned (sharded) across data nodes by hashing the primary key. Other partitioning methods are supported, but in most instances the default is acceptable.

As the sharding is automatic and implemented at the database layer, application developers do not need to modify their applications to control data distribution – which significantly simplifies scaling.

In addition, applications are free to run complex queries such as JOIN operations across the shards, therefore users do not need to trade functionality for scalability.

Q. What hardware would you recommend to get the best performance from MySQL Cluster?

A. It varies by node type. For data nodes:

- Up to 32 x x86-64 bit CPU cores. Use as high a frequency as possible as this will enable faster processing of messages between nodes;

- Large CPU caches assist in delivering optimal performance;

- 64-bit hosts with enough RAM to store your in-memory data set

- Linux, Solaris or Windows operating systems.

- 2 x Network Interface Cards and 2 x Power Supply Units for hardware redundancy.

It is important to ensure systems are configured to reduce swapping to disk whenever possible.

As a rule of thumb, have 7x times the amount of DataMemory configured for disk space for each data node. This space is needed for storing 2 Local Checkpoints (LCPs), the Redo log and 3 backups. You will also want to allocate space for table spaces if you are making use of disk-based data – including allowing extra space for the backups.

Having a fast, low-latency disk subsystem is very important and will affect check pointing and backups.

Download the MySQL Cluster Evaluation Guide for more recommendations 

The hardware requirements for MySQL Servers would be a little less:

- 4 - 32 x86-64 bit CPU cores

- Minimum 4GB of RAM. Memory is not as critical at this layer, and requirements will be influenced by connections and buffers.

- 2 x Network Interface Cards and 2 x Power Supply Units for hardware redundancy.

Q. I heard that MySQL Cluster doesn't support Foreign Keys, how can I get around that?

A. Foreign keys are previewed in MySQL Cluster 7.3 Early Access release which you can download and evaluate now. In MySQL Cluster 7.2 and earlier, you can emulate foreign keys programmatically via triggers.


If you are thinking about using MySQL Cluster for your next project, it is worth investing a little bit of time to get familiar with these performance best practices. The Webinar replay, the MySQL Cluster Performance Guide and the MySQL Cluster Evaluation Guide will give you pretty much everything you need to build high performance, high availability services with MySQL Cluster. 

Tuesday Jul 17, 2012

MySQL Cluster Powers El Chavo from Playful Play, Latin America’s Most Popular Facebook Game


Attracting over 3m subscribers in just 6 months and growing by 30,000 new users per day, Playful Play needed a database that was able to keep pace with the massive scalability and high availability demands of the wildly successful La Vecindad de El Chavo Facebook game.

Playful Play selected MySQL Cluster CGE running on a public cloud to power their gaming platform, providing:

  • 45% improvement in performance;
  • 99.999% uptime;
  • 80% reduction in DBA overhead;
  • Local language support, 24x7.

As a result, Playful Play has been able to maintain user growth rates and attract new advertisers to the platform, while enhancing agility and reducing cost.

Company Overview

Based out of Monterrey, Mexico, Playful Play has created Latin America’s #1 Facebook game based on "El Chavo del 8", the cultural and TV phenomenon that has been running across Latin America for the past four decades. The show is also extremely popular in Spain and the United States.

El Chavo (The Kid) appeals to a broad demographic, having attracted over 3M subscribers in its first 6 months, and adding 30,000 new users per day. The game has also proved popular with advertisers who are keen to integrate social gaming into their marketing strategies in order to raise awareness and build brand loyalty.

Through partnerships with Televisa Home Entertainment and Grupo Chespirito, Playful Play has aggressive plans to grow their business through the development of new social games targeting Latin America, Brazil, the United States and Spain.

The Challenges of Rapid Growth

As a start-up business, fast time to market at the lowest possible case was the leading development priority after Playful Play secured the rights to develop the El Chavo game.

As a result, Playful Play developed the first release of the game on the proven LAMP (Linux, Apache, MySQL, PHP / Perl / Python) stack.

To meet both the scalability and availability requirements of the game, Playful Play initially deployed MySQL in a replicated, multi-master configuration.

The database is core to the game, responsible for managing:

  • User profiles and avatars
  • Gaming session data;
  • In-app (application) purchases;
  • Advertising and digital marketing event data.

As La Vecidad de El Chavo spread virally across Facebook, subscriptions rapidly exceeded one million users, leading Playful Play to consider how to best architect their gaming platforms for long-term growth. They had heard about the release of MySQL Cluster 7.2, and so decided to initiate an evaluation to determine if it could meet their requirements.

The Route to Internet-Scale with MySQL Cluster

In addition to growing user volumes, the El Chavo game also added new features that changed the profile of the database. Operations became more write-intensive, with INSERTs and UPDATEs accounting for up to 70% of the database load.

The game’s popularity also attracted advertisers, who demanded strict SLAs for both performance (predictable throughput with low latency) as well as uptime.

Through their evaluation, the developers at Playful Play identified that MySQL Cluster was able to meet their needs.

Write Performance with Real-Time Responsiveness

MySQL Cluster’s multi-master architecture, coupled with the ability to auto-shard (partition) tables across nodes, enabled Playful Play to meet the performance and scalability demands of the El Chavo game, without changes to their applications.

The ability to store data in memory and persist to disk delivered the sub-millisecond responsiveness and durability demanded by the game’s users.

In addition, Playful Play was able to horizontally scale the database across low-cost commodity nodes running in the cloud, reducing their TCO and enhancing agility.

Continuous Availability

The shared-nothing, distributed design of MySQL Cluster coupled with integrated replication and self-healing recovery ensured high availability of the gaming platform, without DevOps intervention.

MySQL Cluster was able to protect against downtime resulting from both failures and planned upgrades. For example, Playful Play has been able to scale the cluster on-line to support growing user volumes, without service interruption.

Data Integrity Supporting Monetization

As the game evolved to support in-app purchases and digital marketing, the ACID-compliance offered by MySQL Cluster afforded the transactional integrity demanded by users and advertisers.

The Value of MySQL Cluster Consulting, Support and Management Tools

Following their successful evaluation, Playful Play deployed MySQL Cluster with their public cloud provider.

Playful Play wanted to ensure they were getting the most out of the deployment, so Oracle’s MySQL consulting team reviewed their Architecture and Design. The consultants were able to further optimize the database to deliver even higher levels of performance, and demonstrated how Playful Play could automatically scale the database using MySQL Cluster Manager.

The MySQL Cluster deployment at Playful Play is growing rapidly, and currently comprised of over 25 nodes running Linux on commodity x86 servers, each configured with 24-cores and 64GB of RAM:

  • 12 x MySQL Cluster data nodes;
  • 12 x MySQL Server SQL nodes;
  • 2 x MySQL Cluster management nodes.

Each data and SQL node is co-located to a single physical instance.

 Playful Play Deployment Architecture, built for rapid scale

Using the architecture above, MySQL Cluster is currently supporting:

  • 3 million subscribers, with 30,000 new additions per day across Latin America, Europe and U.S;
  • 10,000 concurrent users;
  • 10,000 Transactions Per Second;
  • 99.999% uptime.

"The MySQL support service has been essential in helping us for troubleshooting and giving recommendations for the production cluster, Thanks" Carlos Morales – DBA, México

Playful Play has deployed MySQL Cluster Carrier-Grade Edition, providing access to technical and consultative support, in addition to the MySQL Enterprise Monitor and MySQL Cluster Manager tools.

MySQL Enterprise Monitor Enables Continuous Service Availability

MySQL Enterprise Monitor enables Playful Play to proactively monitor the entire cluster from a single GUI. DBAs are automatically notified if any environment variables begin to exceed pre-defined thresholds, and presented with the instructions necessary to take corrective action, before users are impacted.

MySQL Cluster Manager is used to automate cluster configuration changes, eliminating the risk of manual errors and significantly enhancing DevOps productivity.

The subscription has delivered terrific value to Playful Play, enabling them to:

  • Improve performance by 45% while reducing CPU utilization by 35%, providing significant headroom for continued growth;
  • Reduced DBA overhead by 80%, providing substantial cost savings;
  • Access to local language support, 24x7.

The Future with MySQL Cluster

Playful Play has aggressive growth plans, seeking to attract over 50M subscribers within 5 years. El Chavo del 8 is continuing to gain widespread popularity in Latin regions around the world, and will be joined by other social networking games currently in development. The next major target is Brazil, presenting a very attractive, emerging market in social gaming.

MySQL Cluster CGE has been selected as the database platform powering this growth.

Playful Play will be presenting their experiences and best practices with MySQL Cluster at the MySQL Connect conference, September 29th and 30th 2012 in San Francisco.

Further Resources

MySQL Cluster at Playful Play On-Demand Webinar (Spanish)

MySQL Cluster Demonstration

Guide to Scaling Web Databases with MySQL Cluster

Monday Jul 09, 2012

Top Reasons to Take the MySQL Cluster Training

Here are the top reasons to take the authorized MySQL Cluster training course:

  • Take training which was developed by MySQL Cluster product team and delivered by the MySQL Cluster experts at Oracle
  • Learn how to develop, deploy, manage and scale your MySQL Cluster applications more efficiently
  • Keep your mission-critical applications and essential services up and running 24x7
  • Deliver the highest performance and scalability using MySQL Cluster best practices

In this 3 day course, experienced database users learn the important details of clustering necessary to get started with MySQL Cluster, to properly configure and manage the cluster nodes to ensure high availability, to install the different nodes and provide a better understanding of the internals of the cluster.

To see the schedule for this course, go to the Oracle University Portal (click on MySQL). Should you not see an event for a location/date that suits you, register your interest in additional events.

Here is a small sample of the events already on the schedule for the MySQL Cluster course:



 Delivery Language

 Prague, Czech Republic

 17 September 2012


 Warsaw, Poland

 1 August 2012


 London, United Kingdom

 18 July 2012


 Lisbon, Portugal

 3 December 2012

 European Portugese

 Nice, France

 8 October 2012


 Barcelona, Spain

 25 September 2012


 Madrid, Spain

 20 August 2012


 Denver, United States

 17 October 2012


 Chicago, United States

 22 August 2012


 Petaling Jaya, Malaysia

 10 October 2012



 21 August 2012


 Mexico City, Mexico

 23 July 2012


Monday Jul 02, 2012

NoSQL Java API for MySQL Cluster: Questions & Answers

The MySQL Cluster engineering team recently ran a live webinar, available now on-demand demonstrating the ClusterJ and ClusterJPA NoSQL APIs for MySQL Cluster, and how these can be used in building real-time, high scale Java-based services that require continuous availability.

Attendees asked a number of great questions during the webinar, and I thought it would be useful to share those here, so others are also able to learn more about the Java NoSQL APIs.

First, a little bit about why we developed these APIs and why they are interesting to Java developers.

ClusterJ and Cluster JPA

ClusterJ is a Java interface to MySQL Cluster that provides either a static or dynamic domain object model, similar to the data model used by JDO, JPA, and Hibernate. A simple API gives users extremely high performance for common operations: insert, delete, update, and query.

ClusterJPA works with ClusterJ to extend functionality, including

- Persistent classes

- Relationships

- Joins in queries

- Lazy loading

- Table and index creation from object model

By eliminating data transformations via SQL, users get lower data access latency and higher throughput. In addition, Java developers have a more natural programming method to directly manage their data, with a complete, feature-rich solution for Object/Relational Mapping. As a result, the development of Java applications is simplified with faster development cycles resulting in accelerated time to market for new services.

MySQL Cluster offers multiple NoSQL APIs alongside Java:

  • - Memcached for a persistent, high performance, write-scalable Key/Value store,
  • - HTTP/REST via an Apache module
  • - C++ via the NDB API for the lowest absolute latency.

Developers can use SQL as well as NoSQL APIs for access to the same data set via multiple query patterns – from simple Primary Key lookups or inserts to complex cross-shard JOINs using Adaptive Query Localization

Marrying NoSQL and SQL access to an ACID-compliant database offers developers a number of benefits. MySQL Cluster’s distributed, shared-nothing architecture with auto-sharding and real time performance makes it a great fit for workloads requiring high volume OLTP. Users also get the added flexibility of being able to run real-time analytics across the same OLTP data set for real-time business insight.

OK – hopefully you now have a better idea of why ClusterJ and JPA are available. Now, for the Q&A.

Q & A

Q. Why would I use Connector/J vs. ClusterJ?

A. Partly it's a question of whether you prefer to work with SQL (Connector/J) or objects (ClusterJ). Performance of ClusterJ will be better as there is no need to pass through the MySQL Server. A ClusterJ operation can only act on a single table (e.g. no joins) - ClusterJPA extends that capability

Q. Can I mix different APIs (ie ClusterJ, Connector/J) in our application for different query types?

A. Yes. You can mix and match all of the API types, SQL, JDBC, ODBC, ClusterJ, Memcached, REST, C++. They all access the exact same data in the data nodes. Update through one API and new data is instantly visible to all of the others.

Q. How many TCP connections would a SessionFactory instance create for a cluster of 8 data nodes?

A. SessionFactory has a connection to the mgmd (management node) but otherwise is just a vehicle to create Sessions. Without using connection pooling, a SessionFactory will have one connection open with each data node. Using optional connection pooling allows multiple connections from the SessionFactory to increase throughput.

Q. Can you give details of how Cluster J optimizes sharding to enhance performance of distributed query processing?

A. Each data node in a cluster runs a Transaction Coordinator (TC), which begins and ends the transaction, but also serves as a resource to operate on the result rows. While an API node (such as a ClusterJ process) can send queries to any TC/data node, there are performance gains if the TC is where most of the result data is stored. ClusterJ computes the shard (partition) key to choose the data node where the row resides as the TC.

Q. What happens if we perform two primary key lookups within the same transaction? Are they sent to the data node in one transaction?

A. ClusterJ will send identical PK lookups to the same data node.

Q. How is distributed query processing handled by MySQL Cluster ?

A. If the data is split between data nodes then all of the information will be transparently combined and passed back to the application. The session will connect to a data node - typically by hashing the primary key - which then interacts with its neighboring nodes to collect the data needed to fulfil the query.

Q. Can I use Foreign Keys with MySQL Cluster

A. Support for Foreign Keys is included in the MySQL Cluster 7.3 Early Access release


The NoSQL Java APIs are packaged with MySQL Cluster, available for download here so feel free to take them for a spin today!

Key Resources

MySQL Cluster on-line demo 

MySQL ClusterJ and JPA On-demand webinar 

MySQL ClusterJ and JPA documentation

MySQL ClusterJ and JPA whitepaper and tutorial

Monday Jun 11, 2012

Official MySQL Cluster Training Available Near You!

Oracle is the official provider of MySQL Training.

To learn more about MySQL Cluster, you can register for the MySQL Cluster training at a large selection of locations and often you will find the course delivery in your local language! For example:



 Delivery Language

 Prague, Czech Republic

 17 September 2012


 Warsaw, Poland

 1 August 2012


 Wien, Austria

 27 August 2012


 London, United Kingdom

 18 July 2012


 Lisbon, Portugal

 3 December 2012

 European Portugese

 Nice, France

 8 October 2012


 Barcelona, Spain

 25 September 2012


 Madrid, Spain

 20 August 2012


 Denver, United States

 17 October 2012


 Chicago, United States

 22 August 2012


 New York, United States

 20 June 2012


 Petaling Jaya, Malaysia

 18 July 2012



 21 August 2012


 Melbourne, Australia

 13 June 2012


 Mexico City, Mexico

 23 July 2012


To learn more or register your interest in another course, location, or date, go to Oracle University's official portal.


Get the latest updates on products, technology, news, events, webcasts, customers and more.




« April 2014