Tuesday Apr 30, 2013

Big Fish Selects MySQL Cluster for Real-Time Web Recommendations

The world's largest producer of casual games has selected MySQL Cluster to power its real-time recommendations platform.

High velocity data ingestion, low latency reads, on-line scaling and the operational simplicity delivered by MySQL Cluster has enabled Big Fish to increase customer engagement and deliver targeted marketing, providing a more personalized experience to its users.

You can read the full Big Fish Games and MySQL Cluster case study here - and a summary below

BUSINESS NEED

The global video gaming market is experiencing explosive growth. Competition is intense, and so to differentiate services and engage users, progressive gaming companies such as Big Fish are seeking solutions to more fully personalize the customer experience.

Using Business Intelligence (BI) and predictive analytics Big Fish can segment customers based on a range of demographic and behavioural indicators. This enables Big Fish to serve highly targeted recommendations and marketing, precisely personalized to a user's individual preferences.

Big Fish's Marketing Management Service platform, powered by MySQL Cluster, is used across all of the company's customer management systems, including customer support and the company's "Game Manager", to provide a unique customer experience to each of its millions of monthly users whenever they come in contact with Big Fish.

TECHNOLOGY SELECTION

Big Fish already has an extensive deployment of MySQL databases powering web applications, including the storefront. They knew MySQL could power the recommendations database, but would require additional engineering efforts to implement database sharding to support data ingest and future scaling needs, coupled with a Memcached layer for low-latency reads.

As a result, they began evaluations of MySQL Cluster, in addition to other database technologies. Using MySQL Cluster, the Engineering teams were able to leverage their existing MySQL skills, enabling them to reduce operational complexity when compared to introducing a new database to the Big Fish environment.

At the same time, they knew MySQL Cluster, backed by Oracle, provided the long-term investment protection they needed for the MMS recommendations platform.

Through their evaluation, the Big Fish engineering team identified MySQL Cluster was best able to meet their technical requirements, based on:

Write performance to support high velocity data ingest

Low latency access with in-memory tables 

On-line scalability, adding nodes to a running cluster

Continuous availability with its shared-nothing architecture

SQL and NoSQL APIs to the cluster supporting both fast data loading and complex queries

PROJECT IMPLEMENTATION

As illustrated in the figure below:

  • User data is replicated from the MySQL databases powering the gaming storefront to the Big Fish BI platform;
  • User data is analyzed and segmented within the BI platform;
  • Recommendations are loaded as user records into MySQL Cluster using the NoSQL Cluster_J (Java) Connector;
  • The SQL interface presented by the MySQL Servers then delivers personalized content to gamers in real-time, initially serving over 15m sessions per day.


Big Fish has subscribed to MySQL Cluster CGE providing the Engineering team with access to 24x7 Oracle Premier Support and MySQL Cluster Manager, which reduces operational overhead by providing:

  • Automated configuration and reconfiguration of MySQL Cluster;
  • Automated on-line node addition for on-demand scaling.

The online scalability of MySQL CGE can help Big Fish to meet future requirements, as it expands its use of MySQL CGE to include all new website developments, channels and gaming platforms. 

LEARN MORE

Read the full Big Fish Games and MySQL Cluster case study

Read the press release 


Wednesday Apr 17, 2013

MySQL Cluster 7.3 DMR2: Increasing Developer Flexibility and Simplicity

Highlights: Foreign Keys, NoSQL JavaScript Connector, MySQL 5.6 and Auto-Tuned Clustering

The MySQL team at Oracle are excited to announce the immediate availability of the second MySQL Cluster 7.3 Development Milestone Release (DMR)

Some might call MySQL Cluster 7.3 “the foreign keys release” – and sure enough it is a major engineering achievement to build a distributed database that enforces referential integrity across a shared-nothing cluster, while maintaining ACID compliance and cross-shard JOINs. But MySQL Cluster 7.3 is a lot more as well. 

The design focus has been on enabling developer agility – making it simpler and faster than ever to enhance new services with a highly scalable, fault tolerant, real-time database – with minimum development or operational effort. 

The key enhancements delivered by MySQL Cluster 7.3 are summarized below.

Foreign Keys: Strengthens data modeling and simplifies application logic by automatically enforcing referential integrity between different tables distributed on different shards, on different nodes…..even in different data centers 

NoSQL JavaScript Connector for Node.js: Enables a single programming language and a single tool-chain by extending JavaScript from the client to the server, all the way through to the database, bypassing the SQL layer to deliver lower latency and reduced development cycles.  

MySQL 5.6 Support: Developers can combine the InnoDB and MySQL Cluster NDB storage engines within a single database, using the very latest MySQL 5.6 release.

Connection Thread Scalability: Increases cluster performance and capacity by improving the throughput of each connection to the data nodes, thus reducing the number of connections that need to be provisioned, and enabling greater scale-out headroom.  Current testing is showing up to 3x higher throughput per connection, enabling more client threads to use each connection. 

Auto-Installer: Get it all up and running in minutes! Graphically configure and provision a production-grade cluster, automatically tuned for your workload and environment, without ever resorting to “RTFM”.  

MySQL Cluster: Automated Tuning and Configuration  

Read the MySQL Cluster 7.3 DMR2 DevZone article for more detail on all of the enhancements discussed above.

You can download the source and binaries for MySQL Cluster 7.3 DMR2 today (select the Development Releases tab).  


MySQL Cluster Tutorial: NoSQL JavaScript Connector for Node.js

This tutorial has been authored by Craig Russell and JD Duncan

The MySQL Cluster team are working on a new NoSQL JavaScript connector for MySQL. The objectives are simplicity and high performance for JavaScript users:

- allows end-to-end JavaScript development, from the browser to the server and now to the world's most popular open source database

- native "NoSQL" access to the storage layer without going first through SQL transformations and parsing.

Node.js is a complete web platform built around JavaScript designed to deliver millions of client connections on commodity hardware. With the MySQL NoSQL Connector for JavaScript, Node.js users can easily add data access and persistence to their web, cloud, social and mobile applications.

While the initial implementation is designed to plug and play with Node.js, the actual implementation doesn't depend heavily on Node, potentially enabling wider platform support in the future.

Changes since the previous blog:

- InnoDB is now supported via the mysql adapter (accesses data via the mysqld server)

- Auto-increment columns are now supported

- Default values in columns are now supported

- Multiple databases are now supported

- Column converters for JavaScript types that need special (user-written) mapping to database types are now supported

- Queries that specify all columns of a primary or unique key index are now supported

- When acquiring a connection or session, specific table or class metadata can be provided in order to pre-load database metadata and signal an error if not all metadata can be loaded

- Users can now get metadata for tables by using the session.getMetadata function

- The user interface to map JavaScript domain objects to database tables has been significantly simplified

Implementation

The architecture and user interface of this connector are very different from other MySQL connectors in a major way: it is an asynchronous interface that follows the event model built into Node.js.

To make it as easy as possible, we decided to use a domain object model to store the data. This allows for users to query data from the database and have a fully-instantiated object to work with, instead of having to deal with rows and columns of the database. The domain object model can have any user behavior that is desired, with the NoSQL connector providing the data from the database.

To make it as fast as possible, we use a direct connection from the user's address space to the database. This approach means that no SQL (pun intended) is needed to get to the data, and no SQL (and again) server is between the user and the data.

The connector is being developed to be extensible to multiple underlying database technologies, including direct, native access to both the MySQL Cluster "ndb" and InnoDB storage engines. The current release supports ndb via both native access and mysqld; and supports InnoDB via mysqld.

The connector integrates the MySQL Cluster native API library directly within the Node.js platform itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

The following sections take you through how to connect to MySQL, query the data and how to get started.


Connecting to the database

A Session is the main user access path to the database. You can get a Session object directly from the connector using the openSession function:

var nosql = require("mysql-js");

var dbProperties = {

    "implementation" : "ndb",

    "database" : "test"

};

nosql.openSession(dbProperties, null, onSession);

The openSession function calls back into the application upon creating a Session. The Session is then used to create, delete, update, and read objects.

Default database

Every session and connection to the database has a default database associated with it. When mapping domain objects to the database, or using a table name to identify a table, users can specify the database by using the explicit form for the table name: 'tableName.databaseName'. If users omit the databaseName, the default database associated with the session is used.

This feature supports multi-tenancy by allowing the database name to be specified during connection, while allowing the table to dynamically refer to the specific database in use.

Pre-load metadata for tables or domain objects

If your application requires specific tables or domain objects to be available, you can specify the tables and domain objects in your connect or openSession function. For example, if you need the table 't_basic' and the domain object 'Employee' to run the application, you can specify these during the connect or openSession functions.

nosql.openSession(dbProperties, ['t_basic', Employee], onSession);

If the t_basic table or the mapped Employee domain object are not able to be used by the session, then an error will be signaled and the onSession callback will report the specific error.

Getting Table metadata

If getting metadata associated with your application's tables is important, you can get the information by using the session function getMetadata. This function will return information about the specified table in the same format as used in mapTable. You can get the names and types of the table's columns and use the information to dynamically access tables and columns.

Reading data

The Session can read data from the database in a number of ways. If you simply want the data from the database, you provide a table name and the key of the row that you want. For example, consider this schema:

create table employee (

  id int not null primary key,

  name varchar(32),

  salary float

) ENGINE=ndbcluster;

Since the primary key is a number, you can provide the key as a number to the find function.

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find('employee', 0, onData);

};

function onData = function(err, data) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(data));

  ... use data in application

};

If you want to have the data stored in your own domain model, you tell the connector which table your domain model uses, by specifying an annotation, and pass your domain model to the find function.

function Employee = function(id, name, salary) {

  this.id = id;

  this.name = name;

  this.salary = salary;

  this.giveRaise = function(percent) {

    this.salary *= percent;

  }

};

annotations.mapClass(Employee, {'table' : 'employee'});

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find(Employee, 0, onData);

};

Special Domain Object Property Types

If your domain object uses types that do not map directly to database types, you can use column converters to transform domain types to database types.

For example, if you have a domain type such as a MaritalStatus type that contains only values of type MARITAL_STATUS, you can define a conversion that translates domain object values into database values.

var MARITAL_STATUS = {

NEVER_MARRIED: {value: 0, code: 'N', name: 'NEVER_MARRIED'},

MARRIED: {value: 1, code: 'M', name: 'MARRIED'},

DIVORCED: {value: 2, code: 'D', name: 'DIVORCED'},

lookup: function(value) {

switch (value) {

case 0: return this.NEVER_MARRIED; break;

case 1: return this.MARRIED; break;

case 2: return this.DIVORCED; break;

default: return null; break;

}

}

};


// column converter for status

var statusConverter = {

toDB: function toDB(status) {

return status.value;

},

fromDB: function fromDB(value) {

return MARITAL_STATUS.lookup(value);

}

};

Updating data

You can update the emp instance in memory, but to make the changes persistent, you need to write it back to the database, using the update function.

function onData = function(err, emp) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp); // oops, session is out of scope here

};

Using JavaScript can be tricky because it does not have the concept of block scope for variables. You can create a closure to handle these variables, or use a feature of the connector to remember your variables.

The connector api takes a fixed number of parameters and returns a fixed number of result parameters to the callback function. But the connector will keep track of variables for you and return them to the callback. So in the above example, change the onSession function to remember the session variable, and you can refer to it in the onData function:

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find(Employee, 0, onData, session);

};

function onData = function(err, emp, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp, onUpdate); // session is now in scope

};

function onUpdate = function(err, emp) {

  if (err) {

    console.log(err);

    ... error handling

  }

Inserting data

Inserting data requires a mapped JavaScript user function (constructor) and a session. Create a variable and persist it:

function onSession = function(err, session) {

  var data = new Employee(999, 'Mat Keep', 20000000);

  session.persist(data, onInsert);

  }

};

Autoincrement Columns

Columns allow but do not require users to specify the values for autoincrement columns. If users want to specify values for autoincrement columns, for example to reset the autoincrement value for the table, the insert function allows specification of values for these columns.

But if users want to exploit the autoincrement functionality, they must avoid setting a value for autoincrement columns. When mysql-js detects that the user has not specified a value, the next value in sequence is used. In the callback for the insert operation, mysql-js has filled in the missing values.

Default Values

Columns that specify a default value allow but do not require users to specify the values for these columns. If users want to specify values for these columns, the insert function allows specification of 'undefined' for these columns. In these cases, mysql-js will use the default values for these columns.

Deleting data

To remove data from the database, use the session remove function. You use an instance of the domain object to identify the row you want to remove. Only the key field is relevant.

function onSession = function(err, session) {

  var key = new Employee(999);

  session.remove(Employee, onDelete);

  }

};

More extensive queries

Queries are defined using a builder pattern and then executed with parameters that can specialize the query and control the operation of the query.

To define a query, use the createQuery function of Session. Provide a constructor function of a mapped domain object or a table name. The resulting QueryDomainType is returned in the callback. The QueryDomainType is a specialized object that has a property for each property in the user's domain object, or a property for each column in the table. These properties are of type QueryField, and they implement functions that allow you to compare column values of database rows to parameters supplied when you execute the query.

session.createQuery(Employee, function(err, qdt) {

// build and execute the query using qdt

});

To build the query, use the query domain type to filter the results. If nothing else is specified, executing the query will return all rows in the table mapped by Employee as an array of instances of Employee.

To filter the results, similar to using a WHERE clause in SQL, specify a query predicate using the where function of the query domain type. To build a query predicate, you can compare fields in the query domain type to values provided as parameters, using common comparison functions such as equal, greater than, etc. To compare fields, use the query field functions that are created in the query domain type. You can combine predicates using AND and OR functions. For example,

var salaryLowerBound = qdt.param('floor'); // define the formal parameter floor

var compareSalaryLowerBound = qdt.salary.ge(salaryLowerBound); // compare the field salary to the floor

var salaryUpperBound = qdt.param('ceiling'); // define the formal parameter ceiling

var compareSalaryUpperBound = qdt.salary.le(salaryUpperBound); // compare the field salary to the ceiling

 

var combinedComparisons = compareSalaryLowerBound.and(compareSalaryUpperBound);

qdt.where(combinedComparisons); // specify the entire filter for the query

The query api supports a fluent style of query composition. The query can be written as:

qdt.where(qdt.salary.ge(qdt.param('floor')).and(qdt.salary.le(qdt.param('ceiling'))));

The above query filter compares the salary greater or equal to parameter floor and less or equal to ceiling.

Executing Queries

Once the query has been built, you can execute the query, providing the actual parameters to be used in the query. The results of the query are returned as an array in the callback.

qdt.execute({'floor': 40000, 'ceiling': 80000}, function(err, results) {

if (err) throw Error('query failed');

results.forEach(function(result) {

console.out('Employee', result.name, 'has salary', result.salary);

});

}

How to evaluate

The MySQL Connector for JavaScript is available for download and forking from GitHub

Since we are still in the development phase, feedback is especially valuable (so don't hesitate to leave comments on this blog, or head to the MySQL Cluster forum). Try it out and see how easy (and fast) it is to integrate MySQL Cluster into your Node.js platforms.

You can also learn more about other previewed functionality of MySQL Cluster 7.3 Development Milestone Release DevZone article.

Thursday Nov 01, 2012

MySQL Cluster 7.3: On-Demand Webinar and Q&A Available

The on-demand webinar for the MySQL Cluster 7.3 Development Release is now available.

You can learn more about the design, implementation and getting started with all of the new MySQL Cluster 7.3 features from the comfort and convenience of your own device, including:

- Foreign Key constraints in MySQL Cluster

- Node.js NoSQL API 

- Auto-installation of higher performance distributed, clusters

We received some great questions over the course of the webinar, and I wanted to share those for the benefit of a broader audience.

Q. What Foreign Key actions are supported:

A. The core referential actions defined in the SQL:2003 standard are implemented:

CASCADE

RESTRICT

NO ACTION

SET NULL

Q. Where are Foreign Keys implemented, ie data nodes or SQL nodes?

A. They are implemented in the data nodes, therefore can be enforced for both the SQL and NoSQL APIs

Q. Are they compatible with the InnoDB Foreign Key implementation?

A. Yes, with the following exceptions:

- InnoDB doesn’t support “No Action” constraints, MySQL Cluster does

- You can choose to suspend FK constraint enforcement with InnoDB using the FOREIGN_KEY_CHECKS parameter; at the moment, MySQL Cluster ignores that parameter.

- You cannot set up FKs between 2 tables where one is stored using MySQL Cluster and the other InnoDB.

- You cannot change primary keys through the NDB API which means that the MySQL Server actually has to simulate such operations by deleting and re-adding the row. If the PK in the parent table has a FK constraint on it then this causes non-ideal behaviour. With Restrict or No Action constraints, the change will result in an error. With Cascaded constraints, you’d want the rows in the child table to be updated with the new FK value but, the implicit delete of the row from the parent table would remove the associated rows from the child table and the subsequent implicit insert into the parent wouldn’t reinstate the child rows. For this reason, an attempt to add an ON UPDATE CASCADE where the parent column is a primary key will be rejected.

Q. Does adding or dropping Foreign Keys cause downtime due to a schema change?

A. Nope, this is an online operation. MySQL Cluster supports a number of on-line schema changes, ie adding and dropping indexes, adding columns, etc.

Q. Where can I see an example of node.js with MySQL Cluster?

A. Check out the tutorial and download the code from GitHub

Q. Can I use the auto-installer to support remote deployments? How about setting up MySQL Cluster 7.2?

A. Yes to both!

Q. Can I get a demo

Check out the tutorial. You can download the code from http://labs.mysql.com/ Go to Select Build drop-down box

Q. What is be minimum internet speen required for Geo distributed cluster with synchronous replication?

A. if you're splitting you cluster between sites then we recommend a network latency of 20ms or less. Alternatively, use MySQL asynchronous replication where the latency of your WAN doesn't impact the latency of your reads/writes.

Q. Where you can one learn more about the PayPal project with MySQL Cluster?

A. Take a look at the following - you'll find press coverage, a video and slides from their keynote presentation 

So, if you want to learn more, listen to the new MySQL Cluster 7.3 on-demand webinar 

MySQL Cluster 7.3 is still in the development phase, so it would be great to get your feedback on these new features, and things you want to see!


Monday Oct 22, 2012

MySQL Cluster 7.3 - Join This Week's Webinar to Learn What's New

The first Development Milestone and Early Access releases of MySQL Cluster 7.3 were announced just several weeks ago. To provide more detail and demonstrate the new features, Andrew Morgan and I will be hosting a live webinar this coming Thursday 25th October at 0900 Pacific Time / 16.00 UTC

Even if you can't make the live webinar, it is still worth registering for the event as you will receive a notification when the replay will be available, to view on-demand at your convenience

In the webinar, we will discuss the enhancements being previewed as part of MySQL Cluster 7.3, including:

- Foreign Key Constraints: Yes, we've looked into the future and decided Foreign Keys are it ;-)

You can read more about the implementation of Foreign Keys in MySQL Cluster 7.3 here

- Node.js NoSQL API: Allowing web, mobile and cloud services to query and receive results sets from MySQL Cluster, natively in JavaScript, enables developers to seamlessly couple high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

You can study the Node.js / MySQL Cluster tutorial here

- Auto-Installer: This new web-based GUI makes it simple for DevOps teams to quickly configure and provision highly optimized MySQL Cluster deployments on-premise or in the cloud

You can view a YouTube tutorial on the MySQL Cluster Auto-Installer here 

So we have a lot to cover in our 45 minute session. It will be time well spent if you want to know more about the future direction of MySQL Cluster and how it can help you innovate faster, with greater simplicity.

Registration is open 

Friday Oct 05, 2012

New MySQL Cluster 7.3 Previews: Foreign Keys, NoSQL Node.js API and Auto-Tuned Clusters

At this weeks MySQL Connect conference, Oracle previewed an exciting new wave of developments for MySQL Cluster, further extending its simplicity and flexibility by expanding the range of use-cases, adding new NoSQL options, and automating configuration.

What’s new:

  • Development Release 1: MySQL Cluster 7.3 with Foreign Keys
  • Early Access “Labs” Preview: MySQL Cluster NoSQL API for Node.js
  • Early Access “Labs” Preview: MySQL Cluster GUI-Based Auto-Installer

In this blog, I'll introduce you to the features being previewed.

Review the blogs listed below for more detail on each of the specific features discussed.

Save the date!: A live webinar is scheduled for Thursday 25th October at 0900 Pacific Time / 1600UTC where we will discuss each of these enhancements in more detail. Registration will be open soon and published to the MySQL webinars page

MySQL Cluster 7.3: Development Release 1

The first MySQL Cluster 7.3 Development Milestone Release (DMR) previews Foreign Keys, bringing powerful new functionality to MySQL Cluster while reducing development complexity.

Foreign Key support has been one of the most requested enhancements to MySQL Cluster – enabling users to simplify their data models and application logic – while extending the range of use-cases for both custom projects requiring referential integrity and packaged applications, such as eCommerce, CRM, CMS, etc.

Implementation

The Foreign Key functionality is implemented directly within the MySQL Cluster data nodes, allowing any client API accessing the cluster to benefit from them – whether they are SQL or one of the NoSQL interfaces (Memcached, C++, Java, JPA, HTTP/REST or the new Node.js API - discussed later.)

The core referential actions defined in the SQL:2003 standard are implemented:

  • CASCADE
  • RESTRICT
  • NO ACTION
  • SET NULL

In addition, the MySQL Cluster implementation supports the online adding and dropping of Foreign Keys, ensuring the Cluster continues to serve both read and write requests during the operation.  This represents a further enhancement to MySQL Cluster's support for on0line schema changes, ie adding and dropping indexes, adding columns, etc. 

Read this blog for a demonstration of using Foreign Keys with MySQL Cluster. 

Getting Started with MySQL Cluster 7.3 DMR1:

Users can download either the source or binary and evaluate the MySQL Cluster 7.3 DMR with Foreign Keys now! (Select the Development Release tab).


MySQL Cluster NoSQL API for Node.js

Node.js is hot! In a little over 3 years, it has become one of the most popular environments for developing next generation web, cloud, mobile and social applications. Bringing JavaScript from the browser to the server, the design goal of Node.js is to build new real-time applications supporting millions of client connections, serviced by a single CPU core.

Making it simple to further extend the flexibility and power of Node.js to the database layer, we are previewing the Node.js Javascript API for MySQL Cluster as an Early Access release, available for download now from http://labs.mysql.com/. Select the following build:

MySQL-Cluster-NoSQL-Connector-for-Node-js

Alternatively, you can clone the project at the MySQL GitHub page

Implemented as a module for the V8 engine, the new API provides Node.js with a native, asynchronous JavaScript interface that can be used to both query and receive results sets directly from MySQL Cluster, without transformations to SQL.


Figure 1: MySQL Cluster NoSQL API for Node.js enables end-to-end JavaScript development

Rather than just presenting a simple interface to the database, the Node.js module integrates the MySQL Cluster native API library directly within the web application itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

The new Node.js API joins a rich array of NoSQL interfaces available for MySQL Cluster. Whichever API is chosen for an application, SQL and NoSQL can be used concurrently across the same data set, providing the ultimate in developer flexibility. 

Get started with MySQL Cluster NoSQL API for Node.js tutorial


MySQL Cluster GUI-Based Auto-Installer

Compatible with both MySQL Cluster 7.2 and 7.3, the Auto-Installer makes it simple for DevOps teams to quickly configure and provision highly optimized MySQL Cluster deployments – whether on-premise or in the cloud.

Implemented with a standard HTML GUI and Python-based web server back-end, the Auto-Installer intelligently configures MySQL Cluster based on application requirements and auto-discovered hardware resources


Figure 2: Automated Tuning and Configuration of MySQL Cluster

Developed by the same engineering team responsible for the MySQL Cluster database, the installer provides standardized configurations that make it simple, quick and easy to build stable and high performance clustered environments.

The auto-installer is previewed as an Early Access release, available for download now from http://labs.mysql.com/, by selecting the MySQL-Cluster-Auto-Installer build.

You can read more about getting started with the MySQL Cluster auto-installer here.

Watch the YouTube video for a demonstration of using the MySQL Cluster auto-installer


Getting Started with MySQL Cluster

If you are new to MySQL Cluster, the Getting Started guide will walk you through installing an evaluation cluster on a singe host (these guides reflect MySQL Cluster 7.2, but apply equally well to 7.3 and the Early Access previews). Or use the new MySQL Cluster Auto-Installer!

Download the Guide to Scaling Web Databases with MySQL Cluster (to learn more about its architecture, design and ideal use-cases).

Post any questions to the MySQL Cluster forum where our Engineering team and the MySQL Cluster community will attempt to assist you.

Post any bugs you find to the MySQL bug tracking system (select MySQL Cluster from the Category drop-down menu)

And if you have any feedback, please post them to the Comments section here or in the blogs referenced in this article.


Summary

MySQL Cluster 7.2 is the GA, production-ready release of MySQL Cluster. The first Development Release of MySQL Cluster 7.3 and the Early Access previews give you the opportunity to preview and evaluate future developments in the MySQL Cluster database, and we are very excited to be able to share that with you.

Let us know how you get along with MySQL Cluster 7.3, and other features that you want to see in future releases, by using the comments of this blog.

Saturday Sep 29, 2012

Tutorial: Getting Started with the NoSQL JavaScript / Node.js API for MySQL Cluster

Tutorial authored by Craig Russell and JD Duncan 

The MySQL Cluster team are working on a new NoSQL JavaScript connector for MySQL. The objectives are simplicity and high performance for JavaScript users:

- allows end-to-end JavaScript development, from the browser to the server and now to the world's most popular open source database

- native "NoSQL" access to the storage layer without going first through SQL transformations and parsing.

Node.js is a complete web platform built around JavaScript designed to deliver millions of client connections on commodity hardware. With the MySQL NoSQL Connector for JavaScript, Node.js users can easily add data access and persistence to their web, cloud, social and mobile applications.

While the initial implementation is designed to plug and play with Node.js, the actual implementation doesn't depend heavily on Node, potentially enabling wider platform support in the future.

Implementation

The architecture and user interface of this connector are very different from other MySQL connectors in a major way: it is an asynchronous interface that follows the event model built into Node.js.

To make it as easy as possible, we decided to use a domain object model to store the data. This allows for users to query data from the database and have a fully-instantiated object to work with, instead of having to deal with rows and columns of the database. The domain object model can have any user behavior that is desired, with the NoSQL connector providing the data from the database.

To make it as fast as possible, we use a direct connection from the user's address space to the database. This approach means that no SQL (pun intended) is needed to get to the data, and no SQL server is between the user and the data.

The connector is being developed to be extensible to multiple underlying database technologies, including direct, native access to both the MySQL Cluster "ndb" and InnoDB storage engines.

The connector integrates the MySQL Cluster native API library directly within the Node.js platform itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

The following sections take you through how to connect to MySQL, query the data and how to get started.


Connecting to the database

A Session is the main user access path to the database. You can get a Session object directly from the connector using the openSession function:

var nosql = require("mysql-js");

var dbProperties = {

    "implementation" : "ndb",

    "database" : "test"

};

nosql.openSession(dbProperties, null, onSession);

The openSession function calls back into the application upon creating a Session. The Session is then used to create, delete, update, and read objects.


Reading data

The Session can read data from the database in a number of ways. If you simply want the data from the database, you provide a table name and the key of the row that you want. For example, consider this schema:

create table employee (

  id int not null primary key,

  name varchar(32),

  salary float

) ENGINE=ndbcluster;

Since the primary key is a number, you can provide the key as a number to the find function.

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find('employee', 0, onData);

};

function onData = function(err, data) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(data));

  ... use data in application

};

If you want to have the data stored in your own domain model, you tell the connector which table your domain model uses, by specifying an annotation, and pass your domain model to the find function.

var annotations = new nosql.Annotations();

function Employee = function(id, name, salary) {

  this.id = id;

  this.name = name;

  this.salary = salary;

  this.giveRaise = function(percent) {

    this.salary *= percent;

  }

};

annotations.mapClass(Employee, {'table' : 'employee'});

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find(Employee, 0, onData);

};


Updating data

You can update the emp instance in memory, but to make the raise persistent, you need to write it back to the database, using the update function.

function onData = function(err, emp) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp); // oops, session is out of scope here

};

Using JavaScript can be tricky because it does not have the concept of block scope for variables. You can create a closure to handle these variables, or use a feature of the connector to remember your variables.

The connector api takes a fixed number of parameters and returns a fixed number of result parameters to the callback function. But the connector will keep track of variables for you and return them to the callback. So in the above example, change the onSession function to remember the session variable, and you can refer to it in the onData function:

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find(Employee, 0, onData, session);

};

function onData = function(err, emp, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp, onUpdate); // session is now in scope

};

function onUpdate = function(err, emp) {

  if (err) {

    console.log(err);

    ... error handling

  }


Inserting data

Inserting data requires a mapped JavaScript user function (constructor) and a session. Create a variable and persist it:

function onSession = function(err, session) {

  var data = new Employee(999, 'Mat Keep', 20000000);

  session.persist(data, onInsert);

  }

};


Deleting data

To remove data from the database, use the session remove function. You use an instance of the domain object to identify the row you want to remove. Only the key field is relevant.

function onSession = function(err, session) {

  var key = new Employee(999);

  session.remove(Employee, onDelete);

  }

};


More extensive queries

We are working on the implementation of more extensive queries along the lines of the criteria query api. Stay tuned.

How to evaluate

The MySQL Connector for JavaScript is available for download from labs.mysql.com. Select the build:

MySQL-Cluster-NoSQL-Connector-for-Node-js

You can also clone the project on GitHub

Since it is still early in development, feedback is especially valuable (so don't hesitate to leave comments on this blog, or head to the MySQL Cluster forum). Try it out and see how easy (and fast) it is to integrate MySQL Cluster into your Node.js platforms.

You can learn more about other previewed functionality of MySQL Cluster 7.3 here

Wednesday Sep 12, 2012

MySQL Connect: What to Expect From the Wondrous Land of MySQL Cluster

The MySQL Connect conference is only a couple of weeks away, with MySQL engineers, support teams, consultants and community aces busy putting the final touches to their talks.

There will be many exciting new announcements and sharing of best practices at the conference, covering the range of MySQL technologies.

MySQL Cluster will a big part of this, so I wanted to share some key sessions for those of you who plan on attending, as well as some resources for those who are not lucky enough to be able to make the trip, but who can't afford to miss the key news. Of course, this is no substitute to actually being there….and the good news is that registration is still open ;-)

Roadmap:

Whats New in MySQL Cluster Saturday 29th, 1300-1400, in Golden Gate room 5.                                                                                        Bernd Ocklin, director of MySQL Cluster development, and myself will be taking a look at what follows the latest MySQL Cluster 7.2 release. I don't want to give to much away - lets just say its not often you can add powerful new functionality to a product while at the same time making life radically simpler for its users.

For those not making it to the Conference, a live webinar repeating the talk is scheduled for Thursday 25th October at 09.00 pacific time. Hold the date, registration will be open for that soon and published to our MySQL Webinars page

Best Practices

Getting Started with MySQL Cluster, Hands-On Lab Saturday 29th, 1600-1700, in Plaza Room A.                                                              Santo Leto, one of our lead MySQL Cluster support engineers, regularly works with users new to MySQL Cluster, assisting them in installation, configuration, scaling, etc. In this lab, Santo will share best-practices in getting started.

Delivering Breakthrough Performance with MySQL Cluster Saturday 29th, 1730-1830, in Golden Gate room 5.

Frazer Clement, lead MySQL Cluster software engineer, will demonstrate how to translate the awesome Cluster benchmarks (remember 1 BILLION UPDATEs per minute ?!) into real-world performance.

You can also get some best practices from our new MySQL Cluster performance guide 

MySQL Cluster BoF Saturday 29th, 1900-2000, room Golden Gate 5.                                                                                                           Come and get a demonstration of new tools for the installation and configuration of MySQL Cluster, and spend time with the engineering team discussing any questions or issues you may have.

Developing High-Throughput Services with NoSQL APIs to InnoDB and MySQL Cluster Sunday 30th, 1145 - 1245, in Golden Gate room 7.  

In this session, JD Duncan and Andrew Morgan will present how to get started with both Memcached and new NoSQL APIs.

JD and I recently ran a webinar demonstrating how to build simple Twitter-like services with Memcached and MySQL Cluster. The replay is available for download

Case Studies:

MySQL Cluster @ El Chavo, Latin America’s #1 Facebook Game Sunday 30th, 1745 - 1845, in Golden Gate room 4.                             Playful Play deployed MySQL Cluster CGE to power their market leading social game. This session will discuss the challenges they faced, why they selected MySQL Cluster and their experiences to date.

You can read more about Playful Play and MySQL Cluster here 

A Journey into NoSQLand: MySQL’s NoSQL Implementation Sunday 30th, 1345 - 1445, in Golden Gate room 4.                                          Lig Turmelle, web DBA at Kaplan Professional and esteemed Oracle Ace, will discuss her experiences working with the NoSQL interfaces for both MySQL Cluster and InnoDB

Evaluating MySQL HA Alternatives Saturday 29th, 1430-1530, room Golden Gate 5                                                                                   Henrik Ingo, former member of the MySQL sales engineering team, will provide an overview of various HA technologies for MySQL, starting with replication, progressing to InnoDB, Galera and MySQL Cluster

What about the other stuff?

Of course MySQL Connect has much, much more than MySQL Cluster. There will be lots on replication (which I'll blog about soon), MySQL 5.6, InnoDB, cloud, etc, etc. Take a look at the full Content Catalog to see more.

If you are attending, I hope to see you at one of the Cluster sessions...and remember, registration is still open




Tuesday Jul 31, 2012

MySQL Cluster Performance Best Practices: Q & A

With its distributed, shared-nothing, real-time design, MySQL Cluster has attracted a lot of attention from developers who need to scale both read and write traffic with ultra-low latency and fault-tolerance, using commodity hardware. With many proven deployments in web, gaming, telecoms and mobile use-cases, MySQL Cluster is certainly able to meet these sorts of requirements.

But, as a distributed database, developers do need to think a little differently about data access patterns along with schema and query optimisations in order to get the best possible performance.

Sharing best practices developed by working with MySQL Cluster's largest users, we recently ran a Performance Essentials webinar, and the replay is now available, on-demand, for you to listen to in the comfort of your own office.

The webinar also accompanies a newly published Guide to optimizing the performance of MySQL Cluster.

We received a number of great questions over the course of the webinar, and I thought it would be useful to share a selection of those:

Q. How do I calculate and then monitor memory usage with MySQL Cluster?

A. If designing a completely new database, the following calculations can be used to help determine the approximate memory sizing requirements for the data nodes:

(in memory) Data Size * Replicas * 1.25 = Total Database Memory Requirements

Example: 50 GB * 2 * 1.25 = 125 GB

(Data Size * Replicas * 1.25)/Nodes = RAM Per Node

Example: (2 GB * 2 * 1.25)/4 = 31.25 GB

To see how much of the configured memory is currently in use by the database, you can query the ndbinfo.memory usage table

If using MySQL Cluster CGE then you can view this information over time in a MySQL Enterprise Monitor graph.


Q. Would enabling Disk space Table Space be an impact on the Query Performance ?

A. It can do. The only reason to use Disk based table spaces is when you do not have sufficient memory to store all data in-memory. Therefore some of your disk based data will be uncached at some time, and reads or writes which access this data will stall while the necessary pages are read into the page buffer. This can reduce throughput.


Q. I've seen that MySQL Cluster 7.2 can speed up JOIN operations by 70x. How does it do this?

A. There are two new features in MySQL Cluster 7.2, which when combined, can significantly improve the performance of joins over previous versions of MySQL Cluster:

- The Index Statistics function enables the SQL optimizer to build a better execution plan for each query. In the past, non-optimal query plans required a manual enforcement of indexes via USE INDEX or FORCE INDEX to alter the execution plan. ANALYZE TABLE must first be run on each table to take advantage of this.

- Adaptive Query Localization (AQL) allows the work of the join to be distributed across the data nodes (local to the data it’s working with) rather than up in the MySQL Server; this allows more computing power to be applied to calculating the join as well as dramatically reducing the number of messages being passed around the system.

You can learn more about AQL and a sample query here


Q. Can all JOINs use AQL?

A. In order for a join to be able to exploit AQL (in other words be “pushed down” to the data nodes), it must meet the following conditions:

1. Any columns to be joined must use exactly the same data type. (For example, if an INT and a BIGINT column are joined, the join cannot be pushed down). This includes the lengths of any VARCHAR columns.

2. Joins referencing BLOB or TEXT columns will not be pushed down.

3. Explicit locking is not supported; however, the NDB (MySQL Cluster) storage engine's characteristic implicit row-based locking is enforced.

4. In order for a join to be pushed down, child tables in the Join must be accessed using one of the ref, eq_ref, or const access methods, or some combination of these methods. These access methods are described in the documentation

5. Joins referencing tables explicitly partitioned by [LINEAR] HASH, LIST, or RANGE currently cannot be pushed down

6. If the query plan decides to 'Using join buffer' for a candidate child table, that table cannot be pushed as child. However, it might be the root of another set of pushed tables.

7. If the root of the pushed Join is an eq_ref or const, only child tables joined by eq_ref can be appended. (A ref joined table will then likely become a root of another pushed Join)

These conditions should be considered when designing your schema and application queries – for example, to comply with constraint 4, attempt to make any table scan that is part of the Join be the first clause.

Where a query involves multiple levels of Joins, it is perfectly possible for some levels to be pushed down while others continue to be executed within the MySQL Server.

If your application consists of many of these types of JOIN operations which cannot be made to exploit AQL, other MySQL storage engines such as InnoDB will present a better option for your workload.


Q. What are best practices for data model and query design?

A. The data model and queries should be designed to minimize network roundtrips between hosts. Ensuring that joins meet the requirements for

AQL and avoiding full table scans can help with this.

Looking up data in a hash table is a constant time operation, unaffected by the size of the data set

Looking up data in a tree (T-tree, B-tree etc) structure is logarithmic (O (log n)).

For a database designer this means it is very important to choose the right index structure and access method to retrieve data. We strongly recommend application requests with high requirements on performance be designed as primary key lookups. This is because looking up data in a hash structure is faster than from a tree structure and can be satisfied by a single data node. Therefore, it is very important that the data model takes this into account. It also follows that choosing a good primary key definition is extremely important.

If ordered index lookups are required then tables should be partitioned such that only one data node will be scanned.

The distributed nature of the Cluster and the ability to exploit multiple CPUs, cores or threads within nodes means that the maximum performance will be achieved if the application is architected to run many transactions in parallel. Alternatively you should run many instances of the application simultaneously to ensure that the Cluster is always able to work on many transactions in parallel.

Take a look at the Guide to optimizing the performance of MySQL Cluster for more detail


Q. What are best practices for parallelising my application and access to MySQL Cluster?

A. As mentioned MySQL Cluster is a distributed, auto-sharded database. This means that there is often more than one Data Node that can work in parallel to satisfy application requests.

Additionally, MySQL Cluster 7.2 enhances multi-threading so data nodes can now effectively exploit multiple threads / cores. To use this functionality, the data nodes should be started using the ndbmtd binary rather than ndb and config.ini should be configured correctly

Parallelization can be achieved in MySQL Cluster in several ways:

- Adding more Application Nodes

- Use of multi-threaded data nodes

- Batching of requests

- Parallelizing work on different Application Nodes connected to the Data Nodes

- Utilizing multiple connections between each Application Node and the Data Nodes (connection pooling)

How many threads and how many applications are needed to drive the desired load has to be studied by benchmarks. One approach of doing this is to connect one Application Node at a time and increment the number of threads. When one Application Node cannot generate any more load, add another one. It is advisable to start studying this on a two Data Node cluster, and then grow the number of Data Nodes to understand how your system is scaling.

If you have designed your application queries, and data model according to best practices presented in the Performance Guide, you can expect close to double the throughput on a four Data Node system compared to a two Data Node system, given that the application can generate the load.

Try to multi-thread whenever possible and load balance over more MySQL servers.

In MySQL Cluster you have access to additional performance enhancements that allow better utilization on multi-core / thread CPUs, including:

- Reduced lock contention by having multiple connections from one MySQL Server to the Data Nodes (--ndb-cluster-connection-pool=X):

- Setting threads to real-time priority

- Locking Data Node threads (kernel thread and maintenance threads to a CPU)


Q. Does MySQL Cluster’s data distribution add complexity to my application and limit the types of queries I can run?

A. No, it doesn't. By default, tables are automatically partitioned (sharded) across data nodes by hashing the primary key. Other partitioning methods are supported, but in most instances the default is acceptable.

As the sharding is automatic and implemented at the database layer, application developers do not need to modify their applications to control data distribution – which significantly simplifies scaling.

In addition, applications are free to run complex queries such as JOIN operations across the shards, therefore users do not need to trade functionality for scalability.


Q. What hardware would you recommend to get the best performance from MySQL Cluster?

A. It varies by node type. For data nodes:

- Up to 32 x x86-64 bit CPU cores. Use as high a frequency as possible as this will enable faster processing of messages between nodes;

- Large CPU caches assist in delivering optimal performance;

- 64-bit hosts with enough RAM to store your in-memory data set

- Linux, Solaris or Windows operating systems.

- 2 x Network Interface Cards and 2 x Power Supply Units for hardware redundancy.

It is important to ensure systems are configured to reduce swapping to disk whenever possible.

As a rule of thumb, have 7x times the amount of DataMemory configured for disk space for each data node. This space is needed for storing 2 Local Checkpoints (LCPs), the Redo log and 3 backups. You will also want to allocate space for table spaces if you are making use of disk-based data – including allowing extra space for the backups.

Having a fast, low-latency disk subsystem is very important and will affect check pointing and backups.

Download the MySQL Cluster Evaluation Guide for more recommendations 

The hardware requirements for MySQL Servers would be a little less:

- 4 - 32 x86-64 bit CPU cores

- Minimum 4GB of RAM. Memory is not as critical at this layer, and requirements will be influenced by connections and buffers.

- 2 x Network Interface Cards and 2 x Power Supply Units for hardware redundancy.


Q. I heard that MySQL Cluster doesn't support Foreign Keys, how can I get around that?

A. Foreign keys are previewed in MySQL Cluster 7.3 Early Access release which you can download and evaluate now. In MySQL Cluster 7.2 and earlier, you can emulate foreign keys programmatically via triggers.


Summary

If you are thinking about using MySQL Cluster for your next project, it is worth investing a little bit of time to get familiar with these performance best practices. The Webinar replay, the MySQL Cluster Performance Guide and the MySQL Cluster Evaluation Guide will give you pretty much everything you need to build high performance, high availability services with MySQL Cluster. 




Tuesday Jul 17, 2012

MySQL Cluster Powers El Chavo from Playful Play, Latin America’s Most Popular Facebook Game

Introduction

Attracting over 3m subscribers in just 6 months and growing by 30,000 new users per day, Playful Play needed a database that was able to keep pace with the massive scalability and high availability demands of the wildly successful La Vecindad de El Chavo Facebook game.

Playful Play selected MySQL Cluster CGE running on a public cloud to power their gaming platform, providing:

  • 45% improvement in performance;
  • 99.999% uptime;
  • 80% reduction in DBA overhead;
  • Local language support, 24x7.

As a result, Playful Play has been able to maintain user growth rates and attract new advertisers to the platform, while enhancing agility and reducing cost.

Company Overview

Based out of Monterrey, Mexico, Playful Play has created Latin America’s #1 Facebook game based on "El Chavo del 8", the cultural and TV phenomenon that has been running across Latin America for the past four decades. The show is also extremely popular in Spain and the United States.

El Chavo (The Kid) appeals to a broad demographic, having attracted over 3M subscribers in its first 6 months, and adding 30,000 new users per day. The game has also proved popular with advertisers who are keen to integrate social gaming into their marketing strategies in order to raise awareness and build brand loyalty.

Through partnerships with Televisa Home Entertainment and Grupo Chespirito, Playful Play has aggressive plans to grow their business through the development of new social games targeting Latin America, Brazil, the United States and Spain.


The Challenges of Rapid Growth

As a start-up business, fast time to market at the lowest possible case was the leading development priority after Playful Play secured the rights to develop the El Chavo game.

As a result, Playful Play developed the first release of the game on the proven LAMP (Linux, Apache, MySQL, PHP / Perl / Python) stack.

To meet both the scalability and availability requirements of the game, Playful Play initially deployed MySQL in a replicated, multi-master configuration.

The database is core to the game, responsible for managing:

  • User profiles and avatars
  • Gaming session data;
  • In-app (application) purchases;
  • Advertising and digital marketing event data.

As La Vecidad de El Chavo spread virally across Facebook, subscriptions rapidly exceeded one million users, leading Playful Play to consider how to best architect their gaming platforms for long-term growth. They had heard about the release of MySQL Cluster 7.2, and so decided to initiate an evaluation to determine if it could meet their requirements.


The Route to Internet-Scale with MySQL Cluster

In addition to growing user volumes, the El Chavo game also added new features that changed the profile of the database. Operations became more write-intensive, with INSERTs and UPDATEs accounting for up to 70% of the database load.

The game’s popularity also attracted advertisers, who demanded strict SLAs for both performance (predictable throughput with low latency) as well as uptime.

Through their evaluation, the developers at Playful Play identified that MySQL Cluster was able to meet their needs.

Write Performance with Real-Time Responsiveness

MySQL Cluster’s multi-master architecture, coupled with the ability to auto-shard (partition) tables across nodes, enabled Playful Play to meet the performance and scalability demands of the El Chavo game, without changes to their applications.

The ability to store data in memory and persist to disk delivered the sub-millisecond responsiveness and durability demanded by the game’s users.

In addition, Playful Play was able to horizontally scale the database across low-cost commodity nodes running in the cloud, reducing their TCO and enhancing agility.

Continuous Availability

The shared-nothing, distributed design of MySQL Cluster coupled with integrated replication and self-healing recovery ensured high availability of the gaming platform, without DevOps intervention.

MySQL Cluster was able to protect against downtime resulting from both failures and planned upgrades. For example, Playful Play has been able to scale the cluster on-line to support growing user volumes, without service interruption.

Data Integrity Supporting Monetization

As the game evolved to support in-app purchases and digital marketing, the ACID-compliance offered by MySQL Cluster afforded the transactional integrity demanded by users and advertisers.


The Value of MySQL Cluster Consulting, Support and Management Tools

Following their successful evaluation, Playful Play deployed MySQL Cluster with their public cloud provider.

Playful Play wanted to ensure they were getting the most out of the deployment, so Oracle’s MySQL consulting team reviewed their Architecture and Design. The consultants were able to further optimize the database to deliver even higher levels of performance, and demonstrated how Playful Play could automatically scale the database using MySQL Cluster Manager.

The MySQL Cluster deployment at Playful Play is growing rapidly, and currently comprised of over 25 nodes running Linux on commodity x86 servers, each configured with 24-cores and 64GB of RAM:

  • 12 x MySQL Cluster data nodes;
  • 12 x MySQL Server SQL nodes;
  • 2 x MySQL Cluster management nodes.

Each data and SQL node is co-located to a single physical instance.

 Playful Play Deployment Architecture, built for rapid scale

Using the architecture above, MySQL Cluster is currently supporting:

  • 3 million subscribers, with 30,000 new additions per day across Latin America, Europe and U.S;
  • 10,000 concurrent users;
  • 10,000 Transactions Per Second;
  • 99.999% uptime.

"The MySQL support service has been essential in helping us for troubleshooting and giving recommendations for the production cluster, Thanks" Carlos Morales – DBA, Playfulplay.com México

Playful Play has deployed MySQL Cluster Carrier-Grade Edition, providing access to technical and consultative support, in addition to the MySQL Enterprise Monitor and MySQL Cluster Manager tools.


MySQL Enterprise Monitor Enables Continuous Service Availability

MySQL Enterprise Monitor enables Playful Play to proactively monitor the entire cluster from a single GUI. DBAs are automatically notified if any environment variables begin to exceed pre-defined thresholds, and presented with the instructions necessary to take corrective action, before users are impacted.

MySQL Cluster Manager is used to automate cluster configuration changes, eliminating the risk of manual errors and significantly enhancing DevOps productivity.

The subscription has delivered terrific value to Playful Play, enabling them to:

  • Improve performance by 45% while reducing CPU utilization by 35%, providing significant headroom for continued growth;
  • Reduced DBA overhead by 80%, providing substantial cost savings;
  • Access to local language support, 24x7.


The Future with MySQL Cluster

Playful Play has aggressive growth plans, seeking to attract over 50M subscribers within 5 years. El Chavo del 8 is continuing to gain widespread popularity in Latin regions around the world, and will be joined by other social networking games currently in development. The next major target is Brazil, presenting a very attractive, emerging market in social gaming.

MySQL Cluster CGE has been selected as the database platform powering this growth.

Playful Play will be presenting their experiences and best practices with MySQL Cluster at the MySQL Connect conference, September 29th and 30th 2012 in San Francisco.


Further Resources

MySQL Cluster at Playful Play On-Demand Webinar (Spanish)

MySQL Cluster Demonstration

Guide to Scaling Web Databases with MySQL Cluster



Monday Jul 02, 2012

NoSQL Java API for MySQL Cluster: Questions & Answers

The MySQL Cluster engineering team recently ran a live webinar, available now on-demand demonstrating the ClusterJ and ClusterJPA NoSQL APIs for MySQL Cluster, and how these can be used in building real-time, high scale Java-based services that require continuous availability.

Attendees asked a number of great questions during the webinar, and I thought it would be useful to share those here, so others are also able to learn more about the Java NoSQL APIs.

First, a little bit about why we developed these APIs and why they are interesting to Java developers.

ClusterJ and Cluster JPA

ClusterJ is a Java interface to MySQL Cluster that provides either a static or dynamic domain object model, similar to the data model used by JDO, JPA, and Hibernate. A simple API gives users extremely high performance for common operations: insert, delete, update, and query.

ClusterJPA works with ClusterJ to extend functionality, including

- Persistent classes

- Relationships

- Joins in queries

- Lazy loading

- Table and index creation from object model

By eliminating data transformations via SQL, users get lower data access latency and higher throughput. In addition, Java developers have a more natural programming method to directly manage their data, with a complete, feature-rich solution for Object/Relational Mapping. As a result, the development of Java applications is simplified with faster development cycles resulting in accelerated time to market for new services.

MySQL Cluster offers multiple NoSQL APIs alongside Java:

  • - Memcached for a persistent, high performance, write-scalable Key/Value store,
  • - HTTP/REST via an Apache module
  • - C++ via the NDB API for the lowest absolute latency.

Developers can use SQL as well as NoSQL APIs for access to the same data set via multiple query patterns – from simple Primary Key lookups or inserts to complex cross-shard JOINs using Adaptive Query Localization

Marrying NoSQL and SQL access to an ACID-compliant database offers developers a number of benefits. MySQL Cluster’s distributed, shared-nothing architecture with auto-sharding and real time performance makes it a great fit for workloads requiring high volume OLTP. Users also get the added flexibility of being able to run real-time analytics across the same OLTP data set for real-time business insight.

OK – hopefully you now have a better idea of why ClusterJ and JPA are available. Now, for the Q&A.

Q & A

Q. Why would I use Connector/J vs. ClusterJ?

A. Partly it's a question of whether you prefer to work with SQL (Connector/J) or objects (ClusterJ). Performance of ClusterJ will be better as there is no need to pass through the MySQL Server. A ClusterJ operation can only act on a single table (e.g. no joins) - ClusterJPA extends that capability

Q. Can I mix different APIs (ie ClusterJ, Connector/J) in our application for different query types?

A. Yes. You can mix and match all of the API types, SQL, JDBC, ODBC, ClusterJ, Memcached, REST, C++. They all access the exact same data in the data nodes. Update through one API and new data is instantly visible to all of the others.

Q. How many TCP connections would a SessionFactory instance create for a cluster of 8 data nodes?

A. SessionFactory has a connection to the mgmd (management node) but otherwise is just a vehicle to create Sessions. Without using connection pooling, a SessionFactory will have one connection open with each data node. Using optional connection pooling allows multiple connections from the SessionFactory to increase throughput.

Q. Can you give details of how Cluster J optimizes sharding to enhance performance of distributed query processing?

A. Each data node in a cluster runs a Transaction Coordinator (TC), which begins and ends the transaction, but also serves as a resource to operate on the result rows. While an API node (such as a ClusterJ process) can send queries to any TC/data node, there are performance gains if the TC is where most of the result data is stored. ClusterJ computes the shard (partition) key to choose the data node where the row resides as the TC.

Q. What happens if we perform two primary key lookups within the same transaction? Are they sent to the data node in one transaction?

A. ClusterJ will send identical PK lookups to the same data node.

Q. How is distributed query processing handled by MySQL Cluster ?

A. If the data is split between data nodes then all of the information will be transparently combined and passed back to the application. The session will connect to a data node - typically by hashing the primary key - which then interacts with its neighboring nodes to collect the data needed to fulfil the query.

Q. Can I use Foreign Keys with MySQL Cluster

A. Support for Foreign Keys is included in the MySQL Cluster 7.3 Early Access release

Summary

The NoSQL Java APIs are packaged with MySQL Cluster, available for download here so feel free to take them for a spin today!

Key Resources

MySQL Cluster on-line demo 

MySQL ClusterJ and JPA On-demand webinar 

MySQL ClusterJ and JPA documentation

MySQL ClusterJ and JPA whitepaper and tutorial

Tuesday Jun 05, 2012

MySQL Cluster 7.3 Labs Release – Foreign Keys Are In!

Summary (aka TL/DR):

Support for Foreign Key constraints has been one of the most requested feature enhancements for MySQL Cluster. We are therefore extremely excited to announce that Foreign Keys are part of the first Labs Release of MySQL Cluster 7.3 – available for download, evaluation and feedback now! (Select the mysql-cluster-7.3-labs-June-2012 build)

In this blog, I will attempt to discuss the design rationale, implementation, configuration and steps to get started in evaluating the first MySQL Cluster 7.3 Labs Release.

Pace of Innovation

It was only a couple of months ago that we announced the General Availability (GA) of MySQL Cluster 7.2, delivering 1 billion Queries per Minute, with 70x higher cross-shard JOIN performance, Memcached NoSQL key-value API and cross-data center replication.  This release has been a huge hit, with downloads and deployments quickly reaching record levels.

The announcement of the first MySQL Cluster 7.3 Early Access lab release at today's MySQL Innovation Day event demonstrates the continued pace in Cluster development, and provides an opportunity for the community to evaluate and feedback on new features they want to see.

What’s the Plan for MySQL Cluster 7.3?

Well, Foreign Keys, as you may have gathered by now (!), and this is the focus of this first Labs Release.

As with MySQL Cluster 7.2, we plan to publish a series of preview releases for 7.3 that will incrementally add new candidate features for a final GA release (subject to usual safe harbor statement below*), including:

- New NoSQL APIs;

- Features to automate the configuration and provisioning of multi-node clusters, on premise or in the cloud;

- Performance and scalability enhancements;

- Taking advantage of features in the latest MySQL 5.x Server GA.

Design Rationale

MySQL Cluster is designed as a “Not-Only-SQL” database. It combines attributes that enable users to blend the best of both relational and NoSQL technologies into solutions that deliver web scalability with 99.999% availability and real-time performance, including:

  • Concurrent NoSQL and SQL access to the database;
  • Auto-sharding with simple scale-out across commodity hardware;
  • Multi-master replication with failover and recovery both within and across data centers;
  • Shared-nothing architecture with no single point of failure;
  • Online scaling and schema changes;
  • ACID compliance and support for complex queries, across shards.

Native support for Foreign Key constraints enables users to extend the benefits of MySQL Cluster into a broader range of use-cases, including:

- Packaged applications in areas such as eCommerce and Web Content Management that prescribe databases with Foreign Key support.

- In-house developments benefiting from Foreign Key constraints to simplify data models and eliminate the additional application logic needed to maintain data consistency and integrity between tables.

Implementation

The Foreign Key functionality is implemented directly within MySQL Cluster’s data nodes, allowing any client API accessing the cluster to benefit from them – whether using SQL or one of the NoSQL interfaces (Memcached, C++, Java, JPA or HTTP/REST.)

The core referential actions defined in the SQL:2003 standard are implemented:

  • CASCADE
  • RESTRICT
  • NO ACTION
  • SET NULL

In addition, the MySQL Cluster implementation supports the online adding and dropping of Foreign Keys, ensuring the Cluster continues to serve both read and write requests during the operation.

An important difference to note with the Foreign Key implementation in InnoDB is that MySQL Cluster does not support the updating of Primary Keys from within the Data Nodes themselves - instead the UPDATE is emulated with a DELETE followed by an INSERT operation. Therefore an UPDATE operation will return an error if the parent reference is using a Primary Key, unless using CASCADE action, in which case the delete operation will result in the corresponding rows in the child table being deleted. The Engineering team plans to change this behavior in a subsequent preview release.

Also note that when using InnoDB "NO ACTION" is identical to "RESTRICT". In the case of MySQL Cluster “NO ACTION” means “deferred check”, i.e. the constraint is checked before commit, allowing user-defined triggers to automatically make changes in order to satisfy the Foreign Key constraints.

Configuration

There is nothing special you have to do here – Foreign Key constraint checking is enabled by default.

If you intend to migrate existing tables from another database or storage engine, for example from InnoDB, there are a couple of best practices to observe:

1. Analyze the structure of the Foreign Key graph and run the ALTER TABLE ENGINE=NDB in the correct sequence to ensure constraints are enforced

2. Alternatively drop the Foreign Key constraints prior to the import process and then recreate when complete.

Getting Started

Read this blog for a demonstration of using Foreign Keys with MySQL Cluster. 

You can download MySQL Cluster 7.3 Labs Release with Foreign Keys today - (select the mysql-cluster-7.3-labs-June-2012 build)

If you are new to MySQL Cluster, the Getting Started guide will walk you through installing an evaluation cluster on a singe host (these guides reflect MySQL Cluster 7.2, but apply equally well to 7.3)

Post any questions to the MySQL Cluster forum where our Engineering team will attempt to assist you.

Post any bugs you find to the MySQL bug tracking system (select MySQL Cluster from the Category drop-down menu)

And if you have any feedback, please post them to the Comments section of this blog.

Summary

MySQL Cluster 7.2 is the GA, production-ready release of MySQL Cluster. This first Labs Release of MySQL Cluster 7.3 gives you the opportunity to preview and evaluate future developments in the MySQL Cluster database, and we are very excited to be able to share that with you.

Let us know how you get along with MySQL Cluster 7.3, and other features that you want to see in future releases.

* Safe Harbor Statement

This information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracles products remains at the sole discretion of Oracle.

Friday Jun 01, 2012

Configuring MySQL Cluster Data Nodes

In my previous blog post, I discussed the enhanced performance and scalability delivered by extensions to the multi-threaded data nodes in MySQL Cluster 7.2. In this post, I’ll share best practices on the configuration of data nodes to achieve optimum performance on the latest generations of multi-core, multi-thread CPU designs.

Configuring the Data Nodes

The configuration of data node threads can be managed in two ways via the config.ini file:

- Simply set MaxNoOfExecutionThreads to the appropriate number of threads to be run in the data node, based on the number of threads presented by the processors used in the host or VM.

- Use the new ThreadConfig variable that enables users to configure both the number of each thread type to use and also which CPUs to bind them too.

The flexible configuration afforded by the multi-threaded data node enhancements means that it is possible to optimise data nodes to use anything from a single CPU/thread up to a 48 CPU/thread server. Co-locating the MySQL Server with a single data node can fully utilize servers with 64 – 80 CPU/threads. It is also possible to co-locate multiple data nodes per server, but this is now only required for very large servers with 4+ CPU sockets dense multi-core processors.

24 Threads and Beyond!

An example of how to make best use of a 24 CPU/thread server box is to configure the following:

- 8 ldm threads

- 4 tc threads

- 3 recv threads

- 3 send threads

- 1 rep thread for asynchronous replication.

Each of those threads should be bound to a CPU. It is possible to bind the main thread (schema management domain) and the IO threads to the same CPU in most installations.

In the configuration above, we have bound threads to 20 different CPUs. We should also protect these 20 CPUs from interrupts by using the IRQBALANCE_BANNED_CPUS configuration variable in /etc/sysconfig/irqbalance and setting it to 0x0FFFFF.

The reason for doing this is that MySQL Cluster generates a lot of interrupt and OS kernel processing, and so it is recommended to separate activity across CPUs to ensure conflicts with the MySQL Cluster threads are eliminated.

When booting a Linux kernel it is also possible to provide an option isolcpus=0-19 in grub.conf. The result is that the Linux scheduler won't use these CPUs for any task. Only by using CPU affinity syscalls can a process be made to run on those CPUs.

By using this approach, together with binding MySQL Cluster threads to specific CPUs and banning CPUs IRQ processing on these tasks, a very stable performance environment is created for a MySQL Cluster data node.

On a 32 CPU/Thread server:

- Increase the number of ldm threads to 12

- Increase tc threads to 6

- Provide 2 more CPUs for the OS and interrupts.

- The number of send and receive threads should, in most cases, still be sufficient.

On a 40 CPU/Thread server, increase ldm threads to 16, tc threads to 8 and increment send and receive threads to 4.

On a 48 CPU/Thread server it is possible to optimize further by using:

- 12 tc threads

- 2 more CPUs for the OS and interrupts

- Avoid using IO threads and main thread on same CPU

- Add 1 more receive thread.

Summary

As both this and the previous post seek to demonstrate, the multi-threaded data node extensions not only serve to increase performance of MySQL Cluster, they also enable users to achieve significantly improved levels of utilization from current and future generations of massively multi-core, multi-thread processor designs.

A big thanks to Mikael Ronstrom, Senior MySQL Architect at Oracle, for his work in developing these enhancements and best practices.

You can download MySQL Cluster 7.2 today and try out all of these enhancements. The Getting Started guides are an invaluable aid to quickly building a Proof of Concept

Don’t forget to check out the MySQL Cluster 7.2 New Features whitepaper to discover everything that is new in the latest GA release

Wednesday May 30, 2012

MySQL Cluster 7.2: Over 8x Higher Performance than Cluster 7.1

Summary

The scalability enhancements delivered by extensions to multi-threaded data nodes enables MySQL Cluster 7.2 to deliver over 8x higher performance than the previous MySQL Cluster 7.1 release on a recent benchmark

What’s New in MySQL Cluster 7.2

MySQL Cluster 7.2 was released as GA (Generally Available) in February 2012, delivering many enhancements to performance on complex queries, new NoSQL Key / Value API, cross-data center replication and ease-of-use. These enhancements are summarized in the Figure below, and detailed in the MySQL Cluster New Features whitepaper

Figure 1: Next Generation Web Services, Cross Data Center Replication and Ease-of-Use

Once of the key enhancements delivered in MySQL Cluster 7.2 is extensions made to the multi-threading processes of the data nodes.

Multi-Threaded Data Node Extensions
The MySQL Cluster 7.2 data node is now functionally divided into seven thread types:
1) Local Data Manager threads (ldm). Note – these are sometimes also called LQH threads.
2) Transaction Coordinator threads (tc)
3) Asynchronous Replication threads (rep)
4) Schema Management threads (main)
5) Network receiver threads (recv)
6) Network send threads (send)
7) IO threads

Each of these thread types are discussed in more detail below.

MySQL Cluster 7.2 increases the maximum number of LDM threads from 4 to 16. The LDM contains the actual data, which means that when using 16 threads the data is more heavily partitioned (this is automatic in MySQL Cluster). Each LDM thread maintains its own set of data partitions, index partitions and REDO log. The number of LDM partitions per data node is not dynamically configurable, but it is possible, however, to map more than one partition onto each LDM thread, providing flexibility in modifying the number of LDM threads.

The TC domain stores the state of in-flight transactions. This means that every new transaction can easily be assigned to a new TC thread. Testing has shown that in most cases 1 TC thread per 2 LDM threads is sufficient, and in many cases even 1 TC thread per 4 LDM threads is also acceptable. Testing also demonstrated that in some instances where the workload needed to sustain very high update loads it is necessary to configure 3 to 4 TC threads per 4 LDM threads. In the previous MySQL Cluster 7.1 release, only one TC thread was available. This limit has been increased to 16 TC threads in MySQL Cluster 7.2. The TC domain also manages the Adaptive Query Localization functionality introduced in MySQL Cluster 7.2 that significantly enhanced complex query performance by pushing JOIN operations down to the data nodes.

Asynchronous Replication was separated into its own thread with the release of MySQL Cluster 7.1, and has not been modified in the latest 7.2 release.

To scale the number of TC threads, it was necessary to separate the Schema Management domain from the TC domain. The schema management thread has little load, so is implemented with a single thread.

The Network receiver domain was bound to 1 thread in MySQL Cluster 7.1. With the increase of threads in MySQL Cluster 7.2 it is also necessary to increase the number of recv threads to 8. This enables each receive thread to service one or more sockets used to communicate with other nodes the Cluster.

The Network send thread is a new thread type introduced in MySQL Cluster 7.2. Previously other threads handled the sending operations themselves, which can provide for lower latency. To achieve highest throughput however, it has been necessary to create dedicated send threads, of which 8 can be configured. It is still possible to configure MySQL Cluster 7.2 to a legacy mode that does not use any of the send threads – useful for those workloads that are most sensitive to latency.

The IO Thread is the final thread type and there have been no changes to this domain in MySQL Cluster 7.2. Multiple IO threads were already available, which could be configured to either one thread per open file, or to a fixed number of IO threads that handle the IO traffic. Except when using compression on disk, the IO threads typically have a very light load.

Benchmarking the Scalability Enhancements

The scalability enhancements discussed above have made it possible to scale CPU usage of each data node to more than 5x of that possible in MySQL Cluster 7.1. In addition, a number of bottlenecks have been removed, making it possible to scale data node performance by even more than 5x.

Figure 2: MySQL Cluster 7.2 Delivers 8.4x Higher Performance than 7.1

The flexAsynch benchmark was used to compare MySQL Cluster 7.2 performance to 7.1 across an 8-node Intel Xeon x5670-based cluster of dual socket commodity servers (6 cores each).

As the results demonstrate, MySQL Cluster 7.2 delivers over 8x higher performance per data nodes than MySQL Cluster 7.1.

More details of this and other benchmarks will be published in a new whitepaper – coming soon, so stay tuned!

In a following blog post, I’ll provide recommendations on optimum thread configurations for different types of server processor. You can also learn more from the Best Practices Guide to Optimizing Performance of MySQL Cluster

Conclusion

MySQL Cluster has achieved a range of impressive benchmark results, and set in context with the previous 7.1 release, is able to deliver over 8x higher performance per node.

As a result, the multi-threaded data node extensions not only serve to increase performance of MySQL Cluster, they also enable users to achieve significantly improved levels of utilization from current and future generations of massively multi-core, multi-thread processor designs.

Tuesday May 29, 2012

Performance Testing of MySQL Cluster: The flexAsynch Benchmark

Following the release of MySQL Cluster 7.2, the Engineering has been busy publishing a range of new performance benchmarks, most recently delivering 1.2 Billion UPDATE operations per Minute across a cluster of 30 x commodity Intel Xeon E5-based servers.

Figure 1: Linear Scaling of Write Operations

These performance tests have been run on the flexAsynch benchmark, so in the this blog, I wanted to provide a little more detail on that benchmark, and provide guidance on how you can use it in your own performance evaluations.

FlexAsynch is an open source, highly adaptable test suite that can be downloaded as part of the MySQL Cluster source tarball under the <storage/ndb/test/ndbapi> directory.

An automated tool is available to run the benchmark, with full instructions documented in the README file packaged in the dbt2-0.37.50 tarball. The tarball also includes the scripts to run the benchmark, and are further described on the MySQL benchmark page.

The benchmark reads or updates an entire row from the database as part of its test operation. All UPDATE operations are fully transactional. As part of these tests, each row in this benchmark is 100 bytes total, comprising 25 columns, each 4 bytes in size, though the size and number of columns are fully configurable.

Database access from the application tier is via the C++ NDB API, one of the NoSQL interfaces implemented by MySQL Cluster that bypasses the SQL layer to communicate directly with the data nodes. Other NoSQL interfaces include the Memcached API, Java, JPA and HTTP/REST

flexAsynch makes it possible to generate a variety of loads for MySQL Cluster benchmarking, enabling users to simulate their own environment.

Using the scripts in the dbt2-0.37.50 tarball and flexAsynch it is possible to configure a range of parameters, including:

  • The number of benchmark drivers and the number of threads per benchmark driver;
  • The number of simultaneous transactions executed per thread;
  • The size and number of records;
  • The type of operation performed (INSERT, UPDATE, DELETE, SELECT);
  • The size of the database.

By using flexAsync, users can test the limits of MySQL Cluster performance. The size of the database used for testing is configurable, but given that flexAsynch can load more than 1GB of data per second into the MySQL Cluster database, users should ensure the database is large in order to achieve consistent benchmark numbers.

We will shortly publish a whitepaper that discusses MySQL Cluster benchmarking in more detail – so stay tuned for that. In the meantime, flexAsynch is fully available today for you to run your own tests. Also check out our Best Practices guide for optimizing the performance of MySQL Cluster 

About

Get the latest updates on products, technology, news, events, webcasts, customers and more.

Twitter


Facebook

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
2
5
6
9
10
11
12
13
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today