Monday Apr 14, 2014

Developing with MySQL and NoSQL

MySQL adopts a very different approach to 'NoSQL' than other databases. With the memcached plugin, MySQL provides the speed and high availability benefits of a standard 'NoSQL' database solution, while mitigating many of the drawbacks to this approach.

A traditional memcached application bypasses the SQL layer entirely, and stores all its data in memory. This makes data access extremely fast, but there is a risk that the data will disappear in the event of a system problem. 

The MySQL memcached plugin for InnoDB also bypasses the SQL and optimization layers, resulting in excellent performance. It goes further, writing key-value data directly to  InnoDB tables. The result is fast data access while retaining the advantages provided by the existing relational database infrastructure, such as the ability to run complex queries with SQL, maintain data integrity, provide real-time analytics to the business, and work with big data platforms such as Hadoop.

MySQL Developers can learn about topics such as using NoSQL and memcached, as well as much more in the MySQL for Developers course.

This live 5-day instructor-led course is available as a:

  • Live-Virtual Event: Take this course from your own desk, no travel required. Choose from a selection of events on the schedule to suit different timezones.
  • In-Class Event: Travel to an education center to take this course. Below is a selection of the events already on the schedule:

 Location  Date  Delivery Language
 Vienna, Austria
 17 November 2014
 German
 Brussels, Belgium
 12 May 2014
 English
 Sao Paolo, Brazil
 12 May 2014
 Brazilian Portuguese
 London, England
 29 September 2014
 English
 Rome, Italy
 19 May 2014
 Italian
 Nairobi, Kenya
 19 May 2014
 English
 Petaling Jaya, Malaysia
 14 April 2014
 English
 Utrecht, Netherlands
 12 May 2014
 English
 Makati City, Philippines
 29 September 2014
 English
 Lisbon, Portugal
 26 May 2014
 European Portuguese
 Porto, Portugal
 26 May 2014
 European Portuguese
 Madrid, Spain
 3 November 2014
 Spanish
 Valencia, Spain
 24 November 2014
 Spanish
 Basel, Switzerland
 4 August 2014
 German
 Bern, Switzerland
 4 August 2014
 German
 Zurick, Switzerland
 4 August 2014
 German

To register for an event, request an additional event, or learn more about the authentic MySQL Curriculum, go to http://education.oracle.com/mysql.

The MySQL for Developers course is the recommended training for the MySQL 5.6 Developers OCP certification exam.

Wednesday Apr 17, 2013

MySQL Cluster Tutorial: NoSQL JavaScript Connector for Node.js

This tutorial has been authored by Craig Russell and JD Duncan

The MySQL Cluster team are working on a new NoSQL JavaScript connector for MySQL. The objectives are simplicity and high performance for JavaScript users:

- allows end-to-end JavaScript development, from the browser to the server and now to the world's most popular open source database

- native "NoSQL" access to the storage layer without going first through SQL transformations and parsing.

Node.js is a complete web platform built around JavaScript designed to deliver millions of client connections on commodity hardware. With the MySQL NoSQL Connector for JavaScript, Node.js users can easily add data access and persistence to their web, cloud, social and mobile applications.

While the initial implementation is designed to plug and play with Node.js, the actual implementation doesn't depend heavily on Node, potentially enabling wider platform support in the future.

Changes since the previous blog:

- InnoDB is now supported via the mysql adapter (accesses data via the mysqld server)

- Auto-increment columns are now supported

- Default values in columns are now supported

- Multiple databases are now supported

- Column converters for JavaScript types that need special (user-written) mapping to database types are now supported

- Queries that specify all columns of a primary or unique key index are now supported

- When acquiring a connection or session, specific table or class metadata can be provided in order to pre-load database metadata and signal an error if not all metadata can be loaded

- Users can now get metadata for tables by using the session.getMetadata function

- The user interface to map JavaScript domain objects to database tables has been significantly simplified

Implementation

The architecture and user interface of this connector are very different from other MySQL connectors in a major way: it is an asynchronous interface that follows the event model built into Node.js.

To make it as easy as possible, we decided to use a domain object model to store the data. This allows for users to query data from the database and have a fully-instantiated object to work with, instead of having to deal with rows and columns of the database. The domain object model can have any user behavior that is desired, with the NoSQL connector providing the data from the database.

To make it as fast as possible, we use a direct connection from the user's address space to the database. This approach means that no SQL (pun intended) is needed to get to the data, and no SQL (and again) server is between the user and the data.

The connector is being developed to be extensible to multiple underlying database technologies, including direct, native access to both the MySQL Cluster "ndb" and InnoDB storage engines. The current release supports ndb via both native access and mysqld; and supports InnoDB via mysqld.

The connector integrates the MySQL Cluster native API library directly within the Node.js platform itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

The following sections take you through how to connect to MySQL, query the data and how to get started.


Connecting to the database

A Session is the main user access path to the database. You can get a Session object directly from the connector using the openSession function:

var nosql = require("mysql-js");

var dbProperties = {

    "implementation" : "ndb",

    "database" : "test"

};

nosql.openSession(dbProperties, null, onSession);

The openSession function calls back into the application upon creating a Session. The Session is then used to create, delete, update, and read objects.

Default database

Every session and connection to the database has a default database associated with it. When mapping domain objects to the database, or using a table name to identify a table, users can specify the database by using the explicit form for the table name: 'tableName.databaseName'. If users omit the databaseName, the default database associated with the session is used.

This feature supports multi-tenancy by allowing the database name to be specified during connection, while allowing the table to dynamically refer to the specific database in use.

Pre-load metadata for tables or domain objects

If your application requires specific tables or domain objects to be available, you can specify the tables and domain objects in your connect or openSession function. For example, if you need the table 't_basic' and the domain object 'Employee' to run the application, you can specify these during the connect or openSession functions.

nosql.openSession(dbProperties, ['t_basic', Employee], onSession);

If the t_basic table or the mapped Employee domain object are not able to be used by the session, then an error will be signaled and the onSession callback will report the specific error.

Getting Table metadata

If getting metadata associated with your application's tables is important, you can get the information by using the session function getMetadata. This function will return information about the specified table in the same format as used in mapTable. You can get the names and types of the table's columns and use the information to dynamically access tables and columns.

Reading data

The Session can read data from the database in a number of ways. If you simply want the data from the database, you provide a table name and the key of the row that you want. For example, consider this schema:

create table employee (

  id int not null primary key,

  name varchar(32),

  salary float

) ENGINE=ndbcluster;

Since the primary key is a number, you can provide the key as a number to the find function.

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find('employee', 0, onData);

};

function onData = function(err, data) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(data));

  ... use data in application

};

If you want to have the data stored in your own domain model, you tell the connector which table your domain model uses, by specifying an annotation, and pass your domain model to the find function.

function Employee = function(id, name, salary) {

  this.id = id;

  this.name = name;

  this.salary = salary;

  this.giveRaise = function(percent) {

    this.salary *= percent;

  }

};

annotations.mapClass(Employee, {'table' : 'employee'});

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find(Employee, 0, onData);

};

Special Domain Object Property Types

If your domain object uses types that do not map directly to database types, you can use column converters to transform domain types to database types.

For example, if you have a domain type such as a MaritalStatus type that contains only values of type MARITAL_STATUS, you can define a conversion that translates domain object values into database values.

var MARITAL_STATUS = {

NEVER_MARRIED: {value: 0, code: 'N', name: 'NEVER_MARRIED'},

MARRIED: {value: 1, code: 'M', name: 'MARRIED'},

DIVORCED: {value: 2, code: 'D', name: 'DIVORCED'},

lookup: function(value) {

switch (value) {

case 0: return this.NEVER_MARRIED; break;

case 1: return this.MARRIED; break;

case 2: return this.DIVORCED; break;

default: return null; break;

}

}

};


// column converter for status

var statusConverter = {

toDB: function toDB(status) {

return status.value;

},

fromDB: function fromDB(value) {

return MARITAL_STATUS.lookup(value);

}

};

Updating data

You can update the emp instance in memory, but to make the changes persistent, you need to write it back to the database, using the update function.

function onData = function(err, emp) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp); // oops, session is out of scope here

};

Using JavaScript can be tricky because it does not have the concept of block scope for variables. You can create a closure to handle these variables, or use a feature of the connector to remember your variables.

The connector api takes a fixed number of parameters and returns a fixed number of result parameters to the callback function. But the connector will keep track of variables for you and return them to the callback. So in the above example, change the onSession function to remember the session variable, and you can refer to it in the onData function:

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find(Employee, 0, onData, session);

};

function onData = function(err, emp, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp, onUpdate); // session is now in scope

};

function onUpdate = function(err, emp) {

  if (err) {

    console.log(err);

    ... error handling

  }

Inserting data

Inserting data requires a mapped JavaScript user function (constructor) and a session. Create a variable and persist it:

function onSession = function(err, session) {

  var data = new Employee(999, 'Mat Keep', 20000000);

  session.persist(data, onInsert);

  }

};

Autoincrement Columns

Columns allow but do not require users to specify the values for autoincrement columns. If users want to specify values for autoincrement columns, for example to reset the autoincrement value for the table, the insert function allows specification of values for these columns.

But if users want to exploit the autoincrement functionality, they must avoid setting a value for autoincrement columns. When mysql-js detects that the user has not specified a value, the next value in sequence is used. In the callback for the insert operation, mysql-js has filled in the missing values.

Default Values

Columns that specify a default value allow but do not require users to specify the values for these columns. If users want to specify values for these columns, the insert function allows specification of 'undefined' for these columns. In these cases, mysql-js will use the default values for these columns.

Deleting data

To remove data from the database, use the session remove function. You use an instance of the domain object to identify the row you want to remove. Only the key field is relevant.

function onSession = function(err, session) {

  var key = new Employee(999);

  session.remove(Employee, onDelete);

  }

};

More extensive queries

Queries are defined using a builder pattern and then executed with parameters that can specialize the query and control the operation of the query.

To define a query, use the createQuery function of Session. Provide a constructor function of a mapped domain object or a table name. The resulting QueryDomainType is returned in the callback. The QueryDomainType is a specialized object that has a property for each property in the user's domain object, or a property for each column in the table. These properties are of type QueryField, and they implement functions that allow you to compare column values of database rows to parameters supplied when you execute the query.

session.createQuery(Employee, function(err, qdt) {

// build and execute the query using qdt

});

To build the query, use the query domain type to filter the results. If nothing else is specified, executing the query will return all rows in the table mapped by Employee as an array of instances of Employee.

To filter the results, similar to using a WHERE clause in SQL, specify a query predicate using the where function of the query domain type. To build a query predicate, you can compare fields in the query domain type to values provided as parameters, using common comparison functions such as equal, greater than, etc. To compare fields, use the query field functions that are created in the query domain type. You can combine predicates using AND and OR functions. For example,

var salaryLowerBound = qdt.param('floor'); // define the formal parameter floor

var compareSalaryLowerBound = qdt.salary.ge(salaryLowerBound); // compare the field salary to the floor

var salaryUpperBound = qdt.param('ceiling'); // define the formal parameter ceiling

var compareSalaryUpperBound = qdt.salary.le(salaryUpperBound); // compare the field salary to the ceiling

 

var combinedComparisons = compareSalaryLowerBound.and(compareSalaryUpperBound);

qdt.where(combinedComparisons); // specify the entire filter for the query

The query api supports a fluent style of query composition. The query can be written as:

qdt.where(qdt.salary.ge(qdt.param('floor')).and(qdt.salary.le(qdt.param('ceiling'))));

The above query filter compares the salary greater or equal to parameter floor and less or equal to ceiling.

Executing Queries

Once the query has been built, you can execute the query, providing the actual parameters to be used in the query. The results of the query are returned as an array in the callback.

qdt.execute({'floor': 40000, 'ceiling': 80000}, function(err, results) {

if (err) throw Error('query failed');

results.forEach(function(result) {

console.out('Employee', result.name, 'has salary', result.salary);

});

}

How to evaluate

The MySQL Connector for JavaScript is available for download and forking from GitHub

Since we are still in the development phase, feedback is especially valuable (so don't hesitate to leave comments on this blog, or head to the MySQL Cluster forum). Try it out and see how easy (and fast) it is to integrate MySQL Cluster into your Node.js platforms.

You can also learn more about other previewed functionality of MySQL Cluster 7.3 Development Milestone Release DevZone article.

Thursday Dec 13, 2012

NoSQL Memcached API for MySQL: Latest Updates

With data volumes exploding, it is vital to be able to ingest and query data at high speed. For this reason, MySQL has implemented NoSQL interfaces directly to the InnoDB and MySQL Cluster (NDB) storage engines, which bypass the SQL layer completely. Without SQL parsing and optimization, Key-Value data can be written directly to MySQL tables up to 9x faster, while maintaining ACID guarantees.

In addition, users can continue to run complex queries with SQL across the same data set, providing real-time analytics to the business or anonymizing sensitive data before loading to big data platforms such as Hadoop, while still maintaining all of the advantages of their existing relational database infrastructure.

This and more is discussed in the latest Guide to MySQL and NoSQL where you can learn more about using the APIs to scale new generations of web, cloud, mobile and social applications on the world's most widely deployed open source database

The native Memcached API is part of the MySQL 5.6 Release Candidate, and is already available in the GA release of MySQL Cluster. By using the ubiquitous Memcached API for writing and reading data, developers can preserve their investments in Memcached infrastructure by re-using existing Memcached clients, while also eliminating the need for application changes.

Speed, when combined with flexibility, is essential in the world of growing data volumes and variability. Complementing NoSQL access, support for on-line DDL (Data Definition Language) operations in MySQL 5.6 and MySQL Cluster enables DevOps teams to dynamically update their database schema to accommodate rapidly changing requirements, such as the need to capture additional data generated by their applications. These changes can be made without database downtime.

Using the Memcached interface, developers do not need to define a schema at all when using MySQL Cluster.

Lets look a little more closely at the Memcached implementations for both InnoDB and MySQL Cluster.

Memcached Implementation for InnoDB

The Memcached API for InnoDB is previewed as part of the MySQL 5.6 Release Candidate.

As illustrated in the following figure, Memcached for InnoDB is implemented via a Memcached daemon plug-in to the mysqld process, with the Memcached protocol mapped to the native InnoDB API.


Figure 1: Memcached API Implementation for InnoDB

With the Memcached daemon running in the same process space, users get very low latency access to their data while also leveraging the scalability enhancements delivered with InnoDB and a simple deployment and management model. Multiple web / application servers can remotely access the Memcached / InnoDB server to get direct access to a shared data set.

With simultaneous SQL access, users can maintain all the advanced functionality offered by InnoDB including support for Foreign Keys, XA transactions and complex JOIN operations.

Benchmarks demonstrate that the NoSQL Memcached API for InnoDB delivers up to 9x higher performance than the SQL interface when inserting new key/value pairs, with a single low-end commodity server supporting nearly 70,000 Transactions per Second.

Figure 2: Over 9x Faster INSERT Operations

The delivered performance demonstrates MySQL with the native Memcached NoSQL interface is well suited for high-speed inserts with the added assurance of transactional guarantees.

You can check out the latest Memcached / InnoDB developments and benchmarks here

You can learn how to configure the Memcached API for InnoDB here

Memcached Implementation for MySQL Cluster

Memcached API support for MySQL Cluster was introduced with General Availability (GA) of the 7.2 release, and joins an extensive range of NoSQL interfaces that are already available for MySQL Cluster

Like Memcached, MySQL Cluster provides a distributed hash table with in-memory performance. MySQL Cluster extends Memcached functionality by adding support for write-intensive workloads, a full relational model with ACID compliance (including persistence), rich query support, auto-sharding and 99.999% availability, with extensive management and monitoring capabilities.

All writes are committed directly to MySQL Cluster, eliminating cache invalidation and the overhead of data consistency checking to ensure complete synchronization between the database and cache.


Figure 3: Memcached API Implementation with MySQL Cluster

Implementation is simple:

1. The application sends reads and writes to the Memcached process (using the standard Memcached API).

2. This invokes the Memcached Driver for NDB (which is part of the same process)

3. The NDB API is called, providing for very quick access to the data held in MySQL Cluster’s data nodes.

The solution has been designed to be very flexible, allowing the application architect to find a configuration that best fits their needs. It is possible to co-locate the Memcached API in either the data nodes or application nodes, or alternatively within a dedicated Memcached layer.

The benefit of this flexible approach to deployment is that users can configure behavior on a per-key-prefix basis (through tables in MySQL Cluster) and the application doesn’t have to care – it just uses the Memcached API and relies on the software to store data in the right place(s) and to keep everything synchronized.

Using Memcached for Schema-less Data

By default, every Key / Value is written to the same table with each Key / Value pair stored in a single row – thus allowing schema-less data storage. Alternatively, the developer can define a key-prefix so that each value is linked to a pre-defined column in a specific table.

Of course if the application needs to access the same data through SQL then developers can map key prefixes to existing table columns, enabling Memcached access to schema-structured data already stored in MySQL Cluster.

Conclusion

Download the Guide to MySQL and NoSQL to learn more about NoSQL APIs and how you can use them to scale new generations of web, cloud, mobile and social applications on the world's most widely deployed open source database

See how to build a social app with MySQL Cluster and the Memcached API from our on-demand webinar or take a look at the docs

Don't hesitate to use the comments section below for any questions you may have 

Thursday Nov 29, 2012

MySQL and Hadoop Integration - Unlocking New Insight

“Big Data” offers the potential for organizations to revolutionize their operations. With the volume of business data doubling every 1.2 years, analysts and business users are discovering very real benefits when integrating and analyzing data from multiple sources, enabling deeper insight into their customers, partners, and business processes.

As the world’s most popular open source database, and the most deployed database in the web and cloud, MySQL is a key component of many big data platforms, with Hadoop vendors estimating 80% of deployments are integrated with MySQL.

The new Guide to MySQL and Hadoop presents the tools enabling integration between the two data platforms, supporting the data lifecycle from acquisition and organisation to analysis and visualisation / decision, as shown in the figure below


The Guide details each of these stages and the technologies supporting them:

Acquire: Through new NoSQL APIs, MySQL is able to ingest high volume, high velocity data, without sacrificing ACID guarantees, thereby ensuring data quality. Real-time analytics can also be run against newly acquired data, enabling immediate business insight, before data is loaded into Hadoop. In addition, sensitive data can be pre-processed, for example healthcare or financial services records can be anonymized, before transfer to Hadoop.

Organize: Data is transferred from MySQL tables to Hadoop using Apache Sqoop. With the MySQL Binlog (Binary Log) API, users can also invoke real-time change data capture processes to stream updates to HDFS.

Analyze: Multi-structured data ingested from multiple sources is consolidated and processed within the Hadoop platform.

Decide: The results of the analysis are loaded back to MySQL via Apache Sqoop where they inform real-time operational processes or provide source data for BI analytics tools.

So how are companies taking advantage of this today? As an example, on-line retailers can use big data from their web properties to better understand site visitors’ activities, such as paths through the site, pages viewed, and comments posted. This knowledge can be combined with user profiles and purchasing history to gain a better understanding of customers, and the delivery of highly targeted offers.

Of course, it is not just in the web that big data can make a difference. Every business activity can benefit, with other common use cases including:

- Sentiment analysis;

- Marketing campaign analysis;

- Customer churn modeling;

- Fraud detection;

- Research and Development;

- Risk Modeling;

- And more.

As the guide discusses, Big Data is promising a significant transformation of the way organizations leverage data to run their businesses. MySQL can be seamlessly integrated within a Big Data lifecycle, enabling the unification of multi-structured data into common data platforms, taking advantage of all new data sources and yielding more insight than was ever previously imaginable.

Download the guide to MySQL and Hadoop integration to learn more. I'd also be interested in hearing about how you are integrating MySQL with Hadoop today, and your requirements for the future, so please use the comments on this blog to share your insights.




Thursday Nov 08, 2012

Election 2012: Twitter Breaks Records with MySQL

Twitter VP of Infrastructure Operations Engineering Mazen Rawashdeh shared news and numbers yesterday on his blog:

"Last night, the world tuned in to Twitter to share the election results as U.S. voters chose a president and settled many other campaigns. Throughout the day, people sent more than 31 million election-related Tweets (which contained certain key terms and relevant hashtags). And as results rolled in, we tracked the surge in election-related Tweets at 327,452 Tweets per minute (TPM). These numbers reflect the largest election-related Twitter conversation during our 6 years of existence, though they don’t capture the total volume of all Tweets yesterday."

"Last night, Twitter averaged about 9,965 TPS from 8:11pm to 9:11pm PT, with a one-second peak of 15,107 TPS at 8:20pm PT and a one-minute peak of 874,560 TPM. Seeing a sustained peak over the course of an entire event is a change from the way people have previously turned to Twitter during live events. Now, rather than brief spikes, we are seeing sustained peaks for hours."

Congrats to Jeremy Cole, Davi Arnaut and the rest of the team at Twitter for their excellent work!

Jeremy recently held a keynote presentation at MySQL Connect describing how MySQL powers Twitter, and why they chose and continue to rely on MySQL for their operations. You can watch the presentation here. He also went into more details during another presentation later that day and you can access the slides here.

Below a couple of tweets from Jeremy after what have surely been hectic days...

 Keep up the good work guys!




Thursday Nov 01, 2012

MySQL Cluster 7.3: On-Demand Webinar and Q&A Available

The on-demand webinar for the MySQL Cluster 7.3 Development Release is now available.

You can learn more about the design, implementation and getting started with all of the new MySQL Cluster 7.3 features from the comfort and convenience of your own device, including:

- Foreign Key constraints in MySQL Cluster

- Node.js NoSQL API 

- Auto-installation of higher performance distributed, clusters

We received some great questions over the course of the webinar, and I wanted to share those for the benefit of a broader audience.

Q. What Foreign Key actions are supported:

A. The core referential actions defined in the SQL:2003 standard are implemented:

CASCADE

RESTRICT

NO ACTION

SET NULL

Q. Where are Foreign Keys implemented, ie data nodes or SQL nodes?

A. They are implemented in the data nodes, therefore can be enforced for both the SQL and NoSQL APIs

Q. Are they compatible with the InnoDB Foreign Key implementation?

A. Yes, with the following exceptions:

- InnoDB doesn’t support “No Action” constraints, MySQL Cluster does

- You can choose to suspend FK constraint enforcement with InnoDB using the FOREIGN_KEY_CHECKS parameter; at the moment, MySQL Cluster ignores that parameter.

- You cannot set up FKs between 2 tables where one is stored using MySQL Cluster and the other InnoDB.

- You cannot change primary keys through the NDB API which means that the MySQL Server actually has to simulate such operations by deleting and re-adding the row. If the PK in the parent table has a FK constraint on it then this causes non-ideal behaviour. With Restrict or No Action constraints, the change will result in an error. With Cascaded constraints, you’d want the rows in the child table to be updated with the new FK value but, the implicit delete of the row from the parent table would remove the associated rows from the child table and the subsequent implicit insert into the parent wouldn’t reinstate the child rows. For this reason, an attempt to add an ON UPDATE CASCADE where the parent column is a primary key will be rejected.

Q. Does adding or dropping Foreign Keys cause downtime due to a schema change?

A. Nope, this is an online operation. MySQL Cluster supports a number of on-line schema changes, ie adding and dropping indexes, adding columns, etc.

Q. Where can I see an example of node.js with MySQL Cluster?

A. Check out the tutorial and download the code from GitHub

Q. Can I use the auto-installer to support remote deployments? How about setting up MySQL Cluster 7.2?

A. Yes to both!

Q. Can I get a demo

Check out the tutorial. You can download the code from http://labs.mysql.com/ Go to Select Build drop-down box

Q. What is be minimum internet speen required for Geo distributed cluster with synchronous replication?

A. if you're splitting you cluster between sites then we recommend a network latency of 20ms or less. Alternatively, use MySQL asynchronous replication where the latency of your WAN doesn't impact the latency of your reads/writes.

Q. Where you can one learn more about the PayPal project with MySQL Cluster?

A. Take a look at the following - you'll find press coverage, a video and slides from their keynote presentation 

So, if you want to learn more, listen to the new MySQL Cluster 7.3 on-demand webinar 

MySQL Cluster 7.3 is still in the development phase, so it would be great to get your feedback on these new features, and things you want to see!


Monday Oct 22, 2012

MySQL Cluster 7.3 - Join This Week's Webinar to Learn What's New

The first Development Milestone and Early Access releases of MySQL Cluster 7.3 were announced just several weeks ago. To provide more detail and demonstrate the new features, Andrew Morgan and I will be hosting a live webinar this coming Thursday 25th October at 0900 Pacific Time / 16.00 UTC

Even if you can't make the live webinar, it is still worth registering for the event as you will receive a notification when the replay will be available, to view on-demand at your convenience

In the webinar, we will discuss the enhancements being previewed as part of MySQL Cluster 7.3, including:

- Foreign Key Constraints: Yes, we've looked into the future and decided Foreign Keys are it ;-)

You can read more about the implementation of Foreign Keys in MySQL Cluster 7.3 here

- Node.js NoSQL API: Allowing web, mobile and cloud services to query and receive results sets from MySQL Cluster, natively in JavaScript, enables developers to seamlessly couple high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

You can study the Node.js / MySQL Cluster tutorial here

- Auto-Installer: This new web-based GUI makes it simple for DevOps teams to quickly configure and provision highly optimized MySQL Cluster deployments on-premise or in the cloud

You can view a YouTube tutorial on the MySQL Cluster Auto-Installer here 

So we have a lot to cover in our 45 minute session. It will be time well spent if you want to know more about the future direction of MySQL Cluster and how it can help you innovate faster, with greater simplicity.

Registration is open 

Friday Oct 05, 2012

New MySQL Cluster 7.3 Previews: Foreign Keys, NoSQL Node.js API and Auto-Tuned Clusters

At this weeks MySQL Connect conference, Oracle previewed an exciting new wave of developments for MySQL Cluster, further extending its simplicity and flexibility by expanding the range of use-cases, adding new NoSQL options, and automating configuration.

What’s new:

  • Development Release 1: MySQL Cluster 7.3 with Foreign Keys
  • Early Access “Labs” Preview: MySQL Cluster NoSQL API for Node.js
  • Early Access “Labs” Preview: MySQL Cluster GUI-Based Auto-Installer

In this blog, I'll introduce you to the features being previewed.

Review the blogs listed below for more detail on each of the specific features discussed.

Save the date!: A live webinar is scheduled for Thursday 25th October at 0900 Pacific Time / 1600UTC where we will discuss each of these enhancements in more detail. Registration will be open soon and published to the MySQL webinars page

MySQL Cluster 7.3: Development Release 1

The first MySQL Cluster 7.3 Development Milestone Release (DMR) previews Foreign Keys, bringing powerful new functionality to MySQL Cluster while reducing development complexity.

Foreign Key support has been one of the most requested enhancements to MySQL Cluster – enabling users to simplify their data models and application logic – while extending the range of use-cases for both custom projects requiring referential integrity and packaged applications, such as eCommerce, CRM, CMS, etc.

Implementation

The Foreign Key functionality is implemented directly within the MySQL Cluster data nodes, allowing any client API accessing the cluster to benefit from them – whether they are SQL or one of the NoSQL interfaces (Memcached, C++, Java, JPA, HTTP/REST or the new Node.js API - discussed later.)

The core referential actions defined in the SQL:2003 standard are implemented:

  • CASCADE
  • RESTRICT
  • NO ACTION
  • SET NULL

In addition, the MySQL Cluster implementation supports the online adding and dropping of Foreign Keys, ensuring the Cluster continues to serve both read and write requests during the operation.  This represents a further enhancement to MySQL Cluster's support for on0line schema changes, ie adding and dropping indexes, adding columns, etc. 

Read this blog for a demonstration of using Foreign Keys with MySQL Cluster. 

Getting Started with MySQL Cluster 7.3 DMR1:

Users can download either the source or binary and evaluate the MySQL Cluster 7.3 DMR with Foreign Keys now! (Select the Development Release tab).


MySQL Cluster NoSQL API for Node.js

Node.js is hot! In a little over 3 years, it has become one of the most popular environments for developing next generation web, cloud, mobile and social applications. Bringing JavaScript from the browser to the server, the design goal of Node.js is to build new real-time applications supporting millions of client connections, serviced by a single CPU core.

Making it simple to further extend the flexibility and power of Node.js to the database layer, we are previewing the Node.js Javascript API for MySQL Cluster as an Early Access release, available for download now from http://labs.mysql.com/. Select the following build:

MySQL-Cluster-NoSQL-Connector-for-Node-js

Alternatively, you can clone the project at the MySQL GitHub page

Implemented as a module for the V8 engine, the new API provides Node.js with a native, asynchronous JavaScript interface that can be used to both query and receive results sets directly from MySQL Cluster, without transformations to SQL.


Figure 1: MySQL Cluster NoSQL API for Node.js enables end-to-end JavaScript development

Rather than just presenting a simple interface to the database, the Node.js module integrates the MySQL Cluster native API library directly within the web application itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

The new Node.js API joins a rich array of NoSQL interfaces available for MySQL Cluster. Whichever API is chosen for an application, SQL and NoSQL can be used concurrently across the same data set, providing the ultimate in developer flexibility. 

Get started with MySQL Cluster NoSQL API for Node.js tutorial


MySQL Cluster GUI-Based Auto-Installer

Compatible with both MySQL Cluster 7.2 and 7.3, the Auto-Installer makes it simple for DevOps teams to quickly configure and provision highly optimized MySQL Cluster deployments – whether on-premise or in the cloud.

Implemented with a standard HTML GUI and Python-based web server back-end, the Auto-Installer intelligently configures MySQL Cluster based on application requirements and auto-discovered hardware resources


Figure 2: Automated Tuning and Configuration of MySQL Cluster

Developed by the same engineering team responsible for the MySQL Cluster database, the installer provides standardized configurations that make it simple, quick and easy to build stable and high performance clustered environments.

The auto-installer is previewed as an Early Access release, available for download now from http://labs.mysql.com/, by selecting the MySQL-Cluster-Auto-Installer build.

You can read more about getting started with the MySQL Cluster auto-installer here.

Watch the YouTube video for a demonstration of using the MySQL Cluster auto-installer


Getting Started with MySQL Cluster

If you are new to MySQL Cluster, the Getting Started guide will walk you through installing an evaluation cluster on a singe host (these guides reflect MySQL Cluster 7.2, but apply equally well to 7.3 and the Early Access previews). Or use the new MySQL Cluster Auto-Installer!

Download the Guide to Scaling Web Databases with MySQL Cluster (to learn more about its architecture, design and ideal use-cases).

Post any questions to the MySQL Cluster forum where our Engineering team and the MySQL Cluster community will attempt to assist you.

Post any bugs you find to the MySQL bug tracking system (select MySQL Cluster from the Category drop-down menu)

And if you have any feedback, please post them to the Comments section here or in the blogs referenced in this article.


Summary

MySQL Cluster 7.2 is the GA, production-ready release of MySQL Cluster. The first Development Release of MySQL Cluster 7.3 and the Early Access previews give you the opportunity to preview and evaluate future developments in the MySQL Cluster database, and we are very excited to be able to share that with you.

Let us know how you get along with MySQL Cluster 7.3, and other features that you want to see in future releases, by using the comments of this blog.

Saturday Sep 29, 2012

Tutorial: Getting Started with the NoSQL JavaScript / Node.js API for MySQL Cluster

Tutorial authored by Craig Russell and JD Duncan 

The MySQL Cluster team are working on a new NoSQL JavaScript connector for MySQL. The objectives are simplicity and high performance for JavaScript users:

- allows end-to-end JavaScript development, from the browser to the server and now to the world's most popular open source database

- native "NoSQL" access to the storage layer without going first through SQL transformations and parsing.

Node.js is a complete web platform built around JavaScript designed to deliver millions of client connections on commodity hardware. With the MySQL NoSQL Connector for JavaScript, Node.js users can easily add data access and persistence to their web, cloud, social and mobile applications.

While the initial implementation is designed to plug and play with Node.js, the actual implementation doesn't depend heavily on Node, potentially enabling wider platform support in the future.

Implementation

The architecture and user interface of this connector are very different from other MySQL connectors in a major way: it is an asynchronous interface that follows the event model built into Node.js.

To make it as easy as possible, we decided to use a domain object model to store the data. This allows for users to query data from the database and have a fully-instantiated object to work with, instead of having to deal with rows and columns of the database. The domain object model can have any user behavior that is desired, with the NoSQL connector providing the data from the database.

To make it as fast as possible, we use a direct connection from the user's address space to the database. This approach means that no SQL (pun intended) is needed to get to the data, and no SQL server is between the user and the data.

The connector is being developed to be extensible to multiple underlying database technologies, including direct, native access to both the MySQL Cluster "ndb" and InnoDB storage engines.

The connector integrates the MySQL Cluster native API library directly within the Node.js platform itself, enabling developers to seamlessly couple their high performance, distributed applications with a high performance, distributed, persistence layer delivering 99.999% availability.

The following sections take you through how to connect to MySQL, query the data and how to get started.


Connecting to the database

A Session is the main user access path to the database. You can get a Session object directly from the connector using the openSession function:

var nosql = require("mysql-js");

var dbProperties = {

    "implementation" : "ndb",

    "database" : "test"

};

nosql.openSession(dbProperties, null, onSession);

The openSession function calls back into the application upon creating a Session. The Session is then used to create, delete, update, and read objects.


Reading data

The Session can read data from the database in a number of ways. If you simply want the data from the database, you provide a table name and the key of the row that you want. For example, consider this schema:

create table employee (

  id int not null primary key,

  name varchar(32),

  salary float

) ENGINE=ndbcluster;

Since the primary key is a number, you can provide the key as a number to the find function.

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find('employee', 0, onData);

};

function onData = function(err, data) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(data));

  ... use data in application

};

If you want to have the data stored in your own domain model, you tell the connector which table your domain model uses, by specifying an annotation, and pass your domain model to the find function.

var annotations = new nosql.Annotations();

function Employee = function(id, name, salary) {

  this.id = id;

  this.name = name;

  this.salary = salary;

  this.giveRaise = function(percent) {

    this.salary *= percent;

  }

};

annotations.mapClass(Employee, {'table' : 'employee'});

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find(Employee, 0, onData);

};


Updating data

You can update the emp instance in memory, but to make the raise persistent, you need to write it back to the database, using the update function.

function onData = function(err, emp) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp); // oops, session is out of scope here

};

Using JavaScript can be tricky because it does not have the concept of block scope for variables. You can create a closure to handle these variables, or use a feature of the connector to remember your variables.

The connector api takes a fixed number of parameters and returns a fixed number of result parameters to the callback function. But the connector will keep track of variables for you and return them to the callback. So in the above example, change the onSession function to remember the session variable, and you can refer to it in the onData function:

function onSession = function(err, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  session.find(Employee, 0, onData, session);

};

function onData = function(err, emp, session) {

  if (err) {

    console.log(err);

    ... error handling

  }

  console.log('Found: ', JSON.stringify(emp));

  emp.giveRaise(0.12); // gee, thanks!

  session.update(emp, onUpdate); // session is now in scope

};

function onUpdate = function(err, emp) {

  if (err) {

    console.log(err);

    ... error handling

  }


Inserting data

Inserting data requires a mapped JavaScript user function (constructor) and a session. Create a variable and persist it:

function onSession = function(err, session) {

  var data = new Employee(999, 'Mat Keep', 20000000);

  session.persist(data, onInsert);

  }

};


Deleting data

To remove data from the database, use the session remove function. You use an instance of the domain object to identify the row you want to remove. Only the key field is relevant.

function onSession = function(err, session) {

  var key = new Employee(999);

  session.remove(Employee, onDelete);

  }

};


More extensive queries

We are working on the implementation of more extensive queries along the lines of the criteria query api. Stay tuned.

How to evaluate

The MySQL Connector for JavaScript is available for download from labs.mysql.com. Select the build:

MySQL-Cluster-NoSQL-Connector-for-Node-js

You can also clone the project on GitHub

Since it is still early in development, feedback is especially valuable (so don't hesitate to leave comments on this blog, or head to the MySQL Cluster forum). Try it out and see how easy (and fast) it is to integrate MySQL Cluster into your Node.js platforms.

You can learn more about other previewed functionality of MySQL Cluster 7.3 here

Monday Jul 02, 2012

NoSQL Java API for MySQL Cluster: Questions & Answers

The MySQL Cluster engineering team recently ran a live webinar, available now on-demand demonstrating the ClusterJ and ClusterJPA NoSQL APIs for MySQL Cluster, and how these can be used in building real-time, high scale Java-based services that require continuous availability.

Attendees asked a number of great questions during the webinar, and I thought it would be useful to share those here, so others are also able to learn more about the Java NoSQL APIs.

First, a little bit about why we developed these APIs and why they are interesting to Java developers.

ClusterJ and Cluster JPA

ClusterJ is a Java interface to MySQL Cluster that provides either a static or dynamic domain object model, similar to the data model used by JDO, JPA, and Hibernate. A simple API gives users extremely high performance for common operations: insert, delete, update, and query.

ClusterJPA works with ClusterJ to extend functionality, including

- Persistent classes

- Relationships

- Joins in queries

- Lazy loading

- Table and index creation from object model

By eliminating data transformations via SQL, users get lower data access latency and higher throughput. In addition, Java developers have a more natural programming method to directly manage their data, with a complete, feature-rich solution for Object/Relational Mapping. As a result, the development of Java applications is simplified with faster development cycles resulting in accelerated time to market for new services.

MySQL Cluster offers multiple NoSQL APIs alongside Java:

  • - Memcached for a persistent, high performance, write-scalable Key/Value store,
  • - HTTP/REST via an Apache module
  • - C++ via the NDB API for the lowest absolute latency.

Developers can use SQL as well as NoSQL APIs for access to the same data set via multiple query patterns – from simple Primary Key lookups or inserts to complex cross-shard JOINs using Adaptive Query Localization

Marrying NoSQL and SQL access to an ACID-compliant database offers developers a number of benefits. MySQL Cluster’s distributed, shared-nothing architecture with auto-sharding and real time performance makes it a great fit for workloads requiring high volume OLTP. Users also get the added flexibility of being able to run real-time analytics across the same OLTP data set for real-time business insight.

OK – hopefully you now have a better idea of why ClusterJ and JPA are available. Now, for the Q&A.

Q & A

Q. Why would I use Connector/J vs. ClusterJ?

A. Partly it's a question of whether you prefer to work with SQL (Connector/J) or objects (ClusterJ). Performance of ClusterJ will be better as there is no need to pass through the MySQL Server. A ClusterJ operation can only act on a single table (e.g. no joins) - ClusterJPA extends that capability

Q. Can I mix different APIs (ie ClusterJ, Connector/J) in our application for different query types?

A. Yes. You can mix and match all of the API types, SQL, JDBC, ODBC, ClusterJ, Memcached, REST, C++. They all access the exact same data in the data nodes. Update through one API and new data is instantly visible to all of the others.

Q. How many TCP connections would a SessionFactory instance create for a cluster of 8 data nodes?

A. SessionFactory has a connection to the mgmd (management node) but otherwise is just a vehicle to create Sessions. Without using connection pooling, a SessionFactory will have one connection open with each data node. Using optional connection pooling allows multiple connections from the SessionFactory to increase throughput.

Q. Can you give details of how Cluster J optimizes sharding to enhance performance of distributed query processing?

A. Each data node in a cluster runs a Transaction Coordinator (TC), which begins and ends the transaction, but also serves as a resource to operate on the result rows. While an API node (such as a ClusterJ process) can send queries to any TC/data node, there are performance gains if the TC is where most of the result data is stored. ClusterJ computes the shard (partition) key to choose the data node where the row resides as the TC.

Q. What happens if we perform two primary key lookups within the same transaction? Are they sent to the data node in one transaction?

A. ClusterJ will send identical PK lookups to the same data node.

Q. How is distributed query processing handled by MySQL Cluster ?

A. If the data is split between data nodes then all of the information will be transparently combined and passed back to the application. The session will connect to a data node - typically by hashing the primary key - which then interacts with its neighboring nodes to collect the data needed to fulfil the query.

Q. Can I use Foreign Keys with MySQL Cluster

A. Support for Foreign Keys is included in the MySQL Cluster 7.3 Early Access release

Summary

The NoSQL Java APIs are packaged with MySQL Cluster, available for download here so feel free to take them for a spin today!

Key Resources

MySQL Cluster on-line demo 

MySQL ClusterJ and JPA On-demand webinar 

MySQL ClusterJ and JPA documentation

MySQL ClusterJ and JPA whitepaper and tutorial

Friday Mar 30, 2012

Guide to MySQL & NoSQL, Webinar Q&A

Yesterday we ran a webinar discussing the demands of next generation web services and how blending the best of relational and NoSQL technologies enables developers and architects to deliver the agility, performance and availability needed to be successful.

Attendees posted a number of great questions to the MySQL developers, serving to provide additional insights into areas like auto-sharding and cross-shard JOINs, replication, performance, client libraries, etc. So I thought it would be useful to post those below, for the benefit of those unable to attend the webinar.

Before getting to the Q&A, there are a couple of other resources that maybe useful to those looking at NoSQL capabilities within MySQL:

- On-Demand webinar

- Slides used during the webinar

- Guide to MySQL and NoSQL whitepaper 

- MySQL Cluster demo, including NoSQL interfaces, auto-sharing, high availability, etc. 

So here is the Q&A from the event 

Q. Where does MySQL Cluster fit in to the CAP theorem?

A. MySQL Cluster is flexible. A single Cluster will prefer consistency over availability in the presence of network partitions. A pair of Clusters can be configured to prefer availability over consistency. A full explanation can be found on the MySQL Cluster & CAP Theorem blog post. 

Q. Can you configure the number of replicas? (the slide used a replication factor of 1)

Yes. A cluster is configured by an .ini file. The option NoOfReplicas sets the number of originals and replicas: 1 = no data redundancy, 2 = one copy etc. Usually there's no benefit in setting it >2.

Q. Interestingly most (if not all) of the NoSQL databases recommend having 3 copies of data (the replication factor).   

Yes, with configurable quorum based Reads and writes. MySQL Cluster does not need a quorum of replicas online to provide service. Systems that require a quorum need > 2 replicas to be able to tolerate a single failure. Additionally, many NoSQL systems take liberal inspiration from the original GFS paper which described a 3 replica configuration. MySQL Cluster avoids the need for a quorum by using a lightweight arbitrator. You can configure more than 2 replicas, but this is a tradeoff between incrementally improved availability, and linearly increased cost.

Q. Can you have cross node group JOINS? Wouldn't that run into the risk of flooding the network?

MySQL Cluster 7.2 supports cross nodegroup joins. A full cross-join can require a large amount of data transfer, which may bottleneck on network bandwidth. However, for more selective joins, typically seen with OLTP and light analytic applications, cross node-group joins give a great performance boost and network bandwidth saving over having the MySQL Server perform the join.

Q. Are the details of the benchmark available anywhere? According to my calculations it results in approx. 350k ops/sec per processor which is the largest number I've seen lately

The details are linked from Mikael Ronstrom's blog

The benchmark uses a benchmarking tool we call flexAsynch which runs parallel asynchronous transactions. It involved 100 byte reads, of 25 columns each. Regarding the per-processor ops/s, MySQL Cluster is particularly efficient in terms of throughput/node. It uses lock-free minimal copy message passing internally, and maximizes ID cache reuse. Note also that these are in-memory tables, there is no need to read anything from disk.

Q. Is access control (like table) planned to be supported for NoSQL access mode?

Currently we have not seen much need for full SQL-like access control (which has always been overkill for web apps and telco apps). So we have no plans, though especially with memcached it is certainly possible to turn-on connection-level access control. But specifically table level controls are not planned.

Q. How is the performance of memcached APi with MySQL against memcached+MySQL or any other Object Cache like Ecache with MySQL DB?

With the memcache API we generally see a memcached response in less than 1 ms. and a small cluster with one memcached server can handle tens of thousands of operations per second.

Q. Can .NET can access MemcachedAPI?

Yes, just use a .Net memcache client such as the enyim or BeIT memcache libraries.

Q. Is the row level locking applicable when you update a column through memcached API?

An update that comes through memcached uses a row lock and then releases it immediately. Memcached operations like "INCREMENT" are actually pushed down to the data nodes. In most cases the locks are not even held long enough for a network round trip.

Q. Has anyone published an example using something like PHP? I am assuming that you just use the PHP memcached extension to hook into the memcached API. Is that correct?

Not that I'm aware of but absolutely you can use it with php or any of the other drivers

Q. For beginner we need more examples.

Take a look here for a fully worked example

Q. Can I access MySQL using Cobol (Open Cobol) or C and if so where can I find the coding libraries etc?

A. There is a cobol implementation that works well with MySQL, but I do not think it is Open Cobol. Also there is a MySQL C client library that is a standard part of every mysql distribution

Q. Is there a place to go to find help when testing and/implementing the NoSQL access?

If using Cluster then you can use the cluster@lists.mysql.com alias or post on the MySQL Cluster forum

Q. Are there any white papers on this? 

Yes - there is more detail in the MySQL Guide to NoSQL whitepaper

If you have further questions, please don’t hesitate to use the comments below!

Wednesday Feb 15, 2012

MySQL Cluster 7.2 GA Released, Delivers 1 BILLION Queries per Minute

70x Higher JOIN Performance, NoSQL Key-Value API & Cross Data Center Sharding with Synchronous Replication 

Oracle is delighted to announce the immediate availability of the production-ready, GA release of MySQL Cluster 7.2, available for download under the GPL, and as part of the commercial MySQL Cluster Carrier Grade Edition, including management tools, product certifications and 24x7 global support.

1 Billion Queries per Minute

MySQL Cluster delivered 1 billion queries per minute (17.6m million queries per second), scaled-out across 8x commodity Intel x86 server nodes, accessed by the NoSQL C++ NDB API.

It did this while maintaining 99.999% availability and complete data consistency across the cluster, demonstrating MySQL Cluster is a great choice for the most demanding web and telecoms services, whether deployed on-premise or in the cloud

New Feature Overview

The MySQL Cluster 7.2 GA release builds upon the Development Milestones published over the past 9 months, which provided the community with an opportunity to test and provide feedback on the latest features.

MySQL Cluster 7.2 offers a range of new capabilities designed to enable the delivery of next generation web services, enhance cross data center scalability and improve ease-of-use:

- Enabling next generation web services:

o 70x higher complex query performance

o Native Memcached API

o 4x higher data node scalability

o Integration with the latest MySQL 5.5 server

o Support for Virtual Machine (VM) environments

- Enhancing cross data scalability:

o New multi-site clustering with auto-sharding and synchronous replication between datacenters

o Improved active/active replication between data centers with eventual consistency

- Improved Ease-of-Use:

o Consolidated user privileges

o MySQL Cluster Manager 1.1.4

Read the MySQL Cluster 7.2 Developer Zone article to get the detail on all of the new features.

You can download the MySQL Cluster 7.2 New Features whitepaper for implementation details and how to get started or join a forthcoming MySQL Cluster 7.2 webinar for your Time Zone to learn more:

Summary

MySQL Cluster 7.2 is the best release to date, enabling projects and applications to benefit from web-scalability with carrier-grade availability and developer agility.

You can review the MySQL Cluster 7.2 documentation, and also ask questions to the development team and community via our the MySQL Cluster forum

We look forward to helping you in your new projects, and working with you to continue evolving MySQL Cluster to serve an even broader set of requirements in the future.

Wednesday Nov 02, 2011

MySQL Cluster, and NoSQL

Those are the topics we cover in the latest episode of our “Meet The MySQL Experts” podcast.

Mat Keep and Bernd Ocklin talk about new database requirements, and walk us through what's new in the second Development Milestone Release of MySQL Cluster 7.2, including impressive performance improvements, new NoSQL access via memcached, cross data center scalability, and more...

Enjoy the podcast!

Friday Oct 07, 2011

MySQL Cluster 7.2 (DMR2): NoSQL, Key/Value, Memcached

70x Higher Performance, Cross Data Center Scalability and New NoSQL Interface

Its been an exciting week for all involved with MySQL Cluster, with the announcement of the second Development Milestone Release (7.2.1) at Oracle Open World. Highlights include:

- Enabling next generation web services: 70x higher complex query performance, native memcached API and integration with the latest MySQL 5.5 server

- Enhancing cross data scalability: new multi-site clustering and enhanced active/active replication

- Simplified provisioning: consolidated user privileges.

You can download the DMR for evaluation now from: http://dev.mysql.com/downloads/cluster/ (select Development Milestone Release tab).

You can also read up on the detail of each of these features in the new article posted at the MySQL Developer Zone. In this blog, I’ll summarize the main parts of the announcement.

70x Higher Performance with Adaptive Query Localization (AQL)

Previewed as part of the first MySQL Cluster DMR, AQL is enabled by a new Index Statistics function that allows the SQL optimizer to build a better execution plan for each query.

As a result, JOIN operations are pushed down to the data nodes where the query executes in parallel on local copies of the data. A merged result set is then sent back to the MySQL Server, significantly enhancing performance by reducing network trips.

Take a look at how this is used by a web-based content management to increase performance by 70x

Adaptive Query Localization enables MySQL Cluster to better serve those use-cases that have the need to run real-time analytics across live data sets, along with high throughput OLTP operations. Examples include recommendations engines and clickstream analysis in web applications, pre-pay billing promotions in mobile telecoms networks or fraud detection in payment systems.

New NoSQL Interface and Schema-less Storage with the memcached API

The memcached interface released as an Early Access project with the first MySQL Cluster DMR is now integrated directly into the MySQL Cluster 7.2.1 trunk, enabling simpler evaluation.

The popularity of Key/Value stores has increased dramatically. With MySQL Cluster and the new memcached API, you have all the benefits of an ACID RDBMS, combined with the performance capabilities of Key/Value store.

By default, every Key / Value is written to the same table with each Key / Value pair stored in a single row – thus allowing schema-less data storage. Alternatively, the developer can define a key-prefix so that each value is linked to a pre-defined column in a specific table.

Of course if the application needs to access the same data through SQL then developers can map key prefixes to existing table columns, enabling Memcached access to schema-structured data already stored in MySQL Cluster.

You can read more about the design goals and implementation of the memcached API for MySQL Cluster here.

Integration with MySQL 5.5

MySQL Cluster 7.2.1 is integrated with MySQL Server 5.5, providing binary compatibility to existing MySQL Server deployments. Users can now fully exploit the latest capabilities of both the InnoDB and MySQL Cluster storage engines within a single application.

Users simply install the new MySQL Cluster binary including the MySQL 5.5 release, restart the server and immediate have access to both InnoDB and MySQL Cluster!

Enhancing Cross Data Center Scalability: Simplified Active / Active Replication

MySQL Cluster has long offered Geographic Replication, distributing clusters to remote data centers to reduce the affects of geographic latency by pushing data closer to the user, as well as providing a capability for disaster recovery.

Geographic replication has always been designed around an Active / Active technology, so if applications are attempting to update the same row on different clusters at the same time, the conflict can be detected and resolved. With the release of MySQL Cluster 7.2.1, implementing Active / Active replication has become a whole lot simpler. Developers no longer need to implement and manage timestamp columns within their applications. Also rollbacks can be made to whole transactions rather than just individual operations.

You can learn more here.

Enhancing Cross Data Center Scalability: Multi-Site Clustering

MySQL Cluster 7.2.1 DMR provides a new option for cross data center scalability – multi-site clustering. For the first time splitting data nodes across data centers is a supported deployment option.

Improvements to MySQL Cluster’s heartbeating mechanism with a new “ConnectivityCheckPeriod” parameter enables greater resilience to temporary latency spikes on a WAN, thereby maintaining operation of the cluster.

With this deployment model, users can synchronously replicate updates between data centers without needing conflict detection and resolution, and automatically failover between those sites in the event of a node failure.

Users need to characterize their network bandwidth and latencies, and observe best practices in configuring both their network environment and Cluster. More guidance is available here.

User Privilege Consolidation

User privilege tables are now consolidated into the data nodes and centrally accessible by all MySQL servers accessing the cluster.

Previously the privilege tables were local to each MySQL server, meaning users and their associated privileges had to be managed separately on each server. By consolidating privilege data, users need only be defined once and managed centrally, saving Systems Administrators significant effort and reducing cost of operations.

Summary

The MySQL Cluster 7.2.1 DMR enables new classes of use-cases to benefit from web-scale performance with carrier-grade availability.  We also have a great webinar coming up on Wednesday October 19th  where the engineering and product management team will discuss the enhancements in more detail, and how you can use them today. You can sign up here.

You can download the DMR for evaluation now from: http://dev.mysql.com/downloads/cluster/ (select Development Milestone Release tab).

You can learn more about the MySQL Cluster architecture from our Guide to scaling web databases

Let us know what you think of these enhancements directly in comments of this or the associated blogs. We look forward to working with the community to perfect these new features.

Friday Aug 05, 2011

Scaling Web Databases, Part 3: SQL & NoSQL Data Access

Supporting successful services on the web means scaling your back-end databases across multiple dimensions. This blog focuses on scaling access methods to your data using SQL and/or NoSQL interfaces.

In Part 1 of the blog series , I discussed scaling database performance using auto-sharding and active/active geographic replication in MySQL Cluster to enable applications to scale both within and across data centers.  

In Part 2, I discussed the need to scale operational agility to keep pace with demand, which includes being able to add capacity and performance to the database, and to evolve the schema – all without downtime.

So in this blog I want to explore another dimension to scalability -  how multiple interfaces can be used to scale access to the database, enabling users to simultaneously serve multiple applications, each with distinct access requirements.

Data Access Interfaces to MySQL Cluster

MySQL Cluster automatically shards tables across pools of commodity data nodes, rather than store those tables in a single MySQL Server. It is therefore able to present multiple interfaces to the database, giving developers a choice between:

- S    -  SQL for complex reporting-type queries;

- S    -  Simple Key/Value interfaces bypassing the SQL layer for blazing fast reads & writes;

- S    -  Real-time interfaces for micro-second latency, again bypassing the SQL layer

With this choice of interfaces, developers are free to work in their own preferred environments, enhancing productivity and agility and enabling them to innovate faster.

SQL or NoSQL - Selecting the Right Interface

The following chart shows all of the access methods available to the database. The native API for MySQL Cluster is the C++ based NDB API. All other interfaces access the data through the NDB API.

At the extreme right hand side of the chart, an application has embedded the NDB API library enabling it to make native C++ calls to the database, and therefore delivering the lowest possible latency.

On the extreme left hand side of the chart, MySQL presents a standard SQL interface to the data nodes, and provides connectivity to all of the standard MySQL connectors including:

- Common web development languages and frameworks, i.e. PHP, Perl, Python, Ruby, Ruby on Rails, Spring, Django, etc;

- JDBC (for additional connectivity into ORMs including EclipseLink, Hibernate, etc)

- .NET

- ODBC

Whichever API is chosen for an application, it is important to emphasize that all of these SQL and NoSQL access methods can be used simultaneously, across the same data set, to provide the ultimate in developer flexibility. Therefore, MySQL Cluster maybe supporting any combination of the following services, in real-time:

- Relational queries using the SQL API;

- Key/Value-based web services using the REST/JSON and memcached APIs;

- Enterprise applications with the ClusterJ and JPA APIs;

- Real-time web services (i.e. presence and location based) using the NDB API.

The following figure aims to summarize the capabilities and use-cases for each API.

Schema-less Data Store with the memcached API

As part of the MySQL Cluster 7.2 Development Milestone Release , Oracle announced the preview of native memcached Key/Value API support for MySQL Cluster enabling direct access to the database from the memcached API without passing through the SQL layer. You can read more about the implementation and how to get going with it in this excellent post from Andrew Morgan.

The following image shows the implementation of the memcached API for MySQL Cluster 


Implementation is simple - the application sends read and write requests to the memcached process (using the standard memcached API). This in turn invokes the Memcached Driver for NDB (which is part of the same process), which in turn calls the NDB API for very quick access to the data held in MySQL Cluster’s data nodes.

The solution has been designed to be very flexible, allowing the application architect to find a configuration that best fits their needs. It is possible to co-locate the memcached API in either the data nodes or application nodes, or alternatively within a dedicated memcached layer.

The benefit of this approach is that users can configure behavior on a per-key-prefix basis (through tables in MySQL Cluster) and the application doesn’t have to care – it just uses the memcached API and relies on the software to store data in the right place(s) and to keep everything synchronized.

By default, every Key / Value is written to the same table with each Key / Value pair stored in a single row – thus allowing schema-less data storage. Alternatively, the developer can define a key-prefix so that each value is linked to a pre-defined column in a specific table.

Of course if the application needs to access the same data through SQL then developers can map key prefixes to existing table columns, enabling Memcached access to schema-structured data already stored in MySQL Cluster.

Summary

MySQL Cluster provides developers and architects with a huge amount of flexibility in accessing their persistent data stores - a reflection that one size no longer fits all in the world of web services and databases.

You can learn more about this, and the other dimensions to scaling web databases in our new Guide. 

As ever, let me know your thoughts in the comments below. 


About

Get the latest updates on products, technology, news, events, webcasts, customers and more.

Twitter


Facebook

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
2
5
6
9
10
11
12
13
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today