are the center of today’s web, enterprise and embedded applications, storing
and protecting an organization’s most valuable assets and supporting business-critical
applications. Just minutes of downtime can result in significant lost revenue
and dissatisfiedcustomers. Ensuring database highly
availability is therefore a top priority for any organization.
approaches to delivering highly available MySQL services;
best practices to meet Service Level Agreements (SLAs).
As discussed in
the new Guide, selecting the high availability solution that is appropriate for
your application depends upon 3 core principles:
- The level of availability required to meet business objectives,
within budgetary constraints;
- The profile of application being deployed (i.e. concurrent users,
requests per second, etc.);
- Operational standards within each data center.
each application or service has different operational and availability
requirements, the guide discusses the range of certified and supported High
Availability (HA) solutions – from internal departmental applications all the
way through to geographically redundant, multi-data center systems delivering
99.999% availability (i.e. less than 5 ½ minutes of downtime per year)
supporting transactional web services, communications networks, cloud and hosting environments, etc.
By combining the
right technology with the right skills and processes, users can achieve
business continuity, while developers and DBAs can sleep tight at night!Download the guide to learn
of today’s successful web services are creating new demands that many legacy
databases were just not designed to handle:
-The need to scale writes, as well as reads, both within and across
geographically dispersed data centers;
-The need to scale operational agility to keep pace with database
load and application requirements. This means being able to add capacity and
performance to the database, and to evolve the schema – all without downtime;
-The need to scale queries by having flexibility in the APIs used to
access the database;
-The need to scale the database while maintaining continuous
availability for both failures as well as scheduled maintenance events.
the requirements above warrant their own dedicated blog, which I’ll find time
to write over the next few weeks.
get started, I wanted to discuss how the MySQL Cluster database addresses the
first point – scaling writes to the database with automatic sharding and
MySQL Cluster is implemented as a distributed,
multi-master database with no single point of failure.Tables are automatically sharded across a
pool of low cost commodity nodes, enabling the database to scale horizontally
to serve read and write-intensive workloads, accessed both from SQL and directly
via NoSQL APIs (memcached, REST/HTTP, C++, Java, JPA and LDAP).Up to 255 nodes are supported, of which 48
are data nodes.You can read more about
the different types of nodes here.
By automatically sharding tables in the
database, MySQL Cluster eliminates the need to shard at the application layer,
greatly simplifying application development and maintenance.
Sharding is based on the hashing of the
primary key, though users can override this by telling MySQL Cluster which
fields from the primary key should be used in the hashing algorithm. Hashing on
the primary key generally leads to a more even distribution of data and queries
across the cluster than alternative approached such as range partitioning.
Figure 1 demonstrates how MySQL Cluster
shards tables across data nodes of the cluster.
1: Auto-Sharding in MySQL Cluster
see from the figure above that MySQL Cluster automatically creates “node
groups” from the number of replicas and data nodes specified by the user.Updates are synchronously replicated between
members of the node group to protect against data loss and enable sub-second
failover in the event of a node failure.
shows how MySQL Cluster creates primary and secondary fragments of each shard.
2: Eliminating Data Loss with Cross-Shard Fragments
Cluster is an active/active architecture with multi-master replication, so
updates made by any application or SQL node accessing the cluster are instantly
available to all of the other nodes accessing the cluster.
Unlike other distributed databases,
users do not lose the ability to perform JOIN operations or sacrifice
ACID-guarantees. In the Development Release of MySQL Cluster (7.2), Adaptive
Query Localization pushes JOIN operations down to the data nodes where they are
executed locally and in parallel. We've seen 20-40x higher throughput from the
community members that have tested it.
Of course, web services are global
and so developers will want to ensure their databases can scale-out across
regions.MySQL Cluster offers Geographic
Replication which distributes clusters to remote data centers, serving to
reduce the affects of geographic latency as well as provide a facility for
Figure 3: Geographic Replication
with MySQL Cluster
Geographic Replication is
asynchronous and based on standard MySQL replication – with one important
difference – it is active/active so supports the detection and resolution of
conflicts when the same row is updated across different clusters.This does currently require the addition of a
timestamp column in the application, but that is expected to be eliminated in
Where the Rubber Meets the Road
Auto-sharding and geographic
replication are all great technologies, but what do they mean in terms of delivered
The MySQL Cluster development team
recently ran a series of benchmarks that characterized performance across 8 x dual socket 2.93GHz, 6 core commodity
Intel servers, each equipped with 24GB of RAM.As seen in the figure below, MySQL Cluster delivered just under 2.5
million updates per second with 2 x data nodes configured per server.
Figure 4: MySQL Cluster performance
scaling-out on commodity nodes.
Across 16 Intel servers, MySQL
Cluster achieved just under 7 million read operations per second.We ran out of time in the test cluster before
being able to complete the test of write performance, but will return to those
So what does all of this mean ?There is an ever-growing array of options for
developers to choose from when scaling out new generations of web
applications.Don’t assume that
relational databases can’t scale, or offer the kind of operational agility
demanded by today’s highly dynamic services.MySQL Cluster is already proven as one such option….and you don’t have
to throw away ACID guarantees or the ability to run complex queries to get
scalability or schema agility.
You can learn about how MySQL
Cluster implements auto-sharding, along with other key features for web
services such as online schema updates and NoSQL interfaces from a new
And of course MySQL Cluster is open
source, so you are free to download, develop and deploy with it.The latest GA release is here.
The MySQL Cluster 7.2 Development
Milestone Release including Adaptive Query Localization is here (select the
Development Release tab):
Finally, if you wanted to try out
MySQL Cluster with the memcached API, you can get it from the latest build on
the MySQL labs site.
As ever, let us know how these
technologies work for you, either in the comments below or via the MySQL