X

MySQL and HeatWave

Using MySQL Cluster to Protect & Scale the HDFS Namenode

Guest Author

The
MySQL Cluster product team is always interested
to see new and innovative uses of the database. Last week, a team of students
at the KTH Royal Institute of Technology in Sweden blogged about their use of MySQL Cluster in
creating a
scalable and highly available
HDFS Namenode. The blog has received some pretty wide coverage, but was first picked up by Alex Popescu at the myNoSQL site

There
are many established use cases of MySQL Cluster in the web, cloud/SaaS,
telecoms and even flight control systems – you can see those we are allowed to
talk about publicly here

The
KTH team has been working on a project to move all of the metadata from the
HDFS / Hadoop nameenode to MySQL Cluster.
Why did they want to do this, you may ask? Well…:

- The
namenode is a single point of failure. If it goes down, so too does the file
system

- As
a single server, the namenode becomes a bottleneck within heavily loaded HDFS /
Hadoop deployments. As server resources are consumed and write volumes
increase, so the system can grind to a halt. (And with data volumes growing
around 40% per year, this will only become more common!)

So
KTH decided to move metadata storage to MySQL Cluster.
Why, you may ask? Well….

- MySQL
Cluster already offered them a
replicated, shared-nothing
database, distributed across commodity hardware.

- MySQL Cluster is widely deployed with proven stability

- The metadata can be distributed across nodes to scale
out capacity, while retaining complete consistency to the clients and
eliminating any Single Point of Failure

- Linear scaling of operations per second across the
cluster, as new namenodes are added.

Access to the cluster is via the MySQL Cluster Connector for Java,
providing a NoSQL, Java based ORM with very low latency.
You can learn more about this ClusterJ API here

Of course, the work at KTH is on-going with future optimizations planned
– which we will follow with interest.

So how can you determine if MySQL Cluster is the right choice for your
new project?
We have just updated our MySQL Cluster Evaluation Guide

This update is based around the latest MySQL Cluster 7.2 Development
Release
which includes a series of enhancements to further broaden the use case of
MySQL Cluster, including:

- 70x higher JOIN performance with Adaptive Query
Localization pushing JOIN operations down to MySQL Cluster’s data

- Native Key-Value Memcached interface to the cluster
allowing schema and schemaless storage

- New cross-data center scalability enhancements

MySQL Cluster is not a fit for every use-case, but by
downloading the Evaluation Guide, you’ll get a clear picture of where MySQL
Cluster can be useful to you, and best practices in planning and executing your
evaluation.

Let us know of other interesting use-cases in the comments below

Join the discussion

Comments ( 1 )
  • Michela Wednesday, January 25, 2012

    We face some issues while working on this cluster. This article helped us to resolve some of issue from those problems


Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.