Monday Dec 19, 2011

Using MySQL Cluster to Protect & Scale the HDFS Namenode

The MySQL Cluster product team is always interested to see new and innovative uses of the database. Last week, a team of students at the KTH Royal Institute of Technology in Sweden blogged about their use of MySQL Cluster in creating a scalable and highly available HDFS Namenode. The blog has received some pretty wide coverage, but was first picked up by Alex Popescu at the myNoSQL site

There are many established use cases of MySQL Cluster in the web, cloud/SaaS, telecoms and even flight control systems – you can see those we are allowed to talk about publicly here

The KTH team has been working on a project to move all of the metadata from the HDFS / Hadoop nameenode to MySQL Cluster. Why did they want to do this, you may ask? Well…:

- The namenode is a single point of failure. If it goes down, so too does the file system

- As a single server, the namenode becomes a bottleneck within heavily loaded HDFS / Hadoop deployments. As server resources are consumed and write volumes increase, so the system can grind to a halt. (And with data volumes growing around 40% per year, this will only become more common!)

So KTH decided to move metadata storage to MySQL Cluster. Why, you may ask? Well….

- MySQL Cluster already offered them a replicated, shared-nothing database, distributed across commodity hardware.

- MySQL Cluster is widely deployed with proven stability

- The metadata can be distributed across nodes to scale out capacity, while retaining complete consistency to the clients and eliminating any Single Point of Failure

- Linear scaling of operations per second across the cluster, as new namenodes are added.

Access to the cluster is via the MySQL Cluster Connector for Java, providing a NoSQL, Java based ORM with very low latency. You can learn more about this ClusterJ API here

Of course, the work at KTH is on-going with future optimizations planned – which we will follow with interest.

So how can you determine if MySQL Cluster is the right choice for your new project? We have just updated our MySQL Cluster Evaluation Guide

This update is based around the latest MySQL Cluster 7.2 Development Release which includes a series of enhancements to further broaden the use case of MySQL Cluster, including:

- 70x higher JOIN performance with Adaptive Query Localization pushing JOIN operations down to MySQL Cluster’s data

- Native Key-Value Memcached interface to the cluster allowing schema and schemaless storage

- New cross-data center scalability enhancements

MySQL Cluster is not a fit for every use-case, but by downloading the Evaluation Guide, you’ll get a clear picture of where MySQL Cluster can be useful to you, and best practices in planning and executing your evaluation.

Let us know of other interesting use-cases in the comments below

About

Get the latest updates on products, technology, news, events, webcasts, customers and more.

Twitter


Facebook

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
2
5
6
9
10
11
12
13
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today