Using MySQL Cluster to Protect & Scale the HDFS Namenode
By Mat Keep on Dec 19, 2011
The MySQL Cluster product team is always interested to see new and innovative uses of the database. Last week, a team of students at the KTH Royal Institute of Technology in Sweden blogged about their use of MySQL Cluster in creating a scalable and highly available HDFS Namenode. The blog has received some pretty wide coverage, but was first picked up by Alex Popescu at the myNoSQL site
There are many established use cases of MySQL Cluster in the web, cloud/SaaS, telecoms and even flight control systems – you can see those we are allowed to talk about publicly here.
- The namenode is a single point of failure. If it goes down, so too does the file system
- As a single server, the namenode becomes a bottleneck within heavily loaded HDFS / Hadoop deployments. As server resources are consumed and write volumes increase, so the system can grind to a halt. (And with data volumes growing around 40% per year, this will only become more common!)
- MySQL Cluster already offered them a replicated, shared-nothing database, distributed across commodity hardware.
- MySQL Cluster is widely deployed with proven stability
- The metadata can be distributed across nodes to scale out capacity, while retaining complete consistency to the clients and eliminating any Single Point of Failure
- Linear scaling of operations per second across the cluster, as new namenodes are added.
- 70x higher JOIN performance with Adaptive Query Localization pushing JOIN operations down to MySQL Cluster’s data
- Native Key-Value Memcached interface to the cluster allowing schema and schemaless storage
- New cross-data center scalability enhancements
MySQL Cluster is not a fit for every use-case, but by downloading the Evaluation Guide, you’ll get a clear picture of where MySQL Cluster can be useful to you, and best practices in planning and executing your evaluation.
Let us know of other interesting use-cases in the comments below