Tuesday Jun 05, 2012

MySQL Cluster 7.3 Labs Release – Foreign Keys Are In!

Summary (aka TL/DR):

Support for Foreign Key constraints has been one of the most requested feature enhancements for MySQL Cluster. We are therefore extremely excited to announce that Foreign Keys are part of the first Labs Release of MySQL Cluster 7.3 – available for download, evaluation and feedback now! (Select the mysql-cluster-7.3-labs-June-2012 build)

In this blog, I will attempt to discuss the design rationale, implementation, configuration and steps to get started in evaluating the first MySQL Cluster 7.3 Labs Release.

Pace of Innovation

It was only a couple of months ago that we announced the General Availability (GA) of MySQL Cluster 7.2, delivering 1 billion Queries per Minute, with 70x higher cross-shard JOIN performance, Memcached NoSQL key-value API and cross-data center replication.  This release has been a huge hit, with downloads and deployments quickly reaching record levels.

The announcement of the first MySQL Cluster 7.3 Early Access lab release at today's MySQL Innovation Day event demonstrates the continued pace in Cluster development, and provides an opportunity for the community to evaluate and feedback on new features they want to see.

What’s the Plan for MySQL Cluster 7.3?

Well, Foreign Keys, as you may have gathered by now (!), and this is the focus of this first Labs Release.

As with MySQL Cluster 7.2, we plan to publish a series of preview releases for 7.3 that will incrementally add new candidate features for a final GA release (subject to usual safe harbor statement below*), including:

- New NoSQL APIs;

- Features to automate the configuration and provisioning of multi-node clusters, on premise or in the cloud;

- Performance and scalability enhancements;

- Taking advantage of features in the latest MySQL 5.x Server GA.

Design Rationale

MySQL Cluster is designed as a “Not-Only-SQL” database. It combines attributes that enable users to blend the best of both relational and NoSQL technologies into solutions that deliver web scalability with 99.999% availability and real-time performance, including:

  • Concurrent NoSQL and SQL access to the database;
  • Auto-sharding with simple scale-out across commodity hardware;
  • Multi-master replication with failover and recovery both within and across data centers;
  • Shared-nothing architecture with no single point of failure;
  • Online scaling and schema changes;
  • ACID compliance and support for complex queries, across shards.

Native support for Foreign Key constraints enables users to extend the benefits of MySQL Cluster into a broader range of use-cases, including:

- Packaged applications in areas such as eCommerce and Web Content Management that prescribe databases with Foreign Key support.

- In-house developments benefiting from Foreign Key constraints to simplify data models and eliminate the additional application logic needed to maintain data consistency and integrity between tables.

Implementation

The Foreign Key functionality is implemented directly within MySQL Cluster’s data nodes, allowing any client API accessing the cluster to benefit from them – whether using SQL or one of the NoSQL interfaces (Memcached, C++, Java, JPA or HTTP/REST.)

The core referential actions defined in the SQL:2003 standard are implemented:

  • CASCADE
  • RESTRICT
  • NO ACTION
  • SET NULL

In addition, the MySQL Cluster implementation supports the online adding and dropping of Foreign Keys, ensuring the Cluster continues to serve both read and write requests during the operation.

An important difference to note with the Foreign Key implementation in InnoDB is that MySQL Cluster does not support the updating of Primary Keys from within the Data Nodes themselves - instead the UPDATE is emulated with a DELETE followed by an INSERT operation. Therefore an UPDATE operation will return an error if the parent reference is using a Primary Key, unless using CASCADE action, in which case the delete operation will result in the corresponding rows in the child table being deleted. The Engineering team plans to change this behavior in a subsequent preview release.

Also note that when using InnoDB "NO ACTION" is identical to "RESTRICT". In the case of MySQL Cluster “NO ACTION” means “deferred check”, i.e. the constraint is checked before commit, allowing user-defined triggers to automatically make changes in order to satisfy the Foreign Key constraints.

Configuration

There is nothing special you have to do here – Foreign Key constraint checking is enabled by default.

If you intend to migrate existing tables from another database or storage engine, for example from InnoDB, there are a couple of best practices to observe:

1. Analyze the structure of the Foreign Key graph and run the ALTER TABLE ENGINE=NDB in the correct sequence to ensure constraints are enforced

2. Alternatively drop the Foreign Key constraints prior to the import process and then recreate when complete.

Getting Started

Read this blog for a demonstration of using Foreign Keys with MySQL Cluster. 

You can download MySQL Cluster 7.3 Labs Release with Foreign Keys today - (select the mysql-cluster-7.3-labs-June-2012 build)

If you are new to MySQL Cluster, the Getting Started guide will walk you through installing an evaluation cluster on a singe host (these guides reflect MySQL Cluster 7.2, but apply equally well to 7.3)

Post any questions to the MySQL Cluster forum where our Engineering team will attempt to assist you.

Post any bugs you find to the MySQL bug tracking system (select MySQL Cluster from the Category drop-down menu)

And if you have any feedback, please post them to the Comments section of this blog.

Summary

MySQL Cluster 7.2 is the GA, production-ready release of MySQL Cluster. This first Labs Release of MySQL Cluster 7.3 gives you the opportunity to preview and evaluate future developments in the MySQL Cluster database, and we are very excited to be able to share that with you.

Let us know how you get along with MySQL Cluster 7.3, and other features that you want to see in future releases.

* Safe Harbor Statement

This information is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracles products remains at the sole discretion of Oracle.

Friday Jun 01, 2012

Configuring MySQL Cluster Data Nodes

In my previous blog post, I discussed the enhanced performance and scalability delivered by extensions to the multi-threaded data nodes in MySQL Cluster 7.2. In this post, I’ll share best practices on the configuration of data nodes to achieve optimum performance on the latest generations of multi-core, multi-thread CPU designs.

Configuring the Data Nodes

The configuration of data node threads can be managed in two ways via the config.ini file:

- Simply set MaxNoOfExecutionThreads to the appropriate number of threads to be run in the data node, based on the number of threads presented by the processors used in the host or VM.

- Use the new ThreadConfig variable that enables users to configure both the number of each thread type to use and also which CPUs to bind them too.

The flexible configuration afforded by the multi-threaded data node enhancements means that it is possible to optimise data nodes to use anything from a single CPU/thread up to a 48 CPU/thread server. Co-locating the MySQL Server with a single data node can fully utilize servers with 64 – 80 CPU/threads. It is also possible to co-locate multiple data nodes per server, but this is now only required for very large servers with 4+ CPU sockets dense multi-core processors.

24 Threads and Beyond!

An example of how to make best use of a 24 CPU/thread server box is to configure the following:

- 8 ldm threads

- 4 tc threads

- 3 recv threads

- 3 send threads

- 1 rep thread for asynchronous replication.

Each of those threads should be bound to a CPU. It is possible to bind the main thread (schema management domain) and the IO threads to the same CPU in most installations.

In the configuration above, we have bound threads to 20 different CPUs. We should also protect these 20 CPUs from interrupts by using the IRQBALANCE_BANNED_CPUS configuration variable in /etc/sysconfig/irqbalance and setting it to 0x0FFFFF.

The reason for doing this is that MySQL Cluster generates a lot of interrupt and OS kernel processing, and so it is recommended to separate activity across CPUs to ensure conflicts with the MySQL Cluster threads are eliminated.

When booting a Linux kernel it is also possible to provide an option isolcpus=0-19 in grub.conf. The result is that the Linux scheduler won't use these CPUs for any task. Only by using CPU affinity syscalls can a process be made to run on those CPUs.

By using this approach, together with binding MySQL Cluster threads to specific CPUs and banning CPUs IRQ processing on these tasks, a very stable performance environment is created for a MySQL Cluster data node.

On a 32 CPU/Thread server:

- Increase the number of ldm threads to 12

- Increase tc threads to 6

- Provide 2 more CPUs for the OS and interrupts.

- The number of send and receive threads should, in most cases, still be sufficient.

On a 40 CPU/Thread server, increase ldm threads to 16, tc threads to 8 and increment send and receive threads to 4.

On a 48 CPU/Thread server it is possible to optimize further by using:

- 12 tc threads

- 2 more CPUs for the OS and interrupts

- Avoid using IO threads and main thread on same CPU

- Add 1 more receive thread.

Summary

As both this and the previous post seek to demonstrate, the multi-threaded data node extensions not only serve to increase performance of MySQL Cluster, they also enable users to achieve significantly improved levels of utilization from current and future generations of massively multi-core, multi-thread processor designs.

A big thanks to Mikael Ronstrom, Senior MySQL Architect at Oracle, for his work in developing these enhancements and best practices.

You can download MySQL Cluster 7.2 today and try out all of these enhancements. The Getting Started guides are an invaluable aid to quickly building a Proof of Concept

Don’t forget to check out the MySQL Cluster 7.2 New Features whitepaper to discover everything that is new in the latest GA release

Wednesday May 30, 2012

MySQL Cluster 7.2: Over 8x Higher Performance than Cluster 7.1

Summary

The scalability enhancements delivered by extensions to multi-threaded data nodes enables MySQL Cluster 7.2 to deliver over 8x higher performance than the previous MySQL Cluster 7.1 release on a recent benchmark

What’s New in MySQL Cluster 7.2

MySQL Cluster 7.2 was released as GA (Generally Available) in February 2012, delivering many enhancements to performance on complex queries, new NoSQL Key / Value API, cross-data center replication and ease-of-use. These enhancements are summarized in the Figure below, and detailed in the MySQL Cluster New Features whitepaper

Figure 1: Next Generation Web Services, Cross Data Center Replication and Ease-of-Use

Once of the key enhancements delivered in MySQL Cluster 7.2 is extensions made to the multi-threading processes of the data nodes.

Multi-Threaded Data Node Extensions
The MySQL Cluster 7.2 data node is now functionally divided into seven thread types:
1) Local Data Manager threads (ldm). Note – these are sometimes also called LQH threads.
2) Transaction Coordinator threads (tc)
3) Asynchronous Replication threads (rep)
4) Schema Management threads (main)
5) Network receiver threads (recv)
6) Network send threads (send)
7) IO threads

Each of these thread types are discussed in more detail below.

MySQL Cluster 7.2 increases the maximum number of LDM threads from 4 to 16. The LDM contains the actual data, which means that when using 16 threads the data is more heavily partitioned (this is automatic in MySQL Cluster). Each LDM thread maintains its own set of data partitions, index partitions and REDO log. The number of LDM partitions per data node is not dynamically configurable, but it is possible, however, to map more than one partition onto each LDM thread, providing flexibility in modifying the number of LDM threads.

The TC domain stores the state of in-flight transactions. This means that every new transaction can easily be assigned to a new TC thread. Testing has shown that in most cases 1 TC thread per 2 LDM threads is sufficient, and in many cases even 1 TC thread per 4 LDM threads is also acceptable. Testing also demonstrated that in some instances where the workload needed to sustain very high update loads it is necessary to configure 3 to 4 TC threads per 4 LDM threads. In the previous MySQL Cluster 7.1 release, only one TC thread was available. This limit has been increased to 16 TC threads in MySQL Cluster 7.2. The TC domain also manages the Adaptive Query Localization functionality introduced in MySQL Cluster 7.2 that significantly enhanced complex query performance by pushing JOIN operations down to the data nodes.

Asynchronous Replication was separated into its own thread with the release of MySQL Cluster 7.1, and has not been modified in the latest 7.2 release.

To scale the number of TC threads, it was necessary to separate the Schema Management domain from the TC domain. The schema management thread has little load, so is implemented with a single thread.

The Network receiver domain was bound to 1 thread in MySQL Cluster 7.1. With the increase of threads in MySQL Cluster 7.2 it is also necessary to increase the number of recv threads to 8. This enables each receive thread to service one or more sockets used to communicate with other nodes the Cluster.

The Network send thread is a new thread type introduced in MySQL Cluster 7.2. Previously other threads handled the sending operations themselves, which can provide for lower latency. To achieve highest throughput however, it has been necessary to create dedicated send threads, of which 8 can be configured. It is still possible to configure MySQL Cluster 7.2 to a legacy mode that does not use any of the send threads – useful for those workloads that are most sensitive to latency.

The IO Thread is the final thread type and there have been no changes to this domain in MySQL Cluster 7.2. Multiple IO threads were already available, which could be configured to either one thread per open file, or to a fixed number of IO threads that handle the IO traffic. Except when using compression on disk, the IO threads typically have a very light load.

Benchmarking the Scalability Enhancements

The scalability enhancements discussed above have made it possible to scale CPU usage of each data node to more than 5x of that possible in MySQL Cluster 7.1. In addition, a number of bottlenecks have been removed, making it possible to scale data node performance by even more than 5x.

Figure 2: MySQL Cluster 7.2 Delivers 8.4x Higher Performance than 7.1

The flexAsynch benchmark was used to compare MySQL Cluster 7.2 performance to 7.1 across an 8-node Intel Xeon x5670-based cluster of dual socket commodity servers (6 cores each).

As the results demonstrate, MySQL Cluster 7.2 delivers over 8x higher performance per data nodes than MySQL Cluster 7.1.

More details of this and other benchmarks will be published in a new whitepaper – coming soon, so stay tuned!

In a following blog post, I’ll provide recommendations on optimum thread configurations for different types of server processor. You can also learn more from the Best Practices Guide to Optimizing Performance of MySQL Cluster

Conclusion

MySQL Cluster has achieved a range of impressive benchmark results, and set in context with the previous 7.1 release, is able to deliver over 8x higher performance per node.

As a result, the multi-threaded data node extensions not only serve to increase performance of MySQL Cluster, they also enable users to achieve significantly improved levels of utilization from current and future generations of massively multi-core, multi-thread processor designs.

Tuesday May 29, 2012

Performance Testing of MySQL Cluster: The flexAsynch Benchmark

Following the release of MySQL Cluster 7.2, the Engineering has been busy publishing a range of new performance benchmarks, most recently delivering 1.2 Billion UPDATE operations per Minute across a cluster of 30 x commodity Intel Xeon E5-based servers.

Figure 1: Linear Scaling of Write Operations

These performance tests have been run on the flexAsynch benchmark, so in the this blog, I wanted to provide a little more detail on that benchmark, and provide guidance on how you can use it in your own performance evaluations.

FlexAsynch is an open source, highly adaptable test suite that can be downloaded as part of the MySQL Cluster source tarball under the <storage/ndb/test/ndbapi> directory.

An automated tool is available to run the benchmark, with full instructions documented in the README file packaged in the dbt2-0.37.50 tarball. The tarball also includes the scripts to run the benchmark, and are further described on the MySQL benchmark page.

The benchmark reads or updates an entire row from the database as part of its test operation. All UPDATE operations are fully transactional. As part of these tests, each row in this benchmark is 100 bytes total, comprising 25 columns, each 4 bytes in size, though the size and number of columns are fully configurable.

Database access from the application tier is via the C++ NDB API, one of the NoSQL interfaces implemented by MySQL Cluster that bypasses the SQL layer to communicate directly with the data nodes. Other NoSQL interfaces include the Memcached API, Java, JPA and HTTP/REST

flexAsynch makes it possible to generate a variety of loads for MySQL Cluster benchmarking, enabling users to simulate their own environment.

Using the scripts in the dbt2-0.37.50 tarball and flexAsynch it is possible to configure a range of parameters, including:

  • The number of benchmark drivers and the number of threads per benchmark driver;
  • The number of simultaneous transactions executed per thread;
  • The size and number of records;
  • The type of operation performed (INSERT, UPDATE, DELETE, SELECT);
  • The size of the database.

By using flexAsync, users can test the limits of MySQL Cluster performance. The size of the database used for testing is configurable, but given that flexAsynch can load more than 1GB of data per second into the MySQL Cluster database, users should ensure the database is large in order to achieve consistent benchmark numbers.

We will shortly publish a whitepaper that discusses MySQL Cluster benchmarking in more detail – so stay tuned for that. In the meantime, flexAsynch is fully available today for you to run your own tests. Also check out our Best Practices guide for optimizing the performance of MySQL Cluster 

Friday Mar 30, 2012

Guide to MySQL & NoSQL, Webinar Q&A

Yesterday we ran a webinar discussing the demands of next generation web services and how blending the best of relational and NoSQL technologies enables developers and architects to deliver the agility, performance and availability needed to be successful.

Attendees posted a number of great questions to the MySQL developers, serving to provide additional insights into areas like auto-sharding and cross-shard JOINs, replication, performance, client libraries, etc. So I thought it would be useful to post those below, for the benefit of those unable to attend the webinar.

Before getting to the Q&A, there are a couple of other resources that maybe useful to those looking at NoSQL capabilities within MySQL:

- On-Demand webinar

- Slides used during the webinar

- Guide to MySQL and NoSQL whitepaper 

- MySQL Cluster demo, including NoSQL interfaces, auto-sharing, high availability, etc. 

So here is the Q&A from the event 

Q. Where does MySQL Cluster fit in to the CAP theorem?

A. MySQL Cluster is flexible. A single Cluster will prefer consistency over availability in the presence of network partitions. A pair of Clusters can be configured to prefer availability over consistency. A full explanation can be found on the MySQL Cluster & CAP Theorem blog post. 

Q. Can you configure the number of replicas? (the slide used a replication factor of 1)

Yes. A cluster is configured by an .ini file. The option NoOfReplicas sets the number of originals and replicas: 1 = no data redundancy, 2 = one copy etc. Usually there's no benefit in setting it >2.

Q. Interestingly most (if not all) of the NoSQL databases recommend having 3 copies of data (the replication factor).   

Yes, with configurable quorum based Reads and writes. MySQL Cluster does not need a quorum of replicas online to provide service. Systems that require a quorum need > 2 replicas to be able to tolerate a single failure. Additionally, many NoSQL systems take liberal inspiration from the original GFS paper which described a 3 replica configuration. MySQL Cluster avoids the need for a quorum by using a lightweight arbitrator. You can configure more than 2 replicas, but this is a tradeoff between incrementally improved availability, and linearly increased cost.

Q. Can you have cross node group JOINS? Wouldn't that run into the risk of flooding the network?

MySQL Cluster 7.2 supports cross nodegroup joins. A full cross-join can require a large amount of data transfer, which may bottleneck on network bandwidth. However, for more selective joins, typically seen with OLTP and light analytic applications, cross node-group joins give a great performance boost and network bandwidth saving over having the MySQL Server perform the join.

Q. Are the details of the benchmark available anywhere? According to my calculations it results in approx. 350k ops/sec per processor which is the largest number I've seen lately

The details are linked from Mikael Ronstrom's blog

The benchmark uses a benchmarking tool we call flexAsynch which runs parallel asynchronous transactions. It involved 100 byte reads, of 25 columns each. Regarding the per-processor ops/s, MySQL Cluster is particularly efficient in terms of throughput/node. It uses lock-free minimal copy message passing internally, and maximizes ID cache reuse. Note also that these are in-memory tables, there is no need to read anything from disk.

Q. Is access control (like table) planned to be supported for NoSQL access mode?

Currently we have not seen much need for full SQL-like access control (which has always been overkill for web apps and telco apps). So we have no plans, though especially with memcached it is certainly possible to turn-on connection-level access control. But specifically table level controls are not planned.

Q. How is the performance of memcached APi with MySQL against memcached+MySQL or any other Object Cache like Ecache with MySQL DB?

With the memcache API we generally see a memcached response in less than 1 ms. and a small cluster with one memcached server can handle tens of thousands of operations per second.

Q. Can .NET can access MemcachedAPI?

Yes, just use a .Net memcache client such as the enyim or BeIT memcache libraries.

Q. Is the row level locking applicable when you update a column through memcached API?

An update that comes through memcached uses a row lock and then releases it immediately. Memcached operations like "INCREMENT" are actually pushed down to the data nodes. In most cases the locks are not even held long enough for a network round trip.

Q. Has anyone published an example using something like PHP? I am assuming that you just use the PHP memcached extension to hook into the memcached API. Is that correct?

Not that I'm aware of but absolutely you can use it with php or any of the other drivers

Q. For beginner we need more examples.

Take a look here for a fully worked example

Q. Can I access MySQL using Cobol (Open Cobol) or C and if so where can I find the coding libraries etc?

A. There is a cobol implementation that works well with MySQL, but I do not think it is Open Cobol. Also there is a MySQL C client library that is a standard part of every mysql distribution

Q. Is there a place to go to find help when testing and/implementing the NoSQL access?

If using Cluster then you can use the cluster@lists.mysql.com alias or post on the MySQL Cluster forum

Q. Are there any white papers on this? 

Yes - there is more detail in the MySQL Guide to NoSQL whitepaper

If you have further questions, please don’t hesitate to use the comments below!

Friday Feb 24, 2012

Where Would I Use MySQL Cluster?

MySQL Cluster has long been used in telecommunications network services for Subscriber Data Management (HLR/HSS), Service Delivery Platforms and Value-Added Services, and has also been deployed in certain parts of general web infrastructure.

Following the announcements of MySQL Cluster 7.2 General Availability, including new benchmarks demonstrating MySQL Cluster delivering 1 Billion Queries per Minute, I thought it might be worthwhile to highlight examples of use cases for MySQL Cluster .

Web-Based Payment & Financial Services Platforms

MySQL Cluster can be deployed across a range of applications including payment gateways, trading systems and customer service infrastructure.

Payment Gateways

- These are used by merchants to process customer payments

- The gateways need to integrate with multiple credit and debit card systems

- Multiple payment channels have to be supported, i.e. ePayment, mPayment, In-store, etc.

- MySQL Cluster can be used to record full transaction data, including customer & product information

- This data is persisted for set time periods to enable auditing and fraud detection

Web-Based Trading Systems

- MySQL Cluster can be deployed to support the trading engine, persisting the details of each trade

- MySQL Cluster also provides the storage layer for the store–and-forward messaging system used by traders and customers to track transactions

Customer Service Systems

- MySQL Cluster can be used as a command and control system, providing telephony, web portal and call desk integration

- Inbound calls are routed to customer services representatives and customer account details are retrieved in real-time

- Additional support for Integrated Voice Response systems enabling customer self-service

Core database requirements of these platforms include:

· ACID compliance to support transactional integrity

· Rapid scale-out to support growth in merchants, traders, customers and payment channels

· Very high insert and update rates

· Low, predictable latency to support real-time trading and customer experience

· 99.999% availability to guard against both outages and support on-line maintenance operations needed to seamlessly evolve services (i.e. adding nodes, upgrading schema, etc.)

· Low TCO to maximize trading margins

Session Management and eCommerce

Providing the back-end to on-line retail sites is an area where MySQL Cluster has a strong track record, providing the following services:

- Enabling a seamless experience as users log-in, search and browse products, and then place orders.

- Managing user accounts, storing each new user session, updating customer profiles and maintaining shopping carts

- Recording and tracking user behavior to integrate with merchandizing systems, enabling real-time cross-sell and upsell promotions

Database requirements for eCommerce include

· ACID compliance to support transactional integrity

· Elastic, on-demand scale-out using commodity hardware to support growing user and order volumes, and holiday season peaks

· Low, predictable latency to support a real time user experience

· High availability to avoid downtime resulting in lost sales and compromised customer satisfaction

· On-line schema changes to support the additions of new product categories or customer profiling attributes

Take a look at the MySQL Web Reference Architectures for best practices in scaling highly available, on-line retail sites

On-Line Gaming

With a huge growth in gamers, and gaming platforms, MySQL Cluster can be used to support the core gaming persistence layer:

- MySQL Cluster manages user accounts, gaming entitlements and session state (life-force, weapons, scores, etc.), along with leaderboards, all in real time

- Manages the eCommerce and billing platform (for in-game purchases)

- Command and control system across gaming platforms, integrating multiple services with avatars and devices

Again, the core requirements of the database include:

· Linear, on-demand scalability of both read and write operations to support the ramp in demand when new games gain traction

· High availability

· Low latency for a real-time gaming experience

Event Data and Content Management

Digital Advertising and Customer Relationship Management

MySQL Cluster can be used to capture customer campaign responses in real time

- Campaign responses are consolidated across multiple channels, including web, social media, SMS, and in-store responses.

- Data is replicated in batches to the MySQL InnoDB storage engine for analysis and reporting

Event Data Capture

MySQL Cluster is used to capture real-time data feeds & metadata from environmental sensors, devices and satellites. Data is then replicated to analysis platforms for transformation and processing

Database requirements include:

· The ability to support high volume insert and update rates, with zero data loss

· Scaled-out on commodity hardware

· Flexible replication topologies to other database engines and across data centers

How to Get Started

The above examples illustrate how MySQL Cluster can be used across range of web-based services deployed on-premise or in the cloud.

If you have workloads that have similar demands, it’s worth taking a look at MySQL Cluster 7.2. The new MySQL Cluster Evaluation Guide provides best practices in quickly provisioning proof-of-concepts and benchmarking MySQL Cluster with your application.

We’d love to hear more about they types of workloads that you think would benefit from MySQL Cluster, so please use the comments section below and provide feedback.

Wednesday Feb 15, 2012

MySQL Cluster 7.2 GA Released, Delivers 1 BILLION Queries per Minute

70x Higher JOIN Performance, NoSQL Key-Value API & Cross Data Center Sharding with Synchronous Replication 

Oracle is delighted to announce the immediate availability of the production-ready, GA release of MySQL Cluster 7.2, available for download under the GPL, and as part of the commercial MySQL Cluster Carrier Grade Edition, including management tools, product certifications and 24x7 global support.

1 Billion Queries per Minute

MySQL Cluster delivered 1 billion queries per minute (17.6m million queries per second), scaled-out across 8x commodity Intel x86 server nodes, accessed by the NoSQL C++ NDB API.

It did this while maintaining 99.999% availability and complete data consistency across the cluster, demonstrating MySQL Cluster is a great choice for the most demanding web and telecoms services, whether deployed on-premise or in the cloud

New Feature Overview

The MySQL Cluster 7.2 GA release builds upon the Development Milestones published over the past 9 months, which provided the community with an opportunity to test and provide feedback on the latest features.

MySQL Cluster 7.2 offers a range of new capabilities designed to enable the delivery of next generation web services, enhance cross data center scalability and improve ease-of-use:

- Enabling next generation web services:

o 70x higher complex query performance

o Native Memcached API

o 4x higher data node scalability

o Integration with the latest MySQL 5.5 server

o Support for Virtual Machine (VM) environments

- Enhancing cross data scalability:

o New multi-site clustering with auto-sharding and synchronous replication between datacenters

o Improved active/active replication between data centers with eventual consistency

- Improved Ease-of-Use:

o Consolidated user privileges

o MySQL Cluster Manager 1.1.4

Read the MySQL Cluster 7.2 Developer Zone article to get the detail on all of the new features.

You can download the MySQL Cluster 7.2 New Features whitepaper for implementation details and how to get started or join a forthcoming MySQL Cluster 7.2 webinar for your Time Zone to learn more:

Summary

MySQL Cluster 7.2 is the best release to date, enabling projects and applications to benefit from web-scalability with carrier-grade availability and developer agility.

You can review the MySQL Cluster 7.2 documentation, and also ask questions to the development team and community via our the MySQL Cluster forum

We look forward to helping you in your new projects, and working with you to continue evolving MySQL Cluster to serve an even broader set of requirements in the future.

Monday Dec 19, 2011

Using MySQL Cluster to Protect & Scale the HDFS Namenode

The MySQL Cluster product team is always interested to see new and innovative uses of the database. Last week, a team of students at the KTH Royal Institute of Technology in Sweden blogged about their use of MySQL Cluster in creating a scalable and highly available HDFS Namenode. The blog has received some pretty wide coverage, but was first picked up by Alex Popescu at the myNoSQL site

There are many established use cases of MySQL Cluster in the web, cloud/SaaS, telecoms and even flight control systems – you can see those we are allowed to talk about publicly here

The KTH team has been working on a project to move all of the metadata from the HDFS / Hadoop nameenode to MySQL Cluster. Why did they want to do this, you may ask? Well…:

- The namenode is a single point of failure. If it goes down, so too does the file system

- As a single server, the namenode becomes a bottleneck within heavily loaded HDFS / Hadoop deployments. As server resources are consumed and write volumes increase, so the system can grind to a halt. (And with data volumes growing around 40% per year, this will only become more common!)

So KTH decided to move metadata storage to MySQL Cluster. Why, you may ask? Well….

- MySQL Cluster already offered them a replicated, shared-nothing database, distributed across commodity hardware.

- MySQL Cluster is widely deployed with proven stability

- The metadata can be distributed across nodes to scale out capacity, while retaining complete consistency to the clients and eliminating any Single Point of Failure

- Linear scaling of operations per second across the cluster, as new namenodes are added.

Access to the cluster is via the MySQL Cluster Connector for Java, providing a NoSQL, Java based ORM with very low latency. You can learn more about this ClusterJ API here

Of course, the work at KTH is on-going with future optimizations planned – which we will follow with interest.

So how can you determine if MySQL Cluster is the right choice for your new project? We have just updated our MySQL Cluster Evaluation Guide

This update is based around the latest MySQL Cluster 7.2 Development Release which includes a series of enhancements to further broaden the use case of MySQL Cluster, including:

- 70x higher JOIN performance with Adaptive Query Localization pushing JOIN operations down to MySQL Cluster’s data

- Native Key-Value Memcached interface to the cluster allowing schema and schemaless storage

- New cross-data center scalability enhancements

MySQL Cluster is not a fit for every use-case, but by downloading the Evaluation Guide, you’ll get a clear picture of where MySQL Cluster can be useful to you, and best practices in planning and executing your evaluation.

Let us know of other interesting use-cases in the comments below

Wednesday Nov 02, 2011

MySQL Cluster, and NoSQL

Those are the topics we cover in the latest episode of our “Meet The MySQL Experts” podcast.

Mat Keep and Bernd Ocklin talk about new database requirements, and walk us through what's new in the second Development Milestone Release of MySQL Cluster 7.2, including impressive performance improvements, new NoSQL access via memcached, cross data center scalability, and more...

Enjoy the podcast!

Friday Oct 07, 2011

MySQL Cluster 7.2 (DMR2): NoSQL, Key/Value, Memcached

70x Higher Performance, Cross Data Center Scalability and New NoSQL Interface

Its been an exciting week for all involved with MySQL Cluster, with the announcement of the second Development Milestone Release (7.2.1) at Oracle Open World. Highlights include:

- Enabling next generation web services: 70x higher complex query performance, native memcached API and integration with the latest MySQL 5.5 server

- Enhancing cross data scalability: new multi-site clustering and enhanced active/active replication

- Simplified provisioning: consolidated user privileges.

You can download the DMR for evaluation now from: http://dev.mysql.com/downloads/cluster/ (select Development Milestone Release tab).

You can also read up on the detail of each of these features in the new article posted at the MySQL Developer Zone. In this blog, I’ll summarize the main parts of the announcement.

70x Higher Performance with Adaptive Query Localization (AQL)

Previewed as part of the first MySQL Cluster DMR, AQL is enabled by a new Index Statistics function that allows the SQL optimizer to build a better execution plan for each query.

As a result, JOIN operations are pushed down to the data nodes where the query executes in parallel on local copies of the data. A merged result set is then sent back to the MySQL Server, significantly enhancing performance by reducing network trips.

Take a look at how this is used by a web-based content management to increase performance by 70x

Adaptive Query Localization enables MySQL Cluster to better serve those use-cases that have the need to run real-time analytics across live data sets, along with high throughput OLTP operations. Examples include recommendations engines and clickstream analysis in web applications, pre-pay billing promotions in mobile telecoms networks or fraud detection in payment systems.

New NoSQL Interface and Schema-less Storage with the memcached API

The memcached interface released as an Early Access project with the first MySQL Cluster DMR is now integrated directly into the MySQL Cluster 7.2.1 trunk, enabling simpler evaluation.

The popularity of Key/Value stores has increased dramatically. With MySQL Cluster and the new memcached API, you have all the benefits of an ACID RDBMS, combined with the performance capabilities of Key/Value store.

By default, every Key / Value is written to the same table with each Key / Value pair stored in a single row – thus allowing schema-less data storage. Alternatively, the developer can define a key-prefix so that each value is linked to a pre-defined column in a specific table.

Of course if the application needs to access the same data through SQL then developers can map key prefixes to existing table columns, enabling Memcached access to schema-structured data already stored in MySQL Cluster.

You can read more about the design goals and implementation of the memcached API for MySQL Cluster here.

Integration with MySQL 5.5

MySQL Cluster 7.2.1 is integrated with MySQL Server 5.5, providing binary compatibility to existing MySQL Server deployments. Users can now fully exploit the latest capabilities of both the InnoDB and MySQL Cluster storage engines within a single application.

Users simply install the new MySQL Cluster binary including the MySQL 5.5 release, restart the server and immediate have access to both InnoDB and MySQL Cluster!

Enhancing Cross Data Center Scalability: Simplified Active / Active Replication

MySQL Cluster has long offered Geographic Replication, distributing clusters to remote data centers to reduce the affects of geographic latency by pushing data closer to the user, as well as providing a capability for disaster recovery.

Geographic replication has always been designed around an Active / Active technology, so if applications are attempting to update the same row on different clusters at the same time, the conflict can be detected and resolved. With the release of MySQL Cluster 7.2.1, implementing Active / Active replication has become a whole lot simpler. Developers no longer need to implement and manage timestamp columns within their applications. Also rollbacks can be made to whole transactions rather than just individual operations.

You can learn more here.

Enhancing Cross Data Center Scalability: Multi-Site Clustering

MySQL Cluster 7.2.1 DMR provides a new option for cross data center scalability – multi-site clustering. For the first time splitting data nodes across data centers is a supported deployment option.

Improvements to MySQL Cluster’s heartbeating mechanism with a new “ConnectivityCheckPeriod” parameter enables greater resilience to temporary latency spikes on a WAN, thereby maintaining operation of the cluster.

With this deployment model, users can synchronously replicate updates between data centers without needing conflict detection and resolution, and automatically failover between those sites in the event of a node failure.

Users need to characterize their network bandwidth and latencies, and observe best practices in configuring both their network environment and Cluster. More guidance is available here.

User Privilege Consolidation

User privilege tables are now consolidated into the data nodes and centrally accessible by all MySQL servers accessing the cluster.

Previously the privilege tables were local to each MySQL server, meaning users and their associated privileges had to be managed separately on each server. By consolidating privilege data, users need only be defined once and managed centrally, saving Systems Administrators significant effort and reducing cost of operations.

Summary

The MySQL Cluster 7.2.1 DMR enables new classes of use-cases to benefit from web-scale performance with carrier-grade availability.  We also have a great webinar coming up on Wednesday October 19th  where the engineering and product management team will discuss the enhancements in more detail, and how you can use them today. You can sign up here.

You can download the DMR for evaluation now from: http://dev.mysql.com/downloads/cluster/ (select Development Milestone Release tab).

You can learn more about the MySQL Cluster architecture from our Guide to scaling web databases

Let us know what you think of these enhancements directly in comments of this or the associated blogs. We look forward to working with the community to perfect these new features.

Monday Oct 03, 2011

Synchronously Replicating Databases Across Data Centers – Are you Insane?

 

Well actually….no. The second Development Milestone Release of MySQL Cluster 7.2 introduces support for what we call “Multi-Site Clustering”. In this post, I’ll provide an overview of this new capability, and considerations you need to make when considering it as a deployment option to scale geographically dispersed database services.

You can read more about MySQL Cluster 7.2.1 in the article posted on the MySQL Developer Zone

MySQL Cluster has long offered Geographic Replication, distributing clusters to remote data centers to reduce the affects of geographic latency by pushing data closer to the user, as well as providing a capability for disaster recovery.

Multi-Site Clustering provides a new option for cross data center scalability. For the first time splitting data nodes across data centers is a supported deployment option. With this deployment model, users can synchronously replicate updates between data centers without needing to modify their application or schema for conflict handling, and automatically failover between those sites in the event of a node failure.

MySQL Cluster offers high availability by maintaining a configurable number of data replicas.  All replicas are synchronously maintained by a built-in 2 phase commit protocol.  Data node and communication failures are detected and handled automatically.  On recovery, data nodes automatically rejoin the cluster, synchronize with running nodes, and resume service.

All replicas of a given row are stored in a set of data nodes known as a nodegroup.  To provide service, a cluster must have at least one data node from each nodegroup available at all times.  When the cluster detects that the last node in a nodegroup has failed, the remaining cluster nodes will be gracefully shutdown, to ensure the consistency of the stored databases on recovery.

Improvements to the heartbeating mechanism used by MySQL Cluster enables greater resilience to temporary latency spikes on a WAN, thereby maintaining operation of the cluster. A new “ConnectivityCheck” mechanism is introduced, which must be explicitly configured. This extra mechanism adds messaging overheads and failure handling latency, and so is not switched on by default.

When configuring Multi-Site clustering, the following factors must be considered:

Bandwidth
Low bandwidth between data nodes can slow data node recovery.  In normal operation, the available bandwidth can limit the maximum system throughput.  If link saturation causes latency on individual links to increase, then node failures, and potentially cluster failure could occur.

Latency and performance
Synchronously committing transactions over a wide area increases the latency of operation execution and commit, therefore individual operations are slowed. To maintain the same overall throughput, higher client concurrency is required.  With the same client concurrency level, throughput will decrease relative to a lower latency configuration.

Latency and stability
Synchronous operation implies that clients wait to hear of the success or failure of each operation before continuing. Loss of communication to a node, and high latency communication to a node are indistinguishable in some cases.  To ensure availability, the Cluster monitors inter-node communication.  If a node experiences high communication latency, then it may be killed by another node, to prevent its high latency causing service loss.

Where inter-node latencies fluctuate, and are in the same range as the node-latency-monitoring trigger levels, node failures can result.  Node failures are expensive to recover from, and endanger Cluster availability. 

To avoid node failures, either the latency should be reduced, or the trigger levels should be raised.  Raising trigger levels can result in a longer time-to-detection of communication problems.

WAN latencies
Latency on an IP WAN may be a function of physical distance, routing hops, protocol layering, link failover times and rerouting times. The maximum expected latency on a link should be characterized as input to the cluster configuration.

Survivability of node failures
MySQL Cluster uses a fail fast mechanism to minimize time-to-recovery. Nodes that are suspected of being unreachable or dead are quickly excluded from the Cluster.  This mechanism is simple and fast, but sometimes takes steps that result in unnecessary cluster failure.  For this reason, latency trigger levels should be configured a safe margin
above the maximum latency variation on inter-data node links.

Users can configure various MySQL Cluster parameters including heartbeats, Connectivity_Check, GCP timeouts and transaction deadlock timeouts. You can read more about these parameters in the documentation

Recommendations for Multi-Site Clustering
- Ensure minimal, stable latency;
- Provision the network with sufficient bandwidth for the expected peak load - test with node recovery and system recovery;
- Configure the heartbeat period to ensure a safe margin above latency fluctuations;

- Configure the ConnectivtyCheckPeriod to avoid unnecessary node failures;

- Configure other timeouts accordingly including the GCP timeout, transaction deadlock timeout, and transaction inactivity timeout.

Example
The following is a recommendation of latency and bandwidth requirements for applications with high throughput and fast failure detection requirements:
- latency between remote data nodes must not exceed 20 milliseconds;
- bandwidth of the network link must be more than 1 Gigabit per Second.

For applications that do not require this type of stringent operating environment, latency and bandwidth can be relaxed, subject to the testing recommended above.

As the recommendations demonstrate, there are a number of factors that need to be considered before deploying multi-site clustering. For geo-redundancy, Oracle recommends Geographic Replication, but multi-site clustering does present an alternative deployment, subject to the considerations and constraints discussed above.

You can learn more about scaling web databases with MySQL Cluster from our new Guide.  We look forward to hearing your experiences with the new MySQL Cluster 7.2.1 DMR!

Thursday Sep 29, 2011

MySQL HA Solutions: New Guide Available

Databases are the center of today’s web, enterprise and embedded applications, storing and protecting an organization’s most valuable assets and supporting business-critical applications. Just minutes of downtime can result in significant lost revenue and dissatisfied customers. Ensuring database highly availability is therefore a top priority for any organization.

The new MySQL Guide to High Availability solutions is designed to navigate users through the HA maze, discussing:

- The causes, effects and impacts of downtime;

- Methodologies to select the right HA solution;

- Different approaches to delivering highly available MySQL services;

- Operational best practices to meet Service Level Agreements (SLAs).

As discussed in the new Guide, selecting the high availability solution that is appropriate for your application depends upon 3 core principles:

- The level of availability required to meet business objectives, within budgetary constraints;

- The profile of application being deployed (i.e. concurrent users, requests per second, etc.);

- Operational standards within each data center.

Recognizing that each application or service has different operational and availability requirements, the guide discusses the range of certified and supported High Availability (HA) solutions – from internal departmental applications all the way through to geographically redundant, multi-data center systems delivering 99.999% availability (i.e. less than 5 ½ minutes of downtime per year) supporting transactional web services, communications networks, cloud and hosting environments, etc.

By combining the right technology with the right skills and processes, users can achieve business continuity, while developers and DBAs can sleep tight at night! Download the guide to learn more.

Friday Aug 05, 2011

Scaling Web Databases, Part 3: SQL & NoSQL Data Access

Supporting successful services on the web means scaling your back-end databases across multiple dimensions. This blog focuses on scaling access methods to your data using SQL and/or NoSQL interfaces.

In Part 1 of the blog series , I discussed scaling database performance using auto-sharding and active/active geographic replication in MySQL Cluster to enable applications to scale both within and across data centers.  

In Part 2, I discussed the need to scale operational agility to keep pace with demand, which includes being able to add capacity and performance to the database, and to evolve the schema – all without downtime.

So in this blog I want to explore another dimension to scalability -  how multiple interfaces can be used to scale access to the database, enabling users to simultaneously serve multiple applications, each with distinct access requirements.

Data Access Interfaces to MySQL Cluster

MySQL Cluster automatically shards tables across pools of commodity data nodes, rather than store those tables in a single MySQL Server. It is therefore able to present multiple interfaces to the database, giving developers a choice between:

- S    -  SQL for complex reporting-type queries;

- S    -  Simple Key/Value interfaces bypassing the SQL layer for blazing fast reads & writes;

- S    -  Real-time interfaces for micro-second latency, again bypassing the SQL layer

With this choice of interfaces, developers are free to work in their own preferred environments, enhancing productivity and agility and enabling them to innovate faster.

SQL or NoSQL - Selecting the Right Interface

The following chart shows all of the access methods available to the database. The native API for MySQL Cluster is the C++ based NDB API. All other interfaces access the data through the NDB API.

At the extreme right hand side of the chart, an application has embedded the NDB API library enabling it to make native C++ calls to the database, and therefore delivering the lowest possible latency.

On the extreme left hand side of the chart, MySQL presents a standard SQL interface to the data nodes, and provides connectivity to all of the standard MySQL connectors including:

- Common web development languages and frameworks, i.e. PHP, Perl, Python, Ruby, Ruby on Rails, Spring, Django, etc;

- JDBC (for additional connectivity into ORMs including EclipseLink, Hibernate, etc)

- .NET

- ODBC

Whichever API is chosen for an application, it is important to emphasize that all of these SQL and NoSQL access methods can be used simultaneously, across the same data set, to provide the ultimate in developer flexibility. Therefore, MySQL Cluster maybe supporting any combination of the following services, in real-time:

- Relational queries using the SQL API;

- Key/Value-based web services using the REST/JSON and memcached APIs;

- Enterprise applications with the ClusterJ and JPA APIs;

- Real-time web services (i.e. presence and location based) using the NDB API.

The following figure aims to summarize the capabilities and use-cases for each API.

Schema-less Data Store with the memcached API

As part of the MySQL Cluster 7.2 Development Milestone Release , Oracle announced the preview of native memcached Key/Value API support for MySQL Cluster enabling direct access to the database from the memcached API without passing through the SQL layer. You can read more about the implementation and how to get going with it in this excellent post from Andrew Morgan.

The following image shows the implementation of the memcached API for MySQL Cluster 


Implementation is simple - the application sends read and write requests to the memcached process (using the standard memcached API). This in turn invokes the Memcached Driver for NDB (which is part of the same process), which in turn calls the NDB API for very quick access to the data held in MySQL Cluster’s data nodes.

The solution has been designed to be very flexible, allowing the application architect to find a configuration that best fits their needs. It is possible to co-locate the memcached API in either the data nodes or application nodes, or alternatively within a dedicated memcached layer.

The benefit of this approach is that users can configure behavior on a per-key-prefix basis (through tables in MySQL Cluster) and the application doesn’t have to care – it just uses the memcached API and relies on the software to store data in the right place(s) and to keep everything synchronized.

By default, every Key / Value is written to the same table with each Key / Value pair stored in a single row – thus allowing schema-less data storage. Alternatively, the developer can define a key-prefix so that each value is linked to a pre-defined column in a specific table.

Of course if the application needs to access the same data through SQL then developers can map key prefixes to existing table columns, enabling Memcached access to schema-structured data already stored in MySQL Cluster.

Summary

MySQL Cluster provides developers and architects with a huge amount of flexibility in accessing their persistent data stores - a reflection that one size no longer fits all in the world of web services and databases.

You can learn more about this, and the other dimensions to scaling web databases in our new Guide. 

As ever, let me know your thoughts in the comments below. 


Thursday Jul 21, 2011

Scaling Web Databases, Part 2: Adding Nodes, Evolving Schema with Zero Downtime

In my previous post, I discussed scaling web database performance in MySQL Cluster using auto-sharding and active/active geographic replication - enabling users to scale both within and across data centers.  

I also mentioned that while scaling write-performance of any web service is critical, it is only 1 of multiple dimensions to scalability, which include:

- The need to scale operational agility to keep pace with demand. This means being able to add capacity and performance to the database, and to evolve the schema – all without downtime;

- The need to scale queries by having flexibility in the APIs used to access the database – including SQL and NoSQL interfaces;

- The need to scale the database while maintaining continuous availability.

All of these subjects are discussed in more detail in our new Scaling Web Databases guide.

In this posting, we look at scaling operational agility. 

As a web service gains in popularity it is important to be able to evolve the underlying infrastructure seamlessly, without incurring downtime and without having to add lots of additional DBA or developer resource.

Users may need to increase the capacity and performance of the database; enhance their application (and therefore their database schema) to deliver new capabilities and upgrade their underlying platforms.

MySQL Cluster can perform all of these operations and more on-line – without interrupting service to the application or clients.  

On-Line, On-Demand Scaling

MySQL Cluster allows users to scale both database performance and capacity by adding Application and Data Nodes on-line, enabling users to start with small clusters and then scale them on-demand, without downtime, as a service grows. Scaling could be the result of more users, new application functionality or more applications needing to share the database.

In the following example, the cluster on the left is configured with two application and data nodes and a single management server.  As the service grows, the users are able to scale the database and add management redundancy – all of which can be performed as an online operation.  An added advantage of scaling the Application Nodes is that they provide elasticity in scaling, so can be scaled back down if demand to the database decreases.

When new data nodes and node groups are added, the existing nodes in the cluster initiate a rolling restart to reconfigure for the new resource.  This rolling restart ensures that the cluster remains operational during the addition of new nodes.  Tables are then repartitioned and redundant rows are deleted with the OPTIMIZE TABLE command.  All of these operations are transactional, ensuring that a node failure during the add-node process will not corrupt the database.

The operations can be performed manually from the command line or automated with MySQL Cluster Manager , part of the commercial MySQL Cluster Carrier Grade Edition.

On-Line Cluster Maintenance

With its shared-nothing architecture, it is possible to avoid database outages by using rolling restarts to not only add but also upgrade nodes within the cluster.  Using this approach, users can:

- Upgrade or patch the underlying hardware and operating system;

- Upgrade or patch MySQL Cluster, with full online upgrades between releases.

MySQL Cluster supports on-line, non-blocking backups, ensuring service interruptions are again avoided during this critical database maintenance task.  Users are able to exercise fine-grained control when restoring a MySQL Cluster from backup using ndb_restore. Users can restore only specified tables or databases, or exclude specific tables or databases from being restored, using ndb_restore options --include-tables, --include-databases, --exclude-tables, and --exclude-databases.

On-Line Schema Evolution

As services evolve, developers often want to add new functionality, which in many instances may demand updating the database schema.  

This operation can be very disruptive for many databases, with ALTER TABLE commands taking the database offline for the duration of the operation.  When users have large tables with many millions of rows, downtime can stretch into hours or even days.

MySQL Cluster supports on-line schema changes, enabling users to add new columns and tables and add and remove indexes – all while continuing to serve read and write requests, and without affecting response times.

Unlike other on-line schema update solutions, MySQL Cluster does not need to create temporary tables, therefore avoiding the user having to provision double the usual memory or disk space in order to complete the operation.

Summary

So in addition to scaling write performance, MySQL Cluster can also scale operational agility.  I'll post more on scaling of data access methods and availability levels over the next few weeks.

You can read more about all of these capabilities in the new Scaling Web Databases guide.  

And of course, you can try MySQL Cluster out for yourself - its available under the GPL:

The GA release is 7.1 which can be downloaded here, but I'd recommend taking a look at the latest Development Milestone Release for MySQL Cluster 7.2 which has some great new capabilities (localized JOIN operations, simpler provisioning, etc) which can be downloaded from here (select the Development Releases tab).

As ever, let me know if there are other dimensions of scalability that I should be discussing 

Monday Jul 18, 2011

Simpler and Safer Clustering: MySQL Cluster Manager Update

Clustered computing brings with it many benefits: high performance, high availability, scalable infrastructure, etc. But it also brings with it more complexity.

Why?

Well, by its very nature, there are more “moving parts” to monitor and manage (from physical, virtual and logical hosts) to clustering software to redundant networking components – the list goes on. And a cluster that isn’t effectively provisioned and managed will cause more downtime than the standalone systems it is designed to improve upon.

When it comes to the database industry, analysts already estimate that 50% of a typical database’s Total Cost of Ownership is attributable to staffing and downtime costs. These costs will only increase if a database cluster is not effectively monitored and managed.

Monitoring and management has been a major focus in the development of the MySQL Cluster database, and as part of this focus, the latest release of MySQL Cluster Manager (MCM) hit General Availability last week. You can read all about it in Andrew Morgan's blog.

MySQL Cluster Manager 1.1.1 makes it much simpler to get up and running, to manage the cluster and to allow multiple clusters to be managed from a single process.

MySQL Cluster Manager is part of the commercial Carrier-Grade Edition but anyone is free to download and use MySQL Cluster Manager without obligation for 30 days. This is a great way for those new to MySQL Cluster to rapidly configure and provision their first cluster.

All you need do is:

1. Go to Oracle eDelivery

2. Enter some basic details and click through the agreement

3. Select “MySQL Product Pack”, then your platform, then Go

Not only does MCM make the management of MySQL Cluster simpler, it also makes it safer. One of the largest causes of downtime is administrator error, and here MySQL Cluster Manager can significantly reduce risk.

Consider the task of upgrading rom one release of MySQL Cluster to another. This can be performed as an on-line operation, using rolling restarts to apply upgrades while still serving read and write requests. Its just one of the many operations users can perform on line (ie adding data nodes, upgrading schema, backups, etc) all of which enable MySQL Cluster to achieve 99.999% uptime.

Using a manual upgrade method on a cluster configured with 4 x data nodes, 2 x MySQL Server application nodes and 2 x management nodes, the administrator would be typing 46 x manual commands in an operation that would take around 2 ½ hours to complete. The steps are shown below:

1 x preliminary check of cluster state

8 x ssh commands per server

8 x per-process stop commands

4 x scp of configuration files (2 x mgmd & 2 x mysqld)

8 x per-process start commands

8 x checks for started and re-joined processes

8 x process completion verifications

1 x verify completion of the whole cluster.

Excludes manual editing of each configuration file.

Now compare this to using MySQL Cluster Manager:

upgrade cluster --package=7.1 mycluster;

Just 1 command and walk away and leave it.

Note – both of the processes above exclude the preparation steps of copying the new software package to each host and defining where it's located. The total operation times are based on a DBA restarting 4 x MySQL Cluster Data Nodes, each with 6GB of data, and performing 10,000 operations per second.

You can learn more about MySQL Cluster Manager from our new whitepaper and on-line demo.

We also have an on-demand webinar which covers MySQL Cluster Manager as well as other complimentary methods to managing a MySQL Cluster environment:

* NDBINFO: released with MySQL Cluster 7.1, NDBINFO presents real-time status and usage statistics, providing developers and DBAs with a simple means of pro-actively monitoring and optimizing database performance and availability.

* MySQL Cluster Advisors & Graphs: part of the MySQL Enterprise Monitor and available in the commercial MySQL Cluster Carrier Grade Edition, the Enterprise Advisor includes automated best practice rules that alert on key performance and availability metrics from MySQL Cluster data nodes.

While managing clusters will never be easy, it keeps getting a whole lot simpler !

About

Get the latest updates on products, technology, news, events, webcasts, customers and more.

Twitter


Facebook

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
2
5
6
9
10
11
12
13
15
16
17
18
19
20
23
24
25
26
27
28
29
30
   
       
Today