Monday Oct 03, 2011

Synchronously Replicating Databases Across Data Centers – Are you Insane?

 

Well actually….no. The second Development Milestone Release of MySQL Cluster 7.2 introduces support for what we call “Multi-Site Clustering”. In this post, I’ll provide an overview of this new capability, and considerations you need to make when considering it as a deployment option to scale geographically dispersed database services.

You can read more about MySQL Cluster 7.2.1 in the article posted on the MySQL Developer Zone

MySQL Cluster has long offered Geographic Replication, distributing clusters to remote data centers to reduce the affects of geographic latency by pushing data closer to the user, as well as providing a capability for disaster recovery.

Multi-Site Clustering provides a new option for cross data center scalability. For the first time splitting data nodes across data centers is a supported deployment option. With this deployment model, users can synchronously replicate updates between data centers without needing to modify their application or schema for conflict handling, and automatically failover between those sites in the event of a node failure.

MySQL Cluster offers high availability by maintaining a configurable number of data replicas.  All replicas are synchronously maintained by a built-in 2 phase commit protocol.  Data node and communication failures are detected and handled automatically.  On recovery, data nodes automatically rejoin the cluster, synchronize with running nodes, and resume service.

All replicas of a given row are stored in a set of data nodes known as a nodegroup.  To provide service, a cluster must have at least one data node from each nodegroup available at all times.  When the cluster detects that the last node in a nodegroup has failed, the remaining cluster nodes will be gracefully shutdown, to ensure the consistency of the stored databases on recovery.

Improvements to the heartbeating mechanism used by MySQL Cluster enables greater resilience to temporary latency spikes on a WAN, thereby maintaining operation of the cluster. A new “ConnectivityCheck” mechanism is introduced, which must be explicitly configured. This extra mechanism adds messaging overheads and failure handling latency, and so is not switched on by default.

When configuring Multi-Site clustering, the following factors must be considered:

Bandwidth
Low bandwidth between data nodes can slow data node recovery.  In normal operation, the available bandwidth can limit the maximum system throughput.  If link saturation causes latency on individual links to increase, then node failures, and potentially cluster failure could occur.

Latency and performance
Synchronously committing transactions over a wide area increases the latency of operation execution and commit, therefore individual operations are slowed. To maintain the same overall throughput, higher client concurrency is required.  With the same client concurrency level, throughput will decrease relative to a lower latency configuration.

Latency and stability
Synchronous operation implies that clients wait to hear of the success or failure of each operation before continuing. Loss of communication to a node, and high latency communication to a node are indistinguishable in some cases.  To ensure availability, the Cluster monitors inter-node communication.  If a node experiences high communication latency, then it may be killed by another node, to prevent its high latency causing service loss.

Where inter-node latencies fluctuate, and are in the same range as the node-latency-monitoring trigger levels, node failures can result.  Node failures are expensive to recover from, and endanger Cluster availability. 

To avoid node failures, either the latency should be reduced, or the trigger levels should be raised.  Raising trigger levels can result in a longer time-to-detection of communication problems.

WAN latencies
Latency on an IP WAN may be a function of physical distance, routing hops, protocol layering, link failover times and rerouting times. The maximum expected latency on a link should be characterized as input to the cluster configuration.

Survivability of node failures
MySQL Cluster uses a fail fast mechanism to minimize time-to-recovery. Nodes that are suspected of being unreachable or dead are quickly excluded from the Cluster.  This mechanism is simple and fast, but sometimes takes steps that result in unnecessary cluster failure.  For this reason, latency trigger levels should be configured a safe margin
above the maximum latency variation on inter-data node links.

Users can configure various MySQL Cluster parameters including heartbeats, Connectivity_Check, GCP timeouts and transaction deadlock timeouts. You can read more about these parameters in the documentation

Recommendations for Multi-Site Clustering
- Ensure minimal, stable latency;
- Provision the network with sufficient bandwidth for the expected peak load - test with node recovery and system recovery;
- Configure the heartbeat period to ensure a safe margin above latency fluctuations;

- Configure the ConnectivtyCheckPeriod to avoid unnecessary node failures;

- Configure other timeouts accordingly including the GCP timeout, transaction deadlock timeout, and transaction inactivity timeout.

Example
The following is a recommendation of latency and bandwidth requirements for applications with high throughput and fast failure detection requirements:
- latency between remote data nodes must not exceed 20 milliseconds;
- bandwidth of the network link must be more than 1 Gigabit per Second.

For applications that do not require this type of stringent operating environment, latency and bandwidth can be relaxed, subject to the testing recommended above.

As the recommendations demonstrate, there are a number of factors that need to be considered before deploying multi-site clustering. For geo-redundancy, Oracle recommends Geographic Replication, but multi-site clustering does present an alternative deployment, subject to the considerations and constraints discussed above.

You can learn more about scaling web databases with MySQL Cluster from our new Guide.  We look forward to hearing your experiences with the new MySQL Cluster 7.2.1 DMR!

Thursday Sep 29, 2011

MySQL HA Solutions: New Guide Available

Databases are the center of today’s web, enterprise and embedded applications, storing and protecting an organization’s most valuable assets and supporting business-critical applications. Just minutes of downtime can result in significant lost revenue and dissatisfied customers. Ensuring database highly availability is therefore a top priority for any organization.

The new MySQL Guide to High Availability solutions is designed to navigate users through the HA maze, discussing:

- The causes, effects and impacts of downtime;

- Methodologies to select the right HA solution;

- Different approaches to delivering highly available MySQL services;

- Operational best practices to meet Service Level Agreements (SLAs).

As discussed in the new Guide, selecting the high availability solution that is appropriate for your application depends upon 3 core principles:

- The level of availability required to meet business objectives, within budgetary constraints;

- The profile of application being deployed (i.e. concurrent users, requests per second, etc.);

- Operational standards within each data center.

Recognizing that each application or service has different operational and availability requirements, the guide discusses the range of certified and supported High Availability (HA) solutions – from internal departmental applications all the way through to geographically redundant, multi-data center systems delivering 99.999% availability (i.e. less than 5 ½ minutes of downtime per year) supporting transactional web services, communications networks, cloud and hosting environments, etc.

By combining the right technology with the right skills and processes, users can achieve business continuity, while developers and DBAs can sleep tight at night! Download the guide to learn more.

Monday Jul 18, 2011

Simpler and Safer Clustering: MySQL Cluster Manager Update

Clustered computing brings with it many benefits: high performance, high availability, scalable infrastructure, etc. But it also brings with it more complexity.

Why?

Well, by its very nature, there are more “moving parts” to monitor and manage (from physical, virtual and logical hosts) to clustering software to redundant networking components – the list goes on. And a cluster that isn’t effectively provisioned and managed will cause more downtime than the standalone systems it is designed to improve upon.

When it comes to the database industry, analysts already estimate that 50% of a typical database’s Total Cost of Ownership is attributable to staffing and downtime costs. These costs will only increase if a database cluster is not effectively monitored and managed.

Monitoring and management has been a major focus in the development of the MySQL Cluster database, and as part of this focus, the latest release of MySQL Cluster Manager (MCM) hit General Availability last week. You can read all about it in Andrew Morgan's blog.

MySQL Cluster Manager 1.1.1 makes it much simpler to get up and running, to manage the cluster and to allow multiple clusters to be managed from a single process.

MySQL Cluster Manager is part of the commercial Carrier-Grade Edition but anyone is free to download and use MySQL Cluster Manager without obligation for 30 days. This is a great way for those new to MySQL Cluster to rapidly configure and provision their first cluster.

All you need do is:

1. Go to Oracle eDelivery

2. Enter some basic details and click through the agreement

3. Select “MySQL Product Pack”, then your platform, then Go

Not only does MCM make the management of MySQL Cluster simpler, it also makes it safer. One of the largest causes of downtime is administrator error, and here MySQL Cluster Manager can significantly reduce risk.

Consider the task of upgrading rom one release of MySQL Cluster to another. This can be performed as an on-line operation, using rolling restarts to apply upgrades while still serving read and write requests. Its just one of the many operations users can perform on line (ie adding data nodes, upgrading schema, backups, etc) all of which enable MySQL Cluster to achieve 99.999% uptime.

Using a manual upgrade method on a cluster configured with 4 x data nodes, 2 x MySQL Server application nodes and 2 x management nodes, the administrator would be typing 46 x manual commands in an operation that would take around 2 ½ hours to complete. The steps are shown below:

1 x preliminary check of cluster state

8 x ssh commands per server

8 x per-process stop commands

4 x scp of configuration files (2 x mgmd & 2 x mysqld)

8 x per-process start commands

8 x checks for started and re-joined processes

8 x process completion verifications

1 x verify completion of the whole cluster.

Excludes manual editing of each configuration file.

Now compare this to using MySQL Cluster Manager:

upgrade cluster --package=7.1 mycluster;

Just 1 command and walk away and leave it.

Note – both of the processes above exclude the preparation steps of copying the new software package to each host and defining where it's located. The total operation times are based on a DBA restarting 4 x MySQL Cluster Data Nodes, each with 6GB of data, and performing 10,000 operations per second.

You can learn more about MySQL Cluster Manager from our new whitepaper and on-line demo.

We also have an on-demand webinar which covers MySQL Cluster Manager as well as other complimentary methods to managing a MySQL Cluster environment:

* NDBINFO: released with MySQL Cluster 7.1, NDBINFO presents real-time status and usage statistics, providing developers and DBAs with a simple means of pro-actively monitoring and optimizing database performance and availability.

* MySQL Cluster Advisors & Graphs: part of the MySQL Enterprise Monitor and available in the commercial MySQL Cluster Carrier Grade Edition, the Enterprise Advisor includes automated best practice rules that alert on key performance and availability metrics from MySQL Cluster data nodes.

While managing clusters will never be easy, it keeps getting a whole lot simpler !

Wednesday Jul 06, 2011

Virtualizing MySQL: 1-Click, Kick Back…and Relax

Virtualizing all parts of today’s software infrastructure has become a priority for many. Creating a more flexible and dynamic environment with improved availability enables organizations to accelerate innovation, reduce time to market, cut costs and deliver higher uptime.

Databases have rarely been the first candidates for virtualization – mainly as a result of fears in consolidating such critical resources, and in I/O overhead that may have degraded service levels. However with improvements in hypervisor designs coupled with more powerful commodity server hardware and repeatable best practices, many of these concerns are rapidly diminishing.

It was in this context that we began development of the Oracle VM Template for MySQL Enterprise Edition, making the world’s leading web database radically simpler to deploy, manage, and support in a virtualized environment.

Along with the development team, we will be hosting a live webinar on Wednesday July 13th where we will introduce the Template and demonstrate how to deploy it in production environments.

We will show how, in 1-click, users can download the Oracle VM Template - providing a pre-installed, pre-configured, pre-tested virtualized MySQL instance running on Oracle Linux and Oracle VM, packaged and certified for production deployment.

With the download complete, users simply import the template into the Oracle VM Manager and then configure for their local environment via a self-running first boot script. The image is then provisioned to the Oracle VM Server Pool where it is ready for use. It is simple to clone the image to create multiple MySQL instances in seconds and customize the template with additional software stacks to create new Golden Images.

In addition to rapid deployment, users also benefit from the integrated High Availability technologies that are part of the Template. Oracle VM provides native clustering mechanisms that can detect failures in the underlying server, VM or MySQL instances and automatically fail over to the other nodes in the VM Server Pool. User can configure the recovery to a specific server, or can allow Oracle VM to load balance the recovery to any server in the pool.

With downtime resulting from scheduled maintenance activities now representing the majority of outages in today’s data centers, Oracle VM also offers live migration. A user can initiate the migration of a running MySQL instance to another server across secure SSL links, without downtime.

And of course, users can receive support for the entire stack – from hypersior to database – from Oracle, eliminating all that nasty finger pointing that you might find with other VM / Database combinations.

We’ve created a couple of resources you can use to get started with the Oracle VM Template for MySQL Enterprise Edition:

- Register for the live webinar on July 13th. Don’t worry if you can’t make it – by registering you will automatically be notified when the replay is available

- Download the new whitepaper which discusses the components of the template, and then how to download, configure and deploy it.

In summary, integrating MySQL Enterprise Edition with Oracle VM and Oracle Linux, the Oracle VM Template for MySQL is the fastest, easiest and most reliable way to provision virtualized MySQL instances. By using the template, users are able to meet the explosive demand for web-based services with a low Total Cost of Ownership, while providing a foundation for cloud computing. So download, kick back, and relax.....


Tuesday Jun 14, 2011

Scaling Web Databases: Auto-Sharding with MySQL Cluster

The realities of today’s successful web services are creating new demands that many legacy databases were just not designed to handle:

- The need to scale writes, as well as reads, both within and across geographically dispersed data centers;

- The need to scale operational agility to keep pace with database load and application requirements. This means being able to add capacity and performance to the database, and to evolve the schema – all without downtime;

- The need to scale queries by having flexibility in the APIs used to access the database;

- The need to scale the database while maintaining continuous availability for both failures as well as scheduled maintenance events.

Each of the requirements above warrant their own dedicated blog, which I’ll find time to write over the next few weeks.

But to get started, I wanted to discuss how the MySQL Cluster database addresses the first point – scaling writes to the database with automatic sharding and geographic replication.

Auto-Sharding

MySQL Cluster is implemented as a distributed, multi-master database with no single point of failure. Tables are automatically sharded across a pool of low cost commodity nodes, enabling the database to scale horizontally to serve read and write-intensive workloads, accessed both from SQL and directly via NoSQL APIs (memcached, REST/HTTP, C++, Java, JPA and LDAP). Up to 255 nodes are supported, of which 48 are data nodes. You can read more about the different types of nodes here.

By automatically sharding tables in the database, MySQL Cluster eliminates the need to shard at the application layer, greatly simplifying application development and maintenance.

Sharding is based on the hashing of the primary key, though users can override this by telling MySQL Cluster which fields from the primary key should be used in the hashing algorithm. Hashing on the primary key generally leads to a more even distribution of data and queries across the cluster than alternative approached such as range partitioning.

Figure 1 demonstrates how MySQL Cluster shards tables across data nodes of the cluster.

Figure 1: Auto-Sharding in MySQL Cluster

You will see from the figure above that MySQL Cluster automatically creates “node groups” from the number of replicas and data nodes specified by the user. Updates are synchronously replicated between members of the node group to protect against data loss and enable sub-second failover in the event of a node failure.

Figure 2 shows how MySQL Cluster creates primary and secondary fragments of each shard.


Figure 2: Eliminating Data Loss with Cross-Shard Fragments

MySQL Cluster is an active/active architecture with multi-master replication, so updates made by any application or SQL node accessing the cluster are instantly available to all of the other nodes accessing the cluster.

Unlike other distributed databases, users do not lose the ability to perform JOIN operations or sacrifice ACID-guarantees. In the Development Release of MySQL Cluster (7.2), Adaptive Query Localization pushes JOIN operations down to the data nodes where they are executed locally and in parallel. We've seen 20-40x higher throughput from the community members that have tested it.

Geographic Replication

Of course, web services are global and so developers will want to ensure their databases can scale-out across regions. MySQL Cluster offers Geographic Replication which distributes clusters to remote data centers, serving to reduce the affects of geographic latency as well as provide a facility for disaster recovery.

Figure 3: Geographic Replication with MySQL Cluster

Geographic Replication is asynchronous and based on standard MySQL replication – with one important difference – it is active/active so supports the detection and resolution of conflicts when the same row is updated across different clusters. This does currently require the addition of a timestamp column in the application, but that is expected to be eliminated in future releases.

Where the Rubber Meets the Road

Auto-sharding and geographic replication are all great technologies, but what do they mean in terms of delivered performance ?

The MySQL Cluster development team recently ran a series of benchmarks that characterized performance across 8 x dual socket 2.93GHz, 6 core commodity Intel servers, each equipped with 24GB of RAM. As seen in the figure below, MySQL Cluster delivered just under 2.5 million updates per second with 2 x data nodes configured per server.

Figure 4: MySQL Cluster performance scaling-out on commodity nodes.

Across 16 Intel servers, MySQL Cluster achieved just under 7 million read operations per second. We ran out of time in the test cluster before being able to complete the test of write performance, but will return to those efforts soon.

Wrap-Up

So what does all of this mean ? There is an ever-growing array of options for developers to choose from when scaling out new generations of web applications. Don’t assume that relational databases can’t scale, or offer the kind of operational agility demanded by today’s highly dynamic services. MySQL Cluster is already proven as one such option….and you don’t have to throw away ACID guarantees or the ability to run complex queries to get scalability or schema agility.

You can learn about how MySQL Cluster implements auto-sharding, along with other key features for web services such as online schema updates and NoSQL interfaces from a new on-demand webinar.

And of course MySQL Cluster is open source, so you are free to download, develop and deploy with it. The latest GA release is here.

The MySQL Cluster 7.2 Development Milestone Release including Adaptive Query Localization is here (select the Development Release tab):

Finally, if you wanted to try out MySQL Cluster with the memcached API, you can get it from the latest build on the MySQL labs site.

As ever, let us know how these technologies work for you, either in the comments below or via the MySQL Cluster forum.

Wednesday May 18, 2011

Unlocking New Value from Web Session Management

Join us for a live webinar and download a new whitepaper where we discuss how to realize new value from data collected during web session management.

Session management has long been a key component of any web infrastructure – enhancing the user browsing experience through improved reliability, reduced latency and tighter security.

Increasingly organizations are looking to unlock more value from session management to further improve user loyalty (i.e. making the web service more “sticky”) and improve monetization of web services.  There are two distinct developments that offer the promise of unlocking more value from session data:
1.    Provide highly personalized browsing experiences by recognizing repeat visitors and making real-time recommendations based on previous browsing behavior
2.    Enhance insight into user behavior through analysis of how they interact with the web service, enabling organizations to optimize web experiences  

There are many approaches to session management, and technology selection has become critical in ensuring the full value of data collected from user sessions can be realized.

For rapidly growing web properties, higher volumes of session data need to be managed and persisted in real-time while also demanding very high levels of availability, coupled with the flexibility of relational data management.

In such cases, it makes sense to evaluate the MySQL Cluster database.

To further discuss the challenges and solutions to session management, we are hosting a live webinar on Tuesday May 31st at 0900 Pacific Time / 1700 UK.  In this webinar, we discuss the challenges and solutions to session management, covering:

* The demands of session management
* How MySQL Cluster is well placed to meet the demands from session management
* Configuring session management with PHP and MySQL Cluster
* Configuring session management with with memcached and MySQL Cluster
* Real Time analysis of session data with MySQL Cluster
* Case studies

You can register for the webinar here

You can download the associated whitepaper here

Let us know your recommendations for unlocking more value from web session data in the comments below

About

Get the latest updates on products, technology, news, events, webcasts, customers and more.

Twitter


Facebook

Search

Archives
« March 2015
SunMonTueWedThuFriSat
1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
    
       
Today