Friday Apr 26, 2013

MySQL Web Reference Architectures: On Demand

Last week we ran a live webinar presenting the new MySQL Web Reference Architectures, a set of documented and repeatable best practices for building highly available, scaleable and secure database infrastructure to power new generations of web and mobile services.

The webinar replay is now available on-demand so you can listen in from the comfort of your own desk…or commute.

As a taster - we discuss sizing and design patterns - you can see a sample below:




We received some great questions during the Q&A session which I felt would be useful to you, so I've included them below:

Q. You are talking a lot about MySQL 5.6. We are on MySQL 5.1 - what are the reasons for us to upgrade:

There is a great summary in the MySQL 5.6 developer / DBA guide. You will see the advantage of 5.6 for web and cloud applications in performance, scaling, agility,security, instrumentation and reliability.

Q. When using replication, the master sends only the log file for the slave?

A. The binary log contains all of the database writes on the master. The master then sends the data from this log file to each of the slaves so that they can apply the same writes. It doesn't send the log file itself but sends the changes as they're written to the file (or if a slave disconnects and then reconnects the master will send over all changes that it missed). You can find out more in the introduction to MySQL replication paper 

Q. Are there any open-source utilities to provide Virtual IP failover that can be integrated with mysqlfailover utility? Or should we use corosync and pacemaker with mysqlrpadmin?

Certainly you could use corosync and pacemaker - in which case you're right that mysqlrpladmin might be the best option as then you only have one brain deciding when to fail everything over. Alternatively you can bind your own scripts into mysqlfailover and you could use those to trigger a VIP failover

Q. What should we do to address any potential replication lag?

A. Fast cores on the slave would help. If you're able split your tables between different schemas (databases in MySQL terms) then you can exploit mutil-threaded slaves (parallel slave apply) in MySQL 5.6. Also take a look at optimized row-based replication in MySQL 5.6

Q. Does MySQL 5.6 offer any replication related improvements over 5.5?

A. You bet. Performance, reliability, flexibility. See the introduction paper referenced in the answer above 

Q. Any resources you know of on implementing the Medium-size MySQL web reference architecture with Ruby on Rails apps in the application layer?

A. General best practices outlines in the MySQL web reference architectures guide will apply. There are more specific MySQL / Ruby on Rails best practices here. You can also use community-developed MySQL Ruby connectors

Q. How can we integrate MySQL with Hadoop?

A. You have several options which can be used independently or together. Apache Sqoop provides batch transfers between MySQL and Hadoop, and is also fully bi-directional, so you can replicate the results of Hadoop Map Reduce jobs back to MySQL tables. There is also the new MySQL Applier for Hadoop which enables the streaming of events in real-time from MySQL to Hadoop.

You can learn more about all of the option from the MySQL and Hadoop guide to big data

Q. MySQL Cluster looks interesting, but when we looked at it in the past, it didn't support Foreign Keys, so not sure how you can suggest it as an option for eCommerce apps?

A. There are many eCommerce providers already running MySQL Cluster - Shopatron is a great example. The latest development milestone release of MySQL cluster 7.3 adds native support for Foreign Keys as well - so worth taking another look! 

Q. is it possible to customise auto-sharding when using MySQL Cluster?

A. Yes, you can choose which column within the primary key is to be used as the 'sharding key' - that way you can for example make sure that all data for a specific user will be in the same data node; this can help ensure linear scalability. Details in the MySQL Cluster Performance Optimization guide 

Q. Any advice on UserID generation for sharding?

A. One design pattern is to just use an auto-increment column and then hash on that. Another option is to use a UUID - you can then shard by ranges or by hashing

Q. Can we choose one : clustering or sharding? which one better for HA?

A. It depends on a number of factors such as what level of availability you need, how your application is architected, etc. A good resource to help you weigh up the pros and cons of the different approaches is the MySQL guide to high availability

Q. My company is using 5.5 with one master and slave for our web application. We are currently looking into a new system infrastructure using VM technology. What would you recommend?

A. There are many options available to you on multiple hypervisors. We have created a VM Template for MySQL running on Oracle VM and Oracle Linux. This is a pre-installed, pre-configured and certified software stack which you download as a single file and then provision to an Oracle VM Server farm. You can learn more about it from the Oracle VM Template for MySQL Enterprise Edition guide

Q. I'm interested in the MySQL monitoring tools and in the audit plug-in - how do I get these?

A. These are available as part of MySQL Enterprise Edition which also packages support, backup, security plug-ins, etc in an annual subscription.

Q. How much would it cost to run MySQL Enterprise Edition on that "perfect server"?

A. Pricing is available in the on-line store. Note that the list price is the same regardless of number of cores unless you exceed 4 CPU *sockets*

Q. i am sorry for this very novice question, but could I have some help regarding references that would allow me to digest the" base" of these technologies presented her?

A. Assuming it is case studies you are after, we have a lot of them published at the MySQL customers page. In the "Industry" drop down box, take a look at Web eCommerce, games, SaaS/Hosting and social networks


Tuesday Feb 05, 2013

MySQL 5.6 Replication: New Resources for Database Scaling and HA

MySQL 5.6 reached GA (General Availability) today and is packed with a wealth of new features and capabilities.  Exciting stuff!

MySQL 5.6 also introduces the largest set of enhancements to replication ever delivered in a single release, including:
- 5x higher performance to improve consistency across a cluster and reduce the risks of data loss in the event of a master failing
- Self-healing clusters with automatic failover and recovery from outages or planned maintenance
- Assured data integrity with checksums implemented across the replication workflow
- DevOps automation

Of course, getting started with all of these enhancements can be a challenge - whether you are new to MySQL replication or an experienced user. So two new Guides are available to help take advantage of everything replication offers in MySQL 5.6.

The introduction takes you through new 5.6 features, including Global Transaction Identifiers (GTIDs), BinLog Group Commit, Multi-Threaded Slaves, Crash-safe replication, Checksums, etc.
To provide context to these features, the Introduction Guide takes Developers and DBAs through the concepts of replication, including: 

- Use-cases such as scaling out on commodity hardware, high availability, geo-redundancy for DR and data placement, and supporting complex analytics alongside high velocity OLTP operations in the same cluster

- The replication workflow and internals, discussing the roles of different threads and log files 

- Controlling data consistency using different synchronisation models

- Pros and cons of Statement-Based and Row-Based Replication 

- Monitoring and managing replication using tools such as the replication dashboard within MySQL Enterprise Monitor 


Having learned the concepts of replication and what's new in MySQL 5.6, the Tutorial provides step-by-step guides to configuring, provisioning and managing replication, covering:

- Configuring replication with master and slave .cnf files, creating users, introducing replication to an existing application, initialisation and checking proper operation, using GTIDs and other new features

- Migration to semi-synchronous replication for improved data consistency and reduced risk of data loss

- Administration and configuration by checking slave status, suspending replication and viewing binary logs

- Failover and recovery, including pre-requisites, detecting an outage, slave promotion, redirecting the application and recovering a failed master

These new resources provide everything you need to get started in building your next generation web, cloud, social or mobile application with the world's most popular open source database.

You will find more information in the Docs.  Also, look out for a live webinar in March where we will demonstrate all that is new in MySQL 5.6 replication.



Tuesday Jan 08, 2013

Deep Dive into GTIDs and MySQL 5.6 - What, Why and How

Global Transaction Identifiers (GTIDs) are one of the key replication enhancements in MySQL 5.6. GTIDs make it simple to track and compare replication across a master - slave topology. This enables:

- Much simpler recovery from failures of the master,

- Introduces great flexibility in the provisioning and on-going management of multi-tier or ring (circular) replication topologies.

A new on-demand MySQL 5.6 GTID webinar delivered by the replication engineering team is now available, providing deep insight into the design and implementation of GTIDs, and how they enable users to simplify MySQL scaling and HA. The webinar covers:

- Concepts: What is a GTID? How does the server generate GTIDs? What is the life cycle of GTIDs? How are GTIDs used to connect to a master?

- Handling conflicts

- How to skip transactions using GTIDs

- What happens when binary logs are purged

- How to provision a new slave or restore from a backup

- MySQL utilities for automated failover and controlled switchover

To whet your appetite, an extract of the Q&A from the webinar is as follows. These, and many other questions were answered during the session:

Q. Which versions of MySQL support GTIDs?

A. MySQL 5.6.5 and above

Q. Is GTID ON by default in 5.6?

A. It is OFF by default

Q. What does the GTID contain?

A. It is made up of a unique ID for the server followed by an ever-increasing counter that's specific to that server

Q: Do GTIDs introduce any increased space requirements?

A: Yes, since GTIDs are stored in the binary log, the binary logs will be larger. However, we expect the overhead to be relatively small. GTIDs are written to the binary log in two places:

(1) A small header is stored in the beginning of the binary log. This contains the variable @@gtid_purged, i.e., a list of previously logged GTIDS. Since the list is range-compressed, this is expected to be small: a small fixed-size header plus 40 bytes times the number of master servers in your topology.

(2) A fixed size header is added before each transaction in the binary log. This is 44 bytes, so will typically be small compared to the transaction itself.

Q. I understand GTID's are associated with Transactions. How do they map to the events within each transaction, or do GTID's map as an event itself in a binlog file?

A. Yes, GTIDs are associated with transactions. In the binary log, the GTID is realized as an event written prior to the events that constitute the transaction. The event is called a Gtid_log_event.

Q What if a transaction spans a filtered out table and a non-filtered out table? How does it get recorded on the slave?

A. If the filters are on the master, then a partly logged transaction will be replicated with its GTID.

If filtering on the slave side, a partial image will be processed on the slave and the original GTID is logged (to the slave's binlog) with the processed transaction.

Q. Prior to GTID, to build a new slave, we use mysqldump --master-data=1 to get the slave starting sync point in the dump. With GTID enabled, does it set the gtid_executed / purged in the dump instead?

A. Yes, mysqldump will detect that the server uses GTIDs and output a SET GTID_PURGED statement. (And there is an option to turn off that, e.g., in case you want to execute the output on an old server).

Q. How do GTIDs enable failover and recovery?

A. GTIDs are using in combination with the MySQL utilities. The mysqlfailover and rpladmin utilities provide administration of GTID-enabled slaves, enabling monitoring with automatic failover and on-demand switchover, coupled with slave promotion. GTIDs make it straightforward to reliably failover from the master to the most current slave automatically in the event of a failure. DBAs no longer need to manually analyze the status of each of their slaves to identify the most current when seeking a target to promote to the new master.

Resources to Get Started

In addition to the webinar, here are some other key resources that will give you the detail you need to take advantage of GTIDs in your most important MySQL workloads:

- Engineering blog: Global Transaction Identifiers – why, what, and how

- Engineering blog: Advanced use of GTIDs

- Documentation: Setting up replication with GTIDs 

- Video Tutorial: MySQL replication utilities for auto-failover and switchover 

- Engineering Blog: Controlling read consistency with GTIDs 

If you have any comments, questions or feature requests, don't hesitate to leave a comment on this blog

Monday Oct 03, 2011

Synchronously Replicating Databases Across Data Centers – Are you Insane?

 

Well actually….no. The second Development Milestone Release of MySQL Cluster 7.2 introduces support for what we call “Multi-Site Clustering”. In this post, I’ll provide an overview of this new capability, and considerations you need to make when considering it as a deployment option to scale geographically dispersed database services.

You can read more about MySQL Cluster 7.2.1 in the article posted on the MySQL Developer Zone

MySQL Cluster has long offered Geographic Replication, distributing clusters to remote data centers to reduce the affects of geographic latency by pushing data closer to the user, as well as providing a capability for disaster recovery.

Multi-Site Clustering provides a new option for cross data center scalability. For the first time splitting data nodes across data centers is a supported deployment option. With this deployment model, users can synchronously replicate updates between data centers without needing to modify their application or schema for conflict handling, and automatically failover between those sites in the event of a node failure.

MySQL Cluster offers high availability by maintaining a configurable number of data replicas.  All replicas are synchronously maintained by a built-in 2 phase commit protocol.  Data node and communication failures are detected and handled automatically.  On recovery, data nodes automatically rejoin the cluster, synchronize with running nodes, and resume service.

All replicas of a given row are stored in a set of data nodes known as a nodegroup.  To provide service, a cluster must have at least one data node from each nodegroup available at all times.  When the cluster detects that the last node in a nodegroup has failed, the remaining cluster nodes will be gracefully shutdown, to ensure the consistency of the stored databases on recovery.

Improvements to the heartbeating mechanism used by MySQL Cluster enables greater resilience to temporary latency spikes on a WAN, thereby maintaining operation of the cluster. A new “ConnectivityCheck” mechanism is introduced, which must be explicitly configured. This extra mechanism adds messaging overheads and failure handling latency, and so is not switched on by default.

When configuring Multi-Site clustering, the following factors must be considered:

Bandwidth
Low bandwidth between data nodes can slow data node recovery.  In normal operation, the available bandwidth can limit the maximum system throughput.  If link saturation causes latency on individual links to increase, then node failures, and potentially cluster failure could occur.

Latency and performance
Synchronously committing transactions over a wide area increases the latency of operation execution and commit, therefore individual operations are slowed. To maintain the same overall throughput, higher client concurrency is required.  With the same client concurrency level, throughput will decrease relative to a lower latency configuration.

Latency and stability
Synchronous operation implies that clients wait to hear of the success or failure of each operation before continuing. Loss of communication to a node, and high latency communication to a node are indistinguishable in some cases.  To ensure availability, the Cluster monitors inter-node communication.  If a node experiences high communication latency, then it may be killed by another node, to prevent its high latency causing service loss.

Where inter-node latencies fluctuate, and are in the same range as the node-latency-monitoring trigger levels, node failures can result.  Node failures are expensive to recover from, and endanger Cluster availability. 

To avoid node failures, either the latency should be reduced, or the trigger levels should be raised.  Raising trigger levels can result in a longer time-to-detection of communication problems.

WAN latencies
Latency on an IP WAN may be a function of physical distance, routing hops, protocol layering, link failover times and rerouting times. The maximum expected latency on a link should be characterized as input to the cluster configuration.

Survivability of node failures
MySQL Cluster uses a fail fast mechanism to minimize time-to-recovery. Nodes that are suspected of being unreachable or dead are quickly excluded from the Cluster.  This mechanism is simple and fast, but sometimes takes steps that result in unnecessary cluster failure.  For this reason, latency trigger levels should be configured a safe margin
above the maximum latency variation on inter-data node links.

Users can configure various MySQL Cluster parameters including heartbeats, Connectivity_Check, GCP timeouts and transaction deadlock timeouts. You can read more about these parameters in the documentation

Recommendations for Multi-Site Clustering
- Ensure minimal, stable latency;
- Provision the network with sufficient bandwidth for the expected peak load - test with node recovery and system recovery;
- Configure the heartbeat period to ensure a safe margin above latency fluctuations;

- Configure the ConnectivtyCheckPeriod to avoid unnecessary node failures;

- Configure other timeouts accordingly including the GCP timeout, transaction deadlock timeout, and transaction inactivity timeout.

Example
The following is a recommendation of latency and bandwidth requirements for applications with high throughput and fast failure detection requirements:
- latency between remote data nodes must not exceed 20 milliseconds;
- bandwidth of the network link must be more than 1 Gigabit per Second.

For applications that do not require this type of stringent operating environment, latency and bandwidth can be relaxed, subject to the testing recommended above.

As the recommendations demonstrate, there are a number of factors that need to be considered before deploying multi-site clustering. For geo-redundancy, Oracle recommends Geographic Replication, but multi-site clustering does present an alternative deployment, subject to the considerations and constraints discussed above.

You can learn more about scaling web databases with MySQL Cluster from our new Guide.  We look forward to hearing your experiences with the new MySQL Cluster 7.2.1 DMR!

Friday Aug 05, 2011

Scaling Web Databases, Part 3: SQL & NoSQL Data Access

Supporting successful services on the web means scaling your back-end databases across multiple dimensions. This blog focuses on scaling access methods to your data using SQL and/or NoSQL interfaces.

In Part 1 of the blog series , I discussed scaling database performance using auto-sharding and active/active geographic replication in MySQL Cluster to enable applications to scale both within and across data centers.  

In Part 2, I discussed the need to scale operational agility to keep pace with demand, which includes being able to add capacity and performance to the database, and to evolve the schema – all without downtime.

So in this blog I want to explore another dimension to scalability -  how multiple interfaces can be used to scale access to the database, enabling users to simultaneously serve multiple applications, each with distinct access requirements.

Data Access Interfaces to MySQL Cluster

MySQL Cluster automatically shards tables across pools of commodity data nodes, rather than store those tables in a single MySQL Server. It is therefore able to present multiple interfaces to the database, giving developers a choice between:

- S    -  SQL for complex reporting-type queries;

- S    -  Simple Key/Value interfaces bypassing the SQL layer for blazing fast reads & writes;

- S    -  Real-time interfaces for micro-second latency, again bypassing the SQL layer

With this choice of interfaces, developers are free to work in their own preferred environments, enhancing productivity and agility and enabling them to innovate faster.

SQL or NoSQL - Selecting the Right Interface

The following chart shows all of the access methods available to the database. The native API for MySQL Cluster is the C++ based NDB API. All other interfaces access the data through the NDB API.

At the extreme right hand side of the chart, an application has embedded the NDB API library enabling it to make native C++ calls to the database, and therefore delivering the lowest possible latency.

On the extreme left hand side of the chart, MySQL presents a standard SQL interface to the data nodes, and provides connectivity to all of the standard MySQL connectors including:

- Common web development languages and frameworks, i.e. PHP, Perl, Python, Ruby, Ruby on Rails, Spring, Django, etc;

- JDBC (for additional connectivity into ORMs including EclipseLink, Hibernate, etc)

- .NET

- ODBC

Whichever API is chosen for an application, it is important to emphasize that all of these SQL and NoSQL access methods can be used simultaneously, across the same data set, to provide the ultimate in developer flexibility. Therefore, MySQL Cluster maybe supporting any combination of the following services, in real-time:

- Relational queries using the SQL API;

- Key/Value-based web services using the REST/JSON and memcached APIs;

- Enterprise applications with the ClusterJ and JPA APIs;

- Real-time web services (i.e. presence and location based) using the NDB API.

The following figure aims to summarize the capabilities and use-cases for each API.

Schema-less Data Store with the memcached API

As part of the MySQL Cluster 7.2 Development Milestone Release , Oracle announced the preview of native memcached Key/Value API support for MySQL Cluster enabling direct access to the database from the memcached API without passing through the SQL layer. You can read more about the implementation and how to get going with it in this excellent post from Andrew Morgan.

The following image shows the implementation of the memcached API for MySQL Cluster 


Implementation is simple - the application sends read and write requests to the memcached process (using the standard memcached API). This in turn invokes the Memcached Driver for NDB (which is part of the same process), which in turn calls the NDB API for very quick access to the data held in MySQL Cluster’s data nodes.

The solution has been designed to be very flexible, allowing the application architect to find a configuration that best fits their needs. It is possible to co-locate the memcached API in either the data nodes or application nodes, or alternatively within a dedicated memcached layer.

The benefit of this approach is that users can configure behavior on a per-key-prefix basis (through tables in MySQL Cluster) and the application doesn’t have to care – it just uses the memcached API and relies on the software to store data in the right place(s) and to keep everything synchronized.

By default, every Key / Value is written to the same table with each Key / Value pair stored in a single row – thus allowing schema-less data storage. Alternatively, the developer can define a key-prefix so that each value is linked to a pre-defined column in a specific table.

Of course if the application needs to access the same data through SQL then developers can map key prefixes to existing table columns, enabling Memcached access to schema-structured data already stored in MySQL Cluster.

Summary

MySQL Cluster provides developers and architects with a huge amount of flexibility in accessing their persistent data stores - a reflection that one size no longer fits all in the world of web services and databases.

You can learn more about this, and the other dimensions to scaling web databases in our new Guide. 

As ever, let me know your thoughts in the comments below. 


Thursday Jul 21, 2011

Scaling Web Databases, Part 2: Adding Nodes, Evolving Schema with Zero Downtime

In my previous post, I discussed scaling web database performance in MySQL Cluster using auto-sharding and active/active geographic replication - enabling users to scale both within and across data centers.  

I also mentioned that while scaling write-performance of any web service is critical, it is only 1 of multiple dimensions to scalability, which include:

- The need to scale operational agility to keep pace with demand. This means being able to add capacity and performance to the database, and to evolve the schema – all without downtime;

- The need to scale queries by having flexibility in the APIs used to access the database – including SQL and NoSQL interfaces;

- The need to scale the database while maintaining continuous availability.

All of these subjects are discussed in more detail in our new Scaling Web Databases guide.

In this posting, we look at scaling operational agility. 

As a web service gains in popularity it is important to be able to evolve the underlying infrastructure seamlessly, without incurring downtime and without having to add lots of additional DBA or developer resource.

Users may need to increase the capacity and performance of the database; enhance their application (and therefore their database schema) to deliver new capabilities and upgrade their underlying platforms.

MySQL Cluster can perform all of these operations and more on-line – without interrupting service to the application or clients.  

On-Line, On-Demand Scaling

MySQL Cluster allows users to scale both database performance and capacity by adding Application and Data Nodes on-line, enabling users to start with small clusters and then scale them on-demand, without downtime, as a service grows. Scaling could be the result of more users, new application functionality or more applications needing to share the database.

In the following example, the cluster on the left is configured with two application and data nodes and a single management server.  As the service grows, the users are able to scale the database and add management redundancy – all of which can be performed as an online operation.  An added advantage of scaling the Application Nodes is that they provide elasticity in scaling, so can be scaled back down if demand to the database decreases.

When new data nodes and node groups are added, the existing nodes in the cluster initiate a rolling restart to reconfigure for the new resource.  This rolling restart ensures that the cluster remains operational during the addition of new nodes.  Tables are then repartitioned and redundant rows are deleted with the OPTIMIZE TABLE command.  All of these operations are transactional, ensuring that a node failure during the add-node process will not corrupt the database.

The operations can be performed manually from the command line or automated with MySQL Cluster Manager , part of the commercial MySQL Cluster Carrier Grade Edition.

On-Line Cluster Maintenance

With its shared-nothing architecture, it is possible to avoid database outages by using rolling restarts to not only add but also upgrade nodes within the cluster.  Using this approach, users can:

- Upgrade or patch the underlying hardware and operating system;

- Upgrade or patch MySQL Cluster, with full online upgrades between releases.

MySQL Cluster supports on-line, non-blocking backups, ensuring service interruptions are again avoided during this critical database maintenance task.  Users are able to exercise fine-grained control when restoring a MySQL Cluster from backup using ndb_restore. Users can restore only specified tables or databases, or exclude specific tables or databases from being restored, using ndb_restore options --include-tables, --include-databases, --exclude-tables, and --exclude-databases.

On-Line Schema Evolution

As services evolve, developers often want to add new functionality, which in many instances may demand updating the database schema.  

This operation can be very disruptive for many databases, with ALTER TABLE commands taking the database offline for the duration of the operation.  When users have large tables with many millions of rows, downtime can stretch into hours or even days.

MySQL Cluster supports on-line schema changes, enabling users to add new columns and tables and add and remove indexes – all while continuing to serve read and write requests, and without affecting response times.

Unlike other on-line schema update solutions, MySQL Cluster does not need to create temporary tables, therefore avoiding the user having to provision double the usual memory or disk space in order to complete the operation.

Summary

So in addition to scaling write performance, MySQL Cluster can also scale operational agility.  I'll post more on scaling of data access methods and availability levels over the next few weeks.

You can read more about all of these capabilities in the new Scaling Web Databases guide.  

And of course, you can try MySQL Cluster out for yourself - its available under the GPL:

The GA release is 7.1 which can be downloaded here, but I'd recommend taking a look at the latest Development Milestone Release for MySQL Cluster 7.2 which has some great new capabilities (localized JOIN operations, simpler provisioning, etc) which can be downloaded from here (select the Development Releases tab).

As ever, let me know if there are other dimensions of scalability that I should be discussing 

About

Get the latest updates on products, technology, news, events, webcasts, customers and more.

Twitter


Facebook

Search

Archives
« March 2015
SunMonTueWedThuFriSat
1
3
4
5
6
7
8
10
11
12
13
14
15
17
18
19
21
22
24
25
27
28
29
30
31
    
       
Today