Thursday Apr 25, 2013

Driving MySQL Innovation

Oracle's VP of MySQL Engineering Tomas Ulin delivered on Tuesday a keynote entitled "Driving MySQL Innovation for Next Generation Applications" at the Percona Live Conference.

If you haven't seen it yet, we highly encourage you to watch it.

Tomas covers:

  • Oracle's Investment in MySQL
  • MySQL 5.6
  • Trends and Product Directions

He makes it very clear that Oracle:

  • Invests in MySQL like Never Before
  • Drives MySQL Innovation
  • Makes MySQL Better for Next Generation Web, Cloud and Big Data Applications


Enjoy the keynote!

Thursday Apr 18, 2013

Adzuna Relies on MySQL to Support Explosive Growth

Adzuna is a fast growing search engine for classified ads specialized in jobs, properties and cars. Headquartered in the UK and launched in 2011, Adzuna searches thousands of sites and tens of millions of ads to make it very easy to find the perfect job, home or car locally. It furthermore provides a wealth of statistics such as salaries trends graphs and comparisons, geographic jobs maps, house prices...and more. Additionally, Adzuna is integrated with Facebook and LinkedIn and shows open vacancies one is connected to through his/her own network. The search engine powers a number of government applications and is integrated into the UK's Prime Minister economic dashboard.

Challenges

  • When Adzuna's founders were selecting the database powering the search engine's architecture, they were planning for scalability and reliability. Not only did they expect fast growth but also unpredictable growth. The number of users could indeed jump from ten thousand to one million in a single day, and any downtime or scalability was simply not an option as it could turn away new users from the site forever, and undermine its reputation form the start.
  • As a Web startup, low Total Cost of Ownership was essential, and the team also desired to implement a database solution they would be able to customize and tailor to their specific needs.


Solution

  • The Adzuna team selected MySQL as the database powering their search engine. They had very positive previous experiences of the world’s most popular open source database, and particularly appreciated its performance and scalability, reliability and ease of use, including the quality of its technical documentation. They were confident it would scale according to their requirements, and were ready to bet their business on the database.
  • Key MySQL strong points included its overall suitability for highly demanding web-based applications, its Geographic Information System (GIS) support and integration with the Perl programming language used by the company.
  • Adzuna's search infrastructure uses MySQL to normalize on a minute by minute basis the data from tens of millions of ads from thousands of websites across four continents, resulting in a database size exceeding 100 GB. The Adzuna engine constantly scrolls the Web and receives XML feeds from its partners, and a key to its success is its ability to turn a massive amount of unstructured data into structured data that can be stored, understood and searched by it users. The team decided to use the Apache Solr open source search platform as front end, leveraging its NoSQL features. Data is subsequently transferred into MySQL. The following diagram describes the Adzuna architecture:
  • In order to help ensure scalability and reliability while focusing its resources on developing its business, The Adzuna team decided to rely on MySQL in the cloud, working with cloud services provider Media Temple.
  • The startup has experienced explosive growth since its launch, currently serving 2 million unique visitors per month, and it recently expanded in Germany, Australia, Canada, South Africa and Brazil.  MySQL totally fulfills the expectations of the Adzuna team, who is confident the database will scale to support their ambition to expand in 30 more countries and become the leading search engine for jobs, homes and cars worldwide.
  • As Taleo, Oracle's talent management cloud service, is increasingly used by Adzuna's partners, the company is currently looking at optimizing data transfer from Taleo to MySQL.

“My advice to young startups is to use MySQL, especially if you have high growth expectations. You’ll need to plan for unpredictability and to have a very robust backbone to support it, and that’s exactly what MySQL provides.” Andrew Hunter, Co-Founder, Adzuna

Wednesday Apr 10, 2013

MySQL Replication: Self-Healing Recovery with GTIDs and MySQL Utilities

MySQL 5.6 includes a host of enhancements to replication, enabling DevOps teams to reliably scale-out their MySQL infrastructure across commodity hardware, on-premise or in the cloud.

One of the most significant enhancements is the introduction of Global Transaction Identifiers (GTIDs) where the primary development motivation was:

- enabling seamless failover or switchover from a replication master to slave

- promoting that slave to the new master

- without manual intervention and with minimal service disruption.

You can download the new MySQL Replication High Availability Guide to learn more.  The following sections provide an overview of how GTIDs and new MySQL utilities work together to enable self-healing replication clusters.  

GTIDs

To understand the implementation and capabilities of GTIDs, we will use an example illustrated below:

- Server “A” crashes

- We need to failover to one of the slaves, promoting it to master

- The remaining server becomes a slave of that new master

Figure 1: MySQL Master Failover and Slave Promotion

As MySQL replication is asynchronous by default, servers B and C may not have both replicated and executed the same number of transactions, i.e. one may be ahead of the other. Consider the following scenarios:

Scenario #1

- Server B is ahead of C and B is chosen as the new master;

- [Then] Server C needs to start to replicate from the first transaction in server B that it is yet to receive;

Scenario #2

- Server C has executed transactions that have so far not been received by Server B, but the administrator designates Server B as the new master (for example, it is configured with higher performing hardware than Server C).

- Server B therefore needs to execute later transactions from Server C, before assuming the role of the master, otherwise lost transactions and conflicts can ensue.

GTIDs apply a unique identifier to each transaction, so it is becomes easy to track when it is executed on each slave. When the master commits a transaction, a GTID is generated which is written to the binary log prior to the transaction. The GTID and the transaction are replicated to the slave.

If the slave is configured to write changes to its own binary log, the slave ensures that the GTID and transaction are preserved and written after the transaction has been committed.

The set of GTIDs executed on each slave is exposed to the user in a new, read-only, global server variable, gtid_executed. The variable can be used in conjunction with GTID_SUBTRACT() to determine if a slave is up to date with a master, and if not, which transactions are missing.

A new replication protocol was created to make this process automatic. When a slave connects to the master, the new protocol ensures it sends the range of GTIDs that the slave has executed and committed and requests any missing transactions. The master then sends all other transactions, i.e. those that the slave has not yet executed. This is illustrated in the following example (note that the GTID numbering and binlog format is simplified for clarity):

Figure 2: Automatically Synchronizing a New Master with it's Slaves

In this example, Server B has executed all transactions before Server A crashed. Using the MySQL replication protocol, Server C will send “id1” to Server B, and then B will send “id 2” and “id3” to Server C, before then replicating new transactions as they are committed. If the roles were reversed and Server C is ahead of Server B, this same process also ensures that Server B receives any transactions it has so far not executed, before being promoted to the role of the new master.

We therefore have a foundation for reliable slave promotion, ensuring that any transactions executed on a slave are not lost in the event of an outage to the master.

MySQL Utilities for GTIDs

Implemented as Python-based command-line scripts, the Replication utilities are available as a standalone download as well as packaged with MySQL Workbench. The source code is available from the MySQL Utilities Launchpad page.

The utilities supporting failover and recovery are components of a broader suite of MySQL utilities that simplify the maintenance and administration of MySQL servers, including the provisioning and verification of replication, comparing and cloning databases, diagnostics, etc. The utilities are available under the GPLv2 license, and are extendable using a supplied library. They are designed to work with Python 2.6 and above.

Replication Utility: mysqlfailover

While providing continuous monitoring of the replication topology, mysqlfailover enables automatic or manual failover to a slave in the event of an outage to the master. Its default behavior is to promote the most viable slave, as defined by the following slave election criteria.

• The slave is running and reachable;

• GTIDs are enabled;

• The slaves replication filters do not conflict;

• The replication user exists;

• Binary logging is enabled.

Once a viable slave is elected (called the candidate), the process to retrieve all transactions active in the replication cluster is initiated. This is done by connecting the candidate slave to all of the remaining slaves thereby gathering and executing all transactions in the cluster on the candidate slave. This ensures no replicated transactions are lost, even if the candidate is not the most current slave when failover is initiated.

The election process is configurable. The administrator can use a list to nominate a specific set of candidate slaves to become the new master (e.g. because they have better performing hardware).

At pre-configured intervals, the utility will check to see if the server is alive via a ping operation, followed by a check of the connector to detect if the server is still reachable. If the master is found to be offline or unreachable, the utility will execute one of the following actions based on the value of the failover mode option, which enables the user to define failover policies:

• The auto mode tells the utility to failover to the list of specified candidates first and if none are viable, search the list of remaining slaves for a candidate.

• The elect mode limits election to the candidate slave list and if none are viable, automatic failover is aborted.

• The fail mode tells the utility to not perform failover and instead stop execution, awaiting further manual recovery actions from the DevOps team.

Via a series of Extension Points users can also bind in their own scripts to trigger failover actions beyond the database (such as Virtual IP failover). Review the mysqlfailover documentation for more detail on configuration and options of this utility.

Replication Utility: mysqlrpladmin

If a user needs to take a master offline for scheduled maintenance, mysqlrpladmin can perform a switchover to a specific slave (called the new master). When performing a switchover, the original master is locked and all slaves are allowed to catch up. Once the slaves have read all events from the original master, the original master is shutdown and control is switched to the new master.

There are many Operations teams that prefer to take failover decisions themselves, and so mysqlrpladmin provides a mechanism for manual failover after an outage to the master has been identified, either by using the health reporting provided by the utilities, or by alerting provided by a tool such as the MySQL Enterprise Monitor.

It is also possible to easily re-introduce recovered masters back to the replication topology by using the demote-master command and initiating a switchover to the recovered master.

Review the mysqlrpladmin documentation for more detail on configuration and options of the utility.

Resources to Get Started

In addition to the documentation, two additional resources are useful to enable you to learn more:

- MySQL replication utilities tutorial video

- MySQL replication guide: building a self healing replication topology

Summary

The combination of options to control failover or switchover and the ability to inform applications of the failover event are powerful features that enable self-healing for critical applications.

Thursday Apr 04, 2013

MySQL 5.6 Replication: All That Is New, On-Demand

The new MySQL 5.6 GA release delivers a host of new capabilities to support developers releasing new services faster, with more agility, performance and security .

One of the areas with the most far-reaching set of enhancements is MySQL replication used by the largest web, mobile and social properties to horizontally scale highly-available MySQL databases across distributed clusters of low cost, commodity servers.

A new on-demand MySQL 5.6 replication webinar takes you on a guided tour through all of those enhancements, including:

- 5x higher master and slave performance with Binary Log Group Commit, Multi-Threaded Slaves and Optimized Row-based replication

- Self healing replication with Global Transaction Identifiers (GTIDs), utilities and crash-safe masters and slaves

- Replication event checksums for rock-solid data integrity across a replication cluster

- DevOps features with Time-Delayed Replication, Remote binary log backup and new CLI utilities

We had some great questions during the live session - here are a few:

Question: When multi-threaded slave is configured, how do you handle multi-database DML? 

Answer: When that happens the slave switches to single-threaded mode for that transaction. After it is applied, the slave switches back to multi-threaded mode.

Question: We have seen in past that replication would sometime break when complex queries get fired or alter statements take place. Neither row based or Statement based replication guaranteed 100% fail safe replication. Does MySQL5.6 address this issue ?

Answer: Replication can fail for several specific reasons. There is no general algorithm enabling "100% fail safe" replication. But we made sure that all that are unsafe are either converted to row based replication or issue warnings.

Question: We are looking for a HA cluster solution where most of the websites are Wordpress based. Do you have any recommendations on what kind of replication solution should we choose? The tables are using innodb

Answer: Using MySQL 5.6 with the master/slave replication we are discussing today, configured with GTIDs and using the mysqlfailover utility will provide you High Availability for your Wordpress cluster and InnoDB tables. Nothing you need to add - it is all part of MySQL you download today

Question: I have question about data loss and translating my-bin.000132 to my-bin.00101 when a master fails?

Answer: That's exactly the problem GTIDs address. Without GITDs it's hard to know the correct position on my-bin.000101 that corresponds to my-bin.000132.

With GTIDs we don't need to know positions, only what the slave has already applied, knowing that it's simple to send to the new master any missing transactions that are on other slaves. The data that was already on my-bin.00132 can be ensured that it was sent to a slave using Semi-syncronous replication.

Question: How do the new utilities work with a Pacemaker/Corosync setup?

Answer: The utilities integrate with Pacemaker/Corosync via extension points. The MySQL Pacemaker agent can call rpladmin to perform a switchover.

Question: Where can I learn more about supported HA solutions for MySQL?

Answer: From the Guide to MySQL HA solutions. I hope you find this useful

Question: Are there any plans to move replication from host-based to either 'database' or 'table' based?

Answer: Current MySQL replication supports replication of specific databases or tables from a Host using Replication filters. This means, we support Replication of a host (all data which could be replicated), replication of specific databases per host and specific tables per host. (You can ignore databases and tables using filters too).

For these and other questions, along with the full replay of the webinar, register here


About

Get the latest updates on products, technology, news, events, webcasts, customers and more.

Twitter


Facebook

Search

Archives
« April 2013 »
SunMonTueWedThuFriSat
 
1
3
5
6
8
11
12
13
15
16
19
20
21
23
24
27
28
29
    
       
Today