Friday Apr 26, 2013

MySQL Web Reference Architectures: On Demand

Last week we ran a live webinar presenting the new MySQL Web Reference Architectures, a set of documented and repeatable best practices for building highly available, scaleable and secure database infrastructure to power new generations of web and mobile services.

The webinar replay is now available on-demand so you can listen in from the comfort of your own desk…or commute.

As a taster - we discuss sizing and design patterns - you can see a sample below:




We received some great questions during the Q&A session which I felt would be useful to you, so I've included them below:

Q. You are talking a lot about MySQL 5.6. We are on MySQL 5.1 - what are the reasons for us to upgrade:

There is a great summary in the MySQL 5.6 developer / DBA guide. You will see the advantage of 5.6 for web and cloud applications in performance, scaling, agility,security, instrumentation and reliability.

Q. When using replication, the master sends only the log file for the slave?

A. The binary log contains all of the database writes on the master. The master then sends the data from this log file to each of the slaves so that they can apply the same writes. It doesn't send the log file itself but sends the changes as they're written to the file (or if a slave disconnects and then reconnects the master will send over all changes that it missed). You can find out more in the introduction to MySQL replication paper 

Q. Are there any open-source utilities to provide Virtual IP failover that can be integrated with mysqlfailover utility? Or should we use corosync and pacemaker with mysqlrpadmin?

Certainly you could use corosync and pacemaker - in which case you're right that mysqlrpladmin might be the best option as then you only have one brain deciding when to fail everything over. Alternatively you can bind your own scripts into mysqlfailover and you could use those to trigger a VIP failover

Q. What should we do to address any potential replication lag?

A. Fast cores on the slave would help. If you're able split your tables between different schemas (databases in MySQL terms) then you can exploit mutil-threaded slaves (parallel slave apply) in MySQL 5.6. Also take a look at optimized row-based replication in MySQL 5.6

Q. Does MySQL 5.6 offer any replication related improvements over 5.5?

A. You bet. Performance, reliability, flexibility. See the introduction paper referenced in the answer above 

Q. Any resources you know of on implementing the Medium-size MySQL web reference architecture with Ruby on Rails apps in the application layer?

A. General best practices outlines in the MySQL web reference architectures guide will apply. There are more specific MySQL / Ruby on Rails best practices here. You can also use community-developed MySQL Ruby connectors

Q. How can we integrate MySQL with Hadoop?

A. You have several options which can be used independently or together. Apache Sqoop provides batch transfers between MySQL and Hadoop, and is also fully bi-directional, so you can replicate the results of Hadoop Map Reduce jobs back to MySQL tables. There is also the new MySQL Applier for Hadoop which enables the streaming of events in real-time from MySQL to Hadoop.

You can learn more about all of the option from the MySQL and Hadoop guide to big data

Q. MySQL Cluster looks interesting, but when we looked at it in the past, it didn't support Foreign Keys, so not sure how you can suggest it as an option for eCommerce apps?

A. There are many eCommerce providers already running MySQL Cluster - Shopatron is a great example. The latest development milestone release of MySQL cluster 7.3 adds native support for Foreign Keys as well - so worth taking another look! 

Q. is it possible to customise auto-sharding when using MySQL Cluster?

A. Yes, you can choose which column within the primary key is to be used as the 'sharding key' - that way you can for example make sure that all data for a specific user will be in the same data node; this can help ensure linear scalability. Details in the MySQL Cluster Performance Optimization guide 

Q. Any advice on UserID generation for sharding?

A. One design pattern is to just use an auto-increment column and then hash on that. Another option is to use a UUID - you can then shard by ranges or by hashing

Q. Can we choose one : clustering or sharding? which one better for HA?

A. It depends on a number of factors such as what level of availability you need, how your application is architected, etc. A good resource to help you weigh up the pros and cons of the different approaches is the MySQL guide to high availability

Q. My company is using 5.5 with one master and slave for our web application. We are currently looking into a new system infrastructure using VM technology. What would you recommend?

A. There are many options available to you on multiple hypervisors. We have created a VM Template for MySQL running on Oracle VM and Oracle Linux. This is a pre-installed, pre-configured and certified software stack which you download as a single file and then provision to an Oracle VM Server farm. You can learn more about it from the Oracle VM Template for MySQL Enterprise Edition guide

Q. I'm interested in the MySQL monitoring tools and in the audit plug-in - how do I get these?

A. These are available as part of MySQL Enterprise Edition which also packages support, backup, security plug-ins, etc in an annual subscription.

Q. How much would it cost to run MySQL Enterprise Edition on that "perfect server"?

A. Pricing is available in the on-line store. Note that the list price is the same regardless of number of cores unless you exceed 4 CPU *sockets*

Q. i am sorry for this very novice question, but could I have some help regarding references that would allow me to digest the" base" of these technologies presented her?

A. Assuming it is case studies you are after, we have a lot of them published at the MySQL customers page. In the "Industry" drop down box, take a look at Web eCommerce, games, SaaS/Hosting and social networks


Monday Apr 22, 2013

Announcing the MySQL Applier for Apache Hadoop

Enabling Real-Time MySQL to HDFS Integration

Batch processing delivered by Map/Reduce remains central to Apache Hadoop, but as the pressure to gain competitive advantage from “speed of thought” analytics grows, so Hadoop itself is undergoing significant evolution. The development of technologies allowing real time queries, such as Apache Drill, Cloudera Impala and the Stinger Initiative are emerging, supported by new generations of resource management with Apache YARN

To support this growing emphasis on real-time operations, we are releasing a new MySQL Applier for Hadoop to enable the replication of events from MySQL to Hadoop / Hive / HDFS (Hadoop Distributed File System) as they happen.  The Applier complements existing batch-based Apache Sqoop connectivity.

Replication via the MySQL Applier for Hadoop is implemented by connecting to the MySQL master and reading binary log events as soon as they are committed, and writing them into a file in HDFS.


The Applier for Hadoop uses an API provided by libhdfs, a C library to manipulate files in HDFS. The library comes precompiled with Hadoop distributions.

It connects to the MySQL master to read the binary log and then:

  • Fetches the row insert events occurring on the master
  • Decodes these events, extracts data inserted into each field of the row, and uses content handlers to get it in the format required
  • Appends it to a text file in HDFS.

Databases are mapped as separate directories, with their tables mapped as sub-directories with a Hive data warehouse directory. Data inserted into each table is written into text files (named as datafile1.txt) in Hive / HDFS. Data can be in comma separated format; or any other, that is configurable by command line arguments. 


You can learn more about the design of the MySQL Applier for Hadoop from this blog

The installation, configuration and implementation are discussed in detail in this Applier for Hadoop blog. Integration with Hive is also documented.

You can also see it in action from this MySQL Hadoop Applier Video Tutorial 

With the growth in big data projects and Hadoop adoption, it would be great to get your feedback on how we can further develop the Applier to meet your real-time integration needs. Please use the comments section to let the MySQL team know your priorities.

About

Get the latest updates on products, technology, news, events, webcasts, customers and more.

Twitter


Facebook

Search

Archives
« April 2013 »
SunMonTueWedThuFriSat
 
1
3
5
6
8
11
12
13
15
16
19
20
21
23
24
27
28
29
    
       
Today