Saturday Mar 14, 2009

Script to start lots of MySQL Cluster nodes on Sparc CMT

I recently had the chance to play with the new T5140 servers. Using the Sparc CMT architecture, these servers present an amazing 128 cpu's to you to use (as a combination of cores and compute threads, there are 2 sockets only).

We are doing some trials with eager Sun customers who want to utilize these babies. The good news is that MySQL Cluster 7.0 (aka 6.4) will support a multi-threaded data node option. The bad news is, one ndbd process still only uses about 8 CPU cores, so to utilize 128, there is some way to go! So the bad news is we still have to launch many ndbd processes to get out the full power of these boxes. But the good news is that with 7.0 there is at least a point in trying at all.

I developed a simple script which lets me easily start a varying amount of ndbd and mysqld processes on one host (and then copy the script to also start same amount of processes on another host). If you have been using Johan's excellent severalnines scripts, I should explain that here I'm trying to do exactly the opposite than those do. The benefit of the severalnines scripts is that you can comfortably start and manage the whole cluster from one since command line, it will ssh into your other servers for you, and execute needed commands. This script does not do that, indeed the point is to just make it simple to start 6 ndbd and 6 mysqld processes on the same server I'm logged in.


# This is a simple script to start many ndbd and mysqld processes on one host.
# It is useful for the newest Sparc CMT architectures where you may want to
# start many processes per physical server.
# Copyright 2000-2008 MySQL AB, 2008 Sun Microsystems, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; version 2 of the License.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program; see the file COPYING. If not, write to the
# Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston
# MA 02110-1301 USA.


# The node-id's to start, odd on one host, even on the other
NUM_NDBD='3 5 7 9 11 13'
#NUM_NDBD='4 6 8 10 12 14'

# I also had some code to circulate ndbd's to bind to 3 different
# NIC's, but this is omitted here for simplicity.

# How many mysqld's to start on each host. In this case we don't need
# to use the node-id for cluster, instead these numbers are used in
# the pathname of mysqldir, in the port and socket to listen to.
NUM_MYSQL='01 02 03 04 05 06'

# Whether to start with --initial or not

# Large memory pages Solaris optimization
#export LD_PRELOAD
#export MPSSHEAP

# MySQL Cluster 6.4 management daemon tries to store configuration state,
# which is annoying when you fiddle with config.ini a lot, so delete the
# stored configs.
rm mysql-cluster/ndb_1_config.bin.1
libexec/ndb_mgmd --ndb-nodeid=1 --config-file=/etc/mysql/config-kludge.ini

sleep 4

for i in $NUM_NDBD
libexec/ndbmtd $INITIAL --ndb-nodeid=$i --ndb-connectstring=$HOSTNAME:1186 &

sleep 10

for i in $NUM_MYSQL
mkdir var/var$i
bin/mysql_install_db --datadir=var/var$i >/dev/null
chown -R mysql var/var$i
bin/mysqld_safe --datadir=${INSTALLDIR}var/var$i/ --port=33$i --socket=/tmp/mysql.$i.sock &
# Not needed, running with skip-grant-tables instead
#bin/mysql --socket=/tmp/mysql.$i.sock -u root -e "GRANT ALL ON \*.\* TO 'root'@'%'"

Oh, you want to hear results from my tests? Sorry, maybe later, gotta go now...

Sunday Feb 15, 2009

Date handling in MySQL

I see my collague Anders wrote some thoughts on date handling in MySQL. I recently had to explain this really down to fine details for a customer, so Anders, everyone else, here's my conclusions...

All the fine details of allowed ways to enter dates are explained here:

With dates, just as in many other cases, MySQL is very flexible in how you can input data. But one thing is that, between the multitudes of formats that are allowed, MySQL does \*not\* by default allow you to enter a non-existant date, e.g. "31st of February 2009". I know some other DB products allow that - they then store it as a correct date automatically (3rd of March 2009). This is mentioned at the end of the above manual page. It can be enabled with an SQL_MODE setting and I have attached a lengthy example about that separately below (Listing 1).

Short explanation in English:

1. Create a table with DATE field, verify that a normal date insert works fine.

2. sql_mode is not set, using the default "liberal" mode

3. Insert with id=2 "succeeds" but with a warning. In fact, only a zero value was inserted. This is consistent with sql_mode.

4. When sql_mode is set to one of the "strict" modes, the same insert (id=3) fails with an error. No new row was inserted.

5. Adding the option ALLOW_INVALID_DATES to sql_mode, makes the same insert work (id=4).

6. Note that when selecting, the date is returned in the same invalid form it was inserted. However, if using it in an operation, it is converted to match a valid date forward or backward in time.

Query OK, 0 rows affected (0.00 sec)

mysql> SHOW VARIABLES LIKE "%sql_mode%";
| Variable_name | Value |
| sql_mode | |
1 row in set (0.01 sec)

mysql> INSERT INTO t VALUES (1, "2008-12-21");
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO t VALUES (2, "2008-02-31");
Query OK, 1 row affected, 1 warning (0.00 sec)

| Level | Code | Message |
| Warning | 1265 | Data truncated for column 'd1' at row 1 |
1 row in set (0.00 sec)

mysql> SELECT \* FROM t;
| id | d1 |
| 1 | 2008-12-21 |
| 2 | 0000-00-00 |
2 rows in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO t VALUES (3, "2008-02-31");
ERROR 1292 (22007): Incorrect date value: '2008-02-31' for column 'd1' at row 1

mysql> SELECT \* FROM t;
| id | d1 |
| 1 | 2008-12-21 |
| 2 | 0000-00-00 |
2 rows in set (0.00 sec)

mysql> SHOW VARIABLES LIKE "%sql_mode%";
| Variable_name | Value |
| sql_mode | STRICT_ALL_TABLES |
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql> SHOW VARIABLES LIKE "%sql_mode%";
| Variable_name | Value |
1 row in set (0.00 sec)

mysql> INSERT INTO t VALUES (4, "2008-02-31");
Query OK, 1 row affected (0.00 sec)

mysql> SELECT \* FROM t;
| id | d1 |
| 1 | 2008-12-21 |
| 2 | 0000-00-00 |
| 4 | 2008-02-31 |
3 rows in set (0.01 sec)

| d1 + INTERVAL 0 DAY |
| 2008-12-21 |
| NULL |
| 2008-03-02 |
3 rows in set (0.01 sec)

mysql> -- Note: 2008 was a leap year so above is correct!


Lessons learned from the example:
1. The MySQL default mode often will continue operations without stopping to an error - this also happens in many other cases than this. In the old days, without the ETL tools we have today, this was probably a very convenient feature if you had to insert "dirty" data into your
Data Warehouse.

2. You may/should consider setting sql_mode='TRADITIONAL' (or another strict mode) to stop and get an error to catch such mistakes.

3. Set sql_mode="ALLOW_INVALID_DATES" to allow out of range dates like is usual on some other DB products.

Links about sql_mode:

Another thing that may also come up is that Oracle allows dates to be expressed in textual forms, such as "31-jan-2009".

This hasn't been supported in MySQL, and it seems the situation is still the same in 5.1. There actually is a variable "date_format" in place, but in practice you can only set it to values that are equivalent with the default, you cannot really change it:

mysql> show variables like "%date%";
| Variable_name | Value |
| date_format | %Y-%m-%d |
| datetime_format | %Y-%m-%d %H:%i:%s |

mysql> SET SESSION date_format="%Y-%M-%d";
ERROR 1231 (42000): Variable 'date_format' can't be set to the value of

mysql> SET SESSION date_format="%Y-%m-%d";
Query OK, 0 rows affected (0.00 sec)

mysql> SET SESSION date_format="%Y/%m/%d";
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT \* FROM t;
| id | d1 |
| 1 | 2008-12-21 |
| 2 | 0000-00-00 |
| 4 | 2008-02-31 |
3 rows in set (0.00 sec)


You can of course always use the DATE_FORMAT() and STR_TO_DATE() functions explicitly if you have a need to format textual date formats.
See also:

Thursday Jan 08, 2009

Tips for bulk loading

A collague at Sun asked me for tips on how to tune MySQL to get fast bulk loads of csv files. (The use case being, insert data into a data warehouse during a nightly load window.) Considering that I spend most of my time working with MySQL Cluster, I was amazed at how many tips I could already come up with both for MyISAM and InnoDB. So I thought it might be interesting to share, and also: Do you have any more tips to add?

[A Sun partner] have requested to do a POC to test mysql's bulk loading capabilities from CSV files. They have about 20GB of compressed CSV files, and they want to see how long it takes to load them.

They haven't specified which storage engine they intend to use yet.

Good start, well defined PoC. The storage engine question actually is significant here.

- MyISAM typically may be up to twice as fast for bulk loads compared to InnoDB. (But there may be some tuning that makes InnoDB closer to MyISAM, see below.) MyISAM should also be faster on reads.

- On the downside, MyISAM doesn't offer transactions which also implies no crash recovery!

I would recommend MyISAM if:

- Customer/Partner understands the below points, ie is not completely ignorant of DB/MySQL concepts. If he is/appears ignorant, then using InnoDB is simpler and safer.

- Apart from the bulk load, there is no transactional activity on the database. (not at all, or very little)

- It is feasible for the customer to load the data for instance at night time with something like LOCK TABLE...; LOAD DATA INFILE...; UNLOCK TABLE; In other words, nobody will read data or generate reports during the load period, eg it could be a nightly window or something. (

- Crash recovery is not an issue. This basically means that in case MySQL (or the whole server) crashes and the DB cannot be repaired, there is always a backup, ZFS snapshot or other means available to get back to the state after the most recent successful bulk load. And there were no transactions after the bulk load or it is ok for them to be lost. (Note that if there is no write activity, then a crash is \*not\* a problem, problems only happen if there is a crash during LOAD/INSERT/UPDATE.)

Note that InnoDB is not a bad choice for this task either, and we do have big datawarehouses running on InnoDB.

I've googled, and have found the following parameters:

MyISAM: bulk_insert_buffer_size, myisam_sort_buffer_size, key_buffer_size

InnoDB: innodb_buffer_pool_size, innodb_log_file_size, innodb_flush_logs_at_trx_commit, innodb_flush_method, innodb_doublewrite

Besides the parameters above, are there any others that I should be looking at? Also, do you have any pointers to how I can derive the optimum values of the parameters above?

Actually, I'm not an expert on InnoDB and MyISAM parameters. What you've found seem to be what people generally tune, yes. But I might be missing something myself.

In addition, I would think about the following:

- For both, but especially for MyISAM, try to find ways how to make the inserts happen concurrently. Obviously you should have each table in a separate csv file, but also data going into the same table can be split into multiple parts. Then you can LOAD DATA INFILE... simultaneously for each csv file.

- For MyISAM, use the CONCURRENT keyword to enable inserts to happen concurrently, and if needed use "SET GLOBAL concurrent_insert=2":

- If you don't use replication (you probably don't), turn off the binary log by commenting out log-bin in the my.cnf file.

For InnoDB:

- It is a big advantage if the data in the CSV files is already ordered by primary key. (Because the InnoDB primary key is a clustered index, so it will organize the table physically to be in primary key order anyway.)

- For the bulk insert, you should consider turning off foreign key checking and unique index checking.

- Using InnoDB plugin, you can speed things up by inserting data into a table without indexes (only define primary key, of course), and then create the indexes separately with alter table. (on an existing table you can also consider dropping existing indexes, the benefit of this would depend case by case).

InnoDB plugin is not supplied by Sun but directly from Oracle. So we cannot sell it with the commercial (oem) license, but for customer's that use it for internal use we do support it on a best effort basis (as we do for other non-standard MySQL uses too).

Also came across this while writing this:

Monday Dec 15, 2008

How much DataMemory+IndexMemory do you need for disk data?

One thing we were guessing at with Massimo yesterday is, if you store
large blobs as disk data, how much will they consume DataMemory and
IndexMemory (primary key, each "chunk" has a hidden primary key, first 25x bytes
of blob stored in memory...)?

My empirical test showed that about 2% of the total size of blobs is needed for RAM
(25% of that is IndexMemory).

IMHO this is close to negligible, but in many situations not negligible
at all (may have close to TB of disk data -> 20GB of RAM needed for
disk data).

Also note that this is a minimum figure. If you actually have something
else than the blob (like other indexes) you of course use much more RAM.

The test was:
CREATE TABLE `jpgtest` (
`id` int(11) NOT NULL,
`jpg` blob,

and inserting 100k blobs into that table (7+ GB in total).

Details below.

PS: Note that Johan just posted several excellent posts on using MySQL Cluster disk based data:

\*\*\*\*\*\* Loading 100k files as blobs into an NDB disk data table. \*\*\*\*\*
(Simple test, one datafile, one insert thread, etc...)

GRANT ALL ON \*.\* TO 'root'@'';
-- tablespace and undo log
ADD UNDOFILE 'undo_1.dat'

ADD DATAFILE 'data_1.dat'

use test;

CREATE TABLE `jpgtest` (
`id` int(11) NOT NULL,
`jpg` blob,

-bash-3.2$ cat

use DBI;

$hostname = "ndb05";
$database = "test";
$tablename = "jpgtest";
$user = "root";
$pw = "";

$dsn = "DBI:mysql:database=$database;host=$hostname;port=3306";

$dbh = DBI->connect($dsn, $user, $pw);
$drh = DBI->install_driver("mysql");

$n = 100000;
open FH, 'fakepic.jpg';
$jpg = ;


$sth = $dbh->prepare("INSERT INTO jpgtest VALUES (?, ?);");
$sth->bind_param(1, $i, {TYPE => SQL_INTEGER});
$sth->bind_param(2, $jpg, {TYPE => SQL_BLOB});
print "$i\\n";

-bash-3.2$ ls -lFh
total 108K
-rw-r--r-- 1 hingo hingo 100K 2008-12-09 16:26 fakepic.jpg
-rw-r--r-- 1 hingo hingo 634 2008-12-09 20:48

Load speed:
real 24m36.396s
user 1m24.002s
sys 0m13.382s

mysql> select count(\*) from jpgtest;

77831 records
1476,4 seconds
52,72 rows/sec <<<<<<<<<<<
102404 bytes/row
5398420,02 bytes/sec
5,15 MB/sec <<<<<<<<<<

Data usage:
7970205724 bytes total
7,42 GB total (data inserted) <<<<<<<<<<<

[root@ndb05 mysql]# ls -laiFh /data1/mysqlcluster/ndb_2_fs
total 11G
21200937 -rw-r--r-- 1 root sroot 11G 2008-12-10 12:09 data_1.dat
21200936 -rw-r--r-- 1 root sroot 128M 2008-12-10 12:09 undo_1.dat
(The above means nothing, the sizes are as specified when created. However, it is interesting to note that 10GB in MySQL becomes 11GB in the filesystem...)

(This query from Johan's blog)
mysql> select free_extents, total_extents from information_schema.files where file_type='datafile';
| free_extents | total_extents |
| 5284 | 10240 |
| 5284 | 10240 |
(Interesting... Why are more than half of my extents still free,
even if I inserted 7 GB into a 10 GB file? Something in this is not right...)

-bash-3.2$ tail ndb_1_cluster.log
2008-12-10 12:11:40 [MgmSrvr] INFO -- Node 3: Data usage is 11%(3704 32K pages of total 32768)
2008-12-10 12:11:40 [MgmSrvr] INFO -- Node 3: Index usage is 28%(4725 8K pages of total 16416)

3704 pages
32 KB/page
118528 KB
115,75 MB

4725 pages
8 KB/page
37800 KB
36,91 MB

RAM vs datafile ratio:
7,42 GB of largish blobs will use about
152,66 MB of RAM (indexes, hidden "cunk" indexes, beginning of each blob is in RAM...)
Conclusion: Allocate
2,01 %
of the size of your disk data blobs for RAM!

24,18 % of that is IndexMemory

Friday Nov 28, 2008

How to use JDBC (Connector/J) with MySQL Cluster

Last week I helped a customer setup a JBoss application against MySQL Cluster. It turns out it is not immediately obvious how you should setup our JDBC connector to do loadbalancing and failover. For instance, setting the connector up for an Master-Slave setup (with MySQL Enterprise) is well documented, but not doing the same with MySQL Cluster.

It's not really properly documented in the manual part, but I found in the changelogs, and confirmed on IRC that to do load-balancing across the SQL nodes in MySQL Cluster, you would use a different JDBC connection string with the "loadbalance" keyword added...


That does indeed loadbalance, however it didn't properly address failover. When an SQL node is killed, it still tries to round-robin queries to all specified hosts, resulting in exceptions 50% of the time (with 2 nodes, that is).

After further digging (in fact, my collague got this from Mark Matthews himself) I finally found out that the correct string to use is:


The "loadBalanceBlacklistTimeout" adds the needed feature that failed connections in a connection pool are put aside for the specified time, and only working connections are utilized.

That's all that is needed. It is simple and beautiful once you get it to work!

Update: I should add that transactions that are running while a node crashes will still rollback and return an exception. This is by design and it is then up to the application to decide whether to give up or retry. If you retry the transaction, the JDBC driver will pick a working SQL node for the second try.

Wednesday Oct 29, 2008

Because we can: MySQL talks with Johan Wikman, Father of MySQL on Symbian/S60. (part 3 of 3)

Continued from Part 2

Q: So what are some applications or prototypes you are actually working on? Which do you see as the most interesting ones? Can I do something useful with this today already?

In general, what I find most interesting are the use-cases that utilize the aspects that make a web-server on a mobile personal device unique. Use-cases that take advantage of the fact that the context - location, surrounding devices and people, etc. - constantly changes, and the fact that the web-site "administrator" is always there.

And I get all worked up when I think on the implications - even if I obviously don't know what they all might be - if all mobile phones were equipped with a globally accessible web-server (I ignore all technical challenges). For instance, we already have an implementation of "linking by proximity". That is, having reached one mobile web server, you can reach other mobile web servers in the proximity of the first. The servers are related because they happen to be in the same place. A completely new way for linking sites, which I think is very cool, but it does not work before the likelyhood that two mobile web servers accidentally are in the proximity of each other is significantly greater than zero:)

I am also really excited about the idea of building social networks - similar to Facebook, for instance - without any kind of centralized server. The devices just talk to each other directly and events and notifications are routed following the social network. That is, they reach people important to you, before they reach people you saw last time in school 25 years ago. But it's a long way before anything like that can be realized.

Q: You make a strong case, I have to admit!

And yes, you can already do useful things with this:) Just install the version available from Nokia Beta Labs and you'll get some ready-to-use web-applications.

Q: Thinking "outside the box", there are of course some use cases where you discard the traditional notion of a phone, where this could be interesting. For instance, with this software you could just leave your phone at some place, your summer cottage in Nagu for instance, go home and point your browser to that phone, and you have an instant surveillance camera! Is there any research in this direction, or are you really working mostly on the assumption that people want to have a web server in their pocket?

If you by "web server" think of, say, Amazon, then no, that's not what you want to have in your pocket. In fact, in the beginning we actually stopped talking for a while about "web server on your mobile" because far too many made that Amazon association, and talked about "HTTP access to your mobile phone from the Internet" instead. Same thing, but the latter does not give the same associations as the former:)

A good litmus test for the feasibility of running a particular web application or service on your mobile is "so why don't you upload the data to a regular server and share it from there"? If you don't have a good answer to that question, then perhaps you should not try hosting it on your mobile.

Your instant surveillance camera case above is one that would be hard to replicate using a regular server. With the mobile web server it's trivial to do that; it's just a few lines of Python or PHP code. And it actually illustrates one of the interesting aspects of having a website on a mobile device - the web server moves, and hence the content it returns may depend on the location and other context of the device. But this is a little contrived usecase - how often do you really need a surveillance camera and when you do, would you want to leave your mobile behind:) Incidentally, this exists already, so all you need to get this functionality is to install the system on your mobile.

Q: So you proved that a) it has already been done and b) still it's maybe not useful... perhaps I need to come up with a better idea :-)

A more interesting alternative is to involve the phone owner in the scenario. You know your spouse is in Paris and you wonder what she is up to, so you browse to her mobile site, click on a link that causes the phone to beep and display a message "Henrik would like to know what you are up to!" at which point she takes a picture. The page you get contains a map with her location, the image she just took and perhaps clickable icons representing other mobile web sites in her proximity.

There are a couple of interesting points in this scenario; part of the content was generated interactively, it was generated on demand, and the content did not exist before it was asked for and will not continue to exist after it has been returned. Further, it illustrates the personal touch of it all. The whole scenario doesn't really make sense unless the one browsing knows the owner of the mobile web server.

This last point is quite relevant really, because it means that scenarios that on first thought might not make sense, actually may do. For instance, one obvious web application is a dynamic gallery of the images you have on the phone. A counter argument for that idea is: "why don't you simply upload the images to some image-sharing site. The mobile will never be able to handle the load". Well, that's very true, if the intention is to share the images with the world. However, what if the only one consuming those images is your spouse for whom you take pictures when you are on your businesstrip? In this case the load is not a problem and you do not need to create an account on any 3rd-party site, thus leaving the images firmly in your control.

And you should not make the mistake of assuming that having a web-server on a mobile necessarily means that it must expose a "regular" web-site. For instance, it would be trivial to provide SMS and MMS kind of functionality using a mobile web server, in which case the functionality would also become a seamless part of the web.

Further, the situation is not either/or as you could easily utilize the mobile web server for providing real-time data that is mashed up with other data on a regular site. An example of this is the Facebook integration we have made. Using data fetched from the mobile, we can display in Facebook a map-view that shows your current location. And from within Facebook you can type a message that is delivered directly to the inbox of your mobile. That is, you can be present on the social network without actually being logged on.

And now I've only covered the what happens when you put a web-server on a personal mobile device. The way you use a mobile phone also changes once there is http access to it. For instance, it is trivial to create a web-interface to all core applications of the phone, so that whenever you are next to a PC - any PC, not just your own - you can use the big screen and proper keyboard of the PC for reading and sending SMSs, managing your contacts, interacting with the calendar, etc.

Q: You know I'm a big fan of using real keyboards! In fact this was my first thought on how to use this.

And with the added bonus of being able to do that from the office when you have forgotten your phone at home. Not for making phone calls obviously, but you will be aware if someone tries to call you and be able to send him or her an SMS telling on which number you can be reached.

Q: Amazing, hadn't thought of it that way. Since I work from a home office I guess it is unlikely to happen to me, but in my previous career I did feel really lost when I had forgotten my phone at home.

Q: Is there some other question I should have asked but didn't?

Q: Where can I download this thing?

The consumer oriented version - MWS - is available at The entrypoint for our experimental PAMP stack that then provides you with PHP and MySQL is PAMP can be used stand-alone or be installed on top of MWS.

Q: It was also a surprise to hear that suddenly the whole Symbian platform is going to be Open Source. What do you think this tells us about where the world is going and where the telecom world is going? Or even just where Nokia is going?

I can't speak for Nokia, but in my opinion it tells that there in general are fewer and fewer reasons for keeping code closed, if there ever were any good ones.

Open sourcing something is surely no guarantee that a vibrant development community will appear, but not doing it is a sure guarantee that it will not happen.

One thing that open sourcing always brings is ease of communication. It is so much easier to communicate and share ideas etc. if you don't have to think about NDAs and the like, but can in all fora speak openly about issues related to the code.

The open sourcing of Symbian touches my work rather closely actually. Since I have been working with open source software, I have paid great care to only use public Symbian/S60 information. For instance, I only use the public SDKs and never internal ones where I would have had access to all the source. This in order reduce the risk that Nokia or Symbian IPR would "leak" via me into the open source. Obviously this has caused problems at times when something have not worked and a peek at the source immediately would have revealed the problem. Now that issue is about to disappear.

Q: I hear you there! When I was working on S60 projects myself, it was
the first time I really understood the value of having source code available. It is not that you'd want to touch the source code - mostly you don't, but you often do want to read it! (Especially with the S60 documentation we had few years back :-)

Exactly, without the source it might takes days or weeks to figure out the cause of a problem and the workaround for it, with the source at hand it might take 15 minutes.

Q: Final question: As most employees of Nokia Research Center I see you have patents or patent applications pending. What would be your opinion on software patents? And is there a way to answer this question that will satisfy both yourself, your employer and the MySQL community?

First, I must stress that I am not a lawyer, that I really don't know too much about software patents and patents in general, and that I here especially speak for myself.

In my opinion, a large number of software patents are bogus, in the sense that they describe a solution that any engineer, when given the same problem, would come up with. In other words, the novelty and the non-obviousness that should be there, is not there.

Furthermore, often software patents seem to be the patenting of an idea. That is, there is not really any significant investment behind it. If you invest a great deal into the research of something and come up with a novel result, then I think it is right that you can protect that result so that somebody else simply can't go ahead and use the result for reaping the harvest of your investment. But if it's just a bright idea that anyone could have had, then the protection in terms of a patent should not be there. In any case, given the pace of the software industry, at least the lifetime of a software patents should be significantly shortened.

Generally speaking I don't think software patents work and that it would be in the interest of everyone if they simply were abandoned. However, from a practical perspective software patents are a fact of life and there are patent sharks and others out there that are more than willing to go after companies who don't play the patent game. So until the global rules are changed, unilateral disarmament would not be without significant risk.

Q: Johan, thanks for a very interesting chat on an interesting technical topic!

Thank you Henrik for interesting questions!

Monday Oct 27, 2008

Because we can: MySQL talks with Johan Wikman, Father of MySQL on Symbian/S60. (part 2 of 3)

Continued from Part 1

Q: But, we digress... so let me instead ask you the question everyone asks me when they hear about Apache and MySQL on a mobile phone: Why on earth would anyone want to do THAT?

Because we can:)

No seriously, there are good reasons. If we assume that it makes sense to run a web server on your mobile (see further down for reasons for that) and the web-server you use is Apache, then it's quite obvious that you also want to provide both PHP and MySQL. After all, some 40% of all web-sites in the world are powered by (L)AMP, so if you provide the same environment on the mobile, you have hundreds of thousands of developers who are familiar with the stack.

But, in my mind, there are also compelling reasons to have a proper database on the mobile. Currently, the way applications store their data varies a lot; many even use crude binary files. This means that there in many cases is no way to manipulate the application data, without detailed knowledge about the structure of the data or with the cooperation of the application.

Think how different it would be if all applications stored their data in a DBMS. Making backups would be trivial, as you could just interact directly with the DBMS, without having to know anything about the applications storing their data there. And the only thing an application would have to do in order to be included in the backup, would be to store its data in the DBMS.

Q: Clearly you have no idea how weird backup setups people use on MySQL too! Well, I guess for an environment like a mobile phone it would be relatively trivial, yes. Being able to see and modify your data with SQL is a cool idea, I have to admit!

And since my hammer is a web server on a mobile phone, the use of a DBMS would also make it trivial to expose the core data of the phone via a REST API that could be generated on the fly. Again, the only thing an application would have to do in order to publish its data would be to store its data in the DBMS.

Q: Hmmm... This reminds me, I should ask about security. I mean, how do people react to the great news that their photos, SMS and calendar could be automatically available on the Internet?

That's really a good question. The simple answer is that all security mechanisms available on regular websites are just as applicable on mobile ones. If you are certain that others cannot access, for instance, your Hotmail or Amazon account, then you should also be certain that you can prevent others from accessing your SMSs on your mobile website.

However, in practice it is more complex than that. Firstly, it seems that quite a few have a very personal relationship with their mobile phone. The mere thought that someone might be poking around in it, is scary. So, the security mechanisms must be such that not only is the data safe, but people must also be really certain of it. Secondly, while regular websites are managed by professionals or semi-professionals, mobile websites will typically not be. That is, the security arrangement must be such that a non-professional can understand it and thus be able to make informed decisions.

One thing I am certain of, is that a straightforward mobile website specific account/password arrangement will not do. The situation becomes totally unmanageable if you need a specific account for each mobile website you intend to access and where you should have a distinct password on each. That would also turn the phone owner into a regular administrator that must manage accounts and "answer support calls" when someone has forgotten his password. Some single sign-on mechanism is needed, where the bulk of the account management is moved away from the mobile website itself. The phone owner should at most have to decide, for instance, which ones of his contacts are allowed to see his gallery.

I also think that it is important that the phone owner can put the same amount on trust into a "message", regardless of how it has reached his phone. For instance, if you get an SMS from a certain number you implicitly assume it really is sent by the owner of that number. Now, if, for instance, an instant message, delivered over HTTP, is displayed on the screen of your phone and you are told it is from your contact John, you should be able to trust that information as much as if you would have gotten that message as an SMS from John.

Q: Going back to the specialities of Symbian and embedded programming, isn't it contrary to the whole idea of saving battery and sleeping as much as possible, if you suddenly start to run servers on your phone?

Well, sending and receiving SMSs consumes energy as well, as does making a phone call. So it's a question of utility - is the functionality you get, worth the energy it consumes?

The primary problem is that with most other energy consuming activities on the mobile, the user is in control. If the battery is low, you can choose not to surf the web in order to be able to make a phone call later. If you have a server, especially one that is accessible from the Internet, on the mobile, in principle you are no longer in control. There can be activity of which you are not even aware.

And that's a significant difference between stationary and mobile web servers. If you have a regular website, as long as you are not slashdotted, you don't really care who visits it. In the mobile case, the situation is different as all access costs, if not in monetary terms then in terms of energy consumption. With the advent of flat-fee operator agreements the monetary issue is going away, but the energy issue is not. And even if you have access control on your site preventing access by anyone but the ones explicitly allowed, the mere delivery of the rejected request consumes energy.

Now, this is only a problem if direct access to the mobile is possible. In most operator networks it's not and in order be able to reach the mobile from the Internet, the access must go through our transparent gateway. And that, of course, can and indeed does perform access control on behalf of the mobile. That is, an unauthorized browser cannot cause any cost to you.

And that's actually an interesting aspect of the whole thing. You want gateway based access control, not necessarily in order to protect the data, but in order to control who can cause cost to you. That is, if you are at home, connected over WLAN and with the phone connected to a charger, the need for that kind of access control disappears. So, not only may the content be context dependent, as pointer out further down, but the access control may have to be as well.

Q: So maybe Drizzle could be interesting to you, if it becomes a leaner version of MySQL? I remember Symbian is all-Unicode too, so they would fit well together.

I really don't know much about Drizzle, so I can't say anything specific. In principle the leanness is quite attractive, but then again, the mobile phones of today really have quite a lot of memory and other resources. The high-end Nokia phones today have 128MB of RAM and if you compare the mobiles from a few years ago to the mobiles of today and use that for extrapolating where the mobiles will be a few years from now, it is not unrealistic to assume that they will have gigabytes of RAM and hundreds of gigabytes of storage. I suppose that's a lot more than what the servers had that were used for running many MySQL servers just a few years ago:)

Q: I know this but it is still hard to grasp... So if we plug in the phone to a charger, so we have electricity and can disregard thinking about battery all the time, you're saying my Nokia Communicator is more powerful than the servers I was administering in the 90's?


We all know that comparing raw CPU speed does not tell you that much, but I'll throw in some numbers nonetheless. For instance, the N95, which no longer is top of the line, has an ARM11 that, according to ARM's website, delivers up to 740 Dhrystone. I googled around and according to one site, the 450 Mhz Intel Pentium II, which was introduced in the fall of 1998, delivered 813 Dhrystone. So that's roughly where we are in terms of CPU power.

And as for memory... In the early nineties I worked on a HP/UX workstation that had 256MBs of memory. In those days, that was considered an almost obscene amount of memory. All recent Nokia smartphones have 128MBs of RAM.

Click here for Part 3 of the interview...

Sunday Oct 26, 2008

Because we can: MySQL talks with Johan Wikman, Father of MySQL on Symbian/S60. (part 1 of 3)

By the end of 2007, to the surprise of many of us, a guy at Nokia Research Center announced that they had ported and were about to publish the full LAMP stack running on the Symbian/S60 platform of Nokia mobile phones. They dubbed this the Personal AMP stack: PAMP, and you can run most of the popular PHP apps like Wordpress, Drupal, phpMyAdmin... out of the box on a Nokia phone now.

Today we had the opportunity to have a chat with Johan Wikman, the man leading the efforts of porting the AMP stack to Symbian. Johan works as Principal Research Engineer at Nokia Research Center and as such has also previously participated in porting other interesting things to Nokia phones, such as the Linux kernel, eventually leading to what maemo is today.

There is an interesting "it's a small world" aspect in that Johan used to study at Helsinki University of Technology about the same time as Monty, Mårten and Kaj were there.

Q: Hi Johan. I was thinking, for our US readers it might be necessary to start with the basic questions: What is Nokia? What is Symbian? And what is S60?

Hi Henrik.

Well, Nokia is the world's biggest mobile phone manufacturer with a global market share of just below 40%. In the US, however, the market share is significantly smaller. Nokia has its roots in Finland and a great deal of research & development is done over here and the headquarter is here as well, but otherwise Nokia really is a global company with activities all over the globe.

Nokia manufactures phones in all categories, from low-end phones sold in developing countries to high-end smartphone containing all the goodies you can think of, and then some. And these high-end phones run the Symbian operating system. So, just like you can talk about a Windows phone or a Linux phone, you can talk about a Symbian phone. S60 is basically a UI and look&feel layer on top of Symbian, but at times it is a bit difficult to say where Symbian ends and S60 begins.

Q: Also, maybe it is worth talking about development on Symbian a little more. For those who have never done it, could you explain how it differs from, say, creating a Windows app or web development?

Symbian as a programming environment is basically quite different from pretty much everything else. It has a C++ API that takes time getting used to and although it has processes and threads like most other OSs, you usually handle concurrent processing within an application using so called active objects and an active scheduler. Basically its cooperative multitasking between different activities of the application, where you must ensure that you frequently enough give the control back to the scheduler, so that it can let other active objects run.

It may sound primitive, but it actually works quite well, since the operating system itself is well suited for this model, and in practice you don't really have to worry about keeping the CPU for too long. Most OS functionality can be invoked asynchronously, so that you instead of making a call and then block while waiting for it to finish, just initiate the call and are then later called back when operation has finished. While the asynchronous call is executing other active objects can run. But obviously this model has an effect on how you structure your program.

However, from high enough vantage point, creating an application in Symbian is similar to creating an application in other environments. You just have to learn and get used to the required API frameworks. Admittedly, the initial threshold in native Symbian development is higher than in other environments. But nowadays you can also do application development using Python and that really makes things easy.

Q: And I haven't tried it myself, but I've read that you can now program on Symbian pretty much like a Linux system? That is, any POSIX-like system, to be precise.

Yes, recently, with the introduction of the Open-C libraries, the situation has changed significantly. Open-C brings a very large set of functions, typically available in a Linux/Unix environment, to Symbian. So now it is possible to program for Symbian the way you program for Linux. That applies mostly to non-UI stuff, but usually, regardless of the environment, you anyway have to learn the particular UI framework you intend to use. But just a short time ago it was announced that the Qt framework will be made available for S60. That is, provided you are familiar with Qt, you will be able to create software for Nokia smartphones without learning the native APIs.

Although Open-C makes the porting of existing open source software to Symbian much easier than it used to be, there are still a couple of things still that can cause headache. First of all, the security model of Symbian is completely different from that of Linux. In Linux everything revolves around the concept of a user (not necessarily a real one); files and other resources are are owned by a user and processes are executed using the credentials of a particular user. In Symbian there is no concept of a user and prior Symbian version 9 there was basically no security. At runtime processes were protected from each other, but any process could access any file anywhere and also use any system resource, for instance, initiate a phone call. You could say that all processes run as root.

In Symbian 9, platform security was introduced and now the situation is quite different. At build-time binaries are assigned capabilities and those capabilities decide what the process can do at runtime. What capabilities you can assign a binary, depends upon what kind of certificate you are going to sign the installation package with. If you manage with a limited set of capabilities, then a self-signed certificate - one that you create yourself - is sufficient. However, if you need a wider set of capabilities, then a proper certificate is needed. Anyway, trust is effectively given to applications, not to the person who runs them.

But because of the signing process there is traceability, so if some malicious software turns up, you know who is responsible for it. Except for the self-signed applications, of course, but they can't do that much harm anyway and there are lots of warnings when you install them. Further, every application now also has a private directory that no other application, except one with a very hefty set of capabilities, can access.

A more practical problem is that on Symbian all symbols of a DLL (shared library) are exported by ordinal and not by name. So, loading a function explicitly by name from a DLL simply is not possible. So, if your open source software needs that functionality, then you need to work around the limitation. A closely related problem is that it is not possible to export data, but only functions, from a DLL. Since a lot of open source software export data from DLLs, that's something that also must be worked around, and it may not always be trivial to do that without having to touch the code too much.

Q: Which brings us nicely into what we really wanted to know... Was it a lot of work to port MySQL? What about Apache and PHP? Did you use the Open_C library for this, or maybe that's the whole point of the excercise?

Apache was the first component that was ported. And that was originally made when Open-C was not available. At that point there were only a rather limited and buggy C-library, that originally (to the best of my knowledge) was created in order to port the Java VM to Symbian. So, a fair amount of modifications were necessary and, for instance, the dynamic loading of Apache modules was not supported, but everything had to be built into the main executable.

Then came the already mentioned Symbian 9 and Open-C. At that point I decided to throw the existing port away, take the latest Apache httpd version into use and redo the port. Obviously that was much simpler this time, not only could a number of workarounds be thrown out but a lot of things just worked. However, when Apache loads a dynamic module, it looks up a data structure by name from the DLL, and that, as I earlier mentioned, is not possible on Symbian. But I could work around that in such a way that now only a single line must be added to a module in order to make it dynamically loadable.

The next component that was ported was PHP. Initially it was made as an experiment by a Hannu Kankaanpää, a smart summertrainee we had. Later when we really started working on our PAMP stack, the PHP port was updated by Petteri Muilu, Markus Heikelä and Markus Grönholm from Futurice ( From a Symbian perspective PHP is quite problematic, because there are loads of global variables that not only are used by other DLLs, but stored in static structures in other DLLs. Namely, the usual way of working around the data export limitation is to export a function that returns the address of the variable and then define a macro that calls that function and dereferences it. That way client code need not be modified. However, when the addresses of those variables are stored in static structures it no longer works, but somehow the structures must explicitly be initialized at runtime.

Primarily of this reason, PHP has been built as a monolith where all extensions are linked in. However, I'm currenly working on redoing this so that the dynamic loading of PHP extensions will be possible. That way you don't have to link in every PHP extension just in case.

Q: And how about MySQL then?

MySQL was also initially ported by Petteri from Futurice. For sake of simplicity everything is built as static libraries, which currently makes PHP very large, since a lot of MySQL code is linked into PHP. I'm currently also working on changing this so that the client library of MySQL is exposed as a DLL. But it was quite straightforward to port MySQL; if memory serves me, the first version run after just a few weeks, which tells about the portability of MySQL and also about the "Linuxness" of Open-C.

Getting the first version of an open source software to run on top of Open-C would be even faster if it only were possible to run the configure script that open source software typically use for configuring themselves, according to what the current environment provides. Unfortunately, that is not possible at the moment, so what configure usually does, must be done manually on Symbian/Open-C.

But the devil is in the details. Even if you can make the software run quickly, there will be rough edges that it may take a lot more time to fix. Especially if the edges are related to some strange problem that you first must be able to pinpoint.

Q: Did you get - or even need - any help from MySQL employees or community members, or our online resources?

There is some information about porting MySQL, but not much when the target is a completely new environment. So the port was made without any external help. After we had released the PAMP stack there was a report that it is not possible to create indeces after a table has been created, but only as part of the table creation. So I posted a question in the MySQL forum, but before I got a reply (that said that the MySQL developers hang out on IRC), I had already figured out what the problem

The problem was caused by one fundamental difference between Unix and Symbian filesystems. On Unix you can open a file and then remove the file, but still continue reading from and writing to the file. Only when you close the file it is lost. This works, because in Unix a file and its name are actually quite distinct. If that sounds strange, just think about Unix hard links that allow you to have the very same file in two or more places under different names. You can "remove" the file in one place but it continues to exist in all others. So "remove" just removes the directory entry, and only when there are no entries left, the file itself is lost.

Anyway, on Symbian you simply do not have this behaviour. If a file has been opened, it cannot be removed, not by another process and not by you. And because of this, the index creation fails. And this is quite hard to fix without modifying the code extensively, which you don't want to do. So for the time being you have to create the index at table creation time or live without it.

Q: I believe Windows has this same behavior! At least I remember a lot of times I tried to delete a file or folder and couldn't do it, and it is more or less impossible to trace down which application might be holding some files open. And Symbian also has C:, D: and E: drives :-)

Yes, Windows is somewhere in between, but still closer to Unix. You can delete a file that someone else has opened, but the directory entry will not disappear until all handles to that file has been closed.

Click here for Part 2 of the interview...

Sunday Sep 28, 2008

Accessing your MySQL data whatever way you want it (Part 2, InnoDB)

In the previous post we had a look at the MySQL Cluster NDB API and how it enables direct access to the MySQL Cluster data nodes, and therefore also enables access through other protocols than SQL.

I've often asked myself: Since NDB is so great for MySQL Cluster, is there anything similar for MySQL Server (the not-cluster version...)? A couple of months ago Kazuho Oku did something like that and wrote in his blog about it.

The context for Kazuho's work is the social network use case: 1) You have users. 2) Some users are linked to each others as friends. 3) When a user logs in, he should see a timeline of events/messages from his friends. In a previous post he had already tested the difference between a "pull" and "push" approach. (Taking a small sidetrack here, imho both approaches are wrong: The messages/events should first be submitted only in the originating users table, then copied to each recipient by an asynchronous background process. This would give you the best of both worlds, submission speed of pull model and retrieval speed of push model. Anyway...)

For the test we are talking about now, Kazuho is exploring the pull model: When a user logs in, a query is executed to fetch messages/events from all of the user's friends. Kazuho then compared 3 ways to do this: By SQL from the app, by a stored procedure that does everything at once, and by a MySQL User Defined Function. (A UDF is something you write and compile in C and install as a plugin to the server. It can then be called simply as "SELECT functionname(...);".) The UDF is accessing the InnoDB data structures directly using MySQL and InnoDB internal functions, so it is reminiscient of using the NDB API to bypass SQL in MySQL Cluster.

Kazuho's results are clear:

Building Timelines on MySQL
timelines / sec.
Stored Procedure136
UDF using Direct Access1,710

1) This is a good example of a use case where using a stored procedure gives you an advantage over raw SQL. Never mind if you think MySQL stored procedures are inefficient or not, these numbers are clear, the stored procedure approach is 2,5 times more efficient.

2) The UDF rocks! Accessing InnoDB structures directly, it is 10+ times faster than the stored procedure.

There is one drawback though. Accessing the storage engine directly inside MySQL is a bit dangerous. There is no well defined API so there is no guarantee that the next version will not break your UDF. Well, I guess it wouldn't but in theory at least. And a UDF is something you have to install in the server, it is not a client API in that sense. But getting 10 times better performance is something to think about, if you're ready to get your elbows dirty.

PS. I hear the social networking problem is an especially interesting one from this point of view, in that it doesn't map easily to plain old relational databases and SQL. Getting what you want is a bit inefficient with SQL. Kazuho's UDF's show that it can be done tenfold more efficient, by accessing the data in a more optimum way. This is of course exactly the point with native data access.

Monday Sep 01, 2008

Accessing your MySQL data whatever way you want it

One way to look at a database is that

  1. there is data, and

  2. there are ways to access data.

This dichotomy was actually coined (whether intentional or not) by Vinay in the MySQL telecom team when discussing the MySQL Cluster vision some months ago.

Even if you typically think of MySQL Cluster as just a clustered version of the plain old MySQL server, it is actually more like the opposite is true, if you consider the architecture and history of MySQL Cluster. The original cluster was just the data store called Network DataBase or NDB as we familiarly know it still. Then MySQL Server was integrated on top of that to provide an SQL interface. The original and "native" NDB interface is still there though, and many prefer to use direct C++ or Java access to their NDB-clustered data. It is faster just in general, but especially applications with a real-time requirement will benefit from omitting the MySQL Server. No additional network hop and no parsing of SQL, just direct access to your data. Sometimes also you might benefit from being able to do things with the NDB API that cannot be efficiently expressed in SQL at all.

But did you know that in addition to the SQL and NDB API there are actually multiple more ways to interface with MySQL Cluster available:

  • LDAP was presented at a MySQL User Conference BOF last spring and is actually available now as an Option in the Carrier Grade Edition. The LDAP interface is actually an OpenLDAP server node, using a native NDB backend in the slapd.

  • mod_ndb is a REST Web Services API for MySQL Cluster. This one is (obviously) implemented as an Apache module. Although produced by our own John 'JD' Duncan, it is not a Sun supported product.

  • We also know of a case where MySQL Cluster is accessed through the RADIUS protocol, although I don't think this implementation is publicly available.

  • And someone also does it with DIAMETER, a successor to RADIUS.

I don't know details on the 2 last ones, but at least the 2 first ones use NDB directly. That is much more efficient and convenient than for instance doing some LDAP->SQL conversions when SQL really isn't needed in the first place. Moreover, you did realize that all these interfaces are equal citizens with the mysqld - they are all just api nodes. Meaning, you could have one big cluster and access that same data with SQL, LDAP, HTTP, RADIUS and DIAMETER, and of course directly from your application code with the NDB C++ or Java API. Which brings us back to the title for this blog post: You have data and you have ways to access the data. Whatever ways suits you the best.

Then of course for the interesting question? Are there more protocols/API's out there for MySQL Cluster that we don't know about? (Or that I ommitted by mistake?) Are there some protocols there that would be interesting to implement? Let us know at hingo at mysql dot com (or comment below)!

Wednesday Aug 13, 2008

MySQL perspectives from a SQL Server guru

Ben Kaufman at SQLServerCentral introduces MySQL to the SQL Server DBA crowd. All in all his views seem to be fairly positive, in particular the MySQL Cluster experience:

NDB is the gem of MySQL, originally developed by Ericson to track cell phone calls this is a share nothing cluster engine stored in memory. This is a true cluster that supports both high availability and load balancing. [...]
This engine is similar to synchronous mirroring in SQL Server in that it is a 2-phase commit, the difference being the commit is done in memory at the data layer not the log. Logs are hardened at a later time, with the theory being that since data is committed on multiple nodes the data is safe and doesn't require a log flush as part of the transaction. [...]
For pure performance the cluster is comparable with a single instance of SQL Server. I've found on selects it beats SQL Server slightly as long as the data on SQL Server is in memory. This gap widens as more connections are made. Writes depend on whether SQL Server is using write back cache on its controller, in the case it is, it beats NDB, due to NDBs 2-phase commit. Without the controller cache NDB beat SQL. However this is not apples to apples. When compared to SQL Server synchronous mirroring NDB wins hands down. The cost associated with NDB is that it resides in memory (5.1 allows you to keep non indexed data on disk), and therefore your dataset is limited by the amount of memory you use. [...]
With all the negatives put aside if you have an application that requires redundancy, and fast inserts and selects on a single table, NDB is the best product out there. We've been running it for almost 18 months and it's been rock solid. Compared with other SQL Server and other MySQL implementations this has required the least amount of attention. One final note this will not run on Windows.

Also other storage engines get a good evaluation:

Myisam whether intentional or not is built and optimized for read-only datasets. Some of the features that make this the case is the fact that it doesn't support transactions, so taking user updates would be dangerous, but you don't incur the overhead of the transactional log. It performs locking at the table level not the row or page, which is not the best for active OLTP systems, unless the system is only doing inserts. On an application only performing inserts it performs well because it has the ability to perform concurrent inserts and selects. This is because data is appended to the end of the file, and a table lock is not issued, allowing for selects of the data not to be blocked.

Wow, even I had missed this fact about MyISAM. There's always so much talk about MyISAM performing badly on "concurrent writes", I didn't realize that pure INSERTs are not a problem only UPDATEs. I immediately tested this and on a 2 core system (my laptop) the performance of bulk insert doubled when using 10 threads instead of 1.

Thanks Ben, this is useful info. There are many networking use cases where you have to collect lots of data from network elements in some kind of aggregator or mediator, and there the data is then queried, forwarded and deleted later on, but the data is never UPDATEd. It seems MyISAM is a good fit for this kind of use case after all, not just pure datawarehousing. (Especially with PARTITIONing, where you can delete full tables/partitions and not just a bunch of rows.)

Tuesday Aug 05, 2008

MySQL Cluster basics: Why using a primary key is even more important

While we were mostly excited to become part of Sun, there was at least one drawback too. MySQL had a worldwide policy of "Scandinavian style" 5 week vacations per year (4 Summer, 1 Winter week). Sun doesn't have such a worldwide policy, so most of us lost some vacation days there. But there is one group of employees that still enjoy this benefit... those of us that actually live in Scandinavia!

Those 4 weeks are however now gone for my part :-( So I thought I'll make a soft start to working by a) yesterday doing almost nothing and b) today blogging. A long long time ago before I left for my vacation Mark Callaghan requested more blog posts about MySQL Cluster, so I thought why not start with one right now. As previously noted there are many good MySQL Cluster bloggers out there, but admittedly some of their blog postings can be very deeply technical. With this one however, I just want to highlight some basic aspect of how MySQL Cluster works and in fact how it's performance differs from InnoDB and MyISAM based databases. (But it must be added, if you want "getting started" level material, then Johan Andersson's blog with posts like A great config.ini and The RTFM is a must read.)

Some time in June - this was before my vacation, a long time ago I almost cannot remember - I was getting a MySQL Cluster setup up to speed with a customer. Before getting to the real data, the customer had decided to "benchmark" his setup with simply inserting 100 000 integers in a batch:

CREATE TABLE t (i INT) engine=ndbcluster;
... (100000 times)

This test came out with MySQL Cluster looking very bad, more than 2 minutes. Same on InnoDB took 3.7s. (The InnoDB result also looked unrealistic first, on a standard PC you wouldn't get anywhere near that fast, but since those RAID controllers have a DDR RAM chip as write cache, it basically made this an in-memory operation of size 400kB or so.)

So what to do?

MySQL Cluster lesson 1: Batching, batching!

The major cause for bad performance in MySQL Cluster, is often the fact that cluster nodes are connected to each other over a network. This introduces latency and overhead in every single transaction you do - typically at least 3 nodes need to communicate with each other, send the data, agree that it can be committed, etc.... The way to minimize this problem is batching: If you send more data in one batch, the effect of tcp/ip overhead is smaller.

In this particular example of course the effect of tcp/ip overhead was at worst: We were sending only one INT at a time! So we turned on transaction_allow_batching and turned off ndb_force_send. Also remember to turn off AUTOCOMMIT, so there actually is a transaction to be batched!

MySQL Cluster lesson 2: Parallelize, parallelize!

The great thing with MySQL Cluster is it's write scalability, which (for "suitable applications") is practically infinite. With normal MySQL you can easily get read scalability with things like replication, but write scalability is a tougher nut to crack. In MySQL Cluster, you add more data nodes and you get more write throughput. The partitioning is internal to the cluster, so you don't have to mess with shards or anything in your application, one mysql client connection is all you need. Live installations of MySQL Cluster (such as HLR, the heart of a mobile phone network) will do over 100k writes per second.

But to actually get there, you need to also parallelize your operations. Inserting 100k INTEGERs in one batch should be fast, but separating it into 100 batches should be lightning fast, since they can all happen at the same time and MySQL Cluster can absorb that load.

So with these changes... We got from 2 minutes to 50 seconds. Still not 1 second like our marketing material claims!

MySQL Cluster lesson 3: Why PRIMARY KEY is important

With all of my own tricks used, I actually had to turn to my more senior collagues for advice what to do next. One reply was the jackpot:

The thing is, this simple example didn't have a primary key defined. In real life situations you of course always have the primary key, and that is also the reason I had forgotten this important piece of advice: With MySQL Cluster you must always always always remember to use it even for silly examples like this. The reason is that MySQL Cluster (by default) uses the primary key for its internal partitioning between data nodes. So if there is no primary key, it will automatically add a hidden AUTOINCREMENT column to act as one. This leads to 2 problems:

1) Since you are not using an explicit primary key, the mysqld node will end up randomly connecting to whatever ndbd node it happens to choose, from which the transaction is re-routed to the right data nodes. This unnecessary adds (doubles!) network latency.

2) When inserting, fetching the unique autoincrement value acts like a global lock for all inserts. You can parallelize all you want, but it doesn't get any faster. (There is ndb_autoincrement_prefetch_sz to prefetch a number of keys in a batch, but still, always always use primary key.)

In this example, parallellizing actually made performance much worse from the 50 seconds we were at, since the multiple threads were now all competing for the autoincrement lock.

Adding an explicit primary key to the table and the inserts, solved the problem and got the time down to 10 seconds and with some further tuning to 5 seconds. This is still not lightning fast, but at this point the customer was happy and chose to focus on their real application instead. (After the tuning leading to 5 seconds, next significant improvements would probably have been gained from omitting SQL and using the native C++ or Java NDB API and switching from Ethernet to the proprietary Dolphin SCI interconnects. Both of these further remove latency.)

So to summarize: Batch, parallelize and always always define the primary key.

While this has been a trivial example of something not relevant for the real world, I hope it has helped highlight some behavioural charasteristics of the MySQL NDB Cluster.

PS. In June there was still a lot of confusion around the new version numbering with MySQL NDB Cluster. Are you all clear on that now, or do we need to go through it once more in another blog post?

Wednesday Jul 02, 2008

Dublin sales meeting and group photo

The MySQL Telecom sales team is a global team, but it happens to be organized under the EMEA region of the MySQL sales department. Since the EMEA region is having a staff meeting in Dublin this week, it means also we in the telecom team are gathered here. A good opportunity to meet your globally dispersed collagues.

Zack Urlocker was the special guest star from headquarters. He presented a freudian-slip-slide saying that MySQL 5.1 will be released GA in July 2009. After lots of laughter he concessed that maybe engineering can pull it of faster, like maybe July 2008.

Another hot topic Zack talked about was to review how the integration with Sun is going...


...we concluded at least some of us are adapting just fine :-)

Friday Jun 06, 2008

Family of MySQL Cluster bloggers

While this blog is co-authored by the whole MySQL Telecom team, many members in or around the team also write their personal blogs, which you will find very useful. So please follow me on a tour on the absolute top MySQL Cluster blogs in the world:

Johan Andersson is the MySQL Cluster Principal Consultant, and has been with MySQL Cluster since the Ericsson days. He travels around the world to our most demanding customers and shares his guru advice. Rumor has it that recently on a training gig the students made him sign their MySQL t-shirts, can you get closer to living like a rock star than this? Occasionally he also shares some great tips and status info on his blog. Like right now you can find a set of handy scripts to manage all of your MySQL Cluster from one command line, definitively recommended to try!

Jonas Oreland is the Architect of MySQL Cluster. Now, us mere mortals may not always understand everything he is writing about in his blog, but if you want to know what is happening in the MySQL Cluster codebase right now, this is the place to go. And this is really cutting edge, the stuff he writes about in his last post may not appear in a GA release until next year.

Speaking of architects, it is natural to next introduce Mikael Ronström, Father of MySQL Cluster. Yes, Mikael is the one who came up with the whole architecture of MySQL Cluster, we have him to thank for 100,000 writes/sec and linear scalability. (Thank you!) Mikael actually isn't on the Cluster team anymore, he has for some time already been working on the general MySQL Server with things like replication and performance improvements. For newest benchmark of MySQL Server and MySQL Cluster, go to Mikael's blog.

Having dealt with the old and honourable Ericsson alumni, the next blog I want you to follow is Jon Stephen's. Jon is a technical writer, meaning he writes the MySQL Cluster manual. He is very diligent, constantly pestering the developers to divulge some information on how the Cluster work, to the benefit of all of us Cluster lovers. In short, if you want to know how Cluster works, ask Jon, he will know.

Like a cousin to Jon (in Cluster trivia, at least :-) is Ramon Graham, Product Manager for MySQL Cluster. A relatively new blog, but it appeared right in time to answer the worrying question, where did MySQL Cluster disappear? In general, if you are ever wondering where MySQL Cluster is going... talk to Ramon.

MySQL Cluster may have been born in Sweden, but there is a strong Australian angle to it... So let me finish by introducing our 2 Australian Cluster heroes. (I have a theory why these 2 hyperactive guys can get so much done every day, it is because Australia is in such an early timezone, they simply have more hours in a day as we others do!)

First up Stewart Smith, developer in the Cluster team and beloved lecturer about MySQL Cluster. Apart from Jonas' blog, this is another blog to follow if you want to follow how the development of MySQL Cluster. But Stewart writes actively about many things, in fact he is the president of organising the next - one of the most popular Linux developer conferences in the world.

And we started with a consultant, we will end with a consultant: The MySQL-HA blog is where Monty Taylor is writing together with Alexander Rubin about Cluster, High Availability and performance. Monty is also one of those consultants traveling around the world from the Amazon basin in Brasil to various European capitals. He recently expressed that he has not yet ever been to China, so if you are in China and considering to hire a MySQL Cluster consultant, be sure to contact our Sales department! Monty is also the man behind the NDB connectors... hmmm, NDB bindings project which provides python/php/perl/ruby etc... bindings to the native NDB api of MySQL Cluster. If you want to qualify as a MySQL Cluster geek, get familiar with one of the NDB bindings! (Of course, the native C++ NDB API is also an option.)

I believe those are the blogs of the Cluster team & friends I know about. But if there are more, let me know and we'll add more blogs to the end of this post. By the way, the Cluster team is getting an infusion of old-Sun database cluster experts, maybe there are some blogs there we should know about? Let me know!

Update: Another Swede from the Cluster team, Magnus Svensson had unnoticedly joined the Cluster team just while I was writing this article. Today (June 24th) he has a great tip for those of you who want to have your first touch at the MySQL Cluster code: How to insert a simple printf statement for debugging purposes.

PS: I personally also have a blog, where I will never ever write about anything MySQL Cluster related (because that I do here) which focuses on Open Source: phenomenons, culture, business models and trends. Feel free to pop by for a leisurly non-technical read at The Open Life blog.

Friday May 23, 2008

"Telephony is just yet another Internet application." MySQL talks with Juha Heinänen.

During 2008 we are planning to run a series of interviews with interesting persons somehow related to the telecom field. In this first installment, we will have a chat with Juha Heinänen from Finland.

MySQL: Juha, you are a former professor of Computer Science and Communication technology, CTO (or similar positions) in at least Sonera, Telia and Song, former ATM specialist, responsible for bringing Internet to Finland and registering the .fi top-level domain, a consultant for many early network equipment startups some of which succeeded to be still with us today, and author of several RFC's. Nowadays you are a core contributor to the OpenSER SIP proxy, and you sell a VoIP platform called OpenSIPg based on that to Nordic operators. In addition you seem to live a life that would likely be a dream of many hackers, spending time in different parts of the world hacking on your favorite Open Source project. Even to this date, I don't think we've ever met in person.

MySQL: For many years now you have been working with the SIP protocol and OpenSER SIP Proxy. When did you first turn your eye to SIP and why did you become
interested in it?

JH: When working for Song Networks (now TDC) in late 1990 and early 2000, we saw a need for a hosted VoIP service for small businesses. Due to my IETF background, a natural protocol choice for me was SIP instead of the then dominant H.323. At first, we trialled a Swedish commercial SIP proxy called Hotsip, but didn't find it flexible enough for our purposes. I then heard about an Open Source SIP proxy project called SER, saw its great potential, and soon became a SER developer although I had not written a single line of code during the past 10 years.

MySQL: Knowing that you enjoy coding, it must have been great to return to it! By the way, what is your short, 2-3 sentence introduction to OpenSER?

JH: The OpenSER project is a spin off of the SER project. Our aim in OpenSER is to bring to the market a well tested new release of OpenSER SIP proxy every 10 months or so. Today OpenSER is a very successful project with many of high quality developers and a wide user community.

MySQL: How do you see the Internet vs the traditional telephone network? Will SIP (or some other Internet based protocol) eventually completely replace the Plain Old Telephone System?

JH: This is hard to answer, because there always exists the "dark side" that wants to retain the old walled garden style POTS service no matter what equipment or protocols they internally use. These people see telephony as something special, not just yet another Internet application.

MySQL: I remember once talking to you, that you were furious about an operator who insisted on you to implement minute based billing for OpenSER :-) I guess you never did that for them?

(Note to readers: Not that you would consider this for any other Internet protocols either, but this kind of requirement is especially ridiciluous for a peer-to-peer protocol like SIP, since most of the data in a VoIP call may not route through the operator network at all, so it would be hard to justify the operator charging for traffic that is actually happening in some other operators network!)

JH: I don't recall this, but time based billing of SIP calls would be very difficult to implement without also getting involved with routing of media. That, in turn, would mean that most of the advantages that SIP based telephony has over POTS would be lost.

MySQL: What do you think about the IP Multimedia Subsystem?

JH: IMS is a next generation implementation of walled garden telecommunication services. I let it live its own life. I don't care if some users are too rich or lazy and choose IMS instead of open Internet based services as long as I'm not forced to do so.

MySQL: What will happen to service providers (mobile and fixed)? Especially as VoIP companies provide much cheaper calls. And web companies like Google are
offering services. Will the carriers be reduced to bitpipes?

JH: Mobile or fixed Internet access is always worth the money and I gladly pay for it. What I don't like is when operators start to milk their cows without providing any real added value, e.g., by charging huge roaming fees for mobile Internet access. It is operators' own choice if they let companies like Skype and Google take away their customers by not providing their own Internet based telecommunication services.

MySQL: Or asking the same question differently, who will eventually be our service provider for voice calls? Google, Nokia, my current telecom operator, or the current VoIP service providers or maybe some decentralised non-commercial and free peer-to-peer VoIP network?

JH: To me telephony is just yet another Internet application. The same parties will be providing it in the future that today are providing email, web, etc. services. In case of my own company, TutPro Inc., it is TutPro Inc. itself, because I don't like the idea that someone else (perhaps with ties to government spy agencies) is storing my emails or routing my VoIP calls.

MySQL: What is your view on convergence? Or even simpler, what is convergence?

JH: Convergence is a term that I don't fully understand. My goal is to be able to use Internet for all my communication needs. What prevents it from happening today is too slow and (sometimes) too expensive mobile Internet access that is unsuitable for real-time communications.

MySQL: So, tell us more about your current projects. What are you working on now?

JH: I have OpenSER and SEMS based SIP platform called OpenSIPg that a few operators and organizations in Finland and Sweden use to offer their VoIP and presence services. Developing OpenSIPg keeps me busy, but thanks to mobile Internet access, does not tie me physically to some particular place or country.

One new thing that I have been working on is a simple, certificate free mechanism for reliable verification of trusted peers. It is based on Radius protocol and a broker model similar to what was used already long time ago for dial-up access.

MySQL: I know you recommend MySQL Cluster to your customers as the database to go with OpenSIPg. What is the database mainly storing, and what features make MySQL Cluster the best fit?

JH: Well, firstly OpenSER SIP proxy keeps all location and presence data in MySQL database tables. My own principle in developing OpenSIPg has been that my customers should not need to edit any text files when they provision users or the VoIP infrastructure itself. So all OpenSIPg information is kept in MySQL databases, where it can be accessed and manipulated via web based GUIs.

The databases should naturally be resilient and therefore a clustered implementation is the best fit.

MySQL: By the way, for the more technical readers, do you have any kind of numbers about the loads OpenSER and the database behind it must support? Like calls per second or SQL transactions per second? (I know the Finnish operators are not the biggest in the world, but still.)

JH: None of my customers have hit or even been close to any performance limits yet. Nevertheless, a good SIP proxy design tries to minimize the number database operations that need to be performed per request. We thus recommend MySQL cluster more for high availability rather than performance reasons.

MySQL: If you had 3 wishes - but restricted to MySQL Cluster - what would you wish for?

JH: I would wish that MySQL 5.1 would become available also as Debian/Ubuntu packages, because cluster capabilities in 5.1 are more developed than those in 5.0. From maintenance point of view it is not a good idea to install any software to servers from tar files. My other wishes are related to ease of use. Setting up and running MySQL cluster should not require a high degree in database administration.

MySQL: So let's see, your product is based on Linux, OpenSER, PHP, FreeRADIUS and MySQL. What is the importance of Open Source in Telecom? What can Open Source do for Telecom?

JH: Open Source is important for everyone. Large developer and user communities of Open Source software can produce rapidly higher quality software than even the biggest companies can do on their own.

MySQL: Years ago, we had an email chat about a mobile application that was using SMS messages to communicate with a server. Your quick comment was: "Nice, if you want to use such legacy technology." As the pioneer spirit you are, where do you see the border between "legacy" and "modern" in 2008?

JH: I think I was referring to SMS as "legacy" technology because SMS was not terminal and underlying network independent Internet application. That is still true today and for some strange reason even Nokia has not yet made SIP based messaging available in its phones.

MySQL: And what will be legacy in 2011?

JH: I'm afraid that in 2011 there still exist mobile network specific services that do not work end-to-end unless each mobile operator has made a bilateral agreement with each other mobile operator. Such a service model simply does not scale nor lead to rapid development of innovative services.

MySQL: Thanks Juha for taking the time to talk to us, it has been a pleasure. And all the best to your future projects.


The people of the MySQL Telecom team writes about developments around MySQL and MySQL Cluster and how these products are used by our Communitcations industry customers. (Image jasmic@Flickr)


« July 2016