Saturday Mar 14, 2009

Script to start lots of MySQL Cluster nodes on Sparc CMT

I recently had the chance to play with the new T5140 servers. Using the Sparc CMT architecture, these servers present an amazing 128 cpu's to you to use (as a combination of cores and compute threads, there are 2 sockets only).

We are doing some trials with eager Sun customers who want to utilize these babies. The good news is that MySQL Cluster 7.0 (aka 6.4) will support a multi-threaded data node option. The bad news is, one ndbd process still only uses about 8 CPU cores, so to utilize 128, there is some way to go! So the bad news is we still have to launch many ndbd processes to get out the full power of these boxes. But the good news is that with 7.0 there is at least a point in trying at all.

I developed a simple script which lets me easily start a varying amount of ndbd and mysqld processes on one host (and then copy the script to also start same amount of processes on another host). If you have been using Johan's excellent severalnines scripts, I should explain that here I'm trying to do exactly the opposite than those do. The benefit of the severalnines scripts is that you can comfortably start and manage the whole cluster from one since command line, it will ssh into your other servers for you, and execute needed commands. This script does not do that, indeed the point is to just make it simple to start 6 ndbd and 6 mysqld processes on the same server I'm logged in.


#!/bin/bash

# This is a simple script to start many ndbd and mysqld processes on one host.
# It is useful for the newest Sparc CMT architectures where you may want to
# start many processes per physical server.
#
# henrik.ingo@mysql.com
#
#
# Copyright 2000-2008 MySQL AB, 2008 Sun Microsystems, Inc.
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; version 2 of the License.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program; see the file COPYING. If not, write to the
# Free Software Foundation, Inc., 51 Franklin St, Fifth Floor, Boston
# MA 02110-1301 USA.
#############################################################################33

INSTALLDIR='/usr/local/mysqlcluster/'

# The node-id's to start, odd on one host, even on the other
NUM_NDBD='3 5 7 9 11 13'
#NUM_NDBD='4 6 8 10 12 14'

# I also had some code to circulate ndbd's to bind to 3 different
# NIC's, but this is omitted here for simplicity.

# How many mysqld's to start on each host. In this case we don't need
# to use the node-id for cluster, instead these numbers are used in
# the pathname of mysqldir, in the port and socket to listen to.
NUM_MYSQL='01 02 03 04 05 06'

# Whether to start with --initial or not
INITIAL='--initial'

# Large memory pages Solaris optimization
#LD_PRELOAD=mpss.so.1
#export LD_PRELOAD
#MPSSHEAP=4M
#export MPSSHEAP

# MySQL Cluster 6.4 management daemon tries to store configuration state,
# which is annoying when you fiddle with config.ini a lot, so delete the
# stored configs.
rm mysql-cluster/ndb_1_config.bin.1
libexec/ndb_mgmd --ndb-nodeid=1 --config-file=/etc/mysql/config-kludge.ini

sleep 4

for i in $NUM_NDBD
do
libexec/ndbmtd $INITIAL --ndb-nodeid=$i --ndb-connectstring=$HOSTNAME:1186 &
done

sleep 10

for i in $NUM_MYSQL
do
mkdir var/var$i
bin/mysql_install_db --datadir=var/var$i >/dev/null
chown -R mysql var/var$i
bin/mysqld_safe --datadir=${INSTALLDIR}var/var$i/ --port=33$i --socket=/tmp/mysql.$i.sock &
# Not needed, running with skip-grant-tables instead
#bin/mysql --socket=/tmp/mysql.$i.sock -u root -e "GRANT ALL ON \*.\* TO 'root'@'%'"
done

Oh, you want to hear results from my tests? Sorry, maybe later, gotta go now...

Monday Dec 15, 2008

How much DataMemory+IndexMemory do you need for disk data?

One thing we were guessing at with Massimo yesterday is, if you store
large blobs as disk data, how much will they consume DataMemory and
IndexMemory (primary key, each "chunk" has a hidden primary key, first 25x bytes
of blob stored in memory...)?

My empirical test showed that about 2% of the total size of blobs is needed for RAM
(25% of that is IndexMemory).

IMHO this is close to negligible, but in many situations not negligible
at all (may have close to TB of disk data -> 20GB of RAM needed for
disk data).

Also note that this is a minimum figure. If you actually have something
else than the blob (like other indexes) you of course use much more RAM.

The test was:
CREATE TABLE `jpgtest` (
`id` int(11) NOT NULL,
`jpg` blob,
PRIMARY KEY (`id`)
) TABLESPACE ts_1 STORAGE DISK ENGINE=ndbcluster;

and inserting 100k blobs into that table (7+ GB in total).

Details below.

PS: Note that Johan just posted several excellent posts on using MySQL Cluster disk based data:
http://johanandersson.blogspot.com/2008/12/disk-data-summary.html
http://johanandersson.blogspot.com/2008/12/disk-data-counters-more.html
http://johanandersson.blogspot.com/2008/11/disk-data-counters.html

\*\*\*\*\*\* Loading 100k files as blobs into an NDB disk data table. \*\*\*\*\*
(Simple test, one datafile, one insert thread, etc...)

GRANT ALL ON \*.\* TO 'root'@'';
-- tablespace and undo log
CREATE LOGFILE GROUP lg_1
ADD UNDOFILE 'undo_1.dat'
ENGINE NDB;

CREATE TABLESPACE ts_1
ADD DATAFILE 'data_1.dat'
USE LOGFILE GROUP lg_1
INITIAL_SIZE 10G
ENGINE NDB;

use test;

CREATE TABLE `jpgtest` (
`id` int(11) NOT NULL,
`jpg` blob,
PRIMARY KEY (`id`)
) TABLESPACE ts_1 STORAGE DISK ENGINE=ndbcluster;

-bash-3.2$ cat loadpics.pl
#!/usr/bin/perl

use DBI;

# MySQL CONFIG VARIABLES
$hostname = "ndb05";
$database = "test";
$tablename = "jpgtest";
$user = "root";
$pw = "";

$dsn = "DBI:mysql:database=$database;host=$hostname;port=3306";

$dbh = DBI->connect($dsn, $user, $pw);
$drh = DBI->install_driver("mysql");

$n = 100000;
open FH, 'fakepic.jpg';
$jpg = ;

$i=0;
while(true)
{
$i++;

$sth = $dbh->prepare("INSERT INTO jpgtest VALUES (?, ?);");
$sth->bind_param(1, $i, {TYPE => SQL_INTEGER});
$sth->bind_param(2, $jpg, {TYPE => SQL_BLOB});
$sth->execute;
print "$i\\n";
}

-bash-3.2$ ls -lFh
total 108K
-rw-r--r-- 1 hingo hingo 100K 2008-12-09 16:26 fakepic.jpg
-rw-r--r-- 1 hingo hingo 634 2008-12-09 20:48 loadpics.pl

\*\*\*\*\*\*\*\*\*\*\*\*\*\*
Load speed:
real 24m36.396s
user 1m24.002s
sys 0m13.382s

mysql> select count(\*) from jpgtest;
77831

77831 records
1476,4 seconds
52,72 rows/sec <<<<<<<<<<<
102404 bytes/row
5398420,02 bytes/sec
5,15 MB/sec <<<<<<<<<<

\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
Data usage:
7970205724 bytes total
7,42 GB total (data inserted) <<<<<<<<<<<

[root@ndb05 mysql]# ls -laiFh /data1/mysqlcluster/ndb_2_fs
total 11G
21200937 -rw-r--r-- 1 root sroot 11G 2008-12-10 12:09 data_1.dat
21200936 -rw-r--r-- 1 root sroot 128M 2008-12-10 12:09 undo_1.dat
(The above means nothing, the sizes are as specified when created. However, it is interesting to note that 10GB in MySQL becomes 11GB in the filesystem...)

(This query from Johan's blog)
mysql> select free_extents, total_extents from information_schema.files where file_type='datafile';
+--------------+---------------+
| free_extents | total_extents |
+--------------+---------------+
| 5284 | 10240 |
| 5284 | 10240 |
+--------------+---------------+
(Interesting... Why are more than half of my extents still free,
even if I inserted 7 GB into a 10 GB file? Something in this is not right...)

-bash-3.2$ tail ndb_1_cluster.log
2008-12-10 12:11:40 [MgmSrvr] INFO -- Node 3: Data usage is 11%(3704 32K pages of total 32768)
2008-12-10 12:11:40 [MgmSrvr] INFO -- Node 3: Index usage is 28%(4725 8K pages of total 16416)

DataMemory
3704 pages
32 KB/page
118528 KB
115,75 MB

IndexMemory
4725 pages
8 KB/page
37800 KB
36,91 MB

RAM vs datafile ratio:
7,42 GB of largish blobs will use about
152,66 MB of RAM (indexes, hidden "cunk" indexes, beginning of each blob is in RAM...)
Conclusion: Allocate
2,01 %
of the size of your disk data blobs for RAM!

24,18 % of that is IndexMemory

Friday Nov 28, 2008

How to use JDBC (Connector/J) with MySQL Cluster

Last week I helped a customer setup a JBoss application against MySQL Cluster. It turns out it is not immediately obvious how you should setup our JDBC connector to do loadbalancing and failover. For instance, setting the connector up for an Master-Slave setup (with MySQL Enterprise) is well documented, but not doing the same with MySQL Cluster.

It's not really properly documented in the manual part, but I found in the changelogs, and confirmed on IRC that to do load-balancing across the SQL nodes in MySQL Cluster, you would use a different JDBC connection string with the "loadbalance" keyword added...


jdbc:mysql:loadbalance://host-1,host-2,...host-n/database

That does indeed loadbalance, however it didn't properly address failover. When an SQL node is killed, it still tries to round-robin queries to all specified hosts, resulting in exceptions 50% of the time (with 2 nodes, that is).

After further digging (in fact, my collague got this from Mark Matthews himself) I finally found out that the correct string to use is:


jdbc:mysql:loadbalance://host-1,host-2,...host-n/database?loadBalanceBlacklistTimeout=5000

The "loadBalanceBlacklistTimeout" adds the needed feature that failed connections in a connection pool are put aside for the specified time, and only working connections are utilized.

That's all that is needed. It is simple and beautiful once you get it to work!

Update: I should add that transactions that are running while a node crashes will still rollback and return an exception. This is by design and it is then up to the application to decide whether to give up or retry. If you retry the transaction, the JDBC driver will pick a working SQL node for the second try.

Sunday Sep 28, 2008

Accessing your MySQL data whatever way you want it (Part 2, InnoDB)

In the previous post we had a look at the MySQL Cluster NDB API and how it enables direct access to the MySQL Cluster data nodes, and therefore also enables access through other protocols than SQL.

I've often asked myself: Since NDB is so great for MySQL Cluster, is there anything similar for MySQL Server (the not-cluster version...)? A couple of months ago Kazuho Oku did something like that and wrote in his blog about it.

The context for Kazuho's work is the social network use case: 1) You have users. 2) Some users are linked to each others as friends. 3) When a user logs in, he should see a timeline of events/messages from his friends. In a previous post he had already tested the difference between a "pull" and "push" approach. (Taking a small sidetrack here, imho both approaches are wrong: The messages/events should first be submitted only in the originating users table, then copied to each recipient by an asynchronous background process. This would give you the best of both worlds, submission speed of pull model and retrieval speed of push model. Anyway...)

For the test we are talking about now, Kazuho is exploring the pull model: When a user logs in, a query is executed to fetch messages/events from all of the user's friends. Kazuho then compared 3 ways to do this: By SQL from the app, by a stored procedure that does everything at once, and by a MySQL User Defined Function. (A UDF is something you write and compile in C and install as a plugin to the server. It can then be called simply as "SELECT functionname(...);".) The UDF is accessing the InnoDB data structures directly using MySQL and InnoDB internal functions, so it is reminiscient of using the NDB API to bypass SQL in MySQL Cluster.

Kazuho's results are clear:






Building Timelines on MySQL
timelines / sec.
SQL56.7
Stored Procedure136
UDF using Direct Access1,710

1) This is a good example of a use case where using a stored procedure gives you an advantage over raw SQL. Never mind if you think MySQL stored procedures are inefficient or not, these numbers are clear, the stored procedure approach is 2,5 times more efficient.

2) The UDF rocks! Accessing InnoDB structures directly, it is 10+ times faster than the stored procedure.

There is one drawback though. Accessing the storage engine directly inside MySQL is a bit dangerous. There is no well defined API so there is no guarantee that the next version will not break your UDF. Well, I guess it wouldn't but in theory at least. And a UDF is something you have to install in the server, it is not a client API in that sense. But getting 10 times better performance is something to think about, if you're ready to get your elbows dirty.

PS. I hear the social networking problem is an especially interesting one from this point of view, in that it doesn't map easily to plain old relational databases and SQL. Getting what you want is a bit inefficient with SQL. Kazuho's UDF's show that it can be done tenfold more efficient, by accessing the data in a more optimum way. This is of course exactly the point with native data access.

Monday Sep 01, 2008

Accessing your MySQL data whatever way you want it

One way to look at a database is that


  1. there is data, and

  2. there are ways to access data.

This dichotomy was actually coined (whether intentional or not) by Vinay in the MySQL telecom team when discussing the MySQL Cluster vision some months ago.

Even if you typically think of MySQL Cluster as just a clustered version of the plain old MySQL server, it is actually more like the opposite is true, if you consider the architecture and history of MySQL Cluster. The original cluster was just the data store called Network DataBase or NDB as we familiarly know it still. Then MySQL Server was integrated on top of that to provide an SQL interface. The original and "native" NDB interface is still there though, and many prefer to use direct C++ or Java access to their NDB-clustered data. It is faster just in general, but especially applications with a real-time requirement will benefit from omitting the MySQL Server. No additional network hop and no parsing of SQL, just direct access to your data. Sometimes also you might benefit from being able to do things with the NDB API that cannot be efficiently expressed in SQL at all.

But did you know that in addition to the SQL and NDB API there are actually multiple more ways to interface with MySQL Cluster available:


  • LDAP was presented at a MySQL User Conference BOF last spring and is actually available now as an Option in the Carrier Grade Edition. The LDAP interface is actually an OpenLDAP server node, using a native NDB backend in the slapd.

  • mod_ndb is a REST Web Services API for MySQL Cluster. This one is (obviously) implemented as an Apache module. Although produced by our own John 'JD' Duncan, it is not a Sun supported product.

  • We also know of a case where MySQL Cluster is accessed through the RADIUS protocol, although I don't think this implementation is publicly available.

  • And someone also does it with DIAMETER, a successor to RADIUS.

I don't know details on the 2 last ones, but at least the 2 first ones use NDB directly. That is much more efficient and convenient than for instance doing some LDAP->SQL conversions when SQL really isn't needed in the first place. Moreover, you did realize that all these interfaces are equal citizens with the mysqld - they are all just api nodes. Meaning, you could have one big cluster and access that same data with SQL, LDAP, HTTP, RADIUS and DIAMETER, and of course directly from your application code with the NDB C++ or Java API. Which brings us back to the title for this blog post: You have data and you have ways to access the data. Whatever ways suits you the best.

Then of course for the interesting question? Are there more protocols/API's out there for MySQL Cluster that we don't know about? (Or that I ommitted by mistake?) Are there some protocols there that would be interesting to implement? Let us know at hingo at mysql dot com (or comment below)!

Wednesday Aug 13, 2008

MySQL perspectives from a SQL Server guru

Ben Kaufman at SQLServerCentral introduces MySQL to the SQL Server DBA crowd. All in all his views seem to be fairly positive, in particular the MySQL Cluster experience:


NDB is the gem of MySQL, originally developed by Ericson to track cell phone calls this is a share nothing cluster engine stored in memory. This is a true cluster that supports both high availability and load balancing. [...]
This engine is similar to synchronous mirroring in SQL Server in that it is a 2-phase commit, the difference being the commit is done in memory at the data layer not the log. Logs are hardened at a later time, with the theory being that since data is committed on multiple nodes the data is safe and doesn't require a log flush as part of the transaction. [...]
For pure performance the cluster is comparable with a single instance of SQL Server. I've found on selects it beats SQL Server slightly as long as the data on SQL Server is in memory. This gap widens as more connections are made. Writes depend on whether SQL Server is using write back cache on its controller, in the case it is, it beats NDB, due to NDBs 2-phase commit. Without the controller cache NDB beat SQL. However this is not apples to apples. When compared to SQL Server synchronous mirroring NDB wins hands down. The cost associated with NDB is that it resides in memory (5.1 allows you to keep non indexed data on disk), and therefore your dataset is limited by the amount of memory you use. [...]
With all the negatives put aside if you have an application that requires redundancy, and fast inserts and selects on a single table, NDB is the best product out there. We've been running it for almost 18 months and it's been rock solid. Compared with other SQL Server and other MySQL implementations this has required the least amount of attention. One final note this will not run on Windows.

Also other storage engines get a good evaluation:


Myisam whether intentional or not is built and optimized for read-only datasets. Some of the features that make this the case is the fact that it doesn't support transactions, so taking user updates would be dangerous, but you don't incur the overhead of the transactional log. It performs locking at the table level not the row or page, which is not the best for active OLTP systems, unless the system is only doing inserts. On an application only performing inserts it performs well because it has the ability to perform concurrent inserts and selects. This is because data is appended to the end of the file, and a table lock is not issued, allowing for selects of the data not to be blocked.

Wow, even I had missed this fact about MyISAM. There's always so much talk about MyISAM performing badly on "concurrent writes", I didn't realize that pure INSERTs are not a problem only UPDATEs. I immediately tested this and on a 2 core system (my laptop) the performance of bulk insert doubled when using 10 threads instead of 1.

Thanks Ben, this is useful info. There are many networking use cases where you have to collect lots of data from network elements in some kind of aggregator or mediator, and there the data is then queried, forwarded and deleted later on, but the data is never UPDATEd. It seems MyISAM is a good fit for this kind of use case after all, not just pure datawarehousing. (Especially with PARTITIONing, where you can delete full tables/partitions and not just a bunch of rows.)

Friday Jun 06, 2008

Family of MySQL Cluster bloggers

While this blog is co-authored by the whole MySQL Telecom team, many members in or around the team also write their personal blogs, which you will find very useful. So please follow me on a tour on the absolute top MySQL Cluster blogs in the world:

Johan Andersson is the MySQL Cluster Principal Consultant, and has been with MySQL Cluster since the Ericsson days. He travels around the world to our most demanding customers and shares his guru advice. Rumor has it that recently on a training gig the students made him sign their MySQL t-shirts, can you get closer to living like a rock star than this? Occasionally he also shares some great tips and status info on his blog. Like right now you can find a set of handy scripts to manage all of your MySQL Cluster from one command line, definitively recommended to try!

Jonas Oreland is the Architect of MySQL Cluster. Now, us mere mortals may not always understand everything he is writing about in his blog, but if you want to know what is happening in the MySQL Cluster codebase right now, this is the place to go. And this is really cutting edge, the stuff he writes about in his last post may not appear in a GA release until next year.

Speaking of architects, it is natural to next introduce Mikael Ronström, Father of MySQL Cluster. Yes, Mikael is the one who came up with the whole architecture of MySQL Cluster, we have him to thank for 100,000 writes/sec and linear scalability. (Thank you!) Mikael actually isn't on the Cluster team anymore, he has for some time already been working on the general MySQL Server with things like replication and performance improvements. For newest benchmark of MySQL Server and MySQL Cluster, go to Mikael's blog.

Having dealt with the old and honourable Ericsson alumni, the next blog I want you to follow is Jon Stephen's. Jon is a technical writer, meaning he writes the MySQL Cluster manual. He is very diligent, constantly pestering the developers to divulge some information on how the Cluster work, to the benefit of all of us Cluster lovers. In short, if you want to know how Cluster works, ask Jon, he will know.

Like a cousin to Jon (in Cluster trivia, at least :-) is Ramon Graham, Product Manager for MySQL Cluster. A relatively new blog, but it appeared right in time to answer the worrying question, where did MySQL Cluster disappear? In general, if you are ever wondering where MySQL Cluster is going... talk to Ramon.

MySQL Cluster may have been born in Sweden, but there is a strong Australian angle to it... So let me finish by introducing our 2 Australian Cluster heroes. (I have a theory why these 2 hyperactive guys can get so much done every day, it is because Australia is in such an early timezone, they simply have more hours in a day as we others do!)

First up Stewart Smith, developer in the Cluster team and beloved lecturer about MySQL Cluster. Apart from Jonas' blog, this is another blog to follow if you want to follow how the development of MySQL Cluster. But Stewart writes actively about many things, in fact he is the president of organising the next linux.conf.au - one of the most popular Linux developer conferences in the world.

And we started with a consultant, we will end with a consultant: The MySQL-HA blog is where Monty Taylor is writing together with Alexander Rubin about Cluster, High Availability and performance. Monty is also one of those consultants traveling around the world from the Amazon basin in Brasil to various European capitals. He recently expressed that he has not yet ever been to China, so if you are in China and considering to hire a MySQL Cluster consultant, be sure to contact our Sales department! Monty is also the man behind the NDB connectors... hmmm, NDB bindings project which provides python/php/perl/ruby etc... bindings to the native NDB api of MySQL Cluster. If you want to qualify as a MySQL Cluster geek, get familiar with one of the NDB bindings! (Of course, the native C++ NDB API is also an option.)

I believe those are the blogs of the Cluster team & friends I know about. But if there are more, let me know and we'll add more blogs to the end of this post. By the way, the Cluster team is getting an infusion of old-Sun database cluster experts, maybe there are some blogs there we should know about? Let me know!

Update: Another Swede from the Cluster team, Magnus Svensson had unnoticedly joined the Cluster team just while I was writing this article. Today (June 24th) he has a great tip for those of you who want to have your first touch at the MySQL Cluster code: How to insert a simple printf statement for debugging purposes.

PS: I personally also have a blog, where I will never ever write about anything MySQL Cluster related (because that I do here) which focuses on Open Source: phenomenons, culture, business models and trends. Feel free to pop by for a leisurly non-technical read at The Open Life blog.

Friday May 23, 2008

"Telephony is just yet another Internet application." MySQL talks with Juha Heinänen.

During 2008 we are planning to run a series of interviews with interesting persons somehow related to the telecom field. In this first installment, we will have a chat with Juha Heinänen from Finland.

MySQL: Juha, you are a former professor of Computer Science and Communication technology, CTO (or similar positions) in at least Sonera, Telia and Song, former ATM specialist, responsible for bringing Internet to Finland and registering the .fi top-level domain, a consultant for many early network equipment startups some of which succeeded to be still with us today, and author of several RFC's. Nowadays you are a core contributor to the OpenSER SIP proxy, and you sell a VoIP platform called OpenSIPg based on that to Nordic operators. In addition you seem to live a life that would likely be a dream of many hackers, spending time in different parts of the world hacking on your favorite Open Source project. Even to this date, I don't think we've ever met in person.

MySQL: For many years now you have been working with the SIP protocol and OpenSER SIP Proxy. When did you first turn your eye to SIP and why did you become
interested in it?

JH: When working for Song Networks (now TDC) in late 1990 and early 2000, we saw a need for a hosted VoIP service for small businesses. Due to my IETF background, a natural protocol choice for me was SIP instead of the then dominant H.323. At first, we trialled a Swedish commercial SIP proxy called Hotsip, but didn't find it flexible enough for our purposes. I then heard about an Open Source SIP proxy project called SER, saw its great potential, and soon became a SER developer although I had not written a single line of code during the past 10 years.

MySQL: Knowing that you enjoy coding, it must have been great to return to it! By the way, what is your short, 2-3 sentence introduction to OpenSER?

JH: The OpenSER project is a spin off of the SER project. Our aim in OpenSER is to bring to the market a well tested new release of OpenSER SIP proxy every 10 months or so. Today OpenSER is a very successful project with many of high quality developers and a wide user community.

MySQL: How do you see the Internet vs the traditional telephone network? Will SIP (or some other Internet based protocol) eventually completely replace the Plain Old Telephone System?

JH: This is hard to answer, because there always exists the "dark side" that wants to retain the old walled garden style POTS service no matter what equipment or protocols they internally use. These people see telephony as something special, not just yet another Internet application.

MySQL: I remember once talking to you, that you were furious about an operator who insisted on you to implement minute based billing for OpenSER :-) I guess you never did that for them?

(Note to readers: Not that you would consider this for any other Internet protocols either, but this kind of requirement is especially ridiciluous for a peer-to-peer protocol like SIP, since most of the data in a VoIP call may not route through the operator network at all, so it would be hard to justify the operator charging for traffic that is actually happening in some other operators network!)

JH: I don't recall this, but time based billing of SIP calls would be very difficult to implement without also getting involved with routing of media. That, in turn, would mean that most of the advantages that SIP based telephony has over POTS would be lost.

MySQL: What do you think about the IP Multimedia Subsystem?

JH: IMS is a next generation implementation of walled garden telecommunication services. I let it live its own life. I don't care if some users are too rich or lazy and choose IMS instead of open Internet based services as long as I'm not forced to do so.

MySQL: What will happen to service providers (mobile and fixed)? Especially as VoIP companies provide much cheaper calls. And web companies like Google are
offering services. Will the carriers be reduced to bitpipes?

JH: Mobile or fixed Internet access is always worth the money and I gladly pay for it. What I don't like is when operators start to milk their cows without providing any real added value, e.g., by charging huge roaming fees for mobile Internet access. It is operators' own choice if they let companies like Skype and Google take away their customers by not providing their own Internet based telecommunication services.

MySQL: Or asking the same question differently, who will eventually be our service provider for voice calls? Google, Nokia, my current telecom operator, or the current VoIP service providers or maybe some decentralised non-commercial and free peer-to-peer VoIP network?

JH: To me telephony is just yet another Internet application. The same parties will be providing it in the future that today are providing email, web, etc. services. In case of my own company, TutPro Inc., it is TutPro Inc. itself, because I don't like the idea that someone else (perhaps with ties to government spy agencies) is storing my emails or routing my VoIP calls.

MySQL: What is your view on convergence? Or even simpler, what is convergence?

JH: Convergence is a term that I don't fully understand. My goal is to be able to use Internet for all my communication needs. What prevents it from happening today is too slow and (sometimes) too expensive mobile Internet access that is unsuitable for real-time communications.

MySQL: So, tell us more about your current projects. What are you working on now?

JH: I have OpenSER and SEMS based SIP platform called OpenSIPg that a few operators and organizations in Finland and Sweden use to offer their VoIP and presence services. Developing OpenSIPg keeps me busy, but thanks to mobile Internet access, does not tie me physically to some particular place or country.

One new thing that I have been working on is a simple, certificate free mechanism for reliable verification of trusted peers. It is based on Radius protocol and a broker model similar to what was used already long time ago for dial-up access.

MySQL: I know you recommend MySQL Cluster to your customers as the database to go with OpenSIPg. What is the database mainly storing, and what features make MySQL Cluster the best fit?

JH: Well, firstly OpenSER SIP proxy keeps all location and presence data in MySQL database tables. My own principle in developing OpenSIPg has been that my customers should not need to edit any text files when they provision users or the VoIP infrastructure itself. So all OpenSIPg information is kept in MySQL databases, where it can be accessed and manipulated via web based GUIs.

The databases should naturally be resilient and therefore a clustered implementation is the best fit.

MySQL: By the way, for the more technical readers, do you have any kind of numbers about the loads OpenSER and the database behind it must support? Like calls per second or SQL transactions per second? (I know the Finnish operators are not the biggest in the world, but still.)

JH: None of my customers have hit or even been close to any performance limits yet. Nevertheless, a good SIP proxy design tries to minimize the number database operations that need to be performed per request. We thus recommend MySQL cluster more for high availability rather than performance reasons.

MySQL: If you had 3 wishes - but restricted to MySQL Cluster - what would you wish for?

JH: I would wish that MySQL 5.1 would become available also as Debian/Ubuntu packages, because cluster capabilities in 5.1 are more developed than those in 5.0. From maintenance point of view it is not a good idea to install any software to servers from tar files. My other wishes are related to ease of use. Setting up and running MySQL cluster should not require a high degree in database administration.

MySQL: So let's see, your product is based on Linux, OpenSER, PHP, FreeRADIUS and MySQL. What is the importance of Open Source in Telecom? What can Open Source do for Telecom?

JH: Open Source is important for everyone. Large developer and user communities of Open Source software can produce rapidly higher quality software than even the biggest companies can do on their own.

MySQL: Years ago, we had an email chat about a mobile application that was using SMS messages to communicate with a server. Your quick comment was: "Nice, if you want to use such legacy technology." As the pioneer spirit you are, where do you see the border between "legacy" and "modern" in 2008?

JH: I think I was referring to SMS as "legacy" technology because SMS was not terminal and underlying network independent Internet application. That is still true today and for some strange reason even Nokia has not yet made SIP based messaging available in its phones.

MySQL: And what will be legacy in 2011?

JH: I'm afraid that in 2011 there still exist mobile network specific services that do not work end-to-end unless each mobile operator has made a bilateral agreement with each other mobile operator. Such a service model simply does not scale nor lead to rapid development of innovative services.

MySQL: Thanks Juha for taking the time to talk to us, it has been a pleasure. And all the best to your future projects.

About

The people of the MySQL Telecom team writes about developments around MySQL and MySQL Cluster and how these products are used by our Communitcations industry customers. (Image jasmic@Flickr)

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today