Tuesday Nov 10, 2009

Q&A on VDI and MySQL Cluster

originally blogged by Tino Rachui

I recently gave an interview on Sun VDI and MySQL Cluster to Lenz Grimmer, MySQL Community Relations Manager. It has been published on dev.mysql.com. Check it out: http://dev.mysql.com/tech-resources/interviews/tino-rachui-sun-vdi-cluster.html


Monday Sep 14, 2009

Virtualization for MySQL on VMware

originally blogged by Tino Rachui

In case you have missed this. There exist a document "Virtualization for MySQL on VMware: Best Practices and Performance Guide" which provides a comparison of virtualized MySQL Server instances running on VMware ESX with non-virtualized MySQL Server instances running on bare-metal. Additionally the document contains "Best Practice" configurations for running MySQL Server on VMware ESX. Please note that the document is about MySQL Server and not MySQL Cluster which has to run on bare-metal.

Why is this all important in the context of VDI 3.x? You should have heard meanwhile that Sun VDI 3.x is making use of MySQL to host the VDI configuration. Secondly for the VDI Single Host configuration which we have introduced with VDI 3 Patch 2 you have to use a MySQL Server. With the ability to run MySQL Server in a virtualized environment you also gain the possibility to run the whole VDI Single Host configuration in a virtualized environment. You have to play with it a little bit though as we haven't tested this particular configuration extensively in our labs yet.

Thursday May 21, 2009

VDI 3 - Why you need 3 VDI hosts and what you can do about that?

originally blogged by Tino Rachui

VDI 3 is out for quite a few days meanwhile and the feedback we've got so far reaches from positive to total excitement. That's a nice and really satisfying fee for all the hard work necessary to get this product done in the given time frame. Several things have been written already about VDI 3 and How it all came together so I don't want to repeat such general information here. I'd like to rather concentrate on a specific aspect of VDI 3 namely the MySQL database integration. Why I want to do that?

  1. I'm the responsible engineer for the MySQL database integration (you now have somebody to directly blame for any shortcomings in this particular area ;))
  2. Feedback we've got so far on various tech and feedback aliases suggests there are a few questions open around this area and a few misunderstandings exist as well (apparently the VDI 3 documentation has still some gaps to be filled in this regard). 

VDI 3 has a far more sophisticated data model than his predecessor VDI 2. This data model almost naturally fits into a relational data model hence we changed the path from using the Sun Ray or Sun SGD data store (both LDAP databases) and decided to use a relational database instead. Around the time of this decision Sun announced to acquire MySQL. Given this turn of events and the reputation of MySQL it was not exactly difficult to choose a specific relational database for VDI 3. Besides that what other requirements influenced the VDI 3 database design?

  1. No single point of failure (SPOF). As VDI 3 is a system destined to serve hundreds of users 24x7 hours the system shouldn't have a SPOF.
  2. No MySQL knowledge required to get started with VDI 3. People should be enabled to get started with VDI 3 in production without the requirement to learn about relational databases in general and MySQL in particular and yet be able to quickly setup a high available (HA) system.
  3. Flexibility. VDI 3 is supposed to serve small to large deployments. What will be optimal for smaller deployments will most likely not be optimal for larger deployments too. So VDI 3 should offer the flexibility to satisfy a broad range of deployment scenarios.

All three are important requirements but as so often in life you cannot treat everything as equally important but have to prioritize. For us 1. was the most important requirement leading our initial choice of a particular MySQL version to use for VDI 3. To make a long story short we've chosen MySQL Cluster which is a "...high-availability, high-redundancy version of MySQL adapted for the distributed computing environment...". While this choice perfectly satisfies requirement 1 it doesn't help (rather contradicts) a solution for requirement 2 because setting up MySQL Cluster, though the MySQL folks tried hard to make it as easy as possible, requires quite some knowledge and experience. To compensate this disadvantage we have invested a lot of energy and effort into the VDI 3 installation process. The result is that now with VDI 3 if you choose to use the embedded database a complete, high available MySQL Cluster will be automatically configured for you. You don't need to know anything about the details and theory behind it. There is one noticeable implication of using MySQL Cluster: You need a minimum of three physical hosts to configure a HA MySQL Cluster. This is the recommended deployment option for the embedded VDI database scenario. Why it is necessary to have three physical hosts will be answered here.

Feedback we've got from the field so far reveals that this is a particularly painful implication. A lot of customers want to start small and cheap. Requiring a minimum of three physical VDI hosts if you only want to run a handful of virtual desktops can be hard to explain to somebody. How to escape this dilemma? I will show some possibilities below.

But first let me say something about the third requirement "Flexibility". The embedded VDI database is fine for a lot of smaller to medium deployments but it might not be the perfect match for each and every situation or customer. So we've added the possibility to make use of an existing MySQL database known as the remote database option in the VDI 3 configuration. This hopefully satisfies requirement 3. Here are two of the most likely reasons making the remote database option the better choice:

  • People have a MySQL database running already and want VDI to integrate into that database.
  • People need more freedom in setting up the MySQL database for a variety of reasons e.g. performance, security etc.

So now on to the deployment options and what you can do about the requirement of three physical VDI hosts. I will enumerate a couple of supported scenarios. First of all you can choose the remote database option. This allows you to either use MySQL Server >= version 5.0 with InnoDB or MySQL Cluster version >= 6.2.15 in whatever way is supported by MySQL. One minimal setup enabled by this option is the following one:

  • All in one host scenario [\*]. In this scenario you will run everything on one physical host. Everything means Sun VDI 3, VirtualBox and the storage. Please note that the requirement for this deployment options is to run Solaris 10 >= update 7 on this host. Concerning the database you would have a locally installed MySQL Server and connect to it selecting the remote database option during VDI 3 configuration. It should be pretty clear however that this kind of deployment offers zero redundancy meaning it is one big SPOF (the cheapest one though :)). Furthermore you would have to buy a separate MySQL support contract if you need this part of the solution to be supported formally.
    [\*] Requires VDI 3 patch 1, soon to be released!
    All-in-one-host scenario
  • Shared VirtualBox, VDI Primary host. In this scenario the Sun VDI 3 Primary and one of your VirtualBox hosts share one physical machine. Make sure the shared host has enough capacity to deal with these two roles at the same time.
    Shared VBox-SUn VDI Primary
  • Running the VDI 3 Primary in a virtual machine. This option is useful when you are using VMware as virtualization platform. Running MySQL Cluster completely in a virtualized environment is not supported see MySQL FAQs. Given the fact that the MySQL Cluster management node requires only little resources the MySQL folks have agreed that it is an acceptable and supported scenario to run it in a virtual machine. The two VDI 3 secondary hosts running the MySQL Cluster data nodes nevertheless need to run on bare metal.
    VDI Primary in VM
That's it for now, a lot of stuff. I hope a could clarify a few things concerning the VDI 3 database integration.


Thanks for the explanation. It helps me understand what went through the steps, but there is flaw's in your logic.

Flaw number 1: Mysql Cluster isn't a SPOF. This is a bad assumption. Once the management server goes down, so does the cluster.

Flaw Number 2: There is no way to get user assignments out of VDI 3.0 unless you write a sql script to pull them out of VDI as follows:
select distinguished_name,name from t_pool,t_poolclient,t_poolclient_has_pool where t_pool.id = t_poolclient_has_pool.pool_id and t_poolclient_has_pool.pool_client_id = t_poolclient.id into outfile '/tmp/tim2.out' fields terminated by ':' enclosed by '"' lines terminated by '\\n';

Flaw Number 3: Flexibility. Adding the 2 nodes of mysql cluster isn't flexible for a small deployment. As a matter of fact for a 100 sun ray/vm deployment on Vmware, it increase my cost by 20%. Not sure what's flexible about that?

The fundamental change in data stores also means that having dual data centers or multiple sites becomes increasingly challenging. I don't want to have to manage all of the sites differently. There is no replication of tables from cluster to cluster to make managing in a dual FOG environment easily. VC is a single point of failure, so I need to do 2 whole VMware Environments.

Did Engineering even think about how these solutions are implemented in the field outside of SUN? Did you ask your customers?

Gesendet von Tim Ebbers am Mai 27, 2009 at 09:30 PM MESZ #


"Flaw Number 1: ...Once the management server goes down, so does the cluster."
This is \*not\* true! Not sure where you got this wrong information from. I wont deny that having more than one management node would be preferable but it is not a must and would require even more physical hosts to get started.

"Flaw Number 2: There is no way to get user assignments out of VDI 3...."
The nice thing about having an underlying SQL database is that, as you showed, you are possible to do it even though formal support for this through the admin UI or cli is not available yet - which is in my opinion more a request for enhancement than a flaw. Why do you need this by the way? Feel free to submit a corresponding change request.

"Flaw Number 3: Flexibility...Adding 2 nodes of mysql cluster isn't flexible for small deployments..."
I think you completely missed my point. I was referring to "use of embedded MySQL Cluster vs. using a remote MySQL instance" with regards to flexibility. If adding to MySQL Cluster data nodes is too much/heavy/whatever for your purposes why not go for the remote db option?

And last but not least "YES" we are asking our customers and "YES" we are incorporating customer and field feedback on a continues basis. VDI is a solution primarily focusing outside rather than Sun internal customers. Nothing is perfect but so far we are getting great feedback on VDI 3.0. Nevertheless we welcome every constructive form of criticism in order to meet customer needs and make the product even better.


Gesendet von Tino Rachui am Mai 28, 2009 at 08:56 AM MESZ #

"Flaw Number 1: ...Once the management server goes down, so does the cluster."
This is \*not\* true! Not sure where you got this wrong information from. I wont deny that having more than one management node would be preferable but it is not a must and would require even more physical hosts to get started.

\*\*\*I got this information from my own experience. where the VDA service would have to be restart if the data nodes or management nodes crashed.
and from your own documentation from the link above: The SQL nodes talk to the MySQL mgmt node. MGMT node down, so are the sql nodes. AND SRSS won't know about and switch them to a Server with the Data Nodes on them.
SQL node. This is simply an instance of MySQL Server (mysqld) that is built with support for the NDBCLUSTER storage engine and started with the --ndb-cluster option to enable the engine and the --ndb-connectstring option to enable it to connect to a MySQL Cluster management server. For more about these options, see Section 17.4.2, “mysqld Command Options for MySQL Cluster”.
"Flaw Number 2: There is no way to get user assignments out of VDI 3...."
I was pointing out the fact that you do need relational database knowledge contrary to your "no mysql knowledge to get started". Why do I need this? Because our business requires 2 Full FOGS because of the SPOF's in a single datacenter. We run these active/active, and with VDI 3.0, I don't want to manage both, so I'm forced to use either table level replication, replicating the pool and cloning but most importantly the assigned user information from data center to Data center to make it easier on my end users or writing SQL statements to sync the databases, which is what I have chosen to do at this point while my Sun team comes up with a better solution. This allows them to get the same desktop regardless of which DC they end up in, and if necessary, can give them a new VM if a full DC is down.

Flaw 3: Using a Remote DB would be great. I want to use LDAP. Not SR's or SGD's, I want to use AD's LDAP as my datastore. We have no Mysql Experience in house except me, and I don't count, as I use to sell VDI less than 4 weeks ago maybe 8 by now. We would have to buy a service contract increasing our TCO even higher for all 130-140 sites? That's not likely. And what happens now you are Oracle...We going to have to use Oracle RAC? in VDI 3.1?

Constructive feedback:
Don't tie yourself to one datastore product. Using LDAP allows me to integrate it with EDIR, AD, OpenDS, etc. And the Data model Really? this isn't a complex or heavy weight data model. Maybe compared to VDI 2, but not that requires a relational Database for sure. 12 tables, and some are Lookup tables?

Don't get me wrong...I love VDI 3...I think you did a great job on the software. I think you did a poor job of picking a backend datastore. If I didn't like VDI, I wouldn't have left Sun to implement it. But seeing my costs go up by 20% from VDI2 to VDI 3 is rather alarming.

My end users also love VDI 3. The RDP Broker on the Sun Ray's is an awesome feature. The now easy integration with SGD is to die for.

So if you want to discuss, contact Peter Gehl. As I have shared my thoughts regarding the subject with him as well. This really becomes a cost exercise. If VDI and Sun Ray's aren't cheaper than a desktop, what's the point? When I say cheaper, I mean in all aspects. Total TCO. I went from a 3 server VDI environment to a 5 server VDI environment for 100 people?


Gesendet von Tim Ebbers am Mai 28, 2009 at 12:12 PM MESZ #

"I got this information from my own experience. where the VDA service would have to be restart if the data nodes or management nodes crashed."

Again, if the management node is down but the two data nodes are up the cluster is running fine! I just double-checked this. What exactly was down on your side? As soon as two of the three critical cluster components are down the whole cluster is down though for instance the management node and one data node or both data nodes.

"and from your own documentation from the link above: The SQL nodes talk to the MySQL mgmt node. MGMT node down, so are the sql nodes. AND SRSS won't know about and switch them to a Server with the Data Nodes on them."

Can you specifically point my to the place you are referring to? The management node comes into play either when nodes want to join or leave the cluster or when it has to act as arbitrator in the split brain scenario only.

"I was pointing out the fact that you do need relational database knowledge contrary to your "no mysql knowledge to get started"..."

You know everything will be developed according to the 80-20 rule. ;) So far nobody except you has ask for this functionality and nobody so far was forced to resort to use (My)SQL knowledge to get data out of the VDI 3 database.
This doesn't mean your request isn't valid. It just underlines the fact that you cannot make everybody all of the time happy.

"I want to use LDAP. Not SR's or SGD's, I want to use AD's LDAP as my datastore..."
Unfortunately this is not likely to happen any time soon. Does any other desktop broken, virtualization platform support LDAP backends (honest question, as I don't know for sure)?

"And the Data model Really? this isn't a complex or heavy weight data model. Maybe compared to VDI 2, but not that requires a relational Database for sure. 12 tables, and some are Lookup tables?"
I didn't claim that it is a complex data model, I just said it naturally fits into a relational data model. Besides that the db serves a synchronization point among the various VDA services running concurrently and we are implementing a desktop state machine based on the data model which is why we need synchronous replication. Would that easily be possible with LDAP based dbs - I doubt it.

"...I love VDI 3...I think you did a great job on the software..."
THANKS for the flowers!

"...I think you did a poor job of picking a backend datastore..."
I'm still 100% convinced that the db choice was the right one taking costs, features etc. into account. Maybe in the future we'll support other dbs than MySQL as well but nothing has been decided in this direction yet.


Gesendet von am Mai 29, 2009 at 08:30 AM MESZ #

Ok, so let me give you my business requirements, and as an engineer, help me come up with a solution:

I have 2 VDI 3 environments, Load Balanced with a Persistent session. I have 2 Active Active environments.

I want a single point of management, just like the 7110's. I want to be able to for a fog, say "Replicate this project/Pool or Pools" to VDI environment B. I'm a Virtual Center user, so I have 2 separate Virtual Centers. One for FOG A and one for FOG B. I don't want all of the pools replicated, just one or 2.

In the 7110, it's awesome and takes aprox 30 seconds to setup. Create a project, add one or many replication partners, add a share to that project and say sync.

I want the same kind of thing for VDI. I could do it myself, but would then be unsupported. It's bad enough that I had to create a cron job to go out and look to see if the data nodes were up or not, and then take the SRSS server down.

And the link you requested is from your post about the sql nodes...So only the first 2 nodes behave like you state.


Sorry for the delay in response.

Gesendet von Tim Ebbers am Mai 29, 2009 at 11:16 PM MESZ #

Hi, Tino:

Many thanks for the blog. I have been able to get pieces of the story from various places, but yours is the first more-or-less complete version I have found. I too am a reseller engineer talking with customers about implementing VDI 3.0 and I too am encountering resistance as regards the number of physical servers needed. I completely 'get it' as far as why, and from a technical perspective agree that the design has the virtues you describe. However, telling people that they will need five servers of some sort for the default implementation runs directly counter to the messaging Sun (as well as the rest of the industry) is producing concerning consolidation, eco-awareness, etc. Not to mention the cost.

Most VDI projects want to start small with a pilot (say, 10-20 DTU's) and then be able to scale up in increments of 10's or 100's. This was possible to do with the VDI 2.0 architecture, and provided a smooth transition from physical to virtual desktops. With this architecture of VDI 3.0 (and I'm NOT YET considering some of the alternatives you have offered, I will study them at length and see if they offer a means to overcome this issue), it becomes a requirement to invest in the servers needed for 100 or more DTU's at the outset. This is proving to be a difficult sell.

So, I have one suggestion and one alternative to offer. Let me first disclaim I am far from an expert in these underlying technologies, especially MySQL and VMware. If that negates the value of these two ideas, I apologize for my naiveté.

First, the suggestion: Just as you have the all-in-one-server Evaluation mode, would it be possible to craft an all-in-one (or two)-server(s) Pilot/Small Business version? Put an upper limit of 25 DTU sessions concurrently, state that performance will be X % less than the "full" product, fault tolerance will be less, etc.... just like the 7110 is the "baby brother" of the 7000 storage devices.

As for the alternative: at a recent conference, I was presented with some performance results for Microsoft SQL running on VMware 4.0 (a.k.a vSphere 4.0). The the conclusion of the exercise that they were discussing is that they are seeing about 85%-90% of the same performance of a physical server running MS SQL in a VMware 4.0 virtual server. I cannot vouch for this fact, I have not investigated to find the source material, but it was not presented as proprietary or confidential, so I assume it can be verified.

If that's true AND similar results occur for MySQL (85%-90% of physical server performance by virtual instances) AND the other improvements inherent to VMware 4.0 and/or Nehalem processors overcome the clock/timing/latency/cluster issues that are referred to in the MySQL FAQ documents, then would this be a viable deployment architecture:

1. Small physical server for Primary VDI core ($1000 Intel/AMD)
2. Two secondary servers that are running VMWare 4.0 that each host (in VM's) the secondary copies of the MySQL cluster nodes and some portion of the DTU's VM's
3. A storage device that both the above servers connect to for the usual things

Not trying to expand this out beyond a single physical site for this example/question, but between storage replication and VMware VMotion, I should think BC/DR is within reach of the design.

I recognize that Sun has a great interest in promoting xVM/VirtualBox as an equal partner to VMWare in VDI 3.0, and perhaps it is. I have not so far heard any similar performance results for SQL databases from the VirtualBox group, I apologize if I'm overlooking them

If all the above will be functional, you then bring the design back into line number-of-server-wise (more or less) and make it modular to an extent as well (you can add additional virtual secondary MySQL cluster nodes as you add additional Tier 3 VM servers).

I'd be interested in your (and others') thoughts on this, or if someone is experimenting with it, please chime in.


Gesendet von Jack Glenn am Juni 16, 2009 at 06:16 PM MESZ #

Totally agree with all these comments as a reseller trying to promote SunRay/VDI3. All users want to dip there toe in the water first, telling them they have to start with 5 physical machines is a non starter and a real turn off. OK, so we can build a simple 1 server demo/evaluation machine but by the time you put together the minimum recommended hardware for a production system to meet the user requirements,added all of the support the cost is astronomical. To boot it is an extremely complex solution to a very easy problem (that is even if the client percieves he has a problem with 100/200 PC's to start with)

Gesendet von Terry Cox am Juli 24, 2009 at 05:14 PM MESZ #

I appreciate the fact that the relevant knowledge was targeted to my limited knowledge. Thank you for the time spent.

I'm currently assembling a large server farm to perform a large scale demonstration of the VDI technology as my Graduate project at the University. I'm having some difficulty in the setup, but nothing I can not figure out.

I'm wondering if in a remote database setup, is it possible to run with just 1 primary and 1 secondary instead of the 2 secondary servers as suggested?

Gesendet von Richard Lowe am Januar 21, 2010 at 03:27 AM MEZ #


« July 2016