Wednesday Apr 02, 2008

Deployment and Failover Study of HA MySQL on a Solaris Cluster

 

 

  • Executive Summary
  • Document Scope
  • Benefits of Solaris Cluster
  • Setup environment
  • Installation of Solaris Cluster and HA MySQL Service s/w
  • Configuring a failover file system using a ZFS pool
  • Installation of MySQL Server
  • Upgrading Testing : MySQL 5.0 to 5.1
  • Internal test suite run
  • S/W Fault Test Run
  • S/W Fault Regression Test Run
  • Uninstalling Solaris Cluster
  • Software links
  • References

 

Executive Summary

 

This document illustrates the deployment process of MySQL on Solaris cluster (SC). It also focuses on regression and failover testing of HA MySQL, and describes the tests  performed. Solaris 10 fully supports MySQL and the HA cluster application agent (data service) for MySQL.

A cluster provides a single view of services for applications such as databases, web services, and file services. Services can scale to additional processors with the addition of nodes. A data service is an application designed to run under the control of a cluster.

The MySQL Open Source database is multi-threaded and consists of an SQL server, client programs and libraries, administrative tools, and APIs. Sun acquired MySQL AB in January 2008.

The objective is to help facilitate adoption of the Sun stack by the Open Source community, and enhance Sun's strong commitment to open source products, including Open Source databases.

 

Document Scope

 

The scope of the deployment environment comprises MySQL Community Server on a Solaris 10 global zone. While the platform of choice was SPARC, the results are identical if deployed on an x86-64 platform.

The scope of testing includes upgrade testing, regression testing, and failover testing. Testing with MySQL Cluster, and performance benchmarking is outside of the scope.
 

Benefits of Solaris Cluster

Reduced system downtime, availability of systems and applications, and increased application throughput via scalable services are some of the benefits realized when using Solaris Cluster.

The Solaris Cluster framework includes Sun Cluster, Sun Cluster Geographic Edition, developer tools and support for commercial and open-source applications through agents. It provides high availability and disaster recovery for local and geographic clusters. This leads to an increased choice in storage replication and networking configurations.

A data service consists of an application, cluster configuration files, and management methods that control starting stopping, and monitoring of the application. The public network interface is a logical hostname or a shared address.

Solaris Cluster provides for Higher Availability (HA) for enterprise applications.  HA components can survive a single s/w or h/w failure. The I/O fencing feature keeps shared storage transparent to a faulty node, and ensures integrity of data. For example, in a 2-node cluster that has a disabled interconnect, each node may assume that it's part of a cluster, and may try and form one. This potential split-brain scenario is avoided by I/O fencing.

A logical interface is presented to the applications. There is no single point of failure in the NICs. Striped traffic over the interconnects with transparent failover results in better network utilization.

Solaris cluster has  a proven monitoring and failover mechanism. A wider choice of low latency, high bandwidth interconnects for deployment exists, that yield higher throughputs. Example : Infiniband, Dolphin.

Device path names are uniform across nodes for the shared LUNs, thereby resulting in easier manageability. Node times can be synchronized during configuration.

A comprehensive list of HA applications (data services) are available, and all configurations are certified and tested. Tight integration with the Solaris kernel results in better fault management and heartbeat mechanism.

Setup Environment

A traditional cluster includes two or more nodes cabled together via private interconnects, and connected simultaneously to shared storage devices (multihost devices). Network adapters provide client access to the cluster. Different architectures and topologies can be used for deployment. The core cluster and data service software, and disk management software are used collectively to monitor, access and administer various resources.

The setup comprises two SPARC V1280 systems connected by two private interconnects.
The MySQL Community Server version used is 5.0.45 .

  •    SPARC platform : 
  •       Nodes : v1280-137-03 and v1280-137-04
  •       CPU : 12 UltraSPARC-III+ x 1200 MHz
  •       Memory : 49,152 MB and 98,304 MB
  •       Storage : SE 6120 (14 x 73 GB )
  •       Operating System : Solaris 10 SPARC, Update 4

After a default OS installation, it is recommended to install the patches for the OS version being used, from http://sunsolve.sun.com .

Ensure that the SC3.x version being used has support for Solaris 10, Update x.
 

Installation of Solaris Cluster and HA MySQL Service s/w

 

This section describes the installation and configuration of the Sun Cluster and the HA MySQL service s/w components.

On the first node, perform these steps as root user :

Remove the previous product registry, if any :
# /var/sadm/prod/SUNWentsys5/uninstall

Download and uncompress the SC s/w .zip file : Navigate to sun.com, click on 'downloads' , and follow the links;
(suncluster-3_2-ga-solaris-sparc.zip)

Set the PATH variable to include the directory of the Sun Cluster binaries :
# PATH=$PATH:/usr/cluster/bin:/usr/ccs/bin:/usr/sfw/bin; export PATH

Navigate to the sc/Solaris_sparc sub-directory, and type :
# ./installer

1. Choose 'No' for :
Install the full set of Sun Java(TM) Enterprise System Products and Services?

2. Choose options 4,6 to select these products, and then 'A' on each to install all its components :
[X] 4. Sun Cluster 3.2
[X] 6. Sun Cluster Agents 3.2

3. Choose 1 for :
Upgrade the shared components that were installed in the previous step ?

4. Choose 2 :
Configure Later - Manually configure following installation

5. # scinstall

6. Choose 1 :
Create a new cluster or add a cluster node

7. Choose 2 :
Create just the first node of a new cluster on this machine

8. Choose the 'Typical' mode of operation

9. Choose mysql-ca  as the name of the cluster

10. Choose 'Yes' for :
Do you want to run sccheck (yes/no) ?

11. Enter the 2 node names when prompted, and press Control-D to complete the configuration :
v1280-137-03
v1280-137-04
\^D

12. Configure at least two cluster transport adapters (eg. ce0 and ge0 here).
Choose 'Yes' to confirm each as being dedicated.

13. Choose 'No' for :
Do you want to disable automatic quorum device selection (yes/no) [no]?

14. Choose 'yes' for :
Do you want scinstall to reboot for you (yes/no) [yes]?

15. Choose 'Enter' to reconfirm the options to scinstall
Monitor via an nts console (Esc. sequence : shift + tilde followed by '.')

On the 2nd node, perform all the above steps (1-15 as applicable), as done for the 1st node, except for these differences :

16. Choose 1 :
Create a new cluster or add a cluster node

17. Choose :
Add this machine as a node in an existing cluster

18. Choose 'Typical' mode of operation, and supply the cluster name chosen earlier

19. Choose yes for :
Do you want to use autodiscovery (yes/no) [yes]?

20. Choose remaining prompts similar to those for the first node.

Once the nodes have booted into the cluster, the configuration of Sun Cluster and HA MySQL s/w are complete.


Configuring a failover file system using a ZFS pool

 

This section describes the creation and configuration of a failover file system using a ZFS pool.

The database will reside on a global file system whereas the installation will be on local file systems. A failover file system has increased performance over a cluster file system as all nodes do not have to commit. However, as only one node sees the file system at a time, there is a slight
increase in failover time.

On the first node, perform these steps as root user :

1. Choose a suitable shared disk among those in your setup. The disk names would have two entries, and can be listed by typing :
# scdidadm -L

2. Execute the zpool command as is (volume mysql3 and the specified shared disk name are for illustration):
# zpool create mysql3 c1t20030003BA13E6A1d0

3. Mount the volume, and configure a resource group :
# cd /
# zfs create mysql3/sqlvol
# zfs set mountpoint=/global/mysql  mysql3/sqlvol 

# clrg create mysql-rg  // create a failover resource group

# clrt register SUNW.HAStoragePlus

# scrgadm -at SUNW.HAStoragePlus -x Zpools=mysql3 -g mysql-rg -j sql-stor  // create the HAStoragePlus resource named 'sql-stor' in the mysql-rg resource group for the MySQL disk storage.

# clrg online mysql-rg 
// bring the failover resource group online; the MySQL Disk storage resource previously added gets enabled. Subsequently, add and enable other resources to it, such as the logical host resource,  gds data service resource, and the MySQL resource.

# clrg status mysql-rg  // check the status of the resource group


Installation of MySQL Server

 

This section describes the installation and configuration of the MySQL Server.

On the first node, perform these steps as the root user :

If a MySQL environment already exists, do a cleanup, and remove the MySQL packages; otherwise proceed to Step 1 below :
# clrg offline mysql-rg
# clrg delete -F mysql-rg
# rm -rf /global/mysql/\*
# pkgrm mysql


1. Obtain a standalone, separate hostname and IP for use as the MySQL resource group failover address. (in this eg. v1280-logical,10.x.x.x)

2. On each node, download the packaged release binary file (pkg. format) from the MySQL site dev.mysql.com , and uncompress it into the target directory.

# pkgadd -d mysql-5.0.45-solaris10-sparc-64bit.pkg 
# cd /usr/local
# ln -s //opt/mysql/mysql  mysql   // soft link to the mysql binaries directory

3. From the node on which /global/mysql is mounted, bind the failover IP to the resource group :
# scrgadm -aLl v1280-logical -g mysql-rg // create a resource for the logical hostname in mysql-rg
# clrs enable v1280-logical // enable the logical host resource
# chown -R mysql:mysql /global/mysql
 

Modify the default mysql configuration file (/global/mysql/my.cnf) :
# cp /opt/SUNWscmys/etc/my.cnf_sample_master /global/mysql/my.cnf
bind-address=10.x.x.x,socket=/tmp/v1280-logical.sock,
log=/global/mysql/logs/log1,log-bin=/global/mysql/logs/bin-log, innodb_data_home_dir=/global/mysql/innodb

# /opt/mysql/mysql/scripts/mysql_install_db --datadir=/global/mysql  // install the mysql grant tables
# chown -R mysql:mysql /global/mysql
# cd /global/mysql
# mkdir logs innodb


Modify and configure the default file mysql_config , in order to create a fault monitor user and a test database for the MySQL instance. The fault monitor attempts to restart the server in case of a shutdown :

# cp /opt/SUNWscmys/util/mysql_config /global/mysql
MYSQL_BASE=/opt/mysql/mysql;MYSQL_USER=root; MYSQL_PASSWD=admin123;FMUSER=fmuser;FMPASS=fmuser; MYSQL_SOCK=/tmp/v1280-logical.sock; MYSQL_NIC_HOSTNAME="v1280-137-03 v1280-137-04"

# chown -R mysql:mysql /global/mysql

Test MySQL server startup and shutdown :

#/opt/mysql/mysql/bin/mysqld --defaults-file=/global/mysql/my.cnf
--basedir=/opt/mysql/mysql --datadir=/global/mysql
--user=root --pid-file=/global/mysql/mysql.pid &

# chown -R mysql:mysql /global/mysql
#  /opt/mysql/mysql/bin/mysqladmin shutdown -S /tmp/v1280-logical.sock

Restart the MySQL server; grant database resource privileges to the administrator user:

#/opt/mysql/mysql/bin/mysqld --defaults-file=/global/mysql/my.cnf
--basedir=/opt/mysql/mysql/ --datadir=/global/mysql
--user=root --pid-file=/global/mysql/mysql.pid &

# /opt/mysql/mysql/bin/mysqladmin -S /tmp/v1280-logical.sock password 'admin123' // enable the admin user to access a local MySQL instance with a MySQL logical i/p

# /opt/mysql/mysql/bin/mysql -S /tmp/v1280-logical.sock -uroot -padmin123
> use mysql
> grant all on \*.\* to 'root'@'v1280-137-03' identified by 'admin123';
> grant all on \*.\* to 'root'@'v1280-137-04' identified by 'admin123';
> grant all on \*.\* to 'root'@'v1280-logical' identified by 'admin123';
> UPDATE user SET Grant_priv='Y' WHERE User='root'  AND Host='v1280-logical';
> UPDATE user SET Grant_priv='Y' WHERE User='root'  AND Host='v1280-137-03';
> UPDATE user SET Grant_priv='Y' WHERE User='root'  AND Host='v1280-137-04';> exit

Configure MySQL for HA, and register MySQL as a failover data service. Modify the default configuration file ha_mysql_config to include the cluster information :
# /opt/SUNWscmys/util/mysql_register -f /global/mysql/mysql_config

# cp /opt/SUNWscmys/util/ha_mysql_config /global/mysql
RS=mysql;RG=mysql-rg; PORT=3306;
LH=v1280-logical; HAS_RS=sql-stor;
mysql specifications : BASEDIR=/opt/mysql/mysql; DATADIR=/global/mysql;
MYSQLUSER=mysql;MYSQLHOST=v1280-logical;
FMUSER=fmuser; FMPASS=fmuser; LOGDIR=/global/mysql/logs; CHECK=yes

Register the SUNW.gds data service before registering and enabling the MySQL resource :
# chown -R mysql:mysql /global/mysql
# clrt register SUNW.gds 
# /opt/SUNWscmys/util/ha_mysql_register -f /global/mysql/ha_mysql_config 
# clrs enable mysql & // enable the HA MySQL resource specified by RS in the configuration file

Manually test the failover :
# clrg switch -n 2 mysql-rg 
# clrg status mysql-rg


Upgrade Testing : MySQL 5.0 to 5.1

Simulate a production environment scenario, wherein database upgrades are the norm. Upgrade the MySQL server from version 5.0.45 to 5.1.22 with minimal or no disruption to the underlying stack.

As root user, set PATH and disable the mysql resource :

# PATH=/ws/onnv-tools/SUNWspro/SS11/bin:/usr/ccs/bin:/usr/local/mysql/bin:
/usr/local/mysql/libexec:/usr/sbin:/usr/bin:/usr/cluster/bin:/opt/mysql/mysql/bin
# export PATH
# clrs disable mysql   

On both the nodes, replace the binaries under /opt/mysql/mysql dir. from 5.0.45-64 to 5.1.22-64 :
# rm -rf /opt/mysql/mysql/\*
# cd /opt/mysql/mysql

# gunzip -c mysql-5.1.22-rc-solaris10-sparc-64bit.tar.gz|tar xvf -

On the first node, comment out the 'innodb_arch_dir' entry setting in /global/mysql/my.cnf  (and any other innodb setting that is not valid for  5.1) .

On the first node, restart the 'mysqld' server unmonitored :

#./mysqld --defaults-file=/global/mysql/my.cnf  --basedir=/opt/mysql/mysql --datadir=/global/mysql --user=root  --pid-file=/global/mysql/mysql.pid &
# chown -R mysql:mysql /global/mysql/innodb /global/mysql/logs
   /global/mysql/mysql

Run the mysql_upgrade utility to upgrade and repair the grant tables in the target database :
# ./mysql_upgrade -S /tmp/v1280-logical.sock  -uroot -padmin123
# ./mysqlcheck   -S /tmp/v1280-logical.sock  -uroot -padmin123 --all-databases // run optionally

Refresh the grant privileges for the administrator :
# /opt/mysql/mysql/bin/mysql -S /tmp/v1280-logical.sock -uroot -padmin123 
> use mysql
> grant all on \*.\* to 'root'@'v1280-137-03' identified by 'admin123';
> grant all on \*.\* to 'root'@'v1280-137-04' identified by 'admin123';
> grant all on \*.\* to 'root'@'v1280-logical' identified by 'admin123';
> UPDATE user SET Grant_priv='Y' WHERE User='root'  AND Host='v1280-logical';
> UPDATE user SET Grant_priv='Y' WHERE User='root'  AND Host='v1280-137-03';
> UPDATE user SET Grant_priv='Y' WHERE User='root'  AND Host='v1280-137-04';

Shutdown the server, and enable the mysql resource and fault monitor :
# /opt/mysql/mysql/bin/mysqladmin shutdown -S /tmp/v1280-logical.sock
--user=root --password=admin123
# clrs enable mysql  

 

Internal test suite run

The MySQL benchmark suite (currently single-threaded) is part of a server installation. It can be used to determine which database operations in an implementation perform well or poorly.

The following run illustrates the setup and completion of the test suite. The actual performance numbers are not the focus, this being a default implementation.

Run the internal test suite on a MySQL 5.0.45 32-bit implementation :

On both nodes, download and install the the MySQL DBD driver, and the Perl Modules to access the database servers :
As root user, download and uncompress the latest DBI/DBD files from :
http://www.cpan.org/modules/by-category/07_Database_Interfaces/DBD/
(Eg. Currently DBI-1.602.tar.gz  and DBD-mysql-4.006.tar.gz).

Set a soft link to /usr/bin/perl , and a compiler path in PATH (Eg.)

# cd /usr/local/bin
# ln -s /usr/bin/perl perl
#  PATH=/ws/onnv-tools/SUNWspro/SS11/bin:/usr/ccs/bin:/usr/local/mysql/bin:
    /usr/local/mysql/libexec:/usr/sbin:/usr/bin:/usr/cluster/bin
# export PATH

Install the DBI module :

# cd DBI-1.602
# perl Makefile.PL
# make
cc -c  -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 -xarch=v8
-D_TS_ERRNO -xO3 -xspace -xildoff    -DVERSION=\\"1.602\\"
-DXS_VERSION=\\"1.602\\"  -KPIC
"-I/usr/perl5/5.8.4/lib/sun4-solaris-64int/CORE"
-DDBI_NO_THREADS DBI.c

# make test
...
All tests successful, 34 tests and 379 subtests skipped.
Files=126, Tests=5617, 159 wallclock secs (140.38 cusr + 15.71 csys = 156.09 CPU)
test.pl done

# make install

Install the MySQL DBD driver :
# cd DBD-mysql-4.006
# perl Makefile.PL
# make
..
cflags        (mysql_config) = -I/usr/local/mysql/include -mt
-D_FORTEC_ -xarch=v8..
..
cc -c  -I/usr/perl5/site_perl/5.8.4/sun4-solaris-64int/auto/DBI
-I/usr/local/mysql/include -mt -D_FORTEC_ -xarch=v8
-DDBD_MYSQL_INSERT_ID_IS_GOOD -g  -D_LARGEFILE_SOURCE
-D_FILE_OFFSET_BITS=64 -xarch=v8 -D_TS_ERRNO -xO3 -xspace -xildoff
-DVERSION=\\"4.006\\"  -DXS_VERSION=\\"4.006\\" -KPIC
"-I/usr/perl5/5.8.4/lib/sun4-solaris-64int/CORE"   dbdimp.c
# make test
# make install

Run the test suite :

# cd /opt/mysql/mysql/sql-bench
# ./run-all-tests --socket /tmp/v1280-logical.sock  --user=root
    --password=admin123

Benchmark DBD suite: 2.15
...
alter-table: Total time: 28 wallclock secs ( 0.05 usr  0.07 sys +  0.00 cusr  0.00 csys =  0.12 CPU)
ATIS: Total time: 40 wallclock secs ( 5.17 usr  0.83 sys +  0.00 cusr  0.00 csys =  6.00 CPU)
big-tables: Total time: 30 wallclock secs ( 5.56 usr  0.43 sys +  0.00 cusr  0.00 csys =  5.99 CPU)
connect: Total time: 244 wallclock secs (55.75 usr 45.62 sys +  0.00 cusr  0.00
csys = 101.37 CPU)
create:  Total time: 208 wallclock secs ( 4.86 usr  3.79 sys +  0.00 cusr  0.00
csys =  8.65 CPU)
insert: Total time: 2118 wallclock secs (434.92 usr 132.81 sys +  0.00 cusr 0.00 csys = 567.73 CPU)
select: Total time: 934 wallclock secs (40.27 usr  8.64 sys +  0.00 cusr  0.00 csys = 48.91 CPU)
transactions: Test skipped because the database doesn't support transactions
wisconsin: Total time: 19 wallclock secs ( 2.28 usr  1.68 sys +  0.00 cusr  0.00 csys =  3.96 CPU)
...
All 9 tests executed successfully
Totals per operation:

Operation                                 seconds  usr     sys         cpu            tests
alter_table_add                       12.00     0.01    0.01        0.02         100
alter_table_drop                      11.00    0.01    0.00        0.01          91
connect                                     20.00    9.00    4.02       13.02         10000
connect+select_1_row            23.00    9.42    4.61       14.03        10000
...
update_with_key_prefix          43.00    7.12    5.40       12.52         100000
wisc_benchmark                        3.00    1.31    0.04        1.35          114

TOTALS                                3633.00  543.28  193.53   736.81     3425950



S/W fault test run

 

A. Perform a manual failover of an executing MySQL client transaction:

On the primary node (resources and db instance up), execute multiple transactions that insert records in a table. Check that transactions are either committed or rolled back, and that the data is consistent. Measure the switchover (controlled failover) time:

Node 1  - 1st terminal window :
Create a script (eg. insert-data) with a few transactions :

BEGIN;
INSERT INTO t(f) VALUES (1);
INSERT INTO t(f) VALUES (1);
...
COMMIT;

BEGIN;
INSERT INTO t(f) VALUES (2);
INSERT INTO t(f) VALUES (2);
...
COMMIT;

Create an innoDB table and run the script :
# /opt/mysql/mysql/bin/mysql -S /tmp/v1280-logical.sock -uroot -padmin123
mysql> CREATE TABLE t (f INT) TYPE=InnoDB;
mysql> source insert-data

Node 1  - 2nd window :
Let the script run for a while. Then, perform a  manual failover onto Node 2 :
# clrg switch -n 2 mysql-rg

Node 2 - Window :
Verified that the resource group fails over, and that the mysql process, and /global/mysql are available.
# ps -ef|grep mysql
    root  8084   987   0 02:54:56 ?           0:00 /bin/sh -c /opt/SUNWscgds/bin/gds_probe -R mysql -T SUNW.gds:6 -G mysql-rg
     root  8085  8084   0 02:54:56 ?           0:00 /opt/SUNWscgds/bin/gds_probe -R mysql -T SUNW.gds:6 -G mysql-rg
   mysql  8045     1   0 02:54:45 ?           0:01 ./bin/mysqld --defaults-file=/global/mysql/my.cnf --basedir=/opt/mysql/mysql --

Node 1 - 3rd window..
The script aborts soon after 'clrg switch' begins executing. Those transactions that completed before the abort are committed to the database successfully. The in flight transaction rolls back completely, and pending transactions are not executed. All the threads are stopped, tables and logs flushed, and a clean shutdown is performed.

Browse /var/adm/messages on both nodes :

Node 1 :
Mar 14 00:32:46 v1280-137-03 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <gds_svc_stop> completed successfully

Node 2 :
Mar 14 00:32:52 v1280-137-04 Cluster.RGM.rgmd: [ID 224900 daemon.notice] launching method <gds_svc_start> for resource <mysql>, resource group <mysql-rg>, node <v1280-137-04>, timeout <300> seconds

Failover time is the difference between the moment that gds_svc_stop completes on Node 1 and the moment that gds_svc_start launches on Node 2 = 6 sec.

B. Panic or reboot a node, and measure the failover time:

Panic :

On the adminstrative console of the primary node (resources and db instance up), type :

# uadmin 5 0   // simulate a panic
> ok boot

Browse /var/adm/messages on the secondary node :

Node 2 : 

Apr 15 01:17:53 v1280-137-04 cl_runtime: [ID 446068 kern.notice] NOTICE: CMM: Node v1280-137-03 (nodeid = 1) is down.

Node 2 :

Apr 15 01:19:00 v1280-137-04 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <gds_svc_start> completed successfully for resource <mysql>, resource group <mysql-rg>, node <v1280-137-04>, time used: 18% of timeout <300 seconds> 

Failover time is the difference between the moment the fault is injected on Node 2 (MySQL DB instance no longer available) to the moment gds_svc_start is completed on the same node (MySQL DB instance available again) = 67 sec. Here, the nearest equivalent fault injection message reported on Node 2 is that of Node 1 going down.

Reboot :

On the primary node (resources and db instance up), type :

# reboot

Browse /var/adm/messages on both nodes :

Node 1 :

Apr 14 23:13:19 v1280-137-03 Cluster.PNM: [ID 226280 daemon.notice] PNM daemon exiting.

Node 2 :

Apr 14 23:14:15 v1280-137-04 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <gds_svc_start> completed successfully for resource <mysql>, resource group <mysql-rg>, node <v1280-137-04>, time used: 6% of timeout <300 seconds>

Failover time is the difference between the moment the fault is injected on Node 1 (MySQL DB instance no longer available) to the moment gds_svc_start is completed on Node 2 (MySQL DB instance available again) = 56 sec. Here, the nearest equivalent fault injection message reported on Node 1 is that of the pnmd daemon exiting.


C.) Kill the database server process, and measure the restart time :

On the primary node (with all resources and db up), type :

# kill -9 <pid of mysqld>

Browse /var/adm/messages on Node 2 :

Node 2 :
Apr 14 03:36:30 v1280-137-04 Cluster.RGM.rgmd: [ID 784560 daemon.notice] resource mysql status on node v1280-137-04 change to R_FM_FAULTED

Node  2 :
Apr 14 03:37:20 v1280-137-04 Cluster.RGM.rgmd: [ID 515159 daemon.notice] method <gds_svc_start> completed successfully for resource <mysql>, resource group <mysql-rg>, node <v1280-137-04>, time used: 15% of timeout <300 seconds>

Failover time is the difference between the moment the fault is injected on Node 2 (MySQL DB instance no longer available) to the moment gds_svc_start is completed on the same node (MySQL DB instance available again) = 50 sec. Here, the nearest equivalent fault injection message reported on Node 1 is that of the resource status on Node 1 being faulted.

S/W fault regression test run


An internal regression test suite (developed by Sun's cluster team) was used to execute cluster regression tests. The suite comprises a set of s/w fault tests, and  is automated. The kit s/w is installed onto a client machine. The client is configured to access the cluster, and invokes the tests.

Eg. client m/c is v490-240-01 :
On the cluster machines and on the client, add the 'dats' user, and install the test suite package (SUNWdats).

As root user, add in /etc/passwd :
dats:x:55556:10:DATS Test User:/opt/SUNWdats:/bin/ksh
Add (or exists) in /etc/group :
staff::10:

# pwconv
# su - dats
> cd /net/dv2.sfbay/vol/qevol/scqe/biweekly/24/lab/sparc/SUNWdats
> pkgadd -d . SUNWdats

in /.rhosts :
+ dats
v1280-137-03 +
v1280-137-04 +

Set PATH=/usr/sbin:/usr/bin:/opt/SUNWdats/tset_dataservice/bin

On the cluster nodes, correspondingly set /.rhosts :
+ dats
v1280-137-03 +
v1280-137-04 +
v490-240-01 +

On the client, supply input to generate the data services configuration file :

> cd /opt/SUNWdats/tset_dataservice/bin
> ./get_dsinfo
Enter the name of the output file : mysql-fault
Enter the name of one of the cluster nodes : v1280-137-03
Obtaining cluster configuration information. Please wait...
Do you want to run tests using New Command Set ? [y/n] : y

Select the Data Service Type
 1) Failover dataservice with one resource group
 2) Scalable dataservice with one Shared Address Resource group
    and one Scalable Resource Group
 3) Pre-created Resource Group Configuration
 4) Other
Enter your selection : 3

Obtaining the registered Resource Group Names on the cluster. Please wait...
These are the registered Resource Groups on the cluster : mysql-rg
Which Resource Groups are needed for this dataservice ?
Enter the Resource Group Names.
End the list with a blank line.
Resource Group Name ? mysql-rg
Resource Group Name ?
You entered the following Resource Group Names :
        mysql-rg
Is this correct ? [y/n] : y
Obtaining the Resource Groups/Resources Properties. Please wait...
Processing Resource Group mysql-rg (failover)
Do you have a client program for Resource Group mysql-rg ? [y/n] : n
Resource sql-stor of SUNW.HAStoragePlus:4

Resource sql-stor of SUNW.HAStoragePlus:4 done
Resource v1280-logical of SUNW.LogicalHostname:2
Resource v1280-logical of SUNW.LogicalHostname:2 done
Resource mysql of SUNW.gds:6

Enter the application daemon processes.
End the list with a blank line.
Daemon Process Name ? mysqld
Daemon Process Name ?
You entered the following values
        mysqld
Is this correct ? [y/n] : y
Enter the fault monitor daemon processes.
End the list with a blank line.
Fault Monitor Daemon Process Name ? gds_probe
Fault Monitor Daemon Process Name ?
You entered the following values
        gds_probe
Is this correct ? [y/n] : y
Resource mysql of SUNW.gds:6 done
Processing of Resource Group mysql-rg Done

Run the test suite :

> ./run_dats -f mysql-fault

The output results are logged into this directory, and indicate the status of each test along with the sub-commands executed :
/opt/SUNWdats/dataservice_results/mysql-ca/results/log.xxxxx


The test suite comprises the following :

Non-Reboot tests :

1: Registration of Resource types for the data service

2: Creation of resource group(s) and resources

3: Bringing the resource group(s) online
# clrg online mysql-rg

4: Disabling Application resources
# clrg online -emM mysql-rg
# clrs disable mysql

5: Enabling Application resources
# clrs enable sql-stor

6: Taking the resource group(s) that contain the application resources offline
# clrg offline mysql-rg

7: Disabling the fault monitor for the application resources
# clrs unmonitor mysql

8: Enabling the fault monitor for the application resources
# clrs monitor mysql

9: Switchover of resource groups containing the application resources

10: Kill the application daemon process repeatedly to exceed the Retry_count within the Retry_interval. This should result in the restarting of the data service on the same node until Retry_count is reached, and failover of the data service after the subsequent kill attempt
# scrgadm -c -j mysql -y Retry_interval=1450
# scrgadm -c -g mysql-rg -y Pingpong_interval=360
# kill -9 <pid of mysqld> // repeat 2-3 times after mysqld restarts; failover should occur eventually

11: If the fault monitor daemon processes associated with a resource are killed, they should automatically be restarted
# kill -9 <pid of gds_probe>

12: Killing pnmd process should not affect the data service

13: Unmanage the resource group and manage it again
# clrs disable mysql
# clrs disable sql-stor
# clrs disable v1280-logical
# clrg offline mysql-rg
# clrg unmanage mysql-rg
     Group: mysql-rg       v1280-137-03             Unmanaged      No
     Group: mysql-rg       v1280-137-04             Unmanaged      No
# clrg manage mysql-rg
     Group: mysql-rg       v1280-137-03             Offline        No
     Group: mysql-rg       v1280-137-04             Offline        No
# clrs enable v1280-logical
# clrs enable sql-stor
# clrs enable mysql

14: Removing application resource

15: Removing the resource group

16: Removing the resource type

17: Check for the presence of the client program information

18: If a daemon process associated with a resource is killed, it should automatically be restarted

19: Rebooting the Primary Zone should not affect the availability of the dataservice. The resource groups/resources should failover to the next  available potential Primary node/zone

Reboot tests :

1: Rebooting the primary node should not affect the availability of the data service

2: Killing rgmd on primary node should not affect the availability of the data service

3: Failback property of the resource group should work as expected

4: Check for the presence of the client program information
 

Uninstalling Sun Cluster

Check the Sun Cluster documentation for uninstalling Sun Cluster. The following is a brief  procedure :

On Node 2, execute :

# scswitch -S -h v1280-137-04
# shutdown -g0 -y -i0

On Node 1, execute :

# scconf -c -q node=v1280-137-04,maintstate
# scstat -q
 Node votes:       v1280-137-04         0        0       Offline
# scconf -r -h node=v1280-137-04

If these messages appear :
scconf:  Failed to remove node (v1280-137-04) - node is still cabled or otherwise in use.

scconf:    Node "v1280-137-04" is still cabled.
scconf:    Node "v1280-137-04" is still in use by quorum device "d2".
Then, execute these :
# clnode  clear  -F v1280-137-04
# scconf -r -h node=v1280-137-04
# scstat -n
  Cluster node:     v1280-137-03         Online
# reboot -- -x
# scinstall -r

On Node 2 :
> ok boot -x

Software links

 

References

Wednesday Feb 20, 2008

Sun Studio Compiler Options for MySQL on Solaris 10 x64 OS : Performance Study


  • Introduction
  • Activity
  • Setup and build environment
  • MySQL Configuration options
  • Studio Compiler flags
  • Studio 11 32-bit (1-16 threads)- sysbench read-only oltp test
  • Recommended compiler options for integration with webstack
  • Studio 12 64-bit (1-64 threads)- sysbench and iGen+sysbench read-only oltp test
  • Software documentation links

Introduction

Solaris 10, Sun's flagship OS is multi-platform, scalable and yields massive performance advantages for databases, Web, and Java technology-based services. Its advanced features include security (Process Rights Management), system observability (DTrace), system resource utilization (containers and virtualization), an optimized network stack, data management, system availability (Predictive Self Healing), interoperability tools, Support & Services (s/w subscription, h/w support, technical help).

Sun Studio compiler delivers high-performance, optimizing C, C++, and Fortran compilers for the Solaris OS on SPARC, and for both Solaris and Linux on x86/x64 platforms, including the latest multi-core systems.

Sun Fire™ x64 servers yield very high performance, have dual-core AMD Opteron processors and deliver eight-way performance in a four-processor system. Features include near linear CPU scalability, enterprise reliability, high rack density, redundant power and cooling, RAID storage, and remote system monitoring among others. Coupled with Solaris 10, these servers are capable of delivering very high levels of throughput for demanding departmental and enterprise applications.

MySQL, the most popular Open Source database was developed, distributed, and supported by a commercial company MySQL AB, now part of Sun as a result of an acquisition. MySQL is multi-threaded and consists of an SQL server, client programs and libraries, administrative tools, and APIs. Java client programs that use JDBC connections can access a MySQL server via the MySQL Connector/J interface.

The sysbench workload kit is a modular, cross-platform and multi-threaded benchmark tool for evaluating OS parameters that are important for a system running a database under intensive load.

The iGen kit is an internally developed benchmark. It stresses commit operations and concurrency. Its core metric is transactions per minute (tpm) and average response time. It has a light SQL load . Most runs are memory "cached" executions. The log device is the I/O component that gets stressed.

 

Activity

The objective was to recommend a set of high performance Studio compiler flags for 32-bit integration with project webstack. webstack addresses the OpenSolaris community needs for web tier technologies. It is a bundle of open source software delivered in Solaris and supported by Sun, and contains software that Sun considers critical to its business.

The MySQL source code was compiled on a Sun Fire™ x64 system with sets of Sun Studio run time flags.The resulting binary for each set was then run against the sysbench workload to obtain the performance throughput.

The recommended flags were integrated into webstack with with appropriate MySQL configuration options.

 

Setup and Build environment

The OS Update version used is Solaris 10, Update 4 (s10x_u4wos_11) The C and C++ Compilers are part of the Studio compiler collection. The MySQL Community Server version used is 5.0.4x . The sysbench kit version used is v0.3.3 .

The MySQL Server and the sysbench kit are installed on a Sun Fire™ x64 server.

  • Database Node :
    • CPU : 4 core Opteron x 2593 MHz
    • Memory : 16,320 MB
    • Operating System : Solaris 10, Update 4

 

MySQL Configuration options :

Option Possible reason for inclusion
--prefix Specify installation dir.
--xxdir Specify a directory for serving a purpose
--with-server-suffix Adds a suffix to the mysqld version string
--enable-thread-safe-client Make mysql_real_connect() thread-safe with this option, and recompile the distribution to create a thread-safe client library, libmysqlclient_r
--with-mysqld-libs Include libs in mysqld
--with-named-curses=-lcurses Use specified curses libraries instead of those automatically found by configure
--with-client-ldflags=-static compile statically linked programs
--with-mysql-ldflags=-static compile statically linked programs
--with-pic try to use only PIC objects, and omit usage of non-PIC objects
--with-big-tables Support tables with more than 4 GB rows even on 32 bit platforms
--with-yassl To use SSL connections; configure to use the bundled yaSSL library
--with-readline Do not use system readline or bundled copy
--with-xx-storage-engine Enable the xx Storage Engine
--with-innodb Include the InnoDB table handler
--with-extra-charsets=complex Additionally include all character sets that can't be dynamically loaded to be compiled into the server
--enable-local-infile Permits usage of LOAD DATA (LOCAL INFILE) with files on client-side file system. This adds flexibility. With LOCAL, no access to the server is needed except for the MySQL connection
--with-ndb-cluster Enables support for the ndb cluster storage engine on applicable platforms
--with-zlib-dir=bundled Helps the linker find -lz (libz.so) when linking client programs

 

Studio Compiler flags :

Compiler Options Possible reason for inclusion
-m64 or -m32 Specifies the memory model for the compiled binary object, and generates optimal code.
-mt Macro option that expands to -D_REENTRANT -lthread
-fsimple=1 The optimizer is not allowed to optimize completely without regard to roundoff or exceptions. A floating-point computation cannot be replaced by one that produces different results with rounding modes held constant at runtime. Include this explicitly in the C++ flags
-fns=no Selects SSE flush-to-zero mode and, where available, denormals-are-zero mode; causes subnormal results to be flushed to zero; where available, causes subnormal operands to be treated as zero
-xbuiltin=%all Improves the optimization of code that calls for standard library functions
-xO3 Generates a high level of optimization.
-xstrconst Inserts string literals into the read-only data section of the text segment
-xlibmil Selects the appropriate assembly language inline templates for the floating-point option and platform
-xlibmopt Enables the compiler to use a library of optimized math routines.
-xtarget=generic Specifies the target system for instruction set and optimization. It sets -xarch, -xchip and -xcache
-xrestrict Tells the compiler that there is no pointer aliasing between the arguments in functions
-xprefetch=auto Enables prefetch instructions
-xprefetch_level=3 Controls the aggressiveness of automatic insertion of prefetch instructions as set by -xprefetch=auto
-xunroll=2 Suggests to the optimizer to uroll loops n times. Instructions called in multiple iterations are combined into a single iteration. Register usage and code size may increase.
-xalias_level Provides information to the compiler about pointer usage, and enables it to perform type-based alias analysis and optimizations.

 

Studio 11 32-bit (1-16 threads)- sysbench read-only oltp test; bottom cell numbers correspond to tps throughput when run with SUNPRO_C source code change

Compiler Options 1 2 4 8
1. Release binary (Studio 10) : -xO3 -mt -fsimple=1 -ftrap=%none -nofstore -xbuiltin=%all -xlibmil -xlibmopt -xtarget=generic [for C++ , append -features=no%except] 416.67 722.88 1337.65 1315.38
2. Studio 11 Baseline : -xlibmil -xO3 -DHAVE_RWLOCK_T -mt -fsimple=1 -fns=no 393.95 444.90 674.42 747.58 1143.25 1297.07 1243.08 1284.92
3. -xbuiltin=%all 400.09 443.71 692.67 758.34 1170.63 1270.33 1219.10 1333.03
4. -xbuiltin=%all -xunroll=2 400.76 440.06 682.87 751.58 1312.33 1352.47 1233.43 1395.44
5. -xbuiltin=%all -xprefetch=auto -xprefetch_level=3 403.03 437.99 693.91 739.96 1324.03 1189.17 1246.68 1360.16
6. -xbuiltin=%all -xalias_level=std [=simple for C++] 394.16 435.58 684.62 737.08 1125.59 1408.82 1194.02 1403.18
7. -xbuiltin=%all -xtarget=native 400.84 443.89 685.57 755.54 1151.23 1234.93 1209.17 1303.74
8. -xbuiltin=%all -xunroll=2 -xprefetch=auto -xprefetch_level=3 400.39 443.86 694.79 747.41 1131.03 1236.80 1261.71 1223.67
9. -xbuiltin=%all -xunroll=2 -xprefetch=auto -xprefetch_level=3 -xalias_level 397.44 442.95 691.14 753.21 1119.71 1287.13 1222.34 1281.75

 

Recommended compiler options for integration with webstack

A.) The recommended Studio 11 compiler flags for webstack on AMD64 are ' -xbuiltin=%all' and ' -xprefetch=auto -xprefetch_level=3 ' . A throughput increase of 2.3%-3% was observed over the baseline.

B.) The SUNPRO_C source change yielded a consistent throughput increase in most cases (7%-10%) over those run without this change. This option can be used for SPARC and x64 platforms. The original MySQL sources have explicit inlining of small support functions only with gcc and Visual C++. However, this inlining is found to help Sun Studio as well, and can be enabled with the following change to the header file $MYSQL_HOME/innobase/include/univ.i on line 61:

#if !defined(GNUC) && !defined(WIN) && !defined(__SUNPRO_C)

 

Studio 12 64-bit (1-64 threads)- sysbench read-only oltp test; Top tps number in a cell was obtained when using a patched compiler; the middle tps numbers correspond to runs with an unpatched compiler; and the bottom tps numbers were obtained with binaries run previously against the iGen workload

Compiler Options  1  2
  4
  8
  16  32
  64
1. Release binary (64-bit) : -m64 -O2 -mtune=k8 [LDFLAGS=-static-libgcc]548.00
900.64698.981588.821693.181587.621583.23
2. Studio 12 (64-bit): -fast -m64 -DHAVE_RWLOCK_T -mt [for C++, append -fsimple=1 -fns=no]479.23 484.43 478.86803.69 800.71 804.571423.95 1487.72 1246.741449.46 1457.24 1380.511398.36 1367.45 1455.211347.89 1324.32 1333.261342.14 1333.89 1324.67
3. Feedback Optimization added (FBO): -fast -m64 -DHAVE_RWLOCK_T -mt -xprofile=use:dir [for C++, append -fsimple=1 -fns=no]522.23 527.12 525.32861.59 862.50 862.141381.42 1385.52 1592.071436.56 1488.54 1595.251449.39 1489.28 1570.371351.54 1490.72 1406.151399.92 1427.93 1444.99
4. FBO + Loop Unrolling : -fast -m64 -DHAVE_RWLOCK_T -mt -xprofile=use:dir -[for C++, append -fsimple=1 -fns=no -xunroll=2519.83 521.27 527.47862.08 857.37 867.341489.89 1429.39 1605.271594.95 1515.42 1597.161470.04 1393.64 1572.661498.09 1466.43 1429.441435.28 1434.98 1438.04
5. FBO + Prefetching : -fast -m64 -DHAVE_RWLOCK_T -mt -xprofile=use:dir [for C++, append -fsimple=1 -fns=no] -xprefetch=auto -xprefetch_level=3515.91 523.84 526.51866.70 860.84 868.961484.00 1376.04 1342.131503.10 1476.83 1529.011510.76 1503.53 1575.711405.84 1388.53 1485.511434.42 1433.54 1452.49
6. FBO + Pointer aliasing (xalias=strong) : -fast -m64 -DHAVE_RWLOCK_T -mt -xprofile=use:dir [for C++, append -fsimple=1 -fns=no] -xalias_level=strong [=simple for C++]

524.58 521.35 533.86

873.51 866.01 873.86

1354.31 1351.95 1623.731457.34 1561.47 1631.011538.97 1502.79 1558.291521.22 1481.15 1479.871451.58 1465.48 1448.59
7. FBO + Restricted Pointer Parameters : -fast -m64 -DHAVE_RWLOCK_T -mt -xprofile=use:dir [for C++, append -fsimple=1 -fns=no] -xrestrict=%all
526.07 526.80 528.67865.49 861.65 868.881329.60 1343.57 1359.181592.37 1317.78 1442.331522.06 1409.31 1456.711521.06 1410.47 1512.981433.70 1411.82 1400.65

A.) The runs with the patched compiler yielded a marginal throughput increase in about two thirds of the cases over those with an unpatched compiler.

B.) Patching involved compiler patch 6538437, and a change to the MySQL header file $MYSQL_HOME/innobase/include/univ.i on line 61, to include '__sparc' . The code snippet below is used to determine whether functions get declared as "static inline" or not. Currently triggered for gcc and windows, the SPARC compiler will also support this syntax :

#if !defined(GNUC) && !defined(WIN) && !defined(__sparc)

C.) The runs with the binaries previously run with iGen workload and currently run with sysbench yielded an appreciable throughput increase (4% - 7.7%) in about one fourths of the cases over those binaries run with only sysbench.

 

Software documentation links

  • Solaris 10 OS : (here)
  • Sun Studio 12 Compiler Collection : (here)
  • Sun Fire™ SPARC and x64 servers : (here)
  • MySQL Database : (here)
  • sysbench site : (here)

Sun Studio Compiler Options for MySQL on Solaris 10 SPARC OS : Performance Study


  • Introduction
  • Activity
  • Setup and build environment
  • MySQL Configuration options
  • Studio Compiler flags
  • Studio 11 32-bit (1-16 threads)- sysbench read-only oltp test
  • Recommended compiler options
  • Studio 12 64-bit (1-8 threads)- sysbench read-only oltp test
  • Software documentation links
  • References
  • Acknowledgements

 

Introduction

Solaris 10, Sun's flagship OS is multi-platform, scalable and yields massive performance advantages for databases, Web, and Java technology-based services. Its advanced features include security (Process Rights Management), system observability (DTrace), system resource utilization (containers and virtualization), an optimized network stack, data management, system availability (Predictive Self Healing), interoperability tools, Support & Services (s/w subscription, h/w support, technical help).

Sun Studio compiler delivers high-performance, optimizing C, C++, and Fortran compilers for the Solaris OS on SPARC, and for both Solaris and Linux on x86/x64 platforms, including the latest multi-core systems.

Sun Fire™ SPARC servers pack up to 4 UltraSPARC IV Chip Multi threading processors delivering up to eight concurrent threads in 32 GB of memory. Coupled with Solaris 10, these servers are capable of delivering very high levels of throughput for demanding departmental and enterprise applications.

MySQL, the most popular Open Source database was developed, distributed, and supported by a commercial company MySQL AB, now part of Sun as a result of an acquisition. MySQL is multi-threaded and consists of an SQL server, client programs and libraries, administrative tools, and APIs. Java client programs that use JDBC connections can access a MySQL server via the MySQL Connector/J interface.

The sysbench workload kit is a modular, cross-platform and multi-threaded benchmark tool for evaluating OS parameters that are important for a system running a database under intensive load.

 

Activity

The objective was to recommend a set of high performance Studio compiler flags for 32-bit integration with project webstack. webstack addresses the Open Solaris community needs for web tier technologies. It is a bundle of open source software delivered in Solaris and supported by Sun, and contains software that Sun considers critical to its business.

The MySQL source code was compiled on a Sun Fire™ SPARC system with sets of Sun Studio run time flags.The resulting binary for each set was then run against the sysbench workload to obtain the performance throughput.

The recommended flags were integrated into webstack with appropriate MySQL configuration options.

 

Setup and Build environment

The OS Update version used is Solaris 10, Update 4 (s10x_u4wos_11) The C and C++ Compilers are part of the Studio compiler collection. The MySQL Community Server version used is 5.0.4x . The sysbench kit version used is v0.3.3 .

The MySQL Server and the sysbench kit are installed on a Sun Fire™ SPARC server.

  • Database Node :
    • CPU : 4 core UltraSPARC-IV x 1350 MHz
    • Memory : 32,768 MB
    • Operating System : Solaris 10, Update 4
    •  

MySQL Configuration options :

Option Possible reason for inclusion
--prefix Specify installation dir.
--xxdir Specify a directory for serving a purpose
--with-server-suffix Adds a suffix to the mysqld version string
--enable-thread-safe-client Make mysql_real_connect() thread-safe with this option, and recompile the distribution to create a thread-safe client library, libmysqlclient_r
--with-mysqld-libs Include libs in mysqld
--with-named-curses=-lcurses Use specified curses libraries instead of those automatically found by configure
--with-client-ldflags=-static compile statically linked programs
--with-mysql-ldflags=-static compile statically linked programs
--with-pic try to use only PIC objects, and omit usage of non-PIC objects
--with-big-tables Support tables with more than 4 GB rows even on 32 bit platforms
--with-yassl To use SSL connections; configure to use the bundled yaSSL library
--with-readline Do not use system readline or bundled copy
--with-xx-storage-engine Enable the xx Storage Engine
--with-innodb Include the InnoDB table handler
--with-extra-charsets=complex Additionally include all character sets that can't be dynamically loaded to be compiled into the server
--enable-local-infile Permits usage of LOAD DATA (LOCAL INFILE) with files on client-side file system. This adds flexibility. With LOCAL, no access to the server is needed except for the MySQL connection
--with-ndb-cluster Enables support for the ndb cluster storage engine on applicable platforms
--with-zlib-dir=bundled Helps the linker find -lz (libz.so) when linking client programs

 

Studio Compiler flags :

Compiler Options Possible reason for inclusion
-m64 or -m32 Specifies the memory model for the compiled binary object, and generates optimal code.
-mt Macro option that expands to -D_REENTRANT -lthread
-fsimple=1 The optimizer is not allowed to optimize completely without regard to roundoff or exceptions. A floating-point computation cannot be replaced by one that produces different results with rounding modes held constant at runtime. Include this explicitly in the C++ flags
-fns=no Selects SSE flush-to-zero mode and, where available, denormals-are-zero mode; causes subnormal results to be flushed to zero; where available, causes subnormal operands to be treated as zero
-xbuiltin=%all Improves the optimization of code that calls for standard library functions
-xO3 Generates a high level of optimization.
-xstrconst Inserts string literals into the read-only data section of the text segment
-xlibmil Selects the appropriate assembly language inline templates for the floating-point option and platform
-xlibmopt Enables the compiler to use a library of optimized math routines.
-xtarget=generic Specifies the target system for instruction set and optimization. It sets -xarch, -xchip and -xcache
-xrestrict Tells the compiler that there is no pointer aliasing between the arguments in functions
-xprefetch=auto Enables prefetch instructions
-xprefetch_level=3 Controls the aggressiveness of automatic insertion of prefetch instructions as set by -xprefetch=auto
-xunroll=2 Suggests to the optimizer to uroll loops n times. Instructions called in multiple iterations are combined into a single iteration. Register usage and code size may increase.
-xalias_level Provides information to the compiler about pointer usage, and enables it to perform type-based alias analysis and optimizations.

 

Studio 11 32-bit (1-16 threads)- sysbench read-only oltp test; bottom cell numbers correspond to tps throughput when run with with SUNPRO_C source code change

1 2 4 8 16
1. Release binary (Studio 10) : -xO3 -Xa -xstrconst -mt -D_FORTEC_ -xarch=v8 -xc99=none [for C++ , use -noex , and remove -Xa and -xstrconst] 181.35 335.26 594.94 914.80 942.06
2. Studio 11 Baseline : -xlibmil -xO3 -DHAVE_RWLOCK_T -mt -fsimple=1 -fns=no 183.28 184.22 337.34 338.78 598.87 597.71 833.04 804.49 929.94 871.35
3. -xbuiltin=%all 186.10 185.59 345.30 342.60 604.56 603.12 812.25 921.53 930.23 942.36
4. -xbuiltin=%all -xunroll=2 188.53 189.39 347.28 349.89 613.24 613.07 927.57 846.86 941.56 888.54
5. -xbuiltin=%all -xprefetch=auto -xprefetch_level=3 184.16 186.40 343.20 344.76 602.33 604.19 839.48 813.32 943.60 948.48
6. -xbuiltin=%all -xalias_level=std [=simple for C++] 186.81 187.77 347.56 346.61 610.44 611.10 817.28 923.46 946.79 922.31
7. -xbuiltin=%all -xtarget=native 190.70 190.84 354.24 353.79 619.54 620.02 828.70 849.72 948.16 898.97
8. -xbuiltin=%all -xunroll=2 -xprefetch=auto -xprefetch_level=3 188.21 188.44 348.44 348.51 614.24 613.37 840.94 926.02 948.76 942.59
9. -xbuiltin=%all -xunroll=2 -xprefetch=auto -xprefetch_level=3 -xalias_level 187.86 189.38 348.22 347.57 618.12 618.38 850.12 851.36 941.20 909.14

 

Recommended compiler options for integration with webstack

A.) The recommended Studio 11 compiler flags for webstack on SPARC are ' -xbuiltin=%all' and ' -xtarget=native, -xunroll=2'. With -xtarget, a throughput increase of 3.4%-4% was observed over the baseline. With -xunroll, a throughput increase of 2.3%-3.6% was observed over the baseline.

B.)The SUNPRO_C source change yielded a throughput increase in two-thirds of the cases over those run without this change. This option can be used for SPARC and x64 platforms. The original MySQL sources have explicit inlining of small support functions only with gcc and Visual C++. However, this inlining is found to help Sun Studio as well, and can be enabled with the following change to the header file $MYSQL_HOME/innobase/include/univ.i on line 61:

#if !defined(GNUC) && !defined(WIN) && !defined(__SUNPRO_C)

 

Studio 12 64-bit (1-64 threads)- sysbench read-only oltp test;

Compiler Options 1 2 4 8
1. Release binary (64-bit) : -m64 -O2 -mtune=k8 [LDFLAGS=-static-libgcc] 182.93 335.54 592.00 902.69
2. Studio 12 (64-bit): -Xa -fast -m64 -xarch=sparc -xstrconst -mt [for C++, append -noex -fsimple=1 -fns=no and remove -Xa] 197.81 346.33 586.13 685.93
3. Feedback Optimization added (FBO): As in 3 with -xprofile=use:dir ] 223.09 385.44 635.79 719.65
4. FBO + Loop Unrolling : As in 4 with -xunroll=2 227.25 388.14 631.89 724.14
5. FBO + Prefetching : As in 4 with -xprefetch=auto -xprefetch_level=3 228.64 395.34 651.52 723.02
6. FBO + Restricted Pointer Parameters : As in 4 with -xrestrict=%all 228.91 393.56 635.01 736.71

The Studio 12 compiler flags that performed the best were -xrestrict=%all and '-xprefetch=auto -xprefetch_level=3', when used with FBO. These combinations gave a throughput increase of 8% - 15% over the baseline studio64 (without FBO).

 

Software documentation links

  • Solaris 10 OS : (here)
  • Sun Studio 12 Compiler Collection : (here)
  • Sun Fire™ Servers : (here)
  • MySQL Database : (here)
  • sysbench site : (here)

About

Krish Shankar

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today