Wednesday Feb 25, 2015

Deploying MySQL over Fibre Channel / iSCSI using the Oracle ZFS Storage Appliance

The blog post serves as a follow-up to Deploying MySQL over NFS using the Oracle ZFS Storage Appliance. The benefits remain the same which means this discussion will focus completely on configuring your Oracle ZFS Storage Appliance and database server.

Configuring the Oracle ZFS Storage Appliance

Depending on which block protocol you plan on using, set up target and initiator groups per the following documentation:

These instructions assume you are using a target and initiator group called 'mysql-tgt' and 'mysql-init'

1. From the ZFS controller’s CLI, create a project called ‘mysql’:

        zfs:> shares project mysql

2. Set logbias to latency to leverage write flash capabilities:

        zfs:shares mysql (uncommitted)> set logbias=latency
                               logbias = latency (uncommitted)

3. Commit the settings:

        zfs:shares mysql (uncommitted)> commit

4. Select the 'mysql' project:

        zfs:> shares select mysql

5. Create a LUN called 'innodb-data' to hold data files:

        zfs:shares mysql> lun innodb-data

6. Set the volume block record size to 16K to match Innodb’s standard page size:

        zfs:shares mysql/innodb-data (uncommitted)> set volblocksize=16K
                               volblocksize = 16K (uncommitted)

7. Set the volume size to a value large enough to accommodate your database:

        zfs:shares mysql/innodb-data (uncommitted)> set volsize=1T
                               volsize = 1T (uncommitted)

8. Set the initiator and target groups:

        zfs:shares mysql/innodb-data (uncommitted)> set initiatorgroup=mysql-init
                               initiatorgroup = 1T (uncommitted)
        zfs:shares mysql/innodb-data (uncommitted)> set targetgroup=mysql-tgt
                               initiatorgroup = 1T (uncommitted)
        zfs:shares mysql/innodb-data (uncommitted)> commit

9. Create a LUN called ‘innodb-log’ to hold the redo logs:

        zfs:shares mysql> lun innodb-log

10. Set the volume block record size to 128K:

        zfs:shares mysql/innodb-log (uncommitted)> set volblocksize=128K
                               volblocksize = 128K (uncommitted)

11. Set the volume size to a value large enough to accommodate your database:

        zfs:shares mysql/innodb-log (uncommitted)> set volsize=256G
                               volsize = 256G (uncommitted)
        zfs:shares mysql/innodb-log (uncommitted)> commit

Configuring your database server

1. A directory structure should be created to contain the MySQL database:

# mkdir -p /mysql/san/innodb-data
# mkdir -p /mysql/san/innodb-log
# chown -R mysql:mysql /mysql/san

2. If using iSCSI, login to the ZFS Storage Appliance:

# iscsiadm -m discovery -t sendtargets -p zfssa-ip-address
# iscsiadm -m node -p zfssa-ip-address -login

3. The multipath configuration held in /etc/multipath.conf should contain the following:

defaults {
    find_multipaths yes
    user_friendly_names yes
}
devices {
    device {
        vendor "SUN"
        product "ZFS Storage 7430"
        getuid_callout "/lib/udev/scsi_id --page=0x83 --whitelisted --device=/dev/%n"
        prio alua
        hardware_handler "1 alua"
        path_grouping_policy group_by_prio
        failback immediate
        no_path_retry 600
        rr_min_io 100
        path_checker tur
        rr_weight uniform
        features "0"
    }
}

4. Discover the LUNs with multipath. This may require a restart of the multipath service to take in the new configuration changes.

# multipath -ll
mpatha (3600144f0fa2f948b0000537cdb970008) dm-2 SUN,ZFS Storage 7430
size=1.0T features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `-0:0:0:0 sda 8:0   active ready running
|-+- policy='round-robin 0' prio=1 status=enabled
| `-0:0:1:0 sdc 8:32  active ready running
|-+- policy='round-robin 0' prio=1 status=enabled
| `-1:0:0:0 sde 8:64  active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `-1:0:1:0 sdg 8:96  active  ready running
mpathb (3600144f0fa2f948b0000537cdbc10009) dm-3 SUN,ZFS Storage 7430
size=256G features='0' hwhandler='0' wp=rw
|-+- policy='round-robin 0' prio=1 status=active
| `-0:0:0:1 sdb 8:16  active ready running
|-+- policy='round-robin 0' prio=1 status=enabled
| `-0:0:1:1 sdd 8:48  active ready running
|-+- policy='round-robin 0' prio=1 status=enabled
| `-1:0:0:1 sdf 8:80  active ready running
`-+- policy='round-robin 0' prio=1 status=enabled
  `-1:0:1:1 sdh 8:112 active ready running

5. Align each LUN on a 32 byte boundary, specifying multiples of 256 for each partition. This is documented more extensively in Aligning Partitions to Maximize Performance, page 31.

# fdisk -u -S 32 /dev/dm-2
# fdisk -u -S 32 /dev/dm-3

6. Build an XFS filesystem* on top of each LUN for the full device path on the first partition:

# mkfs.xfs /dev/mapper/mpathap1
# mkfs.xfs /dev/mapper/mpathbp1
* XFS is preferred on Linux. ZFS is preferred on Solaris.

7. Mount each LUN either from a shell or automatically from /etc/fstab:

# mount -t xfs -o noatime,nodiratime,nobarrier,logbufs=8,logbsize=256k
 /dev/mapper/mpathap1 /mysql/san/innodb-data
# mount -t xfs -o noatime,nodiratime,nobarrier,logbufs=8,logbsize=256k
 /dev/mapper/mpathbp1 /mysql/san/innodb-log
That's it. Refer to the previous blog post to understand how to setup the MySQL database once these filesystems are mounted. In a future blog post, I'll cover how to backup your MySQL database using snapshots and cloning.

Saturday Feb 14, 2015

Deploying MySQL over NFS using the Oracle ZFS Storage Appliance

The Oracle ZFS Storage Appliance supports NFS, iSCSI, and Fibre Channel access to a MySQL database. By consolidating the storage of MySQL databases onto the Oracle ZFS Storage Appliance, the following goals can be achieved:

1. Expand and contract storage space easily in a pooled environment.
2. Focus high-end caching on the Oracle ZFS Storage Appliance to simplify provisioning.
3. Eliminate network overhead by leveraging Infiniband and Fibre Channel connectivity.

This blog post will focus specifically on NFS. A followup post will discuss iSCSI and Fibre Channel.

Configuring the Oracle ZFS Storage Appliance

Each database should be contained in its own project.

1. From the ZFS controller’s CLI, create a project called ‘mysql’.

        zfs:> shares project mysql

2. Set logbias to latency to leverage write flash capabilities:

        zfs:shares mysql (uncommitted)> set logbias=latency
                               logbias = latency (uncommitted)

3. Set the default user to mysql and default group to mysql:

        zfs:shares mysql (uncommitted)> set default_user=mysql
                          default_user = mysql (uncommitted)
        zfs:shares mysql (uncommitted)> set default_group=mysql
                            default_group = mysql (uncommitted)

Note: If a name service such as LDAP or NIS is not being used, change these to the actual UID and GID found in /etc/passwd and /etc/group on the host.

4. Disable ‘Update access time on read’:

        zfs:shares mysql> set atime=false
                                 atime = false (uncommitted)

5. Commit the changes:

        zfs:shares mysql> commit

6. Create a filesystem called innodb-data to hold data files:

        zfs:shares mysql> filesystem innodb-data

7. Set the database record size to 16K to match Innodb’s standard page size:

        zfs:shares mysql/innodb-data (uncommitted)> set recordsize=16K
                             recordsize = 16K (uncommitted)
        zfs:shares mysql/innodb-data (uncommitted)> commit

8. Create a filesystem called ‘innodb-log’ to hold redo logs:

        zfs:shares mysql> filesystem innodb-log

9. Set the database record size to 128K:

        zfs:shares mysql/innodb-log (uncommitted)> set recordsize=128K
                            recordsize = 128K (uncommitted)
        zfs:shares mysql/innodb-log (uncommitted)> commit

Configuring the server

This example assumes a Linux server will be running the MySQL database. The following commands are roughly the same for a Solaris machine:

1. A directory structure should be created to contain the MySQL database:

        # mkdir –p /mysql/nas/innodb-data
        # mkdir –p /mysql/nas/innodb-log
        # chown –R mysql:mysql /mysql/nas

2. Each filesystem provisioned on the Oracle ZFS Storage Appliance should be mounted with the following options:

        rw,bg,hard,rsize=1048576,wsize=1048576,vers=3,nointr,timeo=600,tcp,
        actimeo =0,nolock

3. This should be supplied in /etc/fstab in order to be mounted automatically at boot, or it can be run manually from a shell like so:

        # mount –t nfs –o rw,bg,hard,rsize=1048576,wsize=1048576,vers=3,nointr,
          timeo=600,tcp,actimeo =0,nolock zfs:/export/innodb-data /mysql/nas/innodb-data
        # mount –t nfs –o rw,bg,hard,rsize=1048576,wsize=1048576,vers=3,nointr,
          timeo=600,tcp,actimeo =0,nolock zfs:/export/innodb-log /mysql/nas/innodb-log

Configuring the MySQL database

The option file my.cnf should be modified to offload the database onto the Oracle ZFS Storage Appliance and to make additional tunings for optimal performance. Prior to changing this file, MySQL should be stopped and restarted once completed.
    # service mysql stop
    # service mysql start

Important my.cnf changes

1. innodb_doublewrite = 0

A double write buffer is necessary in the event of partial page writes. However, the transactional nature of ZFS guarantees that partial writes will never occur. This can be safely disabled.

2. innodb_flush_method = O_DIRECT

Ensures that InnoDB calls directio() instead of fcntl() for the data files. This allows the data to be accessed without OS level buffering and read-ahead.

3. innodb_data_home_dir = /path/to/innodb-data

The data filesystem for InnoDB should be located on its own share or LUN on the Oracle ZFS Storage Appliance.

4. innodb_log_group_home_dir = /path/to/innodb-log

The log filesystem for InnoDB should be located on its own share or LUN on the Oracle ZFS Storage Appliance.

5. innodb_data_file_path = ibdatafile:1G:autoextend

This configures a single large tablespace for InnoDB. The ZFS controller is then responsible for managing the growth of new data. This eliminates the complexity needed for controlling multiple tablespaces.

You can also download the following example my.cnf file to get started.

Testing with Sysbench

Sysbench is a handy benchmark tool for generating a database workload. To fill a test database, run the following command:
    # sysbench \
    --test=oltp \
    --oltp-table-size=1000000 \
    --mysql-db=test \
    --mysql-user=root \
    --mysql-password= \
    prepare
Once filled, create an OLTP workload with the following command and parameters:
    # sysbench \
    --test=oltp \
    --oltp-table-size=1000000 \
    --oltp-test-mode=complex \
    --oltp-read-only=off \
    --num-threads=128 \
    --max-time=3600 \
    --max-requests=0 \
    --mysql-db=test \
    --mysql-user=root \
    --mysql-password= \
    --mysql-table-engine=innodb \
    run

Analytics

The Analytics feature of the Oracle ZFS Storage Appliance offers an unprecedented level of observability into your database workload. This can assist in identifying performance bottlenecks based on the utilization of your network, hardware, and storage protocols. Its drill-down functionality can also narrow the focus of a MySQL instance into a workload’s operation type (read/write), I/O pattern (sequential / random), response latency, and I/O size for both the data and log files. At any point in time, a DBA can track a database instance at an incredibly granular level.

Once you have a single database installed, you can try creating more instances to analyze your I/O patterns. Run separate sysbench processes for each database and then use Analytics to monitor the differences between workloads.

Monday Sep 22, 2014

ZFS Storage at Oracle OpenWorld 2014

Join my colleagues and myself at this year's Oracle OpenWorld. We have five hands-on lab sessions available to attend. These are all heavily focused on 12c, MySQL, and the new RESTful API for the Oracle ZFS Storage Appliance.

HOL9715 - Deploying Oracle Database 12c with Oracle ZFS Storage Appliance 

September 29, (Monday) 2:45 PM - Hotel Nikko - Mendocino I/II
September 30, (Tuesday) 5:15 PM - Hotel Nikko - Mendocino I/II
HOL9718 - Managing and Monitoring Oracle ZFS Storage Appliance via the RESTful API 

September 29, (Monday) 2:45 PM - Hotel Nikko - Mendocino I/II
October 1, (Wednesday) 10:15 AM - Hotel Nikko - Mendocino I/II
HOL9760 - Deploying MySQL with Oracle ZFS Storage Appliance 

September 30, (Tuesday) 6:45 PM - Hotel Nikko - Mendocino I/II

Tuesday Sep 17, 2013

ZFS Storage at Oracle OpenWorld 2013

Join my colleagues and myself at this year's Oracle OpenWorld. I have a session, hands-on-lab, and demo being held in and around Moscone. These are all heavily focused on 12c and ZFS analytics.

HOL10103 - Managing ZFS Storage Inside Oracle Database 12c Environments 

September 23, (Monday) 10:45 AM - Marriott Marquis - Salon 10A
CON2846 - Oracle Use and Best Practices for High-Performance Cloud Storage 

September 23, (Monday) 12:15 PM - Westin San Francisco - Franciscan II
DEMO3619 - Maintaining the Performance of Your Cloud Infrastructure

Moscone South Lower Level, SC-152

Monday Sep 24, 2012

ZFS Storage at Oracle OpenWorld 2012

Join my colleagues and myself at this year's Oracle OpenWorld. We'll be hosting a hands-on lab, demonstrating the ZFS Storage Appliance and its analytics features.

HOL10034 - Managing Storage in the Cloud 

October 1st (Monday) 3:15 PM - Marriott Marquis - Salon 14/15
October 2nd (Tuesday) 5:00 PM - Marriott Marquis - Salon 14/15

Tuesday Jan 03, 2012

NFS mounts with ZFS

I ran into a strange automount issue where my NFS shares were not being mounted at boot time. nfs/client was enabled, my entry in /etc/vfstab was correct, and issuing a `mount -a` worked flawlessly. So what was the problem? Well, this was the entry in my vfstab file:
biff:/paulie    -       /export/home/paulie/biff   nfs     -       yes     proto=tcp,vers=3
I wanted to place my NFS share inside a zfs filesystem so that it was easily accessible in my home directory.
[paulie@doc ~]$ zfs list | grep export/home/paulie
rpool/export/home/paulie  78.3M  2.82G  78.3M  /export/home/paulie
Turns out this is not such a good idea since the /etc/vfstab file is read *before* zpool's are imported and mounted. This means that all NFS shares need to be listed outside any filesystems to be mounted at boot time and then symlinked in.
[root@doc ~]# mkdir /biff
[paulie@doc ~]$ ln -s /biff/ /export/home/paulie/biff/
... and then changing around vfstab ...
biff:/paulie    -       /biff   nfs     -       yes     proto=tcp,vers=3
And that's it, NFS should automount now:
[paulie@doc ~]$ df -kh | grep biff
biff:/paulie           2.7T   1.2T       1.4T    47%    /biff
Lesson learned.

Wednesday Oct 12, 2011

ZFS Storage Appliance at Storage Networking World Fall 2011

I was an instructor at SNW this year at the JW Marriott hotel in Orlando, Florida. Along with fellow Oracle co-worker Ray Clarke, we represented the ZFS Storage Appliance in a hands-on environment that allowed storage administrators and industry experts to demo our product in a simulated environment.


Rather than haul physical equipment to the convention, we setup an array of Windows 7 virtual machine sessions paired with a ZFS simulator running on Virtual Box across two remote X4270 machines. This let us create a classroom environment of 24 stations (48 VM sessions) that created a superb replica of the 7000 product that each user could toy around with as they completed the storage exercises we devised.

If you missed the opportunity to demo our product, or would like to download and play with the simulator in your own environment, feel free to check out the following links to get started.

Monday Jan 03, 2011

ZFS Encryption for USB sticks on Solaris 11 Express

USB memory sticks are easily lost, so to keep your data safe, it's best to use the new encryption feature of ZFS available since snv_149 (ZFS version 30). Here's how to take advantage of it.
[paulie@adrenaline ~]$ uname -a
SunOS adrenaline 5.11 snv_155 i86pc i386 i86pc Solaris
Get the device id for the USB stick using rmformat.
[paulie@adrenaline ~]$ rmformat
Looking for devices...
     1. Logical Node: /dev/rdsk/c11t0d0p0
        Physical Node: /pci@0,0/pci108e,534a@2/hub@4/storage@1/disk@0,0
        Connected Device: SanDisk  U3 Cruzer Micro  8.02
        Device Type: Removable
	Bus: USB
	Size: 1.9 GB
	Label: 
	Access permissions: Medium is not write protected.
The device id is c11t0d0p0. Using this id, we can make a pool on the device called 'secret'. You can call yours whatever you want.
[paulie@adrenaline ~]# zpool create -O encryption=on secret c11t0d0p0
Enter passphrase for 'secret': 
Enter again: 
Let's create a random 128MB file in the new pool called file.enc.
[paulie@adrenaline ~]# cd /secret; mkfile 128m file.enc
Now, let's make sure it works by exporting and importing the secret pool and hope it asks for a password.
[paulie@adrenaline ~]# zpool export secret
[paulie@adrenaline ~]# zpool import secret
Enter passphrase for 'secret': 
It works as expected. Let's check for the created file.
[paulie@adrenaline ~]# ls /secret
file.enc
We can also check the encryption of any zfs filesystem by using the following command:
[paulie@adrenaline ~]# zfs get encryption secret
NAME    PROPERTY    VALUE        SOURCE
secret  encryption  on           local
For more information visit:
http://docs.sun.com/app/docs/doc/821-1448/gkkih

Friday Jul 25, 2008

ZFS NAS with the Intel Atom

I've been looking to build a network attached storage (NAS) device for some time to store all my music, photos, films, etc. in one global location. I had a few specific requirements that were as follows:

  • Supports two disks (2TB RAID mirroring preferable)
  • Supports UPnP server (media sharing)
  • Low power (always on)
  • Bonus points: ZFS, SSH, Samba, etc.

    There are a few ready-to-go NAS solutions available on the market that are compatible with my demands. The Linksys NAS200, at $130, supports two disks with Twonky Media server for UPnP and is low power. The downside is that reviews indicate it being very slow and it comes with no gigabit ethernet. On the high-end side, however, the QNAP TS-209 Pro also comes with the two disk slots, UPnP support, is fast, along with a bunch of extra goodies like samba and an Itunes server. The price tag at $400 makes it a bit too much for a diskless system, so I decided on a different solution... why not build it myself?

    I had a Thermaltake LANBOX Lite Mini-ITX/ATX case lying around along with a 200W power supply, so all I needed to do was find some cheap, low power hardware to support it. Intel recently released a new type of processor called the Atom, which is aimed at bringing x86 into the embedded market. I'm not so sure how successful they will be at this venture, but it fits my needs perfectly. According to their specs, it draws 2W TDP for the 1.6ghz version which is pretty amazing. The power output turns out to be a bit of marketing hype, but considering it is now being used in the ASUS Eee desktop and laptops, it should prove to be a viable candidate. To encourage the hobbyist market, Intel created a combo of motherboard + Atom (BOXD945GCLF) that is available on Newegg for $75. After getting it along with a pair of 1TB drives at $170, I had a workable system shipped to me for under $500.

    My first reaction when I got all the parts is that the Intel board is TINY. I can fit my hand around entire thing. Even in the media center case that I use, it has quite a bit of extra room.

    The LANBOX Lite is fully modular which makes installation much easier. If you have ever built a computer from scratch, then you probably understand how tedious it can be to screw in motherboards, install drives, etc. This case allows you to pull every section out to make installation a breeze. It also comes with a nice silo to install both the terabyte drives.

    The next step is to turn this into a fully functional NAS. After installing FreeBSD 7.0, I wanted to setup both my disks to mirror. Since the BSD family has such a friendly license, ZFS is included in the distribution. And ZFS mirroring makes things incredibly simple to setup.
    [root@bojangles ~]# zpool attach tank ad4s1d ad6s1d
    
    And that's it! Now to ensure that things went as expected.
    [root@bojangles ~]# zpool list
    NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
    tank                    928G   72.9G    855G     7%  ONLINE     -
    [root@bojangles ~]# zpool status
      pool: tank
     state: ONLINE
     scrub: none requested
    config:
    
    	NAME        STATE     READ WRITE CKSUM
    	tank        ONLINE       0     0     0
    	  mirror    ONLINE       0     0     0
    	    ad4s1d  ONLINE       0     0     0
    	    ad6s1d  ONLINE       0     0     0
    
    errors: No known data errors
    
    Now with ports, I can get Samba and Ushare up and running in no time.
    [root@bojangles ~]# cd /usr/ports/net/samba3 && make install clean
    [root@bojangles ~]# cd /usr/ports/net/ushare && make install clean
    
    The final product in my closet in a makeshift cabinet.

  • About

    Hiya, my name is Paul Johnson and I'm a software engineer working on the Oracle ZFS Storage Appliance .

    Search

    Categories
    Archives
    « April 2015
    SunMonTueWedThuFriSat
       
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
      
           
    Today