Tuesday Apr 21, 2009

Try Postgres 8.4 Beta1 using OpenSolaris Appliance for VirtualBox

Postgres 8.4 Beta1 community binaries are now for OpenSolaris 2008.11. The Beta1 binaries for OpenSolaris can be downloaded from postgresql.org binary location. Postgres 8.4 binaries for Solaris 10 are also available.

For people who don't have OpenSolaris installed on their laptop but want to try out the new improved DTrace Probes in Postgres 8.4beta1, you can install the Minimal OpenSolaris Appliance OVF image for VirtualBox 2.2 and install the Postgres 8.4beta1 binaries in the appliance to try it out. You can also use the DTrace probes on your Mac OS X too.

Easiest way to install the binaries on the OpenSolaris Appliance is to first install SUNWwget package from the OpenSolaris repository

pkg install SUNWwget

and then using copy the download mirror url for those binaries using http and download it with wget in the appliance

wget "http://wwwmaster.postgresql.org/redir/198/h/binary/v8.4beta\\

The community binaries typically should be untarred in /opt.

bzcat postgresql-8.4beta1-opensolaris.i386-32.tar.bz2 |tar -xf -

This will then have the binaries in /opt/postgres/8.4beta1/. If you also untar the 64-bit binaries then the the 64-bit binaries are available from /opt/postgres/8.4beta1/64.

One thing that I have noticed with these binaries that it does not pick up the libraries if installed in /opt by default so depending on the type of bits you may need to set the following

LD_LIBRARY_PATH=/opt/postgres/8.4beta1/lib; export LD_LIBRARY_PATH


LD_LIBRARY_PATH_64=/opt/postgres/8.4beta1/lib/64; export LD_LIBRARY_PATH_64

Beyond that everything should work as you would expect. Well almost... One thing to also note is that the new 8.4 GUC parameter effective_io_concurrency to allow readahead for bitmap scan index scans is disabled on OpenSolaris / Solaris 10.

 If you do find something that doesn't seem to work, please feel free to leave comments.


Wednesday Apr 15, 2009

Lessons with OpenSolaris Appliances

Going through the comments for Minimal OpenSolaris Appliance entry,  I thought I will go over the problems I encountered when I was working on the create_solaris_appliance.sh script and what I think we can do to improve OpenSolaris in those areas.

1. Setting up the disk

This is probably the first thing that most appliance creators will have to do is to format a new media before it is usable by OpenSolaris. Now to make a disk usable by OpenSolaris (on x86/x64 platforms) two things needs to be done, one a primary Solaris partition needs to be created using fdisk and then a regular Solaris VTOC needs to be initialized on the Solaris partition. While the experience is bit easier with the interactive option of the commands, putting it in a script can be challenging.

Fortunately fdisk has -B option. From the man page:

         Default to one Solaris partition  that  uses  the  whole
         disk. On an x86 machine, if the disk is larger than 2 TB
         (terabytes), the default size of the  Solaris  partition
         will be limited to 2 TB.

Hence I could use the following easily in my script:

fdisk -n -B /dev/rdsk/${INSTALLDISK}p0

Unfortunately I saw no such option for fmthard. Infact it made it more difficult since you need to enter the geometry information of the target disk. I took the easy way out by finding the same for the default VirtualBox disk size which is 16GB and using it as follows:

fmthard -d 0:2:00:48195:33447330 /dev/rdsk/${INSTALLDISK}p0

There are some scripting ways to work this around as Vikram Datta commented on my earlier entry:

  SecCnt=`prtvtoc /dev/rdsk/${INSTALLDISK}p0 | awk '/sectors\\/cylinder/ { print $2 }'`
  LastSect=`prtvtoc /dev/rdsk/${INSTALLDISK}p0 | awk '$1 == "2" { print $5 }'` 
  LastSect=`expr $LastSect - $SecCnt` 
  fmthard -d 0:2:00:${SecCnt}:${LastSect} /dev/rdsk/${INSTALLDISK}p0 

But I think it should be as easy as  fdisk.. i.e. doing the following:

fmthard -B /dev/rdsk/${INSTALLDISK}p0

Hence I have filed a new RFE 6829475. I think this RFE is useful not just for my script but in general helps improve usability of the command to a new learner of OpenSolaris.

2. Setting up the ZPool

The next step of creating zpool for the root device was pretty straight forward

zpool create -f rpool ${INSTALLDISK}s0
zfs set compression=on rpool 
zfs create -o mountpoint=legacy rpool/ROOT
zfs create -o mountpoint=$PKG_IMAGE rpool/ROOT/VOSApp
zfs create -V 128M rpool/swap
zfs create -V 16M rpool/dump
zfs create rpool/ROOT/VOSApp/opt
zfs create rpool/ROOT/VOSApp/var
zfs create rpool/export
zpool set bootfs=rpool/ROOT/VOSApp rpool

Here I took liberty of separating out /opt /var into separate dataset so that I can enable zfs snapshots just for "optional" and "variable" data of the applications. This is a point of view of deployment. Your view may be different here.

3. Setting up the packages from OpenSolaris repository

The next step on how to use OpenSolaris repository to install the pacakges. Alex Eremin had great pointers on his blog that I adapted . The initial setup can be easily done by exporting the PKG_IMAGE environment variable to the directory where you are currently mounting the zpool and then using pkg image-create command.

pkg image-create -f -F -a opensolaris.org=http://pkg.opensolaris.org/release $PKG_IMAGE
pkg refresh

Then  I played with the package list over and over again to get a minimal size with most network adapter drivers to get on the internet and all required pacakges to allow "pkg" command to work. Of course this is the piece that took me quite a bit of trial and error to figure out the right mix of pacakges so OpenSolaris does boot up successfully and allow "pkg" to run successfully.

pkg install SUNWcsd 
pkg install SUNWcs 
pkg install SUNWcar SUNWcakr SUNWkvm SUNWos86r SUNWrmodr \\
 SUNWpsdcr SUNWpsdir SUNWcnetr SUNWesu SUNWkey SUNWnfsckr \\
 SUNWnfsc SUNWgss SUNWgssc SUNWbip SUNWbash SUNWloc SUNWsshcu \\
 SUNWsshd SUNWssh SUNWtoo SUNWzfskr SUNWipf SUNWintgige SUNWipkg \\
 SUNWadmr SUNWadmap SUNWPython SUNWperl584core SUNWgrub SUNWxcu6\\
 SUNWxcu4 SUNWgawk SUNWgtar SUNWgnu-coreutils SUNWscp SUNWfmd \\
 SUNWxge SUNWbge SUNWnge SUNWrge SUNWrtls \\
 SUNWixgb SUNWchxge SUNWzfs-auto-snapshot SUNWsolnm

I did realize one thing.. Loading one package at a time with pkg is awfully slow. By putting all the packages within one line (except for SUNWcsd and SUNWcs) I could cut down the time from hours to minutes. This was my "Eureka" moment when I could install the packages to a bare-metal .. well a bare-virtualbox VDI within 10 minutes.

4. Setting up the SMF Database on the system

cp $PKG_IMAGE/lib/svc/seed/global.db $PKG_IMAGE/etc/svc/repository.db
chmod 0600 $PKG_IMAGE/etc/svc/repository.db
chown root:sys $PKG_IMAGE/etc/svc/repository.db
# setup smf profiles
ln -s ns_files.xml $PKG_IMAGE/var/svc/profile/name_service.xml
ln -s generic_limited_net.xml $PKG_IMAGE/var/svc/profile/generic.xml
ln -s inetd_generic.xml $PKG_IMAGE/var/svc/profile/inetd_services.xml
ln -s platform_none.xml $PKG_IMAGE/var/svc/profile/platform.xml
# Set the environment variables for svccfg.
${SVCCFG} import ${PKG_IMAGE}/var/svc/manifest/milestone/sysconfig.xml 

Again this is one area I think OpenSolaris can improve a bit. It does some amount of research (google, Alex Eremin) before understanding how to set it up properly. (But then I am not a kernel engineer.)

5. Other Miscellaneous but important stuff to get a bootable system

The following are basically hacks in some ways to get to a bootable system. 

# Set TimeZone 
echo $HOSTNAME > $PKG_IMAGE/etc/nodename 
# configure our new /etc/vfstab
printf "rpool/ROOT/VOSApp -\\t/\\t\\tzfs\\t-\\tno\\t-\\n" >> $PKG_IMAGE/etc/vfstab
printf "/dev/zvol/dsk/rpool/swap\\t-\\t-\\t\\tswap\\t-\\tno\\t-\\n" >> $PKG_IMAGE/etc/vfstab
chmod a+r $PKG_IMAGE/etc/vfstab
# turn off root as a role
printf "/\^root::::type=role;\\ns/\^root::::type=role;/root::::/\\nw" |\\
ed -s $PKG_IMAGE/etc/user_attr
# Edit etc/ssh/sshd_config to allow ssh to root account
printf "/\^PermitRootLogin no\\ns/\^PermitRootLogin no/PermitRootLogin yes/\\nw" |\\
ed -s ${PKG_IMAGE}/etc/ssh/sshd_config 
# Generate ssh keys
ssh-keygen -t dsa -f $PKG_IMAGE/etc/ssh/ssh_host_dsa_key -N ''
ssh-keygen -t rsa -f $PKG_IMAGE/etc/ssh/ssh_host_rsa_key -N ''

6. Finally the boot archive and  grub

# configure /dev in the new image
devfsadm -R $PKG_IMAGE
bootadm update-archive -R $PKG_IMAGE
$PKG_IMAGE/boot/solaris/bin/update_grub -R $PKG_IMAGE
# For zfs root, menu.lst has moved to /rpool/boot/grub/menu.lst. # 
# create the new real grub menu
cat <⁞<-EOF > /rpool/boot/grub/menu.lst
default 0
timeout 10
splashimage /boot/grub/splash.xpm.gz
title  Appliance based on OpenSolaris 2008.11 
findroot (pool_rpool,0,a)
bootfs rpool/ROOT/VOSApp
kernel\\$ /platform/i86pc/kernel/\\$ISADIR/unix  -B \\$ZFS-BOOTFS
module\\$ /platform/i86pc/\\$ISADIR/boot_archive
# make the grub menu files readable by everyone.
chmod a+r $PKG_IMAGE/boot/grub/menu.lst
chmod a+r /rpool/boot/grub/menu.lst
# setup /etc/bootsign so that grub can find this zpool
dir -p /rpool/etc>
echo pool_rpool > /rpool/etc/bootsign
zfs set mountpoint=/ rpool/ROOT/VOSApp
zfs set compression=off rpool 

Hope this makes it easier for  someone thinking of making their own appliances based on OpenSolaris.

NOTE: There was fair number of people who did download the images. I did accidently lose the OpenSolaris image once after it was downloaded 75 times.  But you can track the images directly at mediacast.sun.com/tags/appliance

Sunday Apr 12, 2009

Openbravo ERP 2.40 Appliance using Postgres 8.3 appliance with OpenSolaris OVF

Few days ago I talked about a Postgres 8.3 Appliance based on OpenSolaris. Today lets look at how to use that appliance image to get an Openbravo ERP 2.40 appliance based on OpenSolaris in VirtualBox.

Download the Postgres 8.3 Appliance OVF image and unzip the two files. Fire up VirtualBox 2.2 and use File->Import Appliance and point it to the .ovf file  from the zip file. Change the networking from NAT to "Bridged Network" and start the VM and soon you get "postgresdb login:" screen.  Use root/opensolaris to login into the system and verify that postgres instance is already running as follows:

# svcs -a |grep postgres
disabled       19:08:00 svc:/application/database/postgresql_83:default_64bit
online         19:08:23 svc:/application/database/postgresql_83:default_32bit

The default options of postgresql.conf are pretty low so bump them up slightly

# vi /var/postgres/8.3/data/postgresql.conf


# svcadm restart svc:/application/database/postgresql_83:default_32bit

 Now import other required dependencies for Openbravo ERP 2.40

# pkg install SUNWj6dev SUNWant SUNWtcat
DOWNLOAD                                    PKGS       FILES     XFER (MB)
SUNWj6dev                                    0/4     25/4756    1.08/84.90

Make sure that your newly installed tomcat setup has a valid server.xml file or copy it from an example file included. 

# cp /var/apache/tomcat/conf/server.xml-example /var/apache/tomcat/conf/server.xml

 Now download Openbravo ERP 2.40 installer as follows:

# pkg install SUNWwget

# wget "http://voxel.dl.sourceforge.net/sourceforge/openbravo/OpenbravoERP_2.40-solaris-intel-installer.bin"

# chmod a+x OpenbravoERP_2.40-solaris-intel-installer.bin

# ./OpenbravoERP_2.40-solaris-intel-installer.bin

 And use the following options:

  • /opt/OpenbravoERP | /var/OpenbravoERP/AppsOpenbravo/attachments
  • Complete |Standard | /usr/jdk/latest | /usr/bin/ant 
  • /var/apache/tomcat
  • PostgreSQL
  • /usr/postgres/8.3/bin
  • localhost     5432
  • (Enter password for postgres user as "postgres" twice)
  • openbravo    tad     (Enter password for tad user  twice)
  • Context name: openbravo
  • Date format: DD MM YYYY, Date Separator -, Time format 24h, Time Separator :
  • Demo data: Y or N depending on your preferences

After the information the installation GUI takes quite a bit of time to complete specially if you select to load the demo data. (Hope you made changes to PostgreSQL before to tune this loading.)

Once the installation completes  start tomcat as follows

# /usr/apache/tomcat/bin/startup.sh

Now from any other machine (or host machine) fire a browser and enter the IP address of the VM with port 8080 and uri openbravo and you now have a virtual VM with Openbravo running


The login screen for Openbravo should appear. Use Openbravo as username and openbravo (all lower case) as password to login and set it up for your business.

Wednesday Apr 08, 2009

Postgres 8.3 Appliance for VirtualBox 2.2

With the release of VirtualBox 2.2 today, exporting and importing appliances becomes lot easier. The new features of VirtualBox 2.2 includes import and export of appliances based on OVF or Open Virtualization Format. 

One of  my earlier blog entry talks about creating appliances. Using the same script mentioned in the earlier blog entry,  I now have a OVF Image  which will work with VirtualBox 2.2  "Import Appliance"  and make it easier to try a virtual PostgreSQL 8.3 database appliance running OpenSolaris under the cover.

The PostgreSQL 8.3 Appliance Image is less than 330MB (unlike a full install of OpenSolaris which could take more than 1500MB).  Just unzip the files in a directory and import them through VirtualBox 2.2 using File->Import Appliance Wizard.  Since an appliance (database in this case) is typically accessed through external application servers, you will want to use Bridge Network instead of NAT.  Detailed instructions to make it accessible (both VirtualBox VM as well as PostgreSQL 8.3)  from external clients are published on Postgres 8.3 Appliance page.

Finally  it is lot easier to finally see the following grub menu on your own VirtualBox 2.2:

 Also for PostgreSQL users/developers who are not familiar with OpenSolaris,  this VM  allows you to try the DTrace probes and ZFS snapshots with PostgreSQL 8.3. More information related to it is  coming soon on Postgres 8.3 Appliance page. Also the PostgreSQL DTrace Toolkit 2009.03.29 is available on PGFoundry.org.

Tip: You can also use my PostgreSQL Monitor Demo to connect to it:

As usual if you have questions, I will be happy to answer them.

Sunday Apr 05, 2009

Effects of Flash/SSD on PostgreSQL - PostgreSQL East 2009

I presented a talk at PostgreSQL East 2009 at Drexel University today morning.  The topic was "Effects of Flash/SSDs on PostgreSQL".  Here are the slides from the presentation

If you have questions please leave comments.

Wednesday Mar 11, 2009

Thundering Herd of Elephants for Scalability

During last year's PgCon 2008 I had presented about "Problems with PostgreSQL on Multi-core Systems".  On slide 14 I talked about the results with IGEN with various think times and had identied the problem of how it is difficult to scale with increasing number of users. The chart showed of how 200ms think time tests will saturate about 1000  active users and then throughput starts to fall.  On slide 16 & 17  I identified ProcArrayLock as the culprit of why scalability tanks with increasing number of users. 

Today I was again intrigued by the same problem as I was trying out PostgreSQL 8.4 builds and once again hit the bottleneck at about 1000 users and frustrated that PostgreSQL cannot scale even with only 1 active socket (64 strands)  of Sun Enterprise SPARC T5240 which is a 2 socket UltraSPARC T2 plus system.  

Again I was digging through the source code of PostgreSQL 8.4 snapshot to see what can be done. While  reviewing lwlock.c I thought of a change  and quickly  changed couple  of lines in the code and recompiled the database binaries and re-ran the test. The results are as show below (8.4 Vs 8.4fix)

The setup was quite standard (as how I test them). The database and log files are on RAM (/tmp). The think times were about 200ms for each user between transactions.  But the results was quite pleasing. On the same system setup,  TPM throughput went up about 1.89x up and response time now shows a more gradual increase rather than a knee-jerk response.

So what was the change in the source code? Well before I explain what I did, let me explain what is PostgreSQL trying to do in that code logic.

After a process acquires a lock and does its critical work, when it is ready to release the lock, it finds the next postgres process that it needs to wake up so effectively not causing starvation for processes trying to do exclusive locks  and also I think it is trying to avoid a Thundering Herd problem. It is a good thing to do for single core systems since CPU resources are sacred and effective usage results in better performane. However on systems with many CPU resources (say like 256 threads of Sun SPARC Enterprise T5440) this ends up artificially bottlenecking since it is not the system but the application determining which next process should wake up and try to get the lock.

The change I did was discard the selective process waiter wake-up and just wake up all waiters waiting for that lock and let the processes, OS Dispatcher, and CPU resources do its magic on its own way (and that worked,  worked very well.)

          //if (!proc->lwExclusive)
           if (1)
                            while (proc->lwWaitLink != NULL &&
                                          // !proc->lwWaitLink->lwExclusive)
                                       proc = proc->lwWaitLink;

Its amazing that last year I had tried so many different approaches to the problem and a mere simple approach proved to be more effective.

I have put a proposal to the PostgreSQL Performance alias that a postgresql.conf tunable be defined for PostgreSQL 8.4 so people can tweak their instances to use the above method of awaking all waiters without impacting the existing behavior for most existing users.

In the meantime if you do compile your own PostgreSQL binaries, try the workaround if you are using PostgreSQL on ay 16 or more cores/threads system and provide feedback about the impact on your workloads.

Thursday Mar 05, 2009

PostgreSQL Conference East 2009 & upcoming 8.4 beta

Sometimes you need a kickstart to get one moving from inertia. Considering that I haven't blogged in 2009 at all, I welcome the PostgreSQL Conference East 2009  email reminders to get me going again.  That will motivate me to do some targetted work with PostgreSQL for the community. I will be presenting on SSDs and PostgreSQL during the event. Unlike Robert Treat who can pretty much prepare all his slides during the lunch break of the conference, some of us require more time.  As a reminder PostgreSQL East 2009 registration has already started. 

 I am also looking forward to PostgreSQL 8.4 beta and test drive it on 128-threaded Sun SPARC Enterprise  T5240.

Tuesday Dec 16, 2008

PostgreSQL 8.3 Appliance based on OpenSolaris using VirtualBox

Based on Alex Eremin's blog entry on minimal script and some of my own hacking,  I have a script that should allow to create a basic PostgreSQL 8.3 Appliance using OpenSolaris 2008.11 Kernel  in Virtualbox easily with an image which is not bloated (No X, Gnome, etc).

Here is how to try it out:

  • For the time being this requires the  OpenSolaris 2008.11 CD image (650+ MB) only to execute my script create_pg_appliance.sh which is about 6KB. Bear with me till I solve this problem and play along :-)
  • Download Virtualbox and install it
  • Create a New Virtual Box VM with following parameters in the Wizard Screen
    • Call it PostgreSQL 8.3 Appliance
    • Select OS Type "OpenSolaris" 
    • Base Memory  768MB or increase it to 1GB if you can spare your RAM for it
    • Create a New Dynamic Expanding Image with exactly 16.00 GB (Any other wise may not work)
  • Once the VM is created, immediately  click the blue Network link and modify it to select a "Host Based Network"  (in the setup make sure it is connected to the active host interface - wired or wireless depending on the host system)
  • Also Click the CD-ROM image and point it to the osol-200811.iso image that you downloaded earlier
  • Boot up the VM and select the first LiveCD option in the Grub Menu options
  • Select through the defaults to get the full Gnome Desktop
  • Open Firefox in the VM and make  sure your VM  has internet access
  • Open a terminal window and execute the following commands in sequence
    • wget "http://blogs.sun.com/jkshah/resource/create_pg_appliance.sh"
    • pfexec sh -x create_pg_appliance.sh
    • pfexec reboot
  • At this point stop the VM and disconnect the CD image connected to the VM and boot again or  select the  grub entry
    • Boot from Hard disk
  • Once the VM boots from the hard disk, select the  grub menu item
    • PostgreSQL 8.3 Appliance based on OpenSolaris 2008.11
  • The Operating system boots up to give a login prompt. Login as root/opensolaris
  • The PostgreSQL database server is already running with PGDATA in /var/postgres/8.3/data
  • Modify postgresql.conf  to add  listen-address='\*' parameter
  • Modify pg_hba .conf to  allow your clients in pg_hba.conf
  •  Restart the postgresql server instance as follows:
    • svcadm restart postgresql_83:default_32bit
  • Now the PostgreSQL Appliance is ready to be connected from your clients.
  • Remember to take snapshots of rpool/VOSApp/var so you can always revert back to earlier copies if you ever need it.
Image of Grub Menu

Note: In a trial installation on a MacBook Pro the script executed in  about 11 minutes which  includes the time it takes to download packages from pkg.opensolaris.org.  I thought that was fantastic.

Maybe I should also create a minimized LiveCD just to execute the script on Virtualbox.  If you have problems executing the script let me know.

Wednesday Dec 10, 2008

OpenSolaris 2008.11 and PostgreSQL 8.3

The second supported version of OpenSolaris called OpenSolaris 2008.11 is now officially launched. With the release, I guess now PostgreSQL 8.3 is now "officially" supported on OpenSolaris too (though you could have used it prior  anyway with the development releases of OpenSolaris after the initial OpenSolaris 2008.05 was released).

 Anyway let's quickly look at some tips and how-to which I think could be useful to improve the experience of PostgreSQL 8.3 with OpenSolaris 2008.11.  One way to take a quick drive of OpenSolaris 2008.11 is to also try out the new VirtualBox 2.0.6 and install OpenSolaris 2008.11 in a VM. (Note: If you do plan to access the VM from other systems then use Host Interface instead of NAT to make it easy to access it from outside.)

PostgreSQL 8.3 is not already installed once you install OpenSolaris 2008.11 from the LiveCD. This gives an opportunity for a tweak that I would highly recommend. By default OpenSolaris 2008.11 does not have option for separate ZFS dataset for /var (like Solaris 10 10/08 - U6 does). I would certainly like my PostgreSQL database to be on a separate dataset that I can control PostgreSQL snapshots  independently. Since I know the default PostgreSQL database path for PostgreSQL 8.3 on OpenSolaris is /var/postgres/8.3 I generally create a separate dataset as follows before I use pkg script to install PostgreSQL 8.3

# zfs create -o mountpoint=/var/postgres rpool/postgres

Now I am ready to install the various packages of PostgreSQL 8.3

# pkg install SUNWpostgr-83-server

# pkg install SUNWpostgr-83-client SUNWpostgr-jdbc SUNWpostgr-83-contrib

# pkg install SUNWpostgr-83-docs  SUNWpostgr-83-devel

# pkg install SUNWpostgr-83-tcl SUNWpostgr-83-pl

Now to install PGAdmin3 as follows:

# pkg install SUNWpgadmin3

Even pgbouncer is available in the repository

# pkg install SUNWpgbouncer-pg83

Now that all binaries are installed, let's look on how to get started. Generally the best way to start and stop PostgreSQL 8.3 server is via the svcadm command of OpenSolaris. The SMF manifest for PostgreSQL 8.3 is installed with SUNWpostgr-83-server, however the import of the script does not happen till the next reboot. We can always work that around by doing a quick manual update as follows:

# svccfg import /var/svc/manifest/application/database/postgresql_83.xml

This creates two entries one for 32-bit server instance and one for 64-bit server instance.

# svcs -a |grep postgres
disabled        1:14:44 svc:/application/database/postgresql_83:default_64bit
disabled        1:14:45 svc:/application/database/postgresql_83:default_32bit

Depending on your server/choice you can start the corresponding server. Or 32-bit instance may be just easier to select if there are doubts.

# svcadm enable postgresql_83:default_32bit

# svcs -a |grep postgres
disabled        1:14:44 svc:/application/database/postgresql_83:default_64bit
online          2:12:37 svc:/application/database/postgresql_83:default_32bit

# zfs list rpool/postgres
rpool/postgres  30.4M  11.8G  30.4M  /var/postgres

The client psql is still not in the default path. The path /usr/postgres/8.3/bin should be in your search PATH. (or if you are using 64-bit add /usr/postgres/8.3/bin/64).

# /usr/postgres/8.3/bin/psql -U postgres postgres
Welcome to psql 8.3.4, the PostgreSQL interactive terminal.

Type:  \\copyright for distribution terms
       \\h for help with SQL commands
       \\? for help with psql commands
       \\g or terminate with semicolon to execute query
       \\q to quit


For people who are using PostgreSQL 8.2 on OpenSolaris 2008.05 already, there is a way to get to PostgreSQL 8.3. First you have to update your OpenSolaris 2008.05 to OpenSolaris 2008.11 image using

# pkg image-update

Then install PostgreSQL 8.3 binaries as above and also install an additional package

# pkg install SUNWpostgr-upgrade

If there are more question feel free to leave a comment.

Wednesday Nov 12, 2008

Snapshots with PostgreSQL and Sun Storage 7000 Unified Storage Systems

Continuing on my earlier blog entry on setup of PostgreSQL using Sun Storage 7000  Unified Storage (Amber Road), let's now look at how to take snapshots for such systems. 

Considering a typical PostgreSQL system there are two ways to take backup: 

Cold or Offline backup: Backing of PostgreSQL while PostgreSQL server is  not running. The advantage of such a backup is that it is simple to achieve the purpose of backing up the database. However the disadvantage is that the database system is unavailable for work.  The general steps for doing a cold backup is, shutdown gracefully the PostgreSQL server and backup all the files including $PGDATA, pg_xlog if it is in different location and all the tablespaces location used for that server instance. After backup the files, start the PostgreSQL server again and you are done. In case if you ever want to go back to that earlier backup version, you shutdown the database, restore all the old files and restart the database.  This is essentially the same logic which we will follow with snapshots except if you now have a multi gigabyte database, the snapshot operation  will look like million times faster than having to do a "tar" or "cpio" (specially if you consider that "tar" is a single threaded application doing its best but still slow if you use it on multi-core systems).

Lets consider an example on how to take a snapshot with OpenStorage systems with Cold or Offline backup strategy.

Assume that Sun Storage 7000  Unified Storage System  has the  hostname "amberroad" (Sun Storage 7000 Unified Storage Systems is mouthful ) have a project defined "postgres" which has three devices exported "pgdata" which holds $PGDATA, "pglog" which holds $PGDATA/pg_xlog and "pgtbs1" which is one of the tablespaces defined.

Then a backup/snapshot script  (as postgres user ) will look something similar to the following (Disclaimer:  Provided AS-IS. You will have to modify it for your own setups):

echo "---------------------------------------------------------------"
echo "Shutting down the PostgreSQL database..."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA stop
echo "---------------------------------------------------------------"
echo "Deleting earlier snapshot called testsnap.."
echo "---------------------------------------------------------------"
ssh -l root amberroad confirm shares select postgres snapshots select testsnap destroy
echo "---------------------------------------------------------------"
echo "Taking a new snapshot..."
echo "---------------------------------------------------------------"
ssh -l root amberroad shares select postgres snapshots snapshot testsnap
echo "---------------------------------------------------------------"
echo "Verifying the snapshot.."
echo "---------------------------------------------------------------"
ssh -l root amberroad shares select postgres snapshots show
ssh -l root amberroad shares select postgres snapshots select testsnap show
echo "---------------------------------------------------------------"
echo "Restarting the database.."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA start -l $PGDATA/server.log
echo "---------------------------------------------------------------"

Note: The testsnap should not exist otherwise snapshot operation will fail and hence we delete any earlier snaps called testsnap in the script. You may modify the script to suit your priorities.

Now in case you want to restore to the snapshot "testsnap" version
you can use a sample script as follows:

echo "Restoring the database"
echo "---------------------------------------------------------------"
echo "Shutting down the database..."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA stop
echo "---------------------------------------------------------------"
echo "Restoring the snapshot.."
echo "---------------------------------------------------------------"
ssh -l root amberroad confirm shares select postgres select pgdata snapshots \\
select testsnap rollback
ssh -l root amberroad confirm shares select postgres select pglog snapshots \\
select testsnap rollback
ssh -l root amberroad confirm shares select postgres select pgtbs1 snapshots \\
select testsnap rollback
echo "---------------------------------------------------------------"
echo "Restarting the database.."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA start -l $PGDATA/server.log
echo "---------------------------------------------------------------"
echo "Database restored to earlier snapshot: testsnap."
echo "---------------------------------------------------------------"

NOTE: When you use the above script, all snapshots taken after “testsnap” are lost as they are rolled back to the point in time when “testsnap” was taken. There is another feature to clone the snapshot incase if you don't want to use rollback.

Hot or  Online backup:  A Hot or Online backup is when one takes a backup while PostgreSQL server is running. The advantage is that the database server is available. The disadvantage is that the setup is more complex than the Cold or Offline backup strategy. The recommended way to use Hot or Online backup with PostgreSQL is to use it in conjuction with PITR - Point in Time Recovery feature of PostgreSQL.

This can be achieved by turning on continuous WAL archiving in PostgreSQL and using   SELECT pg_start_backup('label');before taking the snapshot and then issuing SELECT pg_stop_backup(); after the snapshot is completed.

For this you will also need another dedicated device under the "postgres" project to just hold the WAL archive files. (In my case I call it "pgwal" and mounted as /pgwal/archive 

To turn on continuous WAL archiving you need to set the following variables in postgresql.conf (and then restart the PostgreSQL server):

archive_mode = true
archive_command = 'test ! -f /var/lib/postgres/8.2/data/backup_in_progress || cp -i %p /pgwal/archive/%f < /dev/null'

Then a sample backup/snapshot with hot backup  script can look as follows:

#PGDATA= /var/postgres/8.3/data; export PGDATA # if not set 
echo "---------------------------------------------------------------"
echo "Indicating PostgreSQL for a hot Snapshot.."
echo "---------------------------------------------------------------"
touch $PGDATA/backup_in_progress
psql  -c “select pg_start_backup('MyHotSnapShot');” postgres
echo "---------------------------------------------------------------"
echo "Deleting earlier snapshot.."
echo "---------------------------------------------------------------"
ssh -l root amberroad confirm shares select postgres snapshots select testsnap destroy
echo "---------------------------------------------------------------"
echo "Taking a new snapshot..."
echo "---------------------------------------------------------------"
ssh -l root amberroad shares select postgres snapshots snapshot testsnap
echo "---------------------------------------------------------------"
echo "Verifying the snapshot.."
echo "---------------------------------------------------------------"
ssh -l root amberroad shares select postgres snapshots show
ssh -l root amberroad shares select postgres snapshots select testsnap show
echo "---------------------------------------------------------------"
echo "Indicating to PostgreSQL Server that Hot Snapshot is done."
echo "---------------------------------------------------------------"
psql  -c “ select pg_stop_backup();” postgres
rm $PGDATA/backup_in_progress
echo "---------------------------------------------------------------"

Restoring from a hot backup snapshot is also bit tricky since we need to decide whether we need to roll forward and upto what point. It is probably beyond the scope of my post and hence I will just do a simple one with all files in the wal arhcives.

First of all restoring with PITR requires a recovery.conf file which should contain the path to the WAL Archives:

restore_command = 'cp /pgwal/archive/%f "%p"'

Also since we will be doing rollforward with the archive files,  we need to remove all the old pg_xlog files, create an archive status directory, remove the postmaster.pid file and an artifact of our earlier script, remove the backup_in_progress marker file too.

(We will do the above steps also as part  of our restore script.)

A restore snapshot script file can look something like as follows:

#PGDATA= /var/postgres/8.3/data; export PGDATA # if not set 
echo "Restoring the database"
echo "---------------------------------------------------------------"
echo "Shutting down the database..."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA stop
echo "---------------------------------------------------------------"
echo "Restoring the snapshot.."
echo "---------------------------------------------------------------"
ssh -l root amberroad confirm shares select postgres select pgdata snapshots \\
select testsnap rollback
ssh -l root amberroad confirm shares select postgres select pglog snapshots \\
select testsnap rollback
ssh -l root amberroad confirm shares select postgres select pgtbs1 snapshots \\
select testsnap rollback
#if there is reason to rollback walarchives then uncomment the next two lines
#ssh -l root amberroad confirm shares select postgres select pgwal snapshots \\
#select testsnap rollback
echo "---------------------------------------------------------------"
echo "Make sure pg_xlog/\* is empty ..."
echo "Make sure pg_xlog/archive_status exists..."
echo "Make sure postmaster.pid is removed"
echo "---------------------------------------------------------------"
rm $PGDATA/backup_in_progress
rm $PGDATA/postmaster.pid
rm $PGDATA/pg_xlog/\*
mkdir -p $PGDATA/pg_xlog/archive_status
echo "---------------------------------------------------------------"
echo "Set recovery.conf for recover.."
echo "---------------------------------------------------------------"
echo restore_command = \\'cp /pgwal/archive/%f "%p"\\' > $PGDATA/recovery.conf
echo "---------------------------------------------------------------"
echo "Restarting the database.."
echo "---------------------------------------------------------------"
pg_ctl -D $PGDATA start -l $PGDATA/server.log
echo "---------------------------------------------------------------"
echo "Database restored."
echo "---------------------------------------------------------------"

Of course the above scripts may not be best but gives an idea of how one can use snapshot features provided by Sun Storage 7000  Unified Storage Systems with PostgreSQL.


Jignesh Shah is Principal Software Engineer in Application Integration Engineering, Oracle Corporation. AIE enables integration of ISV products including Oracle with Unified Storage Systems. You can also follow me on my blog http://jkshah.blogspot.com


« April 2014