Monday Mar 02, 2009

New DTrace Probes in PostgreSQL 8.4

DTrace probes were introduced in PostgreSQL starting in 8.2. Initially, only a handful were added, and they were mostly for developers. In 8.4, many more (46 to be exact) have been added, and they are targeted more toward database administrators as shown below.

query-parse-start(const char \*)
query-parse-done(const char \*)
query-rewrite-start(const char \*)
query-rewrite-done(const char \*)
query-start(const char \*)
query-done(const char \*)
statement-status(const char \*)
sort-start(int, bool, int, int, bool)
sort-done(unsigned long, long)
buffer-read-start(ForkNumber, BlockNumber, Oid, Oid, Oid, bool)
buffer-read-done(ForkNumber, BlockNumber, Oid, Oid, Oid, bool, bool)buffer-flush-start(Oid, Oid, Oid)
buffer-flush-done(Oid, Oid, Oid)
buffer-sync-start(int, int)
buffer-sync-done(int, int, int)
buffer-write-dirty-start(ForkNumber, BlockNumber, Oid, Oid, Oid)
buffer-write-dirty-done(ForkNumber, BlockNumber, Oid, Oid, Oid)
checkpoint-done(int, int, int, int, int)
smgr-md-read-start(ForkNumber, BlockNumber, Oid, Oid, Oid)
smgr-md-read-done(ForkNumber, BlockNumber, Oid, Oid, Oid, const char \*, int, int)
smgr-md-write-start(ForkNumber, BlockNumber, Oid, Oid, Oid)
smgr-md-write-done(ForkNumber, BlockNumber, Oid, Oid, Oid, const char \*, int, int)
xlog-insert(unsigned char, unsigned char)

Documentation should be available in 8.4 doc page soon, but if you don't want to wait, check out the doc patch I recently submitted. If you're using or plan to use the probes, I'd love to hear your feedback, both positive and constructive!

Special thanks to Theo Schlossnagle, Robert Treat, Zdenek Kotala, Alvaro Herrera and Simon Riggs for their contributions with the probes as well as reviewing them.

Wednesday Nov 19, 2008

Test driving Hyperic HQ 4.x

Hyperic just released Hyperic HQ 4.x, and I recently took it for a test drive on OpenSolaris 2008.05. The installation went smoothly, but I ran into one issue when starting the server. With a 64-bit kernel, the startup script assumes you have 64-bit JRE on the system. It my case, the 64-bit JRE wasn't available, so the server startup failed. I could have installed the 64 bit version, but I decide to just comment out the check in the startup script ( to use the 32-bit version. Below are the details of the installation steps.

Before running the Hyperic HQ install script, setup the database first. Below are the commands I used for PostgreSQL. Run psql, and then run the following commands:

postgres=# create role admin with login createdb password 'hqadmin';
postgres=# CREATE DATABASE "HQ" OWNER admin;

Now, run the install script (assuming you're unpacked the download in /var/tmp) and note the answers I entered in bold.

-bash-3.2# /var/tmp/hyperic-hq-installer/
Initializing Hyperic HQ 4.0.1 Installation...
Loading taskdefs...
Taskdefs loaded
Choose which software to install:
1: Hyperic HQ Server
2: Hyperic HQ Agent
You may enter multiple choices, separated by commas.
HQ server installation path [default '/home/hyperic']:
	1: Oracle 9i/10g
	2: PostgreSQL
	3: MySQL 5.x
What backend database should the HQ server use? [default '1']:
Enter the JDBC connection URL for the PostgreSQL database [default 'jdbc:postgresql://localhost:5432/HQ?protocolVersion=2']: Enter

Enter the username to use to connect to the database:
Enter the password to use to connect to the database: enter password here
(again): enter password again
HQ agent installation path [default '/usr/local/hyperic']: Enter

Loading install configuration...
Install configuration loaded.
Preparing to install...
Validating agent install configuration...
Validating server install configuration...
Checking server webapp port...
Checking server secure webapp port...
Checking server JRMP port...
Checking server JNP port...
Checking database permissions...
Verifying admin user properties
Validating server DB configuration...
Installing the agent...
Looking for previous installation
Unpacking agent to: /usr/local/hyperic/agent-4.0.1...
Setting permissions on agent binaries...
Fixing line endings on text files...
Installation Complete:
  Agent successfully installed to: /usr/local/hyperic/agent-4.0.1

 You can now start your HQ agent by running this command:

  /usr/local/hyperic/agent-4.0.1/bin/ start

Installing the server...
Unpacking server to: /usr/local/hyperic/server-4.0.1...
Creating server configuration files...
Copying binaries and libraries to server installation...
Copying server configuration file...
Copying server control file...
Copying server binaries...
Copying server libs...
Setting up server database...
Setting permissions on server binaries...
Fixing line endings on text files...
Installation Complete:
  Server successfully installed to: /usr/local/hyperic/server-4.0.1

 You can now start your HQ server by running this command:

  /usr/local/hyperic/server-4.0.1/bin/ start

 Note that the first time the HQ server starts up it may take several minutes
 to initialize.  Subsequent startups will be much faster.

 Once the HQ server reports that it has successfully started, you can log in
 to your HQ server at: 

  username: hqadmin
  password: hqadmin

 To change your password, log in to the HQ server, click the "Administration"
 link, choose "List Users", then click on the "hqadmin" user.

Setup completed.
A copy of the output shown above has been saved to:

At this point the installation is completed, and the server can be started with the following line, but as you can see HQ failed to start.

-bash-3.2# /usr/local/hyperic/server-4.0.1/bin/ start
Starting HQ server...
Initializing HQ server configuration...
Checking jboss jndi port...
Checking jboss mbean port...
Setting -d64 JAVA OPTION to enable SunOS 64-bit JRE
Booting the HQ server (Using JAVA_OPTS=-XX:MaxPermSize=192m -Xmx512m -Xms512m -d64)...
HQ failed to start
The log file /usr/local/hyperic/server-4.0.1/logs/server.out may contain further details on why it failed to start.

Looking at the script, I notice that it checks if you're running a 64-bit kernel. If so, it uses the 64-bit JRE. Since I only have 32-bit JRE on my system, I commented out the following section in

#if [ $THISOS = "SunOS" ] ; then
#       ARCH=`isainfo -kv`
#       case $ARCH in
#               \*64-bit\*)
#                 echo "Setting -d64 JAVA OPTION to enable SunOS 64-bit JRE"
#                       HQ_JAVA_OPTS="${HQ_JAVA_OPTS} -d64"
#                       ;;
#       esac

And rerunning the script, the server started up properly.

-bash-3.2# /usr/local/hyperic/server-4.0.1/bin/ start
Starting HQ server...
Removing stale pid file /usr/local/hyperic/server-4.0.1/logs/
Initializing HQ server configuration...
Checking jboss jndi port...
Checking jboss mbean port...
Booting the HQ server (Using JAVA_OPTS=-XX:MaxPermSize=192m -Xmx512m -Xms512m)...
HQ server booted.
Login to HQ at:

At this point you can just hit the server by pointing the browser to, and you should see the Portal Dashboard.

This is just a quick cheatsheet to get you started. The official installation/configuration instructions are available on Hyperic website.

Tuesday Sep 30, 2008

Setting up MediaWiki with PostgreSQL 8.3

Setting up MediaWiki (I use version 1.13.1) with PostgreSQL 8.3 is quite straightforward. Thanks to Greg Mullane for fixing the issues.

Since tsearch2 is integrated into core in 8.3, you only need to run the following three commands before running the MediaWiki install script.

$ createuser -S -D -R -P -E wikiuser
$ createdb -O wikiuser wikidb
$ createlang plpgsql wikidb

With 8.2, you have to run the following commands in addition to the above.

$ psql wikidb < /usr/postgres/8.2/share/contrib/tsearch2.sql (path for PostgreSQL on Solaris)
$ psql -d wikidb -c "grant select on pg_ts_cfg to wikiuser;"
$ psql -d wikidb -c "grant select on pg_ts_cfgmap to wikiuser;"
$ psql -d wikidb -c "grant select on pg_ts_dict to wikiuser;"
$ psql -d wikidb -c "grant select on pg_ts_parser to wikiuser;"
$ psql -d wikidb -c "update pg_ts_cfg set locale = current_setting('lc_collate') where ts_name = 'default';"

If you need to upgrade to PostgreSQL 8.3 from an older version of PostgreSQL, you may find the following links useful.

Thursday Apr 03, 2008

DTrace probes in PostgreSQL now work on Mac OS X Leopard

The issue with PostgreSQL's DTrace probes not working with Mac OS X Leopard as reported in the mail list has been fixed and checked into the 8.4 development tree. The problem had to do with the fact that Leopard's DTrace implementation not supporting the -G flag.

If you're curious about the gory details, check out the proposal, patch, and code commit.

With the new implementation, the steps for adding new probes are slightly different than before, but the provider and probe names remain the same. For details on how to use and add new probes, refer to the online doc.

Many thanks to Peter Eisentraut, Tom Lane, and Alvaro Herrera for their valuable feedback and assistance!

Monday Feb 04, 2008

Achieving High Availability with PostgreSQL using Solaris Cluster

Quick announcement: Sun has just released Solaris Express Developer Edition (SXDE) 1/08, a developer version that has all the latest and greatest Solaris features and tools. So what's new for PostgreSQL? The main updates/additions are 8.2.5, Perl driver, and pgAdmin III v1.6.3.

High availability (HA) is very important for most applications. PostgreSQL has an HA feature called warm standby or log shipping which allows one or more secondary servers to take over when the primary server fails. However, there are some limitations with PostgreSQL warm standby, two of which are:

  1. No mechanism in PostgreSQL to identify and perform automatic failover when the primary server fails
  2. No read-only support on the standby server

Fortunately, Solaris Cluster can be used to solve limitation #1. Solaris Cluster is a robust clustering solution that has been around for many years, and best of all it's open sourced and free.

Detlef Ulherr has recently implemented the Solaris Cluster agent to work with PostgreSQL warm standby. We discussed different possible use-case scenarios with PostgreSQL warm standby, and he came up with a design that I think will work well for non-shared storage clustering. Maybe not everything will be perfect initially, but as more people test out the agent, we'll know what can be improved. The agent is still in beta now and will be released soon, probably in a couple of months.

The cool thing with Solaris Cluster is that you can now setup a cluster on a single node using multiple Solaris Zones . This is extremely useful because it eliminates the need for multiple machines or complicated hardware setup if you just want to try it out or if you want a simple environment for doing development. Here's more details on the Solaris Cluster and Zones integration

Of course you wouldn't want to deploy your HA application on a single machine. In production environment, you should have at least a two nodes cluster. Please refer to the Solaris Cluster documentation for more info.

From my recent test, I setup the cluster on a single machine with two Solaris Zones. Here's how the automatic failover works in a nutshell:

  • Client connects to a logical host. The logical host is configured to have two resource groups, each containing a Zone.
  • The logical host initially points to the primary server (Zone 1) where PostgreSQL is running, and PostgreSQL is configured to ship WAL logs to the secondary server (Zone 2) where PostgreSQL is running in continuous recovery mode.
  • When primary server fails, the Solaris Cluster agent detects the failure and triggers the logical host to automatically switch to the IP address of the secondary server.
  • From the PostgreSQL client perspective, it's still using the same logical host, but the actual PostgreSQL server has moved to a different machine, all happens transparently.
  • If the client was connecting to the DB on the primary, the session would be disconnected momentarily and reconnected automatically to the DB on the secondary server, and the application would continue on its merry way.

The combination of PostgreSQL warm standby and Solaris Cluster can provide an enterprise class clustering solution for free (unless to need support services off course). So, please try it out and provide your feedback on what can be improved.

In my next blog, I will discuss how Solaris ZFS and Zones can be used in a clever way to overcome limitation #2. This idea has been used already, and some of you may have seen Theo's blog on this topic. I will provide a working sample code and step by step instructions for setting it up.

Wednesday Jan 16, 2008

Postgres and Solaris Virtualization

Many people today use virtualization technology to consolidate applications to fewer and more powerful systems to improve system utilization, save datacenter space and power. One way to achieve this on Solaris is through Zones . Combined with Solaris Resource Management , a Solaris Zone provides a virtualized environment where resources such as CPU and memory can be controlled.

To demonstrate how this works, I will show a simple example of how to run PostgreSQL in Solaris Zone and adjust CPU cap as appropriate for this application. You can also restrict other resources such as memory or number of processes in a zone, but this example only covers CPU capping.

Note: To follow this example, you need a machine with Solaris Express Developer Edition 9/07 or later installed.

1) Create a new zone. Here is a script you can use to automate the zone creation. Save it to a file called, and run it like this ./ pgzone /zones By default, this script assign the zone 20% of a CPU. You can adjust this number(ncpus) as necessary.

    PrintMsg() {
      echo "$CURRENTTIME: $\*"
    if [[ -z "$1" || -z "$2" ]]; then
                    PrintMsg "Usage: $0  "
                    PrintMsg "Example: $0 pgzone /zones"
                    exit 2
    if [[ ! -d $2 ]]; then
                    PrintMsg "$2 does not exist or is not a directory"
                    exit 1
    PrintMsg "Creating a new zone"
    zoneadm -z $name list > /dev/null 2>&1
    if [ $? != 0 ]; then
            PrintMsg "Configuring $name"
            rm -f $commands
            echo "create" > $commands
            echo "set zonepath=$dir/$name" >> $commands
            echo "set autoboot=true" >> $commands
            echo "set scheduling-class=FSS" >> $commands
            echo "add capped-cpu" >> $commands
            echo "set ncpus=0.2" >> $commands
            echo "end" >> $commands
            echo "add capped-memory" >> $commands
            echo "set physical=256m" >> $commands
            echo "set swap=256m" >> $commands
            echo "end" >> $commands
            echo "set cpu-shares=10" >> $commands
            echo "set max-lwps=500" >> $commands
            echo "commit" >> $commands
            zonecfg -z $name -f $commands 2>&1 | \\
                sed 's/\^/    /g'
            PrintMsg "$name already configured"
    # Installing
    if [ `zoneadm -z $name list -p | \\
        cut -d':' -f 3` != "configured" ]; then
            PrintMsg "$name already installed"
            PrintMsg "Installing $name"
            mkdir -pm 0700 $dir/$name
            chmod 700 $dir/$name
            zoneadm -z $name install > /dev/null 2>&1
            PrintMsg "Setting up sysid for $name"
            rm -f $cfg
            echo "network_interface=NONE {hostname=$name}" > $cfg
            echo "system_locale=C" >> $cfg
            echo "terminal=xterms" >> $cfg
            echo "security_policy=NONE" >> $cfg
            echo "name_service=NONE" >> $cfg
            echo "timezone=US/Pacific" >> $cfg
            echo "root_password=Qexr7Y/wzkSbc" >> $cfg  # 'l1a'
    PrintMsg "Booting $name"
    zoneadm -z $name boot
2) Open a terminal. As root, log into the zone using zlogin (e.g zlogin pgzone).

3) Once you're in the zone, do the following:
   a. As root, su to postgres:
      # su - postgres

   b. Create PostgreSQL DB cluster:
      $ /usr/postgres/8.2/bin/initdb -D /var/postgres/8.2/data

   c. As root, use the SMF's svcadm command to start PostgreSQL:
      # /usr/sbin/svcadm enable postgresql:version_82

   d. Create and load a db called bench
      $ /usr/postgres/8.2/bin/createdb bench
      $ /usr/postgres/8.2/bin/pgbench -i -s 5 bench

   e. Run this Cartesian joint multiple times (try 5) to generate CPU load

      $ /usr/postgres/8.2/bin/psql -d bench -c "select count(\*) from accounts foo, accounts bar;" &
4) In the global zone, using another terminal window, run the following command to see cpu usage for each zone. Note the zone that Postgres is running should cap to around 20% if you have a single CPU system.
   # prstat -Z
5) You can dynamically adjust the amount of CPU assigned to the zone using the prctl command. In another terminal window, run:
   # prctl -n zone.cpu-cap -i zone 
So in a nutshell, that's how you can use Solaris Zones and Resource Management to improve system utilization in a virtualization environment. As I mention in the beggining, you can also cap other resouources as well which make the combination of Solaris Zones and resource management very powerful.

Tuesday Jun 12, 2007

PostgreSQL 8.2.4 in Solaris Expres Developer Edition

Today Sun announced the availability of Solaris Express Developer Edition 5/07. There are many new features in this release and among them is the inclusion of PostgreSQL 8.2.4 with SMF and DTrace integration.

Here's how you'd run Postgres 8.2:
    1) As root, su to postgres
    # su - postgres

    2) Create Postgres DB cluster
    $ /usr/postgres/8.2/bin/initdb -D /var/postgres/8.2/data

    3) As root, use the SMF's svadm command to start Postgres
    # /usr/sbin/svcadm enable postgresql:version_82

Note that Postgres 8.1 is also available. The binaries are located in /usr/bin and /usr/postgres/8.2/bin for 8.1 & 8.2 respectively. To use 8.2, make sure to add /usr/postgres/8.2/bin in the PATH. For more info see postgres_82 man page (e.g. run "man postgres_82" from the command prompt).

With 8.2.4 all the user-level DTrace probes are now enabled. To see the list of available probes, run "dtrace -l | grep postgres"

For more info on how to use the probes, refer to

Sunday Aug 27, 2006

User-Level DTrace Probes in PostgreSQL

I'm excited to announce that PostgreSQL 8.2 now has user-level DTrace probes embedded in the source code. These probes will enable users to easily observe the behavior of PostgreSQL with simple D scripts, even in production. As you may already know, DTrace is being ported to FreeBSD and Mac OS X, so PostgreSQL users will be able to use the embedded probes on these OSes as well besides Solaris. My hope is that the presence of DTrace probes will not only help users identify one-off performance problem but will also enable developers to identify systemic performance and scalability issues on big multi-cpu/core/thread systems.

Here's the current list of available probes.
provider postgresql {
        probe transaction__start(int);
        probe transaction__commit(int);
        probe transaction__abort(int);
        probe lwlock__acquire(int, int);
        probe lwlock__release(int); 
        probe lwlock__startwait(int, int);
        probe lwlock__endwait(int, int);
        probe lwlock__condacquire(int, int);
        probe lwlock__condacquire__fail(int, int);
        probe lock__startwait(int, int);
        probe lock__endwait(int, int);

As you can see, the number of probes is small initially, but more will be added over time, and I encourage the community to identify areas in PostgreSQL where more observability is needed, for both developers and admins.

PostgreSQL runs on many operating systems, and the community is quite strict about keeping the code generic. To accomodate this, we created a higher level of abstraction whereby generic macro names are used instead of the DTrace specific macros. For example, we define the following macros PG_TRACE, PG_TRACE1, etc. which ultimately translate to DTRACE_PROBE, DTRACE_PROBE1, ... when used on system with DTrace. Doing this allow the tracing code to use generic macro names and these macros can be mapped to other tracing facilities for other operating systems.

The next few sections explains how to:
  • Compile PostgreSQL with DTrace
  • Use the existing DTrace probes
  • Add new DTrace probes

Compile PostgreSQL with DTrace

By default DTrace probes are disabled, and the user needs to explicitly tell the configure script to make the probes available in PostgreSQL. Certainly, enabling DTrace only makes sense on Operating Systems with DTrace facility. Currently DTrace is available on Solaris 10+ and soon on FreeBSD and Mac OS X.

To include DTrace probes in a 32 bit binary, specify --enable-dtrace to configure. For example:
        $ configure --enable-dtrace ...

To include DTrace probes in a 64 bit binary, specify --enable-dtrace and DTRACEFLAGS="-64" to configure. For example:

         Using gcc compiler:
        $ configure CC='gcc -m64' --enable-dtrace DTRACEFLAGS='-64' ...
         Using Sun compiler:
        $ configure CC='/path_to_sun_compiler/cc -xtarget=native64' --enable-dtrace DTRACEFLAGS='-64' ...

a) To successfully compile PostgreSQL 8.2 with --enable-dtrace, you need to run Solaris Express. The DTrace version in Solaris 10 (up until 11/06) does not allow probes to be added to static functions. This limitation will be fixed in the next update of Solaris 10.
b) When using DTRACEFLAGS='-64', you also have to tell the compiler to build 64 bit binary as shown in the configure lines above; otherwise, you will get compilation errors.

Use Existing DTrace Probes

Using the probes in PostgreSQL is similar to using probes in other DTrace providers. Below is an example of a simple D script using the transaction-start, transaction-commit, and transaction-abort probes. The script prints out the total number of started, committed, and aborted transactions.

#!/usr/sbin/dtrace -qs 

        @start["Start"] = count();
        self->ts  = timestamp;

        @abort["Abort"] = count();

        @commit["Commit"] = count();
        @time["Total time (ns)"] = sum(timestamp - self->ts);

Executing the above script produces the following output.

# ./txn_count.d `pgrep -n postgres`

  Start                                         71
  Commit                                   70
  Total time (ns)                        2312105013

A number of sample D scripts are available from the DTrace's PgFoundry project

To learn more about DTrace, refer to the HowTo and DTrace Guides.

Add New DTrace Probes

New DTrace probes can easily be added to PostgreSQL. For example, if you were to add transaction-start probe, follow these simple steps:

1) Add the probe definitions to src/backend/utils/probes.d
   provider postgresql {
        probe transaction__start(int);

When a dash (-) is used in the probe name, it needs to be converted to double underscores (__) in the probe definition file. So, the above probe is called transaction-start in the D script.

2) Add "PG_TRACE1 (transaction__start, s->transactionId);" to backend/access/transam/xact.c
   static void

         \* generate a new transaction id
        s->transactionId = GetNewTransactionId(false);


        PG_TRACE1 (transaction__start, s->transactionId);


a) PG_TRACE1 is mapped to DTRACE_PROBE1. See src/include/pg_trace.h
b) The provider name for all probes in PostgreSQL is called postgresql per the decision by the developers, so it's not specified in PG_TRACE. See src/include/pg_trace.h.
c) Make sure the data types in the probe definition match the argument passed to the probe. In this case s->transactionId has to be an integer (int).

When you have probes that might be useful to the community at large, send a proposal/patch to to get feedback from the developers.

3) Check to make sure the new probe is available

After recompiling, run the new binary, and as root, execute the following DTrace command to check that your newly added probe is available.

# dtrace -l -n transaction-start

More details which led to DTrace inclusion in PostgreSQL

1) Proposal submitted to the developer community.
2) Presented the proposal at the PostgreSQL Anniversary Summit (first developers conference). The timing of the conference was perfect, and I was fortunate to have the opportunity to present the proposal and demo'ed DTrace to a live audience. I think the discussion with the developers/hackers after the conference was key in solidifying the proposed implementation and getting the thumbs up for inclusion into 8.2.
3) Patch submitted and follow-on discussions
4) Patch was updated by Peter Eisentraut
5) Patch was finally committed by Peter Eisentraut


Many people from the community and at Sun provided excellent feedback on the proposal and implementation, but without the help for the following individuals, it would not have been possible to get DTrace into PostgreSQL.

Gavin Sherry - Gavin was the first person in the community to help us identify locations in PostgreSQL to insert probes. We initially wanted to create a demo, and with a short notice, Gavin made himself available to talk with us.

Tom Lane - At the PostgerSQL Anniverary Summit, Tom helped verify the probe locations and corrected a few of them and provide excellent feedback on how the framework should be implemented.

Peter Eisentraut - Peter stepped up to help incorporate DTrace into PostgreSQL build system. His help was invaluable in getting DTrace in 8.2 before code freeze.

Angelo Rajadurai - Angelo is a DTrace guru in MDE, and he was a great help in getting me up to speed with adding user-level probes and DTrace in general.

Adam Leventhal and Bryan Cantrill (The DTrace creators) - For making themselves available to answer questions and provide feedback.




« March 2015