Ensuring that directio is active on an Oracle server

Last week, an Oracle DBA wanted me to prove that directio was active on a Solaris 10 server. Oracle data resided on UFS filesystems and mounted with "forcedirectio". When set like this at the filesystem level, in theory, all Oracle accesses to the filesystems are done via direct I/O independently of what is set at the Oracle instance level (filesystemio_options). In practise, there is no real evidence that directio is truly in effect.

Let's consider such a server. To reiterate, the server is using UFS filesystems mounted with forcedirectio. Oracle "filesystemio_options" is set to "asynch" (don't ask me why it is not set to a more recommended value). I want to prove that direct I/O is in effect.

1) I can run "vmstat -p" and check that there is no I/O going through the cache filesystem. But it is more a hint than a proof.

2) I could have used "directiostat" from http://www.solarisinternals.com/si/tools/directiostat/index.php. But I was not authorized to transfer and run it on the server under study.

3) Trussing an Oracle process - here, an Oracle Checkpoint process - yields the following:

root@server1 # pgrep -lf ckpt
22806 ora_ckpt_SID1
root@server1 # truss -f -t !lwp_park,semtimedop -p 22806
22806/2:        kaio(6, 0x00000000, 0xFFFFFFFF7CE480C0, 0x00000010, 0xFFFFFFFF7CE49D58, 0xFFFFFFFF7F700A00) (sleeping...)
22806/1:        kill(22884, SIG#0)                              = 0
22806/1:        open("/proc/22884/psinfo", O_RDONLY)            = 16
22806/1:        read(16, "02\\0\\0\\0\\0\\0\\0\\v\\0\\0 Y d".., 416)     = 416
22806/1:        close(16)                                       = 0
22806/1:        kill(22886, SIG#0)                              = 0
22806/1:        open("/proc/22886/psinfo", O_RDONLY)            = 16
22806/1:        read(16, "02\\0\\0\\0\\0\\0\\0\\v\\0\\0 Y f".., 416)     = 416
22806/1:        close(16)                                       = 0
22806/1:        kaio(AIOWAIT, 0xFFFFFFFF7FFFD950)               = 1
22806/16:       pwrite(257, "15C2\\0\\0\\0\\0\\003\\0\\0\\0\\0".., 16384, 49152) = 16384
22806/17:       pwrite(258, "15C2\\0\\0\\0\\0\\003\\0\\0\\0\\0".., 16384, 49152) = 16384
22806/16:       kaio(AIONOTIFY, 0)                              = 0
22806/17:       kaio(AIONOTIFY, 0)                              = 0
22806/10:       pwrite(256, "15C2\\0\\0\\0\\0\\003\\0\\0\\0\\0".., 16384, 49152) = 16384
22806/1:        kaio(AIOWAIT, 0xFFFFFFFF7FFFD950)               = 1
22806/10:       kaio(AIONOTIFY, 0)                              = 0

So, the process is doing asynchronous I/O writes using kaio. It continues to run without waiting for its writes to be acknowledged. The I/Os are handled asynchronously by the Solaris kernel. But the process truss looks the same whether or not directio is enabled.

4) The only way I have found to ensure directio is enabled is with DTrace. The first DTrace command below comes fromhttp://www.forsythesunsolutions.com/node/34. It checks how I/O is done to a particular file (here an Oracle Control file). The stack shows that the file is accessed through directio.

root@server1 # pfiles 22806 | grep ctl
      /oracle/d1/SID1/ctl/control_01.ctl
      /oracle/d2/SID1/ctl/control_02.ctl
      /oracle/d3/SID1/ctl/control_03.ctl
root@server1 # dtrace -n io:::start'/(args[0]->b_flags & B_READ) == 0 && args[2]->fi_pathname == "/oracle/d3/SID1/ctl/control_03.ctl"/{@[stack()] = count()}'
dtrace: description 'io:::start' matched 6 probes
\^C

              genunix`bdev_strategy+0x44
              ufs`directio_start+0xfc
              ufs`ufs_directio_write+0x8c4
              ufs`ufs_write+0x174
              genunix`fop_write+0x20
              genunix`pwrite+0x22c
              unix`syscall_trap+0xac
                8

              genunix`bdev_strategy+0x44
              md_stripe`md_stripe_strategy+0x2fc
              md_mirror`mirror_write_strategy+0x7dc
              md`mdstrategy+0xd4
              ufs`directio_start+0xfc
              ufs`ufs_directio_write+0x8c4
              ufs`ufs_write+0x174
              genunix`fop_write+0x20
              genunix`pwrite+0x22c
              unix`syscall_trap+0xac
                8

An even better proof is provided by the DTrace script from http://wikis.sun.com/display/SAPonSun/Demystifying+Oracle+IO. I copied the script verbatim to /var/tmp/sun/dio2.d. Running the script on the same Oracle Checkpoint process trussed above clearly shows what the process is performing directio.
 
root@server1 # cat /var/tmp/sun/dio2.d
#!/usr/sbin/dtrace -s
#pragma D option flowindent

fbt:ufs::entry
/pid == $1/
{
        self->ts[probefunc] = timestamp;
}

fbt:ufs::return
/pid == $1 && self->ts[probefunc] != 0/
{
        self->duration[probefunc] = (timestamp - self->ts[probefunc])/1000;
}

fbt:ufs::
/pid == $1/
{
        trace(self->duration[probefunc]);
        self->duration[probefunc] = 0;

root@server1 # /var/tmp/sun/dio2.d 22806
dtrace: script '/var/tmp/sun/dio2.d' matched 1826 probes
CPU FUNCTION
 68  -> ufs_rwlock                                            0
 68  <- ufs_rwlock                                           57
 68  -> ufs_write                                             0
 68    -> ufs_check_rewrite                                   0
 68      -> bmap_has_holes                                    0
 68      <- bmap_has_holes                                   18
 68    <- ufs_check_rewrite                                  54
 68    -> ufs_lockfs_begin                                    0
 68      -> ufs_lockfs_is_under_rawlockfs                     0
 68      <- ufs_lockfs_is_under_rawlockfs                    21
 68    <- ufs_lockfs_begin                                   63
 68    -> ufs_directio_write                                  0
 68      -> bmap_read                                         0
 68        -> findextent                                      0
 68        <- findextent                                     15
 68      <- bmap_read                                        48
 68      -> directio_start                                    0
 68      <- directio_start                                  145
 68      -> directio_wait                                     0
 68        -> directio_wait_one                               0
 68        <- directio_wait_one                             354
 68      <- directio_wait                                   388
 68    <- ufs_directio_write                                663
 68    -> ufs_itimes_nolock                                   0
 68      -> ufs_imark                                         0
 68      <- ufs_imark                                        14
 68    <- ufs_itimes_nolock                                  41
 68    -> ufs_lockfs_end                                      0
 68      -> ufs_lockfs_top_vop_return                         0
 68      <- ufs_lockfs_top_vop_return                        19
 68    <- ufs_lockfs_end                                     53
 68  <- ufs_write                                           978

An even better proof is provided by the DTrace script from http://wikis.sun.com/display/SAPonSun/Demystifying+Oracle+IO. I copied the script verbatim to /var/tmp/sun/dio2.d. Running the script on the same Oracle Checkpoint process trussed above clearly shows what the process is performing directio.To complete my tests, I ran the same set of DTrace command and script on a similar server running with UFS filesystems mounted WITHOUT forcedirectio. This time, the results reveal the use of buffered I/O instead of direct I/O.

root@server2 # pgrep -lf ora_ckpt_SID2
22185 ora_ckpt_SID2
root@server2 # pfiles 22185 | grep ctl
      /oracle/d1/SID2/ctl/control_01.ctl
      /oracle/d2/SID2/ctl/control_02.ctl
      /oracle/d3/SID2/ctl/control_03.ctl
root@server2 # dtrace -n io:::start'/(args[0]->b_flags & B_READ) == 0 && args[2]->fi_pathname == "/oracle/d3/SID2/ctl/control_03.ctl"/{@[stack()] = count()}'
dtrace: description 'io:::start' matched 6 probes
\^C

              genunix`bdev_strategy+0x44
              ufs`lufs_write_strategy+0x11c
              ufs`ufs_putapage+0x378
              ufs`ufs_putpages+0x2a4
              genunix`fop_putpage+0x1c
              genunix`segmap_release+0x344
              ufs`wrip+0x850
              ufs`ufs_write+0x580
              genunix`fop_write+0x20
              genunix`pwrite+0x22c
              unix`syscall_trap+0xac
                4

              genunix`bdev_strategy+0x44
              md_stripe`md_stripe_strategy+0x2f4
              md_mirror`mirror_write_strategy+0x750
              md`mdstrategy+0xd4
              ufs`lufs_write_strategy+0x11c
              ufs`ufs_putapage+0x378
              ufs`ufs_putpages+0x2a4
              genunix`fop_putpage+0x1c
              genunix`segmap_release+0x344
              ufs`wrip+0x850
              ufs`ufs_write+0x580
              genunix`fop_write+0x20
              genunix`pwrite+0x22c
              unix`syscall_trap+0xac
                4
root@server2 #
/var/tmp/sun/dio2.d 7053
dtrace: script '/var/tmp/sun/dio2.d' matched 1806 probes
CPU FUNCTION
  0  -> ufs_rwlock                                            0
  0  <- ufs_rwlock                                           10
  0  -> ufs_write                                             0
  0    -> ufs_check_rewrite                                   0
  0      -> bmap_has_holes                                    0
  0      <- bmap_has_holes                                   10
  0    <- ufs_check_rewrite                                  30
  0    -> ufs_lockfs_begin                                    0
  0      -> ufs_lockfs_is_under_rawlockfs                     0
  0      <- ufs_lockfs_is_under_rawlockfs                    13
  0    <- ufs_lockfs_begin                                   38
  0    -> ufs_itimes_nolock                                   0
  0      -> ufs_imark                                         0
  0      <- ufs_imark                                         7
  0    <- ufs_itimes_nolock                                  24
  0    -> wrip                                                0
  0      -> bmap_write                                        0
  0      <- bmap_write                                        8
  0      -> ufs_putpage                                       0
  0      <- ufs_putpage                                       8
  0      -> ufs_putpages                                      0
  0        -> ufs_putapage                                    0
  0          -> bmap_read                                     0
  0            -> findextent                                  0
  0            <- findextent                                  8
  0          <- bmap_read                                    25
  0          -> lufs_write_strategy                           0
  0            -> deltamap_remove                             0
  0            <- deltamap_remove                            10
  0          <- lufs_write_strategy                          91
  0        <- ufs_putapage                                  783
  0      <- ufs_putpages                                    803
  0      -> ufs_check_rewrite                                 0
  0        -> bmap_has_holes                                  0
  0        <- bmap_has_holes                                  8
  0      <- ufs_check_rewrite                                26
  0      -> bmap_write                                        0
  0      <- bmap_write                                        6
  0      -> ufs_putpage                                       0
  0      <- ufs_putpage                                       6
  0      -> ufs_putpages                                      0
  0        -> ufs_putapage                                    0
  0          -> bmap_read                                     0
  0            -> findextent                                  0
  0            <- findextent                                  6
  0          <- bmap_read                                    20
  0          -> lufs_write_strategy                           0
  0            -> deltamap_remove                             0
  0            <- deltamap_remove                             8
  0          <- lufs_write_strategy                          57
  0        <- ufs_putapage                                  358
  0      <- ufs_putpages                                    376
  0      -> ufs_check_rewrite                                 0
  0      <- ufs_check_rewrite                                 8
  0      -> top_delta                                         0
  0        -> deltamap_add                                    0
  0        <- deltamap_add                                    9
  0      <- top_delta                                        25
  0      -> ufs_itimes_nolock                                 0
  0        -> ufs_imark                                       0
  0        <- ufs_imark                                       6
  0      <- ufs_itimes_nolock                                22
  0    <- wrip                                             1437
  0    -> ufs_lockfs_end                                      0
  0      -> ufs_lockfs_top_vop_return                         0
  0      <- ufs_lockfs_top_vop_return                        11
  0    <- ufs_lockfs_end                                     34
  0  <- ufs_write                                          1630

 

Comments:

I am interested in finding the same info out about a vxfs filesystem. Do you know which calls from vxio are using directio versus which are not or is that something I would need to get from veritas/symmantec?

Posted by Chris Burkhardt on December 22, 2009 at 03:16 AM PST #

Post a Comment:
Comments are closed for this entry.
About

apatoki

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today