Thursday Sep 21, 2006

Oracle redo logs and dd

I recently had a case with a customer concerned over the time taken to dd an Oracle redo log file even though Oracle had been shutdown. Finding this strange I thought I'd use TNF probes to see what was happening.
#!/bin/sh -x

# Script to collect TNF data whilst dd is happening on Oracle log file
#
# Andy H 19/09/06
#

TMPDIR=${TMPDIR:-/var/tmp/tnf}
output_file=$TMPDIR/ktrace.tnf
command="dd if=/u01/oracle/oradata/TEST/redo01.log of=/u01/oracle/oradata/TEST/tmp.tmp"

start_prex()
{
        echo "Enabling kernel tracing"
prex -k < $TMPDIR/pid.out
                wait $pid
        fi
}

# Extract kernel trace buffer
ktrace_xtract() {
        echo "Extracting kernel trace buffer"
        tnfxtract $output_file || fail "Could not extract kernel trace buffer"
        ls -l $output_file
}

# Get kernel symbols
ktrace_sym() {
        echo "Collecting kernel symbol information"
        /usr/ccs/bin/nm -fx /dev/ksyms | tr "|" " " > $TMPDIR/nm.out
}

# tar and gzip data
ktrace_gzip() {
        echo "Packaging collected data"
        /usr/sbin/tar cf /tmp/tnf.tar $TMPDIR
        /usr/bin/gzip /tmp/tnf.tar
        echo "Collection complete, please send in /tmp/tnf.tar.gz for analysis"
}

start_prex
run_command
ktrace_xtract
ktrace_sym
stop_prex
ktrace_gzip
Following using some clever sorting awk script from Tim Uglow I compared the TNF output from the normal dd timing and then the slow one.
64.922000  -  2562 1 0x300042edc20 0 page_unmap vnode: 0x600062ee440 offset: 157253632
64.924500  -  2562 1 0x300042edc20 0 page_unmap vnode: 0x600062ee440 offset: 157261824
64.927100  -  2562 1 0x300042edc20 0 page_unmap vnode: 0x600062ee440 offset: 157270016
64.929200  -  2562 1 0x300042edc20 0 page_unmap vnode: 0x600062ee440 offset: 157278208
64.931500  -  2562 1 0x300042edc20 0 page_unmap vnode: 0x600062ee440 offset: 157286400
101.904300  -  2562 1 0x300042edc20 0 syscall_end rval1: 4 rval2: 0 errno: 0 creat64 98.822400 No error
101.908200  -  2562 1 0x300042edc20 0 thread_queue tid: 0x300042edc20 cpuid: 0 priority: 0 queue_length: 0x0
101.908500  -  2562 1 0x300042edc20 0 thread_state tid: 0x300042edc20 state: 8 waiting for CPU
101.924000  -  2562 1 0x300042edc20 0 syscall_start sysnum: 137 sysconfig
101.924500  -  2562 1 0x300042edc20 0 syscall_end rval1: 8192 rval2: 18466816 errno: 0 sysconfig 0.000500 No error
101.928700  -  2562 1 0x300042edc20 0 syscall_start sysnum: 17 brk
101.931200  -  2562 1 0x300042edc20 0 syscall_end rval1: 0 rval2: 134217728 errno: 0 brk 0.002500 No error
101.932500  -  2562 1 0x300042edc20 0 syscall_start sysnum: 17 brk
101.941300  -  2562 1 0x300042edc20 0 syscall_end rval1: 0 rval2: 134217728 errno: 0 brk 0.008800 No error
101.943200  -  2562 1 0x300042edc20 0 thread_state state: 4 asleep in user data fault
101.945400  -  2562 1 0x300042edc20 0 address_fault address: 0x28000 fault_type: 0 access: 1
101.947900  -  2562 1 0x300042edc20 0 anon_zero address: 0x28000
101.967700  -  2562 1 0x300042edc20 0 thread_state state: 0 in userland 
101.978500  -  2562 1 0x300042edc20 0 syscall_start sysnum: 98 sigaction
101.980600  -  2562 1 0x300042edc20 0 syscall_end rval1: 0 rval2: 0 errno: 0 sigaction 0.002100 No error
101.985800  -  2562 1 0x300042edc20 0 syscall_start sysnum: 3 read
102.000000  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8193 errno: 0 read 0.014200 No error
102.009800  -  2562 1 0x300042edc20 0 syscall_start sysnum: 4 write
102.073000  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8962 errno: 0 write 0.063200 No error
102.074500  -  2562 1 0x300042edc20 0 syscall_start sysnum: 3 read
102.079700  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8193 errno: 0 read 0.005200 No error
102.082300  -  2562 1 0x300042edc20 0 syscall_start sysnum: 4 write
102.088200  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8962 errno: 0 write 0.005900 No error
102.089200  -  2562 1 0x300042edc20 0 syscall_start sysnum: 3 read
102.093600  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8193 errno: 0 read 0.004400 No error
102.096200  -  2562 1 0x300042edc20 0 syscall_start sysnum: 4 write
102.129600  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8962 errno: 0 write 0.033400 No error
102.130900  -  2562 1 0x300042edc20 0 syscall_start sysnum: 3 read
102.136100  -  2562 1 0x300042edc20 0 syscall_end rval1: 512 rval2: 8193 errno: 0 read 0.005200 No error
102.138600  -  2562 1 0x300042edc20 0 syscall_start sysnum: 4 write
So each of the read() / writes() happen one after each other.....but what about on the so called broken one:
59.433900  -  2635 1 0x300041792c0 3 page_unmap vnode: 0x600062ee440 offset: 157253632
59.436200  -  2635 1 0x300041792c0 3 page_unmap vnode: 0x600062ee440 offset: 157261824
59.438300  -  2635 1 0x300041792c0 3 page_unmap vnode: 0x600062ee440 offset: 157270016
59.440300  -  2635 1 0x300041792c0 3 page_unmap vnode: 0x600062ee440 offset: 157278208
59.442600  -  2635 1 0x300041792c0 3 page_unmap vnode: 0x600062ee440 offset: 157286400
95.378500  -  2635 1 0x300041792c0 3 syscall_end rval1: 4 rval2: 0 errno: 0 creat64 91.600000 No error
95.380600  -  2635 1 0x300041792c0 3 thread_queue tid: 0x300041792c0 cpuid: 3 priority: 0 queue_length: 0x0
95.380800  -  2635 1 0x300041792c0 3 thread_state tid: 0x300041792c0 state: 8 waiting for CPU
95.398100  -  2635 1 0x300041792c0 3 syscall_start sysnum: 137 sysconfig
95.398600  -  2635 1 0x300041792c0 3 syscall_end rval1: 8192 rval2: 18466816 errno: 0 sysconfig 0.000500 No error
95.404300  -  2635 1 0x300041792c0 3 syscall_start sysnum: 17 brk
95.406100  -  2635 1 0x300041792c0 3 syscall_end rval1: 0 rval2: 134217728 errno: 0 brk 0.001800 No error
95.407300  -  2635 1 0x300041792c0 3 syscall_start sysnum: 17 brk
95.415700  -  2635 1 0x300041792c0 3 syscall_end rval1: 0 rval2: 134217728 errno: 0 brk 0.008400 No error
95.417700  -  2635 1 0x300041792c0 3 thread_state state: 4 asleep in user data fault
95.419500  -  2635 1 0x300041792c0 3 address_fault address: 0x28000 fault_type: 0 access: 1
95.421600  -  2635 1 0x300041792c0 3 anon_zero address: 0x28000
95.440600  -  2635 1 0x300041792c0 3 thread_state state: 0 in userland 
95.453200  -  2635 1 0x300041792c0 3 syscall_start sysnum: 98 sigaction
95.454800  -  2635 1 0x300041792c0 3 syscall_end rval1: 0 rval2: 0 errno: 0 sigaction 0.001600 No error
95.460900  -  2635 1 0x300041792c0 3 syscall_start sysnum: 3 read
95.483700  -  2635 1 0x300041792c0 3 strategy device: 118,22  block: 2262752 size: 0x200 buf: 0x60008105278 flags: 0x2200061
95.522200  -  2635 1 0x300041792c0 3 thread_block reason: 0x60008105338 stack: 
95.523200  -  2635 1 0x300041792c0 3 thread_state state: 7 asleep for other reason
95.523400  -  2635 1 0x300041792c0 3 thread_state tid: 0x2a10001fcc0 state: 1 in syscall/pagefault
95.939700  -  2635 1 0x300041792c0 17 syscall_end rval1: 512 rval2: 8193 errno: 0 read 0.478800 No error
95.972400  -  2635 1 0x300041792c0 17 syscall_start sysnum: 4 write
96.070000  -  2635 1 0x300041792c0 17 syscall_end rval1: 512 rval2: 8962 errno: 0 write 0.097600 No error
96.072000  -  2635 1 0x300041792c0 17 syscall_start sysnum: 3 read
96.078800  -  2635 1 0x300041792c0 17 strategy device: 118,22  block: 2262753 size: 0x200 buf: 0x60008105278 flags: 0x2200061
96.105100  -  2635 1 0x300041792c0 17 thread_block reason: 0x60008105338 stack: 
96.106100  -  2635 1 0x300041792c0 17 thread_state state: 7 asleep for other reason
96.106300  -  2635 1 0x300041792c0 17 thread_state tid: 0x2a1005d1cc0 state: 1 in syscall/pagefault

Interesting we appear to be blocking for a reason....ahhh following from a quick chat with John Alderson and reading his blog entry this now becomes clear.

Basically directio is still persistant on the file after Oracle is shutdown. DIRECTIO_ON allows you to force directio per-file and is used potentially by databases if their app tunable is so set (which in our case it is). The flag is set in the inode's i_flag and hangs around cached within the inode in the DNLC and will effect files until the inode is flushed from the DNLC.

About

I'm an RPE (Revenue Product Engineering) Engineer supporting Solaris on Exadata, Exalogic and Super Cluster. I attempt to diagnose and resolve any problems with Solaris on any of the Engineered Systems.

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today