Wednesday May 27, 2009

mod_sed is now integrated into opensolaris

mod_sed is now integrated into opensolaris. It will be available with opensolaris 2006/09 release. Also it will be part of Sun Web Stack 1.5 release.

mod_sed sources can be downloaded from here

mod_sed is already part of httpd trunk. Old version of sources is available at webstack/mod_sed After the module was accepted into apache trunk, above link is now unmaintained. To download mod_sed, either download files from httpd trunk or from opensolaris website

Friday Aug 22, 2008

Tracing apr calls in Apache using dtrace

Tracking apr calls in Apache pid provider can be used to trace apr calls in apache. At first step we need to create probes for the apache processes. Following script will create probes for a single process.
# createprobes.d
pid$1::apr_\*:entry
/execname == "httpd"/
{
}

pid$1::apr_\*:return
/execname == "httpd"/
{
}

profile:::tick-2sec
{
    exit(0);
}
Now run the script for all httpd processes.
# for each in `pgrep httpd`; 
do 
   echo "each = $each"; 
   dtrace -s createprobes.d $each; 
done

Once the probes are created, we can use the dtrace script can be used to trace apr calls in apache.
pid\*::apr_\*:entry
/execname == "httpd"/
{
}

pid\*::apr_\*:return
/execname == "httpd"/
{
}
To execute the above script, we do not need any builtin probes inside apache. It is the pid provider which inserts the probe in user code.
If we run this script we see the following output (snippet)
# dtrace -s apr-trace.d
CPU     ID                    FUNCTION:NAME
  0  73552  apr_pool_cleanup_register:entry
  0  73535                 apr_palloc:entry
  0  78695                apr_palloc:return
  0  79116 apr_pool_cleanup_register:return
  0  79191         apr_socket_accept:return
    ...

To measure the time taken for each apr routine, we need to do the difference between the timing. Here is the aprtime.d
pid\*::apr_\*:entry
/execname == "httpd"/
{
    ts[probefunc] = timestamp;
}

pid\*::apr_\*:return
/execname == "httpd"/
{
    printf("%d nsecs", timestamp - ts[probefunc]);
}

# dtrace -s aprtime.d
CPU     ID                    FUNCTION:NAME
  0  78695                apr_palloc:return 16834 nsecs
  0  79116 apr_pool_cleanup_register:return 51750 nsecs
  0  79078     apr_thread_mutex_lock:return 11250 nsecs
  0  79086    apr_thread_cond_signal:return 14750 nsecs
  0  79080   apr_thread_mutex_unlock:return 31167 nsecs
  0  79078     apr_thread_mutex_lock:return 6500 nsecs
  ...

Friday Aug 15, 2008

Using mod_sed to filter web content in Apache

Using mod_sed to filter Web Content in apache mod_sed is a apache module which filters the web content using powerful sed commands whether is generated by php, jsp or a plain html. Basic configuration information can been seen from the README. In this blog, I will cover how cryptic but powerful sed commands can be used inside apache.

Using branches "b" to implement if/else type of code
Suppose I want to write
if (line contains "a") then
   replace "x" with "y"
else
   replace "y" with "x"
fi
If I want to write above logic using "goto" syntax then I can write something like (pseudo code ) :
if (line contains "a") go to :ifpart
# else part
   replace "y" with "x"
   go to :end
:ifpart
   replace "x" with "y"
:end
In sed we can use the branch command "b" which is equivalent of goto. Here is the sed equivalent code :
/a/ b ifpart
s/y/x/g
b end
:ifpart
s/x/y/g
:end

$ cat one.txt
ax
xyz
$ /usr/ucb/sed -f one.sed < one.txt
ay
xxz
We can write the same example in apache :
OutputSed "/a/ b ifpart"
OutputSed "s/y/x/g"
OutputSed "b end"
OutputSed ":ifpart"
OutputSed "s/x/y/g"
OutputSed ":end"


Using hold buffer "h" as a buffer to save current text
Let's say I have a text :
It is Sunday today.
And I want replace it with two lines :
It is Monday today.
It is Sunday today.
So I want to do the following (pseudo code)
saveline=curline
replace Monday with Sunday.
curline = curline + saveline
print curline
In sed, we will write something like :
# hold the buffer
h
s/Sunday/Monday/
# Append the hold buffer to current text.
G
Sed's G command append the hold buffer into the current line (Pattern space). Inside apache, we can do the same thing using OutputSed directives :
OutputSed "h"
OutputSed "s/Sunday/Monday/"
OutputSed "G"


Multiline expression using hold buffer and commands "N", "x", "h" and "H"
Sed is very powerful to handle multi line text manipulation. Suppose, I have a condition which says :
'If a line contain "Sunday" and next line contain "Monday" then replace "Sunday" in first line to "Monday" and replace "Monday" to "Tuesday" in second line.'
As a example, I have a text :
It is Sunday today.
Tomorrow will be Monday.
The output should look like :
It is Monday today.
Tomorrow will be Tuesday.
So I want to do the following (pseudo code)
search for Sunday in current line
if found then 
    saveline=curline
    Read next line into curline
    search for Tuesday in second line
    if found then 
        swap curline and readline
        replace Sunday to Monday in curline
        swap curline and readline again.
        replace Monday to Tuesday in curline
        saveline = saveline + curline
        curline = saveline
    end innerif
end outerif
Next line can be read by "N" command.
swap functionality is provided by "x" sed command.
Appending saveline with curline is provided by "H" command.
replacing "curline" with "saveline" is provided by "g" command.
Overall sed script will look like :
/Sunday/ {
# save the current line in hold buffer
h
# Delete the content of the current line.
s/.\*//
# Read next line.
N
# Delete first new line character (from previous line)
s/\^.//
# Search for Monday in next line.
    /Monday/ {
# Exchange hold buffer from current line
        x
# Now current line contain 1st line so replace Sunday with Monday.
        s/Sunday/Monday/
# Exchange hold buffer from current line
        x
# Now current line contain 2nd line so replace Monday with Tuesday.
        s/Monday/Tuesday/
# Append hold buffer (1st line) with 2nd line.
        H
# Replace hold buffer with current line
        g
    }
}
Inside apache httpd.conf, I will write the equivalent sed script as following :
OutputSed "/Sunday/ {"
OutputSed "h"
OutputSed "s/.\*//"
OutputSed "N"
OutputSed "s/\^.//"
OutputSed     "/Monday/ {"
OutputSed         "x"
OutputSed         "s/Sunday/Monday/"
OutputSed         "x"
OutputSed         "s/Monday/Tuesday/"
OutputSed         "H"
OutputSed         "g"
OutputSed     "}"
OutputSed "}"
Above example shows how powerful sed commands can be used to filter web content (whether it is generated by html or php or jsp). Details of the sed can be obtained from sed man page

Little history behind mod_sed filter module

Little history behind mod_sed filter module Sun has donated the "sed" filtering module mod_sed to Apache Software Foundation. It is not yet part of the Apache Web Server. It is under consideration with Apache httpd dev community.

In this blog, I will cover the history behind the mod_sed code. Solaris 10 has two separate "sed" utilities, one in /usr/bin/sed and another in /usr/ucb/sed. The later one is open sourced under CDDL and available in opensolaris. Sun Java System Web Server 7.0 initially included sed filter module. Sun Web Server filter module was derived from "/usr/bin/sed" code and it was written by Chris Elving

Last year, I took the Sun Web Server code and wrote the mod_sed based on Web Server code. The difference between Sun Web Server sed filter module and mod_sed is that mod_sed is derived from /usr/ucb/sed code. Sun Web Server's sed filtering module uses NSPR for portable API while mod_sed uses APR since it runs under Apache which uses APR for portability.

Functionality wise "/usr/bin/sed" code was little better than "/usr/ucb/sed" but I have fixed some of the limitation of "/usr/ucb/sed" in mod_sed e.g max number of characters in a line or hold buffer.
About

Basant Kukreja

Search

Top Tags
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today