Caching Dynamic Content with Apache HTTP Server on OpenSolaris 2008.11

This article will show how to use Apache's mod_cache and mod_disk_cache to cache dynamic content from a web application, reducing load on the application and improving response time.  For the purposes of illustration, Tomcat will host the web application, and the requests to be cached are the example JSPs installed with Tomcat.

Setup and initial experimentation

Ensure that the following packages are installed:

  • SUNWapch22 (Apache HTTP Server)
  • SUNWapch22m-jk (mod_jk, for communicating with Tomcat)
  • SUNWtcat (Tomcat)

The example commands assume that the user is the primary administrator.

If Apache is not already set up, run the following commands to import the SMF definitions for Apache and enable it to run as a service:

$ svccfg import /var/svc/manifest/network/http-apache22.xml
$ svcadm enable network/http:apache22

If Tomcat is not already set up, run the following commands to create a sample configuration and start Tomcat:

$ pfexec cp /var/apache/tomcat/conf/server.xml-example \\
/var/apache/tomcat/conf/server.xml
$ pfexec /usr/apache/tomcat/bin/startup.sh
Tomcat now listens for http on port 8080, with a default page that links to a rich set of sample JSPs and servlets. Verify with firefox:
$ firefox http://127.0.0.1:8080/

mod_jk is an Apache plug-in module which routes requests received by Apache over to Tomcat. The default mod_jk configuration already maps requests for /servlets-examples/ received by Apache to the Tomcat servlet examples. In order to route the JSP examples over to Tomcat, edit /etc/apache2/2.2/conf.d/jk.conf and add the following line after the existing JKMount directive:

JkMount /jsp-examples/\* worker1

Restart Apache in order to pick up the changed configuration:

$ svcadm restart apache22

You can now access both /jsp-examples/ and /servlets-examples/ through Apache, with the web applications running on Tomcat. Verify:

$ firefox http://127.0.0.1/jsp-examples/

Next, we'll enable mod_cache and mod_disk_cache for both sets of examples. (mod_cache provides the general caching capability, and mod_disk_cache provides the cache storage.)

Create the file /etc/apache2/2.2/conf.d/cache_tomcat_examples.conf with these contents:

# Locate the disk cache.  Make sure there is adequate
# disk space available on this path.  Use htcacheclean
# to maintain the cache size at a maximum level.
CacheRoot /var/apache2/2.2/proxy

CacheEnable disk /jsp-examples/
CacheEnable disk /servlets-examples/

Restart Apache in order to pick up the changed configuration:

$ svcadm restart apache22

Now load http://jsp-examples/jsp2/el/basic-arithmetic.jsp from the browser and look in the CacheRoot directory:

$ firefox http://127.0.0.1/jsp-examples/jsp2/el/basic-arithmetic.jsp
$ pfexec find /var/apache2/2.2/proxy
$

Nothing is in the cache directory, so the application response was not cached.

In order to diagnose caching issues like this, turn on Apache's debug tracing. Edit /etc/apache2/2.2/httpd.conf and change the LogLevel directive to debug.

Restart Apache in order to pick up the changed configuration:

$ svcadm restart apache22

Request http://127.0.0.1/jsp-examples/jsp2/el/basic-arithmetic.jsp again from the browser and see these debug messages in the error log, /var/apache2/2.2/logs/error_log:

Adding CACHE_SAVE filter for /jsp-examples/jsp2/el/basic-arithmetic.jsp
Adding CACHE_REMOVE_URL filter for /jsp-examples/jsp2/el/basic-arithmetic.jsp
cache: /jsp-examples/jsp2/el/basic-arithmetic.jsp not cached. Reason: No 
  Last-Modified, Etag, or Expires headers

mod_cache is looking for certain HTTP response header fields (Last-Modified, Etag, or Expires) as part of determination of cache-ability, and finds none of them. The "What Can Be Cached" documentation for mod_cache has the best summary of the conditions required for caching, as well as pointers to directives that apply to some of those conditions; consult that documentation when responses are not being cached.

In the best case, applications  will specify an appropriate expiration value in their response. However, the web server administrator often has to work around a lack of this information. A simple solution for many cases is to use mod_expires to set an Expires response header field which indicates how long the response is valid; mod_cache will then allow the response to be cached for that period of time.

Edit the file /etc/apache2/2.2/conf.d/cache_tomcat_examples.conf and add these lines to the bottom:

<Location /jsp-examples/>
  ExpiresActive On
  # The response is valid for the next hour.  Browsers don't have to
  # request it again, and the response can be served to other clients
  # from the disk cache.
  ExpiresDefault "access plus 60 minutes"
</Location>

(You can also use other Apache containers to set the expires configuration for specific requests or for requests matching a pattern; you can use the ExpiresByType directive to set Expires based on the content type of the response.)

Restart Apache in order to pick up the changed configuration:

$ svcadm restart apache22

Load the page again with a browser and check the error log for the following:

cache: serving /jsp-examples/jsp2/el/basic-arithmetic.jsp

Now these requests are cached because of the Expires response header. Any requests for URIs starting with /jsp-examples/ will be served from the disk cache for the next 60 minutes, if other caching requirements are met.

Production considerations

Maintaining the cache

mod_disk_cache does not control the amount of disk space used for the cache; that task is handled by htcacheclean.

Run this command to start htcacheclean now:

$ pfexec /usr/apache2/2.2/bin/htcacheclean  -d3 -n -p/var/apache2/2.2/proxy -l100M -i

Note: /var/apache2/2.2/proxy must match the CacheRoot directive in the Apache configuration.

An RFE has been opened to provide an SMF manifest for htcacheclean in a future release. In the meantime, a Quick and Dirty method to start and stop htcacheclean with run level 3 is to create the following files:

File /etc/rc3.d/S99htcacheclean:
#!/bin/sh
/usr/apache2/2.2/bin/htcacheclean -d3 -n -p/var/apache2/2.2/proxy -l100M -i
File /etc/rc3.d/K99htcacheclean:
#!/bin/sh
pgrep htcacheclean | xargs -n 1 kill -TERM

Mark both files executable:

$ pfexec chmod +x /etc/rc3.d/S99htcacheclean /etc/rc3.d/K99htcacheclean

(You can also combine the stop and start operations into a single script and make S99htcacheclean and K99htcacheclean symlinks to that script; consult other references for standard init script practices or, better yet, work on SMF support for htcacheclean and discuss it with other members of the OpenSolaris Web Stack Project on the mailing list.)

ZFS

Consider creating a separate ZFS filesystem specifically for the disk cache, in order to use the following ZFS capabilities:

quota
Set a hard limit at the filesystem level as a fail-safe measure in addition to htcacheclean's limit processing.
compression
Use ZFS compression to reduce the physical size of the disk cache. In some situations, this can improve performance due to reduced I/O.
atime (off)
Turn off maintenance of access time for the filesystem. Neither mod_disk_cache nor htcacheclean require that information.

Determining cache use from logs

LogLevel Debug cannot ordinarily be used for high load production environments because of the large amount of logging, but no other facility is provided for logging whether or not a response was served from the cache.

If the dynamic content is delivered by another server when not served from Apache's cache, the logs for that server can be checked to see how many times it was served by each server (if caching is effective, it will be served far more often by Apache).

Downstream caches

When you use mod_expires directives to provide HTTP caching information on behalf of the application, the information added to the HTTP response is used by mod_cache locally but also by any downstream caches, either in the network or in the browser.  In the event that a cached object must be forcibly removed, perhaps because of an important update that must be distributed before the expiration time,  the web server administrator will generally be unable to affect any downstream caches.

In order to restrict caching to the local web server, add mod_headers directives to unset the Expires directive before it is sent to the client.  This will add load to the web server, but in extreme circumstances when the object must be refreshed before the configured expiration, the web server administrator will be able to purge the web server cache, and clients will then receive the updated object.

Comments:

Hi,

I have two managed serveres in clustered enviornment using Apapche server via mod_jk. I am using HttpUrlConnection with request header set to application/octet-stream, but when i am sending request the apache server is sending the request but while receiving the response the header content is changed to text/html,how can i convert the response header(application/octet-stream) send from the server to client

Thanks

Posted by subhasish on March 18, 2011 at 07:23 PM PDT #

Post a Comment:
Comments are closed for this entry.
About

Jeff Trawick

Search

Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today