« November 2008 | Main | January 2009 »

December 2008 Archives

December 2, 2008

Day One UKOUG

First Day in Birmingham

Yesterday was the first day of the UK Oracle User group conference in Birmingham – my home town.  The best presentation I attended was by Sten Vesterli entitled “What’s Hot and What’s Not – an Overview of Oracle Development Tools”.  This gave a very realistic assessment of the options open to Oracle developers when choosing a development tool.  He also pointed out something that we all need to remember – Use the right tool for the job.  This was a theme that was later reinforced by Tom Kyte in his outstanding keynote on “The best way …” who made a strong case for understanding multiple IT languages to have access to the right tools, observing that understanding multiple languages allows you to have access to different ways of thinking.  Today I have a presentation around choosing the right Fusion Middleware component for the job.  Unfortunately I am up against Mike Lehman so I suspect my attendance will be low.

If you are at the UKOUG stop by my presentation tomorrow (Wednesday) on using a SOA maturity model to drive project selection.

December 3, 2008

Limits to Scalability

Cameron on Limits to Scalability

Whilst working at Los Alamos Richard Feynman delivered a series of lectures to the other scientists on the fundamentals of mathematics.  Of course everyone present knew where he was going, but the pleasure was in watching how he got there.  I had a similar experience watching Cameron Purdy’s presentation today at the Oracle User Group Conference in Birmingham.  His presentation was entitled “Top 10 Patterns for Scaling Out Java™ Technology-Based Applications”.  There wasn’t anything in there I hadn’t seen before, but watching Cameron build up the story and explain the why was a wonderful experience that clarified my thinking and opened up new possibilities to me.  Cameron originally gave this presentation ay JavaOne 2008 and had an hour slot, but at the OUG he only had 45 minutes, no-one complained when he ran over time, and Simon Haslan as session chair let him run over because he was enthralled as everyone else by the delivery style and content.

So what were Camerons 10 patterns?

10. Understand the Problem

9. Define the Requirements

8. Architecture trumps technology

7. Understand the Basics

6. Visualize the Network

5. Visualize the Design

4a. Plan for Overload

4b. Partition for Scalability

3a. Plan for Failure

3b. Replicate for Availability

2. Tier where it makes sense

1. Simplify

If you didn’t attend the presentation at JavaOne or UKOUG then you missed out on half the experience.  If you did miss out then read the presentation and you will probably come away wiser and with greater insight, I know I did.

December 19, 2008

My Application is Too Fast … NOT!

My Application is Too Fast … NOT!

I have lost track of the number of customers who come up to me and say “Antony, I’m worried that my application is too fast”.  Actually I have just checked my notebooks and I can announce that the number is zero.  What I do get asked a lot is what technologies does Oracle have to help me cache my data.  Obviously there are lots of caching options in different Oracle products, but there are three key technologies that focus specifically on caching to improve application performance;

Some customers (and Oracle sales reps) get confused by these products and which is most appropriate for particular situations so I thought I shed a little light on the subject.

A Simple Application Model

Before considering all the caching options it is worth considering a simple model for web based applications.  A typical application receives a request from a browser, parses the request, does some processing in the application server, requests data and/or updates a database, formats a response and returns the response to the browser.  Lots of opportunity in there for things to go slowly.

image

With this simple, but surprisingly accurate model we can examine where the 3 caching products fit.

Database Caching

The typical database server spends a lot of time ensuring that frequently accessed data is available in memory so what can be done to improve an already optimised process?  Well the use of Times Ten as an in-memory database can give a significant performance boost when reading data because it holds a subset of data from the database in memory, usually on the same machine as the application server.  This avoids the network hop associated with accessing the database server, it also eliminates any need to read data from disk.  Times Ten can be configured to act as a pass through cache so that any requests that cannot be satisfied by Times Ten are passed on to the database transparently, requests may not be satisfied because the data is not being cached, or because the SQL construct being used are not supported in times ten.  Updates may also be processed in Times Ten with support for write through (update cache and write to database as part of request) and write behind operation (write data in background after client thinks request has been processed).  This is important because it enables Times Ten to be largely transparent to the clients, meaning that it is easy to add Time Ten to an existing application by just re-configuring its database connections.

image

To summarise Times Ten can cache data in memory next to the application that requires it, providing a performance boost, and continues to provide a SQL interface to that data, so the addition of Times Ten can be made transparent to the application.  Times Ten can be used to accelerate performance of SQL queries where the data sets can be held in memory.  Introduction of Times Ten should have no impact on existing code so it is an easy component to try out to see how it improves performance.

Web Caching

When we look at what the application is doing in our simple model we are struck by the amount of work required to generate a web page.  The Web Cache sits in front of the application server and returns cached pages without the need to call the application server.  Unlike Times Ten, the Web Cache caches data on demand, so it will only give a performance boost the second and subsequent times it is asked for the same data.  If Web Cache does not have a page that has been requested then it transparently requests the page from the application server and then saves it locally in memory before returning the response to the client.  Web Cache is configured to decide what content is cached and which content, such as updates, is always passed to the application server.  The Web Cache has several ways of caching data;

  • Whole page caching – the whole web page is cached, this is non-intrusive to the application, requiring no changes to be made to how the application operates.  This is a good way to start using web cache.  Web Cache understands parameters to pages and can be made to ignore certain parameters if they do not affect the content of the page.
  • Partial page caching – the application is modified to return the page in multiple logical sections, often these sections are re-used across pages.  For example a portal will often use this technique, each portlet being a logical section.  Even though no user may have the same page, each user may be using the same data in portions of the page that are also used by other users.  The Web Cache will assemble the multiple portions of the page into a single page that is returned to the browser.  This approach allows a much higher degree of caching than the whole page approach but it does require the application server to return partial pages rather than whole pages.  This is a little intrusive in that the application must be modified to take advantage of partial page caching, however often applications are written this way and have a top level page that includes a number of smaller page segments.  Applications that use jsp:include are very easy to modify to take advantage of partial page caching.

image

To summarise Web Cache can cache both whole pages and partial pages with minimal modification to the application.  Interestingly the partial page markup uses the same tags as the Akamai content delivery network which places servers at key points in the Internet and provides caching of content closer to the browser.  The Web Cache can provide a large performance boost to principally read only sites such as e-tailers.  The ability to control the freshness of the data and support for explicit data invalidation means that the Web Cache can actually cache far more scenarios than most people appreciate.

In Memory Data Grid

The poster child of the Oracle caching technologies is the Oracle Coherence in memory data grid.  It provides a way to cache very large data sets in memory by not limiting the memory to that available in a single machine but instead using the memory of multiple machines, the grid, to cache the data.  For caching purposes Coherence can be set up as a three layer cache, a local cache for recently used data, a distributed cache across multiple machines to store large volumes of data, and a backing store to retrieve data that is not held in memory.  For example a very large data set of terabytes in size may have a subset of some tens or hundreds of gigabytes frequently used, with individual applications having a working subset of a few tens of megabytes.  The working subset may be stored in a local cache (just by specifying that there is a local cache and its aging and eviction policies, no need to explicitly decide which data is cached), the frequently used sub set may be stored in the data grids distributed cache (again using the same kind of rules as the local cache, but now acting on a larger data set) and finally any cache misses can be satisfied by going to backing store, which is usually a large database.  Writing of data may be done in either a write through approach where the client waits for the data to be written to the backing store or in write behind mode where the data is written into the cache (which replicates the data for safety) but only written into the backing store when it is convenient whilst the client continues processing.  To further improve performance queries for data may be distributed across the data grid and aggregation calculations are also done in parallel on all data grid nodes.  The ultimate performance boost can be obtained by moving processing into the data grid itself, moving the processing to the data rather than the other way round.

All this sounds ideal and begs the question why don’t we use Coherence for everything.  Well within Oracle there is a move to use Coherence extensively within a number of products, but this is an aspiration rather than a reality for most products currently.  The reason is that Coherence is an intrusive technology that requires the application to be modified to take advantage of it.  To use Coherence for simple caching generally requires the least modification to the application and exploiting the distributed processing capabilities of the data grid often requires the most modification.

image

In summary Coherence can provide a tremendous performance boost to both read and write intensive applications, but it requires modification of the application to get this benefit.  Many customers have found it well worth their while to make these modifications but it does increase the time taken to introduce Coherence into an environment.  In the future expect to see more and more Oracle products providing a Coherence option to boost performance.

Summary

If you believe your database is a bottleneck in your performance then consider Times Ten as a transparent way to boost SQL performance for both read and write operations.  If you want to also reduce the load on your application servers then consider the user of Web Cache as a front end for read intensive sites.  For more powerful distributed caching and massive scalability consider using Coherence.  Note that both Coherence and Times Ten can be used with any application, not just web applications, whilst Web Cache by its nature can only cache HTTP and HTTPS responses, indeed many customers use Coherence not just as a data grid but as an application platform, making data grid the core of some of their environments.

Personally I don’t believe there are many sites that couldn’t benefit from caching technologies, all three technologies have their place and all of them can boost the performance of even supposedly “uncacheable” data.  The benefits of using caching are three-fold;

  • Reduced response time (latency) – users get their response faster
  • Improved scalability (throughput) – better use is made of resources to potentially allow increase in use of computing resources to grow at a lower rate than the increase in use of the application.
  • Reduced hardware (efficiency) – better use of made of resources and so a need for less hardware.

So which cache is best for you?  I have no idea, but hopefully you can now make a more informed decision!

About December 2008

This page contains all entries posted to Antony Reynolds' Blog in December 2008. They are listed from oldest to newest.

November 2008 is the previous archive.

January 2009 is the next archive.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type and Oracle