Last week I was at a customer and was asked to do a 30-minute presentation to the "Friday Club" on a topic of my choosing. This came as a bit of a surprise as I was there to assist in some migration work from another App Server to Oracle. However I agreed and chose to give a presentation I gave at the UK Oracle User Group last November, it originally having been written for a Java SIG presentation I gave in London. Let me share my thoughts with you on J2EE scalability.
In the IT industry we put a lot of weight on benchmarks, especially if they are run or audited by a third party such as TCP-x or SPEC J. However in the real world the fact that Oracle App Server is x% faster than vendor Y is only of interest if the difference is significant, greater than 10% at least or the difference will be swallowed up in other areas.
With that in mind, having a fast scalable App Server provides a good base to start on for building a scalable J2EE architecture, but it is not enough.
A good definition of scalability is
The ability of a system to cope with increasing demand"
A system uses Resources (R) which are some function (f) of load (L)
R = f(L)
We can categorise scalability as being of three types
Linear scalability is what we hope for, double the load and we need to double the resources. Exponential scalability is whaqt we dread, double the load and resources required more than double. Sub-linear scalability is heaven, if only we could get there.
Some key limits to scalability are
Examples of resource contention include hot indices in the database causing block contention or insufficient database connections.
Non-linear scaling is often typified by things such as a full table scan. As the data-set grows the system slows down.
Cost of providing additional hardware or software licensing may be a barrier to increased system growth.
Ability to manage the system can impact the ability to scale the solution. Moving to scores of smal low cost machines sounds cost effective, but without suitable management tools such as Oracle grid Control it will result in a huge administration overhead.
Common problems I come across when looking at customers systems not scaling as expected include the following.
My thoughts on this are based on the observation that a 16-cpu box costs more than 4x4-cpu boxes. Use lots of smaller boxes as it is cheaper to scale, cheaper to add resiliency and the incremental cost of scaling is lower.
It is possible to scale horizontally by adding more identical machines fronted by a load balancer. It is also possible to scale vertically by splitting the EJB/web tiers apart, but this introduces additional latency, is more complicated and so I don't feel this is a good approach. Resources for a single tier will always be less than the resources required for a split EJB/web tier.
So far we have only spoken about achieving linear scalability. The key to achieving the nirvana of sub-linear scalability is caching.
Within the JVM we can use the Oracle Java Cache to cache java objects. If necessary we can share these objects across JVMs. Basically we should cache expensive to create or large objects in the JVM. Extremely expensive objects to create should be shared across JVMs. The Oracle Java cache supports both of these models.
In addition to the Java Object cache we can also use the Oracle Web Cache. This web based caching tool can be used to front any HTTP based server, not just Oracle. The Oracle Web Cache allow work to be done once for many users. The web cache supports invlaidation of content based on time, or on messages from teh database or application server tier. This allows very aggressive caching strategies. One customer I worked with saw their database CPU reduced by two thirds and their app server CPU reduced by three quarters when they deployed the web cache in front of their application.
We allow operators to cope with increasing demand by managing many boxes as one using clustering technology. This allows changes to made once and automaticlly applied to all machines in a cluster so that the overhead of managing multiple machines is marginal.
Another key to scaling is the use of grid computing coupled with standard builds to allow rapid provisioning of new servers.
Use IDEs to improve developer productivity. JDeveloper & other tools can greatly enhance productivity over vi or emacs. Use frameworks such as Struts and Oracle ADF. Trade speed of development for performance, a CPU will cost less than $2,000 but a developer will probably cost more than $100,000 per year. Put like this 1 Developer Year is 50 CPUs or 1 CPU is less than a developer week.
Use automated testing Unit level testing with JUnit and have a master JUnit test suite to test all application components. System testing with Mercury Interactive Load Runner or similar will reveal all kinds of scalability issues as well as being a valuable tool for regression testing. Test scalability with tools. Use tools to stress application. Test with different users. Test with expected volumes. Test, Test, Test!