Friday Nov 13, 2009

Sizing an RTD installation - Part 1

In every implementation of RTD it is necessary to determine the hardware configuration to support the expected loads of RTD applications. While we try to provide guidelines and generalizations, it helps to understand the most significant factors that affect the desired hardware configuration. In a series of blog entries we describe the different factors that need to be considered.

Throughput

The first factor to consider is the expected load, in terms of number of events per second, that the servers will need to deal with. These events have different types and therefore may cause different loads into the servers.

Estimating the number of events per second usually begins at some given metrics. Examples of typical metrics include:

• Web site pages served per second/day/month
• Web site [unique] visitors per month
• Web site visits/sessions per day
• Call Center calls per day
• Average call length
• Maximum number of concurrent agents
• IVR calls handled per day

The first thing to do with these metrics is to translate them to "per second" numbers. The translation from large time periods, like months, can not be done by directly dividing by the number of seconds in a month, as it is typical that there are busier days and busier hours of the day.

Some rules of thumb that I have found to result in numbers that are pretty close to reality for a wide variety of situations are as follows:

• Monthly numbers can be divided by 10 to produce the numbers for a busy day
• Daily numbers can be divided by 10 to produce the numbers on a busy hour
• Hourly numbers are divided by 3000 (or sometimes 2000) to produce the number per second
• If number of pages per visit is unknown, 10 to 15 can be assumed for many sites
• If call length is unknown, 5 minutes can be assumed
• Dividing the number of concurrently active agents by the length of a call (in seconds) gives the number of call starts per second

From  these we can compute the expected number of requests per second. Lets look at some examples.

Web example: a bank. Only the following information is available: "The bank has 5M customers, of them 2M have signed up for online banking. They are planning to use RTD to determine content and promotions in several places in most online banking pages."

Since this is all the information we have, we will do a calculation based on many assumptions. Later on we can confirm or adjust our assumptions based on any additional information we are given.

Assuming 1/2 of the signed up customers are active, and we have on average 4 visits per month we have 4M visits per month. Using the rules of thumb above, we can assume 400k visits on a busy day, and 40k on a busy hour. Dividing by 2000 seconds in an hour that gives us about 20 visits started per second. Assuming 10 pages per visit and 3 requests per page we have 30 requests per visit and 600 requests per second.

Call Center example: "A telco has 5000 agents in the call center. They are interested in implementing RTD for offer recommendations at the end of service calls."

Lets assume that the maximum number of agents active at any given time is about 2/3 of the agents, say 3500. Assuming 5 minute calls, which is 300 seconds, we have an average of about 12 call initializations per second. Assuming 4 requests per call, we have about 48 requests per second.

In upcoming posts we will explore other considerations that come into play when selecting a configuration.