Now that we know how to compute the number of requests per second and we have seen other things that need to be considered, we can finally compute the number of CPUs to cope with the desired load. This number is actually quite easy to compute. For planning purposes we usually account for 100 requests/second/CPU. This leaves enough room for higher peak loads or other underestimations in the process. In typical cases we see a higher throughput per CPU.
For example, if we need to support 300 requests per second we can plan for 3 CPUs for Decision Service. The other processes, Learning Server and Workbench Server can usually be run either on one of the Decision Service CPUs or on their own.
Now, lets say that there is the desire to use standard servers with 2 CPUs, each CPU with 4 cores. In this case, one server would have more than enough computing power to cope with the number of requests per second. Nevertheless, we may choose to have 2 of these servers, that is 16 total cores to provide for high availability.
If this same configuration was used with Disaster Recovery then we may end up running two servers in two sites with a total of 32 CPU cores. That, of course, is more computing power than necessary to cope with the load.
An alternative that is counter intuitive for people running transactional applications is to have RTD running on just one server, and pay the price of non-availability. This may be acceptable depending on the application. For example in offer optimization and if the expected down time of a single server is just a couple of hours per year, then the cost of having non redundant servers maybe better than the cost of having a HA setup.
In any case, the numbers above are for basic planning purposes. If there are many sessions being initialized and not so many other kinds of events then the equations may look different as a session initialization usually takes more resources. Additionally, the load balancing strategy in front of the RTD servers also affects performance. Maximum speed is attained when the load balancing scheme is capable of maintaining session affinity.
Finally, for really high throughput in the thousands of requests per second the strategy is to partition the servers along some strict lines. This partitining strategy can be taken all the way into the database.