CMT-CPUs have been around for quite a while now. That they were developed for parallel, throughput oriented loads is a well known fact. However, finding out if a specific application is a good fit for these CPUs seems to remain a challenge, and is one of my personal FAQs. I'll try to write down a few helpful hints that might help you answer this question yourself.
The first and most important criterion for suitability is always the service time of your application. If this is sufficient, then the application is OK on CMT. If it is not, and the reason is actually the CPU and not some other high-latency component (like a remote database), you will need to test on other CPU architectures.
It is of course desirable that the application is multi-threaded, and the individual threads actually perform useful work in parallel. Only then will the application be able to make use of all the CPU resources available. It is important to understand that (high) server utilization can never be a criterion for good application performance. Performance always needs to be measured using application metrics like throughput (transactions per minute) or service time, or both. Server or CPU utilization only indicate whether the application actually makes good use of the available resources.
Use threadbar to establish if an application is multi threaded and if these threads are actually doing something. Or, if you're more the commandline type, use prstat -L. There, the individual threads of your processes are listed, sorted by CPU usage. What you want to see here is many threads with CPU usage higher than 0%. What's also important is to check if any thread is limited by the single thread performance of the CPU or strand. A first estimate of this can also be obtained from prstat. The column "%CPU" relates to all active CPU (strands) in your system. For example, a T5120 would show you 64 CPUs. One strand is therefore equivalent to 1/64 or about 1.5% of the overall system. What you want is that no one thread of your application consumes 1.5% CPU permanently. This would be a hint that this thread is limited in performance by the single thread performance of the CPU strand.
I've put these hints and some examples into a small presentation. Perhaps it's helpful for some of you.