Response to Joe Temple's blog on my blog...

My attention was called to this blog page containing Joe Temple's (of IBM) continued argument with me and the points I raised on my blog. I tried to respond by adding a comment on his blog (as I published his comments on my blog), but he declined to return the favor of publishing my response. That's a shame, don't you think?

I'd would prefer to ignore this and let sleeping dogs snooze. We've made our points and neither of us is going to convince the other. We probably won't convince anybody else who already has a position staked out. However, a lot of what Mr. Temple said about Sun product, and about IBM System z is wrong. I dislike misrepresentation of facts, and especially misrepresentation of what I said, so I'm going to pick up the sword at least this one more time.

So, I'm going to respond to his blog. I won't recap my dismembering of the phony comparisons that accompanied the z10 announcement, as you can find that on my blog at Ten Percent Solution and No, there isn't a Santa Claus. Especially read the latter, because that is where Mr. Temple and I previously conversed, and I extended him the courtesy of saying "Joe Temple is an IBM Distinguished Engineer, and in my opinion a person who has earned respect". All the more reason for my disappointment at seeing his blog. Consequently, I feel compelled to respond to some of Mr. Temple's distortions.

First: He says of me "he compares the LSPR scaling ratios to Industry benchmark results on UNIX SMPs." I'll be blunt: Joe knows I did the exact opposite. See Ten Percent Solution where on March 18, 2008, I said of LSPR "What I object to is it being used as a marketing tool in an official IBM announcement to extrapolate performance for comparison to a completely different platform based on a workload that isn't even the same as the LSPR benchmark workload." and I describe '"\*legitemate\* purpose IBM has for LSPR: for same-platform-family capacity planning. Anything else should be marked with disclaimers that admit to the "this is only an estimate". For cross-platform comparisons, there are the standard open benchmarks which IBM refuses to publish for System z.' Joe read the material I quote here long before his post saying I said the opposite.

Let's go to the beginning of this. My blog responded to IBM's February announcement of the z10 and a related IBM blog that had the text "384.5 RPEs is approximately equivalent to the number of z10 RPEs at 90% when you use 20 RPEs equal to 1 MIP where MIPS are based on the LSPR curve for the z10." It was IBM using LSPR to compare z to Unix benchmarks, and it was me saying why that was baseless. Joe accuses me of what I clearly was refuting, on a blog he read months before his blog. I do not like being misrepresented in this manner.

The IBM announcement originally had a footnote 3 that had "justification" for their claims, and the blog elaborated on it. When it turned out that this fuzzy math was based on inappropriate use of a 3rd party's tools, both the announcement page and the blog were airbrushed to remove those claims. I wish I had the foresight to have printed or saved those pages, but the text I quoted has been removed. Dear Reader - feel free to speculate why.

Fortunately, the Internet has cached copies of the IBM press release before it was censored. See the original text or use http://tinyurl.com/5u3emk. That page includes the removed text: "\*3 Source: On Line Transaction Processing Relative Processing Estimates (OLTP-RPEs): Derivation of 760 Sun X2100 2.8 Opteron processor cores with average OLTP-RPEs per Ideas International of 3,845 RPEs and available utilization of 10% and 20 RPEs equating to 1 MIPS compared to 26 z10 EC IFLs and an average utilization of 90%." So, there is IBM using RPEs inappropriately. The IBM blog used a mythical "LSPR ratio" (there is no such thing - there are lots of ratios - one for each combination of benchmark and platform combination) to extrapolate from z9 to z10, though, unfortunately nobody kept a copy of the above blog page. I'll use Joe's wording, it is IBM that "compares the LSPR scaling ratios to Industry benchmark results on UNIX SMPs." Not me.

By the way, the basis of IBM's claim of replacing 1,500 servers was that 1,375 of them were essentially idle. With that trick, why not claim you can replace 15,000 or 150,000, or 1,500,000 if they're powered off! See the IBM blog entry or for convenience, http://tinyurl.com/5lxzrn for IBM blogger Tony Pearson saying "125 Backup machines running idle ready for active failover in case a production machine fails. 1250 machines for test, development and quality assurance, running at 5 percent average utilization" (bold fonted for emphasis) I think we can agree that any contemporary machine can replace a large number of machines doing essentially nothing. It just costs much more if you do that on z.

Joe says that Sun disparages TPC-C because we don't do it well. Not so, IBM says the same. Read the IBM document at ftp://ftp.software.ibm.com/eserver/benchmarks/wp_TPC-E_Benchmark_022307.pdf which describes TPC-C as "an aging benchmark losing relevance", and lists its deficiencies for current computers and workloads. So, IBM also acknowledges that TPC-C is broken (except for the cases where they still use it - not on IBM z, of course - to convince the credulous).

Frankly, I've argued that we should blow away the numbers on TPC-C (I believe our current servers could handily do this), and then hold up the trophy for best TPC-C and say "we won the record", and then say outright it's bogus without people making spurious claims that we only say so because we can't do it. The counter argument, which has won so far, is that we've long made a statement of principle that TPC-C is broken, and it would be misleading and a distraction to then go out and publish new TPC-C results. Unlike some other vendors who both disparage it (see above) and also run it, depending on which audience they have handy.

What is comical about this is that Joe implies that Sun doesn't run one particular benchmark because we're unable to "keep up", while defending IBM not publishing ANY standard benchmark on System z. This is called "chutzpah". I can't make up my mind whether z-advocates truly believe they deserve a free pass and are exempt from proving performance and price/performance, or if this is a cynical ploy to avoid publishing z's price/performance.

In fact, Sun has many world record performance results on its servers. See for example a Java app server benchmark (IBM is there, but not for z, of course). Same with web serving. Or the world's largest data warehouse on Sun and Sybase. Or world record with SAP using Oracle and Sun. Click here for many more. There are lots of them.

In sharp contrast, IBM refuses to publish performance results on System z in a way that would permit direct platform comparison. Despite the handwaving and chaff Joe spreads about cache coherency (much of which is fanciful and wrong), many of the standard benchmarks are indeed very good predictors for application performance: such as web serving, file serving, Java application servers and databases. There is nothing mystical about it. These are exactly the workloads IBM wants you to run on z without offering any public evidence that z performs them adequately. Need I add that the industry standard benchmarks vary widely in terms of cache coherence and threadedness? IBM publishes none of them for z.

IBM does run workloads on z similar to some industry benchmarks, but only show performance relative to other z models. See the LSPR page where WASDB (on z/OS) and WASDB/L (on z/Linux)is a Java application server benchmark "written to open Web and Java Enterprise APIs, making the WASDB application portable across J2EE-compliant application servers." It's very much like the Java app server benchmarks Joe says are invalid and so different from what you run on a mainframe. Except it does run on a mainframe, and IBM says it is a valid predictor of performance, contrary to what Joe says. What they won't do is make it easy for you to do a direct comparison with any other platform. Think about it. Also take a time to look at the actual benchmark descriptions: just as with the open benchmarks they have different levels of parallelism and cache interference. To describe the mainframe workloads as being one one side of parallel nirvana, purgatory or hell (see Temple's remarks on his blog, or on the copy below) is nonsensical, as the LSPR studies include all three. IBM also doesn't run the z/OS LSPR studies under z/VM, so his references to virtualization overhead is irrelevant: IBM reports the non-virtualized results, and they still scale poorly.

I do not understand Joe's obsession with simplistic characterizations of the parallelism and scalability of our processors and industry-standard benchmarks, which have a wide range of properties. Labelling them as a group having a single characteristic for scalability, NUMA sensitivity, parallelism, etc, is simply wrong. An M-series enterprise server is very different in performance characterstics and other features from a coolthreads chip-multithreading server. To lump them together as identical is silly. A T5220, for example, has uniform memory latency - it's not NUMA in the least. Not so with an E25K or an M8000, which do have NUMA properties. Sun recognizes that different workloads have different processor requirements, and makes SPARC and Solaris binary compatible systems that can handle all of them so you can trivially move your application to the compatible system that runs it best. Unlike with IBM, where by Joe's own words (see below), you have to switch platform architectures and (consequently) all the platform software if an application turns out to not to be a good performance match for z.

Joe is mistaken in how he characterizes the different processor families. One difference is that Sun's enterprise servers increase cache and I/O bandwidth as processors are added, while IBM's don't. As you enable CPs in an IBM z9 or z10 "book" you decrease the cache available to each processor, since a "book" contains a fixed amount of cache, regardless of CPUs. Not too bad for z/OS because it uses shared address spaces (eg: LPA and CSA) for multiple jobs, but a bigger problem for Linux under z/VM which has to discard cache and TLB contents on every dispatch. (Joe must have missed Sun's E25K product, which, like z, also had a large L3 cache. That's hardly unique to IBM). Anyway, the cache limitations may explain IBM z's sublinear scalability shown in all the LSPR tests, and the fact that IBM doesn't publish LSPR for more than 32 CPUs in a single OS instance. See IBM LSPR report and notice how doubling the number of CPUs doesn't double the measured throughput and performance. It's sometimes stated that IBM doesn't do full-system tests of a single OS instance on z for reasons of cost, but that's obviously not the case: IBM runs LSPR tests on fully-configured z systems but they need multiple OS instances to drive the boxes and still can't get near linear scale.

Joe also neglected to respond to my blog's mention of negative scale on IBM mainframes (which he has already seen): a 16 way z990 doing only 1,322 web transactions per second, and a 24-way doing under 1,000. That's right - Adding CPUs resulted in lower performance!. So much for the claims of scalability for IBM mainframe. Crikey! Any contemporary 1RU x86 or SPARC server will dramatically outperform that at a tiny fraction of the cost, floor space, software licenses, staff, and environmentals. Talk about "inconvenient facts". Instead of misapplying Gunther's book, I suggest Joe read "IBM Mainframes: Architecture and Design, 2nd ed" (Prasad and Savit). Or my VM performance and internals books, for that matter.

So, I urge you to simply ignore all this hand-waving about cache and NUMA properties and the supposed magic capabilities on IBM's z line. It's completely inaccurate, in the cases where it isn't irrelevant. It's just chaff to distract people from looking at z's disastrously bad price/performance.

Joe also disingenuously claims IBM doesn't enjoy monopoly pricing for z. Of course it does. If you want to run z/OS applications, you have nowhere to run them except on IBM processors using IBM's z/OS, CICS, VTAM, etcetera (it's not just the hardware price). IBM has the luxury of charging high margins because there are no Amdahl or Hitachi systems to compete against, nor alternative sources for the software I just listed, and because they know rehosting an application takes a substantial amount of effort. I have hands-on experience with the excellent Unikix application suite, now owned by Clerity Inc. It absolutely can rehost a z/OS batch + CICS application on open systems without rewriting. I've seen successful migrations with a replacement TCO a mere 10% to 15% of the mainframe's TCO (that's TCO, folks. Not acquisition cost.) But, it requires a good deal of effort. The barriers to exit from mainframe are much, much higher than between Unix dialects. That's the whole point of Open Systems, and why you shouldn't check into a z-motel its hard to check out of.

So, yes. Mainframes are proprietary and have monopoly pricing. Just try to buy a z/OS equivalent or a system to run it on from anyone else. See for example IBM shuts up competitor PSI by buying it - The INQUIRER. In contrast, you can run Solaris on SPARC machines from us and from a partner+competitor, and you can run Solaris on x86 servers from dozens of vendors - including IBM.

Which reminds me: Joe distorts the agreement between IBM and Sun with his statement "Finally, Sun itself recognized System z and zVM as "the premier virtualization platform" when Sun and IBM jointly announced support of Open Solaris on IBM hardware." We did no such thing: the statement was support for Solaris on IBM's x86 (Intel) servers, not on "System z and zVM". Surely Joe knows this. See IBM Expands Support for the Solaris OS on x86 Systems. By the way, I've been involved with the OpenSolaris on z project since its inception - I obtained the workstation the non-IBM developers used for the port, personally installed it on z/VM, and it's not finished yet, either. It does give me the chance to do like-to-like performance comparisons of SPARC, Intel, and IBM System z (same app, same OS), and it substantiates everything I've been saying here.

The last paragraph illustrates one of the things that most disturbs me about this sorry episode. I've witnessed a shameless willingness from certain quarters to "just make things up". I've seen claims that you could use z/OS features when running z/Linux under z/VM, and completely imaginary ratios between different computers and their own, and claims that people (like me) said the exact opposite of what they actually said. It's really a sad thing, and makes me doubt the many years I spent as a loyal IBM customer and (alas) fan.

Jeff


Just to be on the safe side, since some things I've referred to have disappeared from the sites that originally hosted them, here is a copy of the blog article I am responding to. Can't be too careful these days. As he notes, my material (following the "Posted by" line) is mingled with his. His text is in blue italics, just as on his blog.

Response to Jeff Savit Blog

As part of the announcement of z10 IBM made some marketing claims about the large number of distributed Intel servers that  could be consolidated with zVM on a z10.  The example cited used Sun rack optimized servers with  Intel Architecture CPUs.  Sun Blogger Jeff Savit objected strenuosly to the claims mainly because of the low utilization assumed on the Sun machines that the claims compared to.  You can read it here:

http://blogs.sun.com/jsavit/entry/no_there_isn_t_aI responded, he responded.  When I was out of pocket  for awhile and did not respond soon enough and his blog cut off replies on that thread.  I am putting my latest response here.  Thanks to Mainframe blog for providing the venue to do so.  My latest responses to Jeff are in blue italics.

Posted by Joe Temple on June 24, 2008 at 11:28 AM EDT #

This format is very difficult for parry and riposte, but let's try. I would like to use different colors, but I can't (AFAIK) put in HTML markup to permit that. So: Joe's stuff verbatim within brackets, and each of his sections starts with a quote of a sentence of mine (which I identify, within quotes) for context. Each stanza identified by name and employer (this is Jeff speaking):

Joe(IBM): [[[Jeff, your post is rather long and rather than build a point by point discussion too long for a single comment I will put up several comments. Starting with the moral of the story: There are several: • quoting Jeff: "Use open, standard benchmarks, such as those from SPEC and TPC."

Better to use your own. They have not been hyper tuned and specifically designed for. They have a better chance of representing reality. But be careful not to measure wall clock time on “hello world” or lap tops will beat servers every time.]]] 

Jeff(Sun): In a perfect world, every customer would have the opportunity to test their applications on a wide variety of hardware platforms to see how they perform. But they don't, and they rely on open standard benchmarks to give them some information about how the platforms would perform. Or, they do have applications they could benchmark, but they're non-portable, or run solely on a single CPU (making all non-uniprocessor results worthless), or otherwise have poor scalability or any of a hundred other problems. Imagine comparing IBM processors based on the speed of somebody writing to tape with a blocksize of 80 bytes! Even if they get a useful result, the next customer doesn't benefit at all and has to start from scratch. It's not trivial to make good benchmarks that aren't flawed in some way. That's why the benchmark organizations exist - to provide benchmarks that characterize performance and give a level playing field for all vendors. IBM, Sun, and others are active in them - our employers must think they have value. Obviously there is "benchmarketing" and misuse of benchmarks. THAT is what I'm railing against. Hence, my following bullet that says "read and understand". But frankly, benchmarks Specweb/specwebssl/Specjvm, the SPEC fileserver benchmarks, and benchmarks like TPC.org's TPC-E provide representative characterization of system performance (with sad exceptions like TPC-C, which is broken and obsolete, but IBM still uses for POWER). The characterization of TPC-C as "old and broken"  may have something to do with Sun's inability to keep up on that benchmark.  One of the characteristics of TPC-C that none of the other benchmarks has is that it has at least some "non local" accesses in the transactions.  Sun's problem with this is that such accesses defeat the strong NUMA characteristic of their large machines.  One of the results of this  is that all machines scale worse on TPC-C than on the benchmarks Jeff cites. Since Sun is very dependent on scaling a large number of engines to get large machine capacity close to IBM's machines they are highly susceptible to this.   The effect  is  exacerbated by NUMA (non uniform memory access).  That is, a flat SMP structure will mitigate this.   The mainframe community's problem with TPC-C is that the non-local traffic is all balanced and a low percentage of the load.  As a result TPC-C still runs best on a machine with a hard affinity switch set and does not drive enough cache coherence traffic to defeat numa structures.  When workload runs this way it does not gain any advantage from z's schedulers or shared cache or flat design. Think of TPC-C as a fence.  There is workload on Sun's side and there is workload on the mainframe side of TPC-C.  All the Industry Standard Benchmarks sit on Sun's side and scale more linearly than TPC-C.  For workloads that are large enough to need scale that run on the Sun side of the TPC-C fence, IBM sells System p and System x.  When you consolidate disparate loads the Industry Standard benchmarks do not represent the load and  with enough "mixing"  the  composite workload will eventually move to the mainframe side of the TPC-C fence.  See Neil Gunther's Guerilla Capacity Planning, for a discussion of contention and coherence traffic and their effect on scale.  Particularly  read chapter 8, to get an idea about how the benchmarks lead to overestimation of scaling capability.    A lot of people have worked very hard to make them be as good as they are. IBM uses these benchmarks all the time - with the notable exception of System z.  System z is designed  to run workloads with non uniform memory access patterns, randomly variable loads, and much more serialization and cache migration than occurs in the standard benchmarks , where strong affinity hurts, rather than enhances throughput. It is the only machine designed that way (Large shared L3 and only 4 way NUMA on 64 processors). Also, the standard benchmarks are generally used for "benchmarketing".  As a result the hard work involved is not purely driven by the noble effort by technical folks that Jeff portrays, but rather by practical business needs, including the need to show throughput and scale in the best possible light.  That's the point, isn't it. It works in a monopoly priced marketplace where it doesn't have to compete on price/performance,  as it does with its x86 and POWER products. Where else are you going to run CICS, IMS, and JES2?  There are alternatives to System z on all workloads, it is matter of migration costs v benefits of moving.  Many applications have moved off CICs and IMS to UNIX )and Windows over the years. Sun has whole marketing programs to encourage migration.  In fact a large fraction of UNIX/Windows loads do work that was once done on mainframes.  As result the mainframe must compete.   Similar costs are incurred moving work from any UNIX (Solaris, HPUX,  AIX, Linux to zOS. Or moving from UNIX to Windows.  The other part of the barrier is the difference in machine structure.  This barrier is workload dependent.  Usually, when considering two platforms for a given piece of work one of the machine structures will be a better fit.   When moving work in the direction favored by the machine structure difference the case can be made to pay for the migration..  This is what all verndors do.  Greg Pfister (In Search of Clusters), suggests that there are three basic categories of work.  Parallel Hell, Paralle Nirvana, and Parallel Purgatory.  I would suggest that there are three types of machines optimized for these environments (Blades in Nirvana, Large UNIX machines in Purgatory, and Mainframes in Hell)  To the extent that workload is in parallel hell, the barrier to movement off the mainframe will be quite high.   Similarly attempts to run purgatory or nirvana loads on the mainframe will run in to price and scaling issues. IBM asserts that consolidation of disparate workloads using virtualization will drive the composite workload toward parallel hell, where the mainframe has advantages due to its design features, mature hypervisors and machine structure.

To the second observation about wall clock time on trivial applications: yes, obviously.

Joe(IBM): [[[quoting Jeff: •"Read and understand what they measure, instead of just accepting them uncritically."
Yes, particularly understand that the industry standard benchmarks run with low enough variability and low thread interaction that it makes sense to turn on a hard affinity scheduler. Your workload probably does not work this way.]]] 

Jeff(Sun): I'm not sure what's intended by that. Are you claiming that benchmarks should be run against systems without fully loading them to see what they can achieve at max loads? Hmm. Anyway, see below my comments about low variability and low thread count - which applies nicely to IBM's LSPR.]]]   I guess I am claiming that the industry benchmarks basically represent parallel nirvana and parallel purgatory.  I am asserting that mixing workload under single OS or virtualizing servers within an SMP drives platforms toward parallel hell.  The near linear scaling of the industry standard loads on machines optimized for them will not be achieved on mixed and virtualized workloads.  In part this because sharing the hardware across multiple applications will lead to more cache reloads and migrations than occur in the benchmarks.   I see Jeff's reference  to LSPR as a red herring for two reasons.  While LSPR has not been applied across the industry,  the values it contains have been used to do capacity planning rather than marketing. The loads for which this planning is done are usually a combination of virtualized images each either running mixed and workload managed  under zOS or  VM and zLinux.   This could not be done successfully if  the scalability were as idealized as the Industry standard benchmarks.   Second, I do not suggest that LSPR is the answer, but rather that the current benchmarks do not sufficiently represent the workloads in question (mixed/virtualized) for Jeff to make the claim that z does not scale as he did elswhere in the blog entry.  Basically,  to draw his conclusion he compares the LSPR scaling ratios to Industry benchmark results on UNIX SMPs. This is not  a good comparison.

Joe(IBM): [[[quoting Jeff: •"Get the price-tag associated with the system used to run the benchmark." Better to understand your total costs including admin, power, cooling, floorspace, outages, licensing, etc."

Jeff(Sun): That's what I meant. Great.  Because the hardware price difference that Sun usually talks about is only a small percentage of total cost.  The share of total cost represented by hardware price shrinks every year.

Joe(IBM): [[[quoting Jeff: • Relate benchmarks to reality. Nobody buys computers to run Dhrystone." Only performance engineers run benchmarks for a living.]]]

Jeff(Sun): Sounds like a dog's life, eh? OTOH, they don't have users...

Joe(IBM): [[[quoting Jeff: •"Don't permit games like "assume the other guy's system is barely loaded while ours is maxed out". That distorts price/performance dishonestly." Understand what your utilization story is by measuring it. Don’t permit games in which hypertuned benchmarks with little or no load variability and low thread interaction represent your virtualized or consolidated workload. Understand the differences in utilization saturation design points in your IT infrastructure and what drives them."]]]

Jeff(Sun): Your comment has nothing to do with what I'm describing. What I'm talking about is the dishonest attempt to make expensive products look competitive by proposing that they be run at 90% utilization, while the opposition is stipulated to be at 10%, and claim magic technology (like WLM, which z/Linux can't use) to permit higher utilization and claim better cost per unit of work on your own kit. That's nothing more than a trick to make mainframes look only 1/9th as expensive as they are. Imagine comparing EPA mileage between two cars by spilling 90% of the gas out of the competitor's tank before starting. As far as "no load variability and low thread interaction", I suggest you take a good look at IBM's LSPR. See http://www-03.ibm.com/servers/eserver/zseries/lspr/lsprwork.html which describes long running batch jobs (NO thread interaction at all) on systems run 100% busy (NO load variability). The IMS, CICS (mostly a single address space, remember), and WAS workloads in LSPR should not be assumed to be different in this regard either. This doesn't make LSPR evil: it is not - it's very useful for comparisons within the same platform family. But consider SPECjAppserver, which has interactions between web container, JSP/servlet, EJB container, database, JMS messaging layer, and transaction management - many in different thread and process contexts. I suggest you reconsider your characterization about thread interaction. Complaints about thread interaction and variability of load are misplaced and misleading.  The comparison of zLinux /VM at high utilization with highly distributed solution at low utiliation is valid, and well founded on both data  and system theory.   You could make similar comparisons of  consolidated  Virtualized UNIX v  distributed Unix,, VMware v Distirbuted Intel.  Any cross comparison of virtualized v distributed servers  will be leveraged mainly by utilization rather than by raw  performance as measured by benchmarks.  Thus the comparison Jeff complains about as dishonest does in fact represent what happens when consolidating existing servers using virtualization.   My second point is that in making comparisons between consolidated mixed worklload solutions that industry benchmarks are not represetative of the relative capacity or the saturation design point for each of the  systems in question.  There is no current benchmark to use for these comparisons.  This includes LSPR, Suns Mvalues, rPerfs,  as well as the industry benchmarks.  None of them works.  Each vendor asserts leverage for consolidation based on their own empirical results, or perceived strengths in terms of machine design.     I am saying that the scaling of these types of workloads is  less linear that the industry benchmark results and that  some of the things z leverages to do LSPR well  will  apply in this environment as well. Joe(IBM): [[[quoting Jeff: •"Don't compare the brand-new machine to the competitor's 2 year old machine" Understand what the vintage of your machine population is. When you embark on a consolidation or virtualization project compare alternative consolidated solutions, but understand that the relative capacity of mixed workload solutions is not represented by any of the existing industry standard benchmarks.]]] 

Jeff(Sun): We're talking at mixed purposes. What I mean is that one vendor's 2008 product tends to look a lot better than the competition's 2002 box, making invidious comparisons easy. Moore's Law has marched on.  The truth is that when you do a consolidation you usually deal with a range of servers some of which are 4 or 5 years old.  2 year old  vintage is probably farirly representative.  In any case Moore's law does not improve utilization of distributed boxes unless you consolidate work in the process of upgrading. Unless a consolidation is done the utilization will drop when you replace old servers with new servers.  For the consolidation to occur within a single application, the application has to span multiple old servers in capacity.  Server farms are full of applications which do not use a single modern engine efficiently let alone a full multicore server.   Jeff's main argument is with the utilization comparison.   The utilization of distributed servers, including HP's, Sun's and IBM's, is  very often quite low.  It is possible to consolidate a lot of low utilized servers on a larger machine. The mainframe has a long term lead in the ability to do this, that includes hardware design characteristics (Cache/Memory Nest), specific scheduling capability in hypervisors (PR/SM and VM), and hardware features (SIE).   How many two year old low utilized servers  running disparate work can an M9000 consolidate?   

Joe(IBM): [[[quoting Jeff: • "Insist that your vendors provide open benchmarks and not just make stuff up."
Get underneath benchmarketing and really understand what vendor data is telling you. Relate benchmark results to design characteristics. Characterize your workloads. (Greg Pfister's In Search of Clusters and Neil Guther's Guerilla Capacity Planning suggest taxonomies for doing so.) Understand how fundamental design attributes are featured or masked by benchmark loads. Understand that ultimately standard benchmarks are “made up” loads that scale well. Learn to derate claims appropriately, by knowing your own situation. (Neil Gunther's Guerilla Capacity Planning suggests a method for doing so)]]]

Jeff(Sun): This is not the "making stuff up" that I was referring to. I was referring to misuse of benchmarks in the z10 announcement, which IBM was required to redact from the announcement web page and the blogs that linked to it. I'm not arguing against synthetic benchmarks that honestly try to mimic reality, I'm arguing against attempts to game the system that I discussed in my "Ten Percent Solution" blog entry.  I have explained the comparison made for the z10 announcement above.   Jeff objects to the utilzation coparison which is legitimate. In fact when servers are running at low utilization most of them are doing nothing most of the time.  That is the central argument for virtualization which is generally accepted in the industry.  I am also pointing out that Industry Standard Benchmarks are not created in purely noble attempt to uncover the truth about capacity.  In fact they are generally defined in a way that supports the distributed processing, scale out. client server camp of solution design, which is why they scale so well.   Think about it.  The industry standard committees each vendor has a vote.  System z represents 1/4 of IBM's vote.   Do you think there will ever be an industry standard benchmark which represents loads that do well on its machine structure?  The benchmarks and their machines have evolved together.  They can represent loads from single application codes that are cluster or numa concious.   What happens to all of those optimizations when workloads are stacked and the data doesn't remain in cache or must migrate from cache to cache?  The point is that relevance and validity of  either side of this argument is highly workload dependent.   The local situation will govern most cases.  Neither an industry benchmark result nor a single consolidation scenario  is more valid than the other. 

Joe(IBM): [[[quoting Jeff: "Be suspicious!"Be aware of your own biases. Most marketing hype is preaching to the choir. Do not trust “near linear scaling” claims. Measure your situation. Don’t accept the assertion that the lowest hardware price leads to the lowest cost solution. Pay attention to your costs, and don’t mask business priorities with flat service levels. Be aware of your chargeback policies and their effects. Work to adjust when those effects distort true value and costs."]]]

Jeff(Sun): With this I cannot disagree. That's exactly what I have been discussing in my blog entries: unsubstantiated claims of "near linear scaling" to permit 1,500 servers to be consolidated onto a single z (well, the trick here is to stipulate that 1,250 of the 1,500 do no work!) By definition servers running at low utilization are doing nothing most of the time.or to ignore service levels (see my "Don't keep your users hostage" entry). Actually virtualization  of servers  on shared hardware can improve service levels by improving latency of interconnects.  I'll also add "beware of the 'sunk cost fallacy'": you shouldn't throw more money into using a too-expensive product that has excess capacity because you've already sunk costs there.  Actually, adding workload to an existing large server can be the most effiicent thing to do in terms of power, cooling, floorspace, people, deployment, and time to market, even if the price of the processor hardware is higher.  These efficiencies and the need for them is locally driven.  In general there may or may not be a "sunk cost fallacy" .  In fact  you should also be aware of the "hardware price bargain fallacy".  Finally, Sun itself recognized System z and zVM as "the premier virtualization platform" when Sun and IBM jointly announced support of Open Solaris on IBM hardware.

(end of quoted material) <script type="text/javascript"> var sc_project=6611784; var sc_invisible=1; var sc_security="4251aa3a"; </script> <script type="text/javascript" src="http://www.statcounter.com/counter/counter.js"></script>

visit tracker on tumblr
Comments:

Post a Comment:
Comments are closed for this entry.
About

jsavit

Search

Categories
Archives
« April 2014
SunMonTueWedThuFriSat
  
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
   
       
Today