Some ruminations about software application licensing

The  Problem

Many important software packages (e.g. Oracle) are licensed on a per processor basis. That is, if you purchase a license for a 72 processor machine, the price is on the near order of 72times more expensive than a single processor license. Is this sensible? What are the unintended consequences for Society at large?


Consider a timesharing service a veryprevious company of mine employed from time to time, BCS (BoeingComputing Services).

BCS  had a large ensemble (more than 100) CDC mainframes. They were binary compatible, and setup to have automatic failover, so that a job that started on one system could end up on another (or having run on a series of different systems).This was necessary to provide adequate reliability.

Most software was metered, that is, one was billed as the sum of resources consumed, such as

  • Nc Dollars per CPU minute 

  • Nd Dollars per amount of diskspace used

  • Ni Dollars per I/O transaction

etc. a complicating factor, however, was that while all the systems were binary compatible, they ran at different speeds. It wasthe clear goal of the software providers (most of the computer vendorthemselves! Or the timesharing system operator itself) that the price for running a job should either be independent of the speed of thesystem, or (more frequently) carry a premium based on the faster CPU.

Since jobs frequently wound up running on more than one system,the billing algorithm was adjusted so that one was charged as if oneran on the original selected system (so if the job “failed over” to a faster system, the rate was adjusted so that the final charge was the same as if it had run on the original system).

When the industry moved away from Timesharing, and thus charge per unit time and towards a software purchase model, there remained avague notion on the part of software vendors (now, more frequently someone other than the computer vendor itself) that if the customerhas 10 slow machines, or machine that is 10x faster, the payment due to the software vendor should remain the same.

As different vendors processors are more (or less) capable, there are software vendors who establish a base price for different vendorsthat differ. This may or may not be acceptable (legally or economically) for some.

Even where it is Legal and Accepted, when a vendor has multiple microarchitectures there may be no single processor whose performance can act as a reliable base.

As a result, many software vendors have simply relied on "processor count", it being a crude metric for the power of a system.

There are, of course, other ways to price software licenses(including per user, per system, per site, and per actual user, andper employee arrangements). However, we'll focus on the lamentablycommon practice of pricing per “CPU”...


Unfortunately, there is no objective definition of a processor.For example, a VLIW machine (like the extinct Multiflow 28) has avery large number of functional units --- more than a quad coreSPARC. The result may well be faster performance for the “singleprocessor”, but  the  licensing fees for the multi-coreprocessor are 4x more expensive!

Current trends in computer design make such issues increasinglyproblematic.

One could argue, as IBM does, that eachidentifiable “processor element” (what sun calls a“core”) is an objectively identifiable“processor”.

But that a “core” exists asan identifiable physical entity is merely a side effect of current design tools and methods (viz. define a single core, step andreplicate). With more advanced CADtools, all of the logical corescould be instantiated and then baked into one huge monolithic mass (which might well have technical benefits beyond that of confusing licensing schemes).

An Aside: Software engineers may recognize this as essentially what a “globally optimizing”compiler does (full interprocedural analysis, etc.) vs. separate compilation.

An Aside: Yes, to all the CAD developers and hardware engineers reading this, I appreciate the manifold reasons why we don't do this (today), and why it's hard (it's possible we might never do it)

Another  complication comes about from the software (or firmware) concept of “virtualization”. Schemes such as those touted by IBM and Microsoft (one physical processor can appear to be any number of processors) provide another confusing view of the actual system from the perspective of software licensing (not to mention how exciting it can be to maintain 20x the number of OS configurations on a single box...).

Yet another complication is provided bythe concept of hardware threads (such as are found in chips ranging from Intel's Xeon to IBM's Power5 and various Sun chips). These threads are typically exposed to the programmer as “virtual CPUs” that is, if the program inquires from the system how many processors there are, a single Xeon chip currently reports 2. The performance of most such hardware threading schemes has been poor (that is, the second thread adds 10%-30%performance, and therefore has been ignored by ISV software licensing..but as hardware threading matures, the performance may well approach N where N is the number of hardware threads).

In the event that the problem is not clear, let us consider the  case of  a chip such as described by:Ace's Hardware (this is not to say that I am confirming any or all bits of speculation and assertions made by that author). But to sum up, they claim it can be described as a single chip, with 8 SPARC cores, each of which has 4 hardware threads. Let us assume for this discussion that they are correct, then....

Is this a single processor with 32 threads? (If so, the license fees for Oracle, would be approximately $15K, using my understanding of  the current rules which ignore threads) Or is it “8 processors”? (If so, the price might be more like $320,000 because larger processor counts start with a higher base cost), or would it be even higher due to the large number of hardware threads?

Getting very speculative, what if some  key hardware resources  were not replicated all N  times? Indeed, what if there was a key pipeline resource critical to Oracle performance  shared amongst all N cores? Does this change the picture? If not, isn't that a truly inequitable licensing algorithm?

It should be clear to the reader that the current situation, where software is licensed by number of“processors” is hardly architecture neutral and has no objectively measurable basis.

In marketplaces where the software vendor has a near monopolistic position, having no objective basis for pricing across platforms may represent a litigation risk (I Am Not a Lawyer, this is my opinion and not that of any member of a Bar Association).

So what can we do instead?

A Solution (starting from the basics)

Every modern computing device is composed of a collection of many chips each of which has a number of transistors. Each transistor has performance characteristics, such as switching speed.

In an Ideal implementation, all hardware vendors would disclosethe number of total transistors in the system, and a breakdown by speeds (most frequently, all the transistors have similar characteristics in a given chip, but in a large system, different chips may have radically different characteristics. Also, in some cases, someparts of a chip may have very different characteristics (e.g.Different clock rates)).

In an Ideal implementation a software vendor would compute a billing factor (BF) for a given system model. BF would be defined as the sum of  all Ti\*Ni fori from 1 to the number of transistor types, and where each T is acost per transistor type and each N is the number of each transistor type. The total price can then be computed  as the product ofBase_Price\*BF. Base_Price will be the same across all platforms (it may be adjusted on a per customer basis based on volumeor other applicable discounts). The BF provides for an architecture and CAD tool neutral platform adjustment.

As most large computer systems are composed of replicated elements, so it may be the case that the most common sub-block may be used to compute a BF and the system BF can be reasonably approximated by Nsubblock\*BF.  This technique will be most useful in “Capacity on Demand” systems where entire sections of the machine are only enabled at some later date.At the time additional subblocks are enabled, the incremental software license charges are trivial to compute.

In the event that hardware vendors fail to publish the precise counts and transistor speeds, the number of transistors can be approximated based on the size of the chip and the particular geometry (130nm, 90nm, etc.). An “average value”may be employed where various transistor speeds are unavailable.


The primary advantage of this system is that it is architecture(both micro and macro) independent, and it is independent of the State of the Art in CAD tools. Having an objective system would be anice thing to have.


Is this a unique optimal solution? Perhaps; but if one is environmentally focused, there's another approach that may have some appeal...


In the Ideal implementation, the end user's computing system would be augmented by a machine readable watt-hour meter, and it's operating system would have fine grained accounting facilities.

At the start of the licensed applications execution,the current value of the watt-hour meter would be recorded. At theend of the licensed applications execution,current value of the watt-hour meter would be obtained, and subtracted from the initial value. This represents the entire power consumed by the system during the licensed applications execution.

Since most computing systems execute more than one program at atime, the Operating Systems accounting facilities will be employed to determine the fraction of the machine's resources that were consumed during the execution of the licensed application. The cost, per execution will then be computed as a Base_Factor\*watt_hours\*Usage_factor (where UF is typically less than 1. It can only be larger than 1 when there issome sort of “Capacity on Demand” functionality deployed).

This would represent a return to “price per usage” asin the days of mainframe computing.

<Un>intended Consequences

The current "processor count" metric encourages users to buy machines with the fastest single thread performance (admittedly, this isn't the only encouragement ;>). The most reliable way to deliever that, generation after generation, has been to "chase" very high clock rates. Unfortunately this is exceedingly energy inefficient (as well as driving up Fab costs rapidly). While California's energy problems certainly weren't created by simply having too many fast clocked computers, it provides a graphic lesson in the downside to power hungry approaches (as well as the laws of economnics and the consequences of incompetent government).

The transistor count approach would reward designers (and consumers) who got the most performance per transistor; while this is appealing from an engineering/logical sense, it's hard to see how that provides a useful benefit to society per se.

The price per watt approach has the intended  consequence of rewarding consumers and designers for providing better performance for lower engergy consumption.

[and before anyone asks, yes, I think it would have been more sensible than setting fleet mpg goals to have made the taxes on gasoline vehicle dependent, and had a factor tied to fuel economy. More efficient cars should be charged less, and less efficient ones should be charged more. Show cars and other historical vehicles could remain unmodified ... but would pay ruinious rates if run as daily commuters ;> ]     

If you like these ideas, please pass them on. Write about them. Lobby for them. Implement them in your products.

Post a Comment:
Comments are closed for this entry.



« April 2014