Corporate Security Blog

Common Criteria and the Future of Security Evaluations

Mary Ann Davidson
Chief Security Officer, Oracle

For years, I (and many others) have recommended that customers demand more of their information technology suppliers in terms of security assurance – that is, proof that security is “built in” and not “bolted on,” that security is “part of” the product or service developed and can be assessed in a meaningful way. While many customers are focused on one kind of assurance – the degree to which a product is free from security vulnerabilities – it is extremely important to know the degree to which a product was designed to meet specific security threats (and how well it does that). These are two distinct approaches to security that are quite complementary and a point that should increasingly be of value for all customers. The good news is that many IT customers – whether of on-premises products or cloud services - are asking for more “proof of assurance,” and many vendors are paying more attention. Great! At the same time, sadly, a core international standard for assurance: the Common Criteria (CC) (ISO 15408), is at risk.

The Common Criteria allows you to evaluate your IT products via an independent lab (certified by the national “scheme” in which the lab is domiciled). Seven levels of assurance are defined – generally, the higher the evaluation assurance level (EAL), the more “proof” you have to provide that your product 1) addresses specific (named) security threats 2) via specific (named) technical remedies to those threats. Over the past few years, CC experts have packaged technology-specific security threats, objectives, functions and assurance requirements into “Protection Profiles” that have a pre-defined assurance level. The best part of the CC is the CC Recognition Arrangement (CCRA), the benefit of which is that a CC security evaluation done in one country (subject to some limits) is recognized in multiple other countries (27, at present). The benefit to customers is that they can have a baseline level of confidence in a product they buy because an independent entity has looked at/validated a set of security claims about that product.

Unfortunately, the CC in danger of losing this key benefit of mutual recognition. The main tension is between countries that want fast, cookie cutter, “one assurance size fits all” evaluations, and those that want (for at least some classes of products) higher levels of assurance. These tensions threaten to shatter the CCRA, with the risk of an “every country for itself,” “every market sector for itself” or worse, “every customer for itself” attempt to impose inconsistent assurance requirements on vendors that sell products and services in the global marketplace. Customers will not be well-served if there is no standardized and widely-recognized starting point for a conversation about product assurance.

The uncertainty about the future of the CC creates opportunity for new, potentially expensive and unproven assurance validation approaches. Every Tom, Dick, and Harriet is jumping on the assurance bandwagon, whether it is developing a new assurance methodology (that the promoters hope will be adopted as a standard, although it’s hardly a standard if one company “owns” the methodology), or lobbying for the use of one proprietary scanning tool or another (noting that none of the tools that analyze code are themselves certified for accuracy and cost-efficiency, nor are the operators of these tools). Nature abhors a vacuum: if the CCRA fractures, there are multiple entities ready to promote their assurance solutions – which may or may not work. (Note: I freely admit that a current weakness of the CC is that, while vulnerability analysis is part of a CC evaluation, it’s not all that one would want. A needed improvement would be a mechanism that ensures that vendors use a combination of tools to more comprehensively attempt to find security vulnerabilities that can weaken security mechanisms and have a risk-based program for triaging and fixing them. Validating that vendors are doing their own tire-kicking – and fixing holes in the tires before the cars leave the factory – would be a positive change.)

Why does this threat of CC balkanization matter? First of all, testing the exact same product or service 27 times won’t in all likelihood lead to a 27-fold security improvement, especially when the cost of the testing is born by the same entity over and over (the vendor). Worse, since the resources (time, money, and people) that would be used to improve actual security are assigned to jumping through the same hoop 27 times, we may paradoxically end up with worse security. We may also end up with worse security to the extent that there will be less incentive for the labs that do CC evaluations to pursue excellence and cost efficiency in testing if they have less competition (for example, from labs in other countries, as is the case under the CCRA) and they are handed a captive marketplace via country-specific evaluation schemes.

Second, whatever the shortcomings of the CC, it is a strong, broadly-adopted foundation for security that to-date has the support of multiple stakeholders. While it may be improved upon, it is nonetheless better to do one thing in one market that benefits and is accepted in 26 other markets than to do 27 or more expensive testing iterations that will not lead to a 27-fold improvement in security. This is especially true in categories of products that some national schemes have deemed “too complex to evaluate meaningfully.” The alternative clearly isn't per-country testing or per-customer testing, because it is in nobody's interests and not feasible for vendors to do repeated one-off assurance fire-drills for multiple system integrators. Even if the CC is “not sufficient” for all types of testing for all products, it is still a reputable and strong baseline to build upon.

Demand for Higher Assurance

In part, the continuing demand for higher assurance CC evaluations is due to the nature of some of the products: smart cards, for example, are often used for payment systems, where there is a well understood need for “higher proof of security-worthiness.” Also, smart cards generally have a smaller code footprint, fewer interfaces that are well-defined and thus they lend themselves fairly well to more in-depth, higher assurance validation. Indeed, the smart card industry – in a foreshadowing and/or inspiration of CC community Protection Profiles (cPPs), was an early adopter of devising common security requirements and “proof of security claims,” doubtless understanding that all smart card manufacturers - and the financial institutions who are heavy issuers of them - have a vested interest in “shared trustworthiness.” This is a great example of understanding that, to quote Ben Franklin, “We must all hang together or assuredly we shall all hang separately.”

The demand for higher assurance evaluations continues in part because the CC has been so successful. Customers worldwide became accustomed to “EAL4” as the gold standard for most commercial software. “EAL-none”—the direction of new style community Protection Profiles (cPP)—hasn’t captured the imagination of the global marketplace for evaluated software in part because the promoters of “no-EAL is the new EAL4” have not made the necessary business case for why “new is better than old.” An honorable, realistic assessment of “new-style” cPPs would explain what the benefits are of the new approach and what the downsides are as part of making a case that “new is better than old.” Consumers do not necessarily upgrade their TV just because they are told “new is better than old;” they upgrade because they can see a larger screen, clearer picture, and better value for money.

Product Complexity and Evaluations

To the extent security evaluation methodology can be more precise and repeatable, that facilitates more consistent evaluations across the board at a lower evaluation cost. However, there is a big difference between products that were designed to do a small set of core functions, using standard protocols, and products that have a broader swathe of functionality and have far more flexibility as to how that functionality is implemented. This means that it will be impossible to standardize testing across products in some product evaluation categories.

For example, routers use standard Internet protocols (or well-known proprietary protocols) and are relatively well defined in terms of what they do. Therefore, it is far easier to test their security using standardized tests as part of a CC evaluation to, for example, determine attack resistance, correctness of protocol implementation, and so forth. The Network Device Protection Profile (NDPP) is the perfect template for this type of evaluation.

Relational databases, on the other hand, use structured query language (SQL) but that does not mean all SQL syntax in all commercial databases is identical, or that protocols used to connect to the database are all identical, or that common functionality is completely comparable among databases. For example, Oracle was the first relational database to implement commercial row level access control: specifically, by attaching a security policy to a table that causes a rewrite of SQL to enforce additional security constraints. Since Oracle developed (and patented) row level access control, other vendors have implemented similar (but not identical) functionality.

As a result, no set of standard tests can adequately test each vendor’s row level security implementation, any more than you can use the same key on locks made by different manufacturers. Prescriptive (monolithic) testing can work for verifying protocol implementations; it will not work in cases where features are implemented differently. Even worse, prescriptive testing may have the effect of “design by test harness.”

Some national CC schemes have expressed concerns that an evaluation of some classes of products (like databases) will not be “meaningful” because of the size and complexity of these products [1], or that these products do not lend themselves to repeatable, cross-product (prescriptive) testing. This is true, to a point: it is much easier to do a building inspection of a 1000-square foot or 100-square meter bungalow than of Buckingham Palace. However, given that some of these large, complex products are the core underpinning of many critical systems, does it make sense to ignore them because it’s not “rapid, repeatable and objective” to evaluate even a core part of their functionality? These classes of products are heavily used in the core market sectors the national schemes serve: all the more reason the schemes should not preclude evaluation of them.

Worse, given that customers subject to these CC schemes still want evaluated products, a lack of mutual recognition of these evaluations (thus breaking the CCRA) or negation of the ability to evaluate merely drives costs up. Demand for inefficient and ineffective ad hoc security assurances continues to increase and will explode if vendors are precluded from evaluating entire classes of products that are widely-used and highly security relevant. No national scheme, despite good intentions, can successfully control its national marketplace, or the global marketplace for information technology.


One of the downsides of rapid, basic, vanilla evaluations is that it stifles the uptake of innovative security features in a customer base that has a lot to protect. Most security-aware customers (like defense and intelligence customers) want new and innovative approaches to security to support their mission. They also want the new innovations vetted properly (via a CC evaluation).

Typically, a community Protection Profile (cPP) defines the set of minimum security functions that a product in category X does. Add-ons can in theory be done via an extended package (EP) – if the community agrees to it and the schemes allow it. The vendor and customer community should encourage the ability to evaluate innovative solutions through an EP, as long as the EP does not specify a particular approach to a threat to the exclusion of other ways to address the threat. This would continue to advance the state of the security art in particular product categories without waiting until absolutely everyone has Security Feature Y. It’s almost always a good thing to build a better mousetrap: there are always more mice to fend off. Rapid adoption of EPs would enable security-aware customers, many of whom are required to use evaluated products, to adopt new features readily, without waiting for:

a) every vendor to have a solution addressing that problem (especially since some vendors may never develop similar functionality)

b) the cPP to have been modified, and

c) all vendors to have evaluated against the new cPP (that includes the new security feature)

Given the increasing focus of governments on improvements to security (in some cases by legislation), national schemes should be the first in line to support “faster innovation/faster evaluation,” to support the customer base they are purportedly serving. Last but really first, in the absence of the ability to rapidly evaluate new, innovative security features, customers who would most benefit from using those features may be unable or unwilling to use them, or may only use them at the expense of “one-off” assurance validation. Is it really in anyone’s interest to ask vendors to do repeated one-off assurance fire-drills for multiple system integrators?


The Common Criteria – and in particular, the Common Criteria recognition – form a valuable, proven foundation for assurance in a digital world that is increasingly in need of it. That strong foundation can nonetheless be strengthened by:

1) recognizing and supporting the legitimate need for higher assurance evaluations in some classes of product

2) enabling faster innovation in security and the ability to evaluate it

3) continuing to evaluate core products that have historically had and continue to have broad usage and market demand (e.g., databases and operating systems)

4) embracing, where apropos, repeatable testing and validation, while recognizing the limitations thereof that apply in some cases to entire classes of products and ensuring that such testing is not unnecessarily prescriptive.

[1] https://www.niap-ccevs.org/Documents_and_Guidance/ccevs/DBMS%20Position%20Statement.pdf

Be the first to comment

Comments ( 0 )
Please enter your name.Please provide a valid email address.Please enter a comment.CAPTCHA challenge response provided was incorrect. Please try again.