Lies, Damn Lies, and Statistics
By user701213 on Feb 12, 2008
There is an aphorism famously attributed to Mark Twain (among others) to the effect that there are "lies, damn lies and statistics." The Mark Twain quotes on truth I was able to verify were almost as interesting though not quite so pithy:
A lie can travel halfway around the world while the truth is putting on its shoes. (Remember, this was before the Internet).
I've had several reminders recently that what we think we know and take for granted is often not only wrong, but quite wrong. The ease with which we can search for things on the Internet leads us to forget that what we are finding is information, but not always expertise and almost never actual wisdom. To the extent we rely on "received wisdom" it is a good opportunity to remind ourselves that information and knowledge are two different and often diametrically opposed beasties, indeed.
For example, someone recently sent a resume of a former colleague. I use "resume" loosely, as the description of work experience (the portion I have direct knowledge of, which is the only section to which I address my comments) is better described as "fiction." Perhaps, "fiction based on actual events," if I am feeling generous, except that I am not. This was by far the worst example of resume embellishment I've seen in 20-some years.
In the interests of protecting the guilty, I will call the individual involved Fictional Resume Writer (FRW). The nature of FRW's sins were 1) claiming credit for work FRW never did 2) claiming origination of work done by others - which I find especially reprehensible and 3) gross exaggeration of accomplishments. I emailed FRW and said that I thought a resume rewrite was in order; especially given FRW was seeking business with Oracle. Business is personal, I said, and someone who is materially misleading in credentialing I'd be unlikely to trust or want to work with in a business setting. I also went point by point with the "issues" in the resume, just to be clear what I thought was inaccurate and why.
The response I got was the email equivalent of a shoulder shrug and a comment that the amount of hard work FRW expended "justified" claiming credit. (Is this the new world of Web 2.0, where "mashup" owners claim origination based on the "hard work" involved in taking others' work and creating something different from it?)
Perhaps I am old-fashioned, but there is a clear difference between a good idea, initiating that good idea, and carrying through on a good idea to effect positive and measurable change. And common sense if not a sense of honor should dictate how one expresses the difference among them.
For example, once upon a time, I got tired of explaining to developers for the umpteenth time what a buffer overflow was, so I wrote up a few pages - perhaps two or three - on what constituted a buffer overflow and how to avoid them. Though I did not know it at the time, this was the genesis of the Oracle Secure Coding Standards. I note at the outset, for reasons that will become all too obvious if you keep reading, that I do not claim "authorship" of these standards.
My prototype document grew over time, substantially. Someone else expanded the list to be a "top ten" list of secure coding dos and don'ts. "Top ten" then grew to be an extensive list of security vulnerabilities and how to avoid them. There are also examples of what happens if you don't write correct, secure code (drawn from actual coding errors development teams have made). All in, the document has grown to about 300 pages, to include "case law" (not just what not to do, but how to address specific technical issues the correct way). One individual (Oracle's Chief Hacking Officer) has written the bulk of the secure coding standards with input and review from others and he is clearly the author and redactor of this document. (Mahalo nui loa, Howard.)
There have been other enhancements and uses of the secure coding standards. Someone got the bright idea of tying the secure coding standards directly to our product release security checklists. A couple of people developed the secure coding class (online web-based training based on the Oracle Secure Coding Standards), while still others have watched over the rollout of this class to the development groups that need to take it (to include restructurings, new hires and acquisitions).
In theory, were I to write my resume the way FRW has, I would claim "originator," "author" or "founder" of the secure coding standards, since I wrote the first two - count them - two glorious pages. But what I wrote does not have the breadth, depth, examples, actual technical know how, proactive guidance, and utility of what now exists. My claim to "authorship" - if I were vain enough to make it - is like the person who puts the front page and inside page (the one with the ISBN number) together for a book claiming to be the "author." It's simply ridiculous, and I'd deserved to get whacked with all 300 pages, hard bound, if I made such a statement.
There is a security lesson here. One of them is the age-old one of "trust, but verify." It is not my job - and I would not do it - to tell FRW's current employer that FRW's resume in some particulars is much closer to fiction than fact. "Caveat emptor" - let the buyer beware. If you are hiring someone on the basis of credentials, it's well worth checking them.
The second security lesson is an old one. Business is still personal, and personal attributes matter, like honor and trust. Contracts, for example, cannot create trust where there is none; just specify requirements for performance and remedies for non-performance. A person who is untrustworthy in small things is likely to be untrustworthy in large things, and if there is anything more untrustworthy than taking credit for others' work, I do not know what it is.
The second reminder of the difference between what we think we know and the truth was occasioned by a recent op-ed piece in the Wall Street Journal called "The Lies of Tet" by historian Arthur Herman (I can personally recommend his book To Rule the Waves - How the British Navy Shaped the Modern World).
For many years, I've tried a little "knowledge experiment," by asking random people if they had heard of the Tet Offensive and, if so, who they thought "won." The response (if I exclude people who have served in the armed forces who know the truth) is astonishing. Most people, when asked, believe that the Tet Offensive was a resounding defeat for the forces of the United States and the Republic of South Vietnam. In particular, those who were alive at the time and recall the media coverage are shocked to find out that what they think they know is all wrong. One hundred percent wrong, in fact.
As Arthur Herman says:
"The Tet Offensive was Hanoi's desperate throw of the dice to seize South Vietnam's northern provinces using conventional armies, while simultaneously triggering a popular uprising in support of the Viet Cong. Both failed. Americans and South Vietnamese soon put down the attacks, which began under cover of a cease-fire to celebrate the Tet lunar new year. By March 2, when U.S. Marines crushed the last North Vietnamese pockets of resistance in the northern city of Hue, the VC had lost 80,000-100,000 killed or wounded without capturing a single province. Tet was a particularly crushing defeat for the VC (emphasis mine). It had not only failed to trigger any uprising but also cost them "our best people," as former Viet Cong doctor Duong Quyunh Hoa later admitted to reporter Stanley Karnow. Yet the very fact of the U.S. military victory -- "The North Vietnamese," noted National Security official William Bundy at the time, "fought to the last Viet Cong" -- was spun otherwise by most of the U.S. press." ("The Lies of Tet," Wall Street Journal, February 6, 2008)
There are "truths" that are so embedded in the fabric of what we think we know that we don't even bother reading broadly, from a breadth of sources (and reputable sources) to reach our own conclusions about what is true vs. what is received wisdom. We simply must do so on issues that matter to us, instead of "outsourcing" wisdom to pundits. Otherwise, "collective" wisdom substitutes for actual facts and analysis. Of all the maxims wandering loose about the Internet (and on it), the one I find the most obnoxiously untrue is "the wisdom of the crowds." Sometimes, the crowds are dead wrong, because they've been massively misinformed. As with Tet.
It is an inescapable truth that the media got Tet wrong, spectacularly wrong, and "the lies of Tet," to use Arthur Herman's phrase, continue to shape people's opinions of not only Vietnam, but warfare in general and the veracity of the armed forces decades after the actual events.
As much as I have expressed concerns about every idiot with an opinion being able to express it on the Internet (as I am doing here!), the fact remains that in some cases, bloggers have spotted untruths, exaggerations and fabrications reported by the media (doctored pictures and doctored service records, to think of a couple of prominent examples). There is an important utility in keeping professional journalists and industry analysts honest and objective that is worth something to the millions of people who expect that from mainstream media. Score one for the blogosphere.
The corollary, and cautionary note to the blogosphere, is the realization that not all truths are apparent in nanoseconds. Technologists are used to rapidity of change, and the barrage of information and the rapidity of change often consume us as we try to keep up with the latest technology trend. Often, however, it is only with the passage of time, careful analysis, and hindsight, that we can correctly weigh events. There is a reason for the phrase rendered "timeless truths" instead of "nanosecond truths."
I was on vacation recently at a venue that couldn't be more removed from Silicon Valley: Colonial Willliamsburg, Virginia, at The Williamburg Antiques Forum. Looking at decorative objects that are between 300 and 400 years old and determining what they say to us now about the time at which they were made and the people who owned them could not be more different than what I do for a living. Yet even in the world of decorative arts, curators continue to uncover new facts that may lead them to reinterpret history. In short, even with a 350-year-old highboy, there is still much to learn, to the point that one's view of history may change.
The security issue in the above is still "trust, but verify," and I would add "from multiple sources, not merely one." Be especially wary of "received wisdom" on things that matter, and be willing to do your own research and develop your own expertise. Anything I read about military history - and history, in large part, is military history - I use at least two sources for if it is important to me, and occasionally more.
Thus far, I've talked about lies (FRW), damn lies (the media about the Tet offensive) but not about statistics. The statistics part comes with a presentation I have been doing recently (three times in Eastern Europe a couple of weeks ago) about security metrics.
I'm going to skip over a lot of what I talked about (I have already opined in a previous blog entry why "number of published vulnerabilities" is a metric very easy to "game" to achieve unintended results), to focus on a truth I stumbled upon by sheer accident. I suspect that metrics kahunas have known what I found for a long time, so I don't claim novelty, just a "eureka!" moment.
I talked in my presentation about what constitutes a good metric (objective, measurable, helps you answer basic questions like "are we doing better or worse," incents the right behavior, fosters additional questions, helps you identify positive or negative trends, and so on). I used as an example the various metrics we keep pertaining to the release of CPUs that I wanted to discuss as a group, because there is no single metric that you could use to answer "goodness questions" related to how we are doing. In fact, picking a single metric and focusing it to the absence of all others would lead to incorrect incentives.
For example, one of the metrics we keep is "number and percentage of patches that were published on the announced CPU date." That's a good one, except that you do not want people only hitting the date and ignoring quality. So, "number and percentage of patch reloads" is another one we keep, because while we want CPUs to come out on the target date, we do not want to have to reload patches because the quality was poor. Both quality and timeliness are important; hence, two metrics. We are also concerned that the issues we identify as worthy of going into a Critical Patch Update make it through the process (sometimes, issues drop out for regressions). Ideally, you'd want all critical issues you identify to actually make it into the target CPU (because there are no regressions). So, we look at number of issues that drop out through the CPU process because we are trying to make that number as low (over time) as is feasible. I walked through all of the aforementioned metrics (and a few related to CPUs I did not discuss here) and slapped a heading on the slide: "combined metric."
My eureka moment was noting that, if security metrics are useful, and they are, the idea of a combined metric is even more useful. The goal of a metric is to be able to manage better, and just as (in the pre-GPS days) of navigation you need to take multiple "fixes" to triangulate your position, you are often better served by triangulating how you are doing by measuring and weighing several different metrics. Rarely can you manage well by measuring just one thing.
The real goal of any metric, or "statistic," to round out my theme, is to manage better. Metrics can help you allocate resources to affect the most good for the most people and to spot trends (both positive and negative) quickly. Ultimately, a good metric needs to help you answer the question, "Are we doing better or worse?" You can do a lot with metrics, and some people lie with them, but above all, you have to be honest with yourself.
As Shakespeare put it so well:
This above all: to thine own self be true,
And it must follow, as the night the day,
Thou canst not then be false to any man.
For More Information:
Book of the week - War Made New: Technology, Warfare and the Course of History: 1500 to Today by Max Boot
A really great read about how technological changes influence warfare. If you have no idea how IT-centric warfare now is (in terms of command and control), the last 100 pages are really insightful.
One of the very best security metrics kahunas I know is Dan Geer. Anyone interested in this topic should Start With Dan:
More on books by Arthur Herman:
An article by Arthur Herman on the Vietnam War:
One of the best books on Vietnam is still Lewis Sorley's A Better War: