Measuring User Productivity
By user720949 on Jun 13, 2008
Authors: Jeff Sauro, Principal Usability Engineer - Oracle Applications User Experience and Joe Dumas, Senior Usability Consultant - Oracle Applications User Experience
(Editor's note: Jeff Sauro and Joe Dumas, industry experts in measuring usability, usually publish their findings in journals of human-computer interaction. I invited them to share the results of their latest study with customers on this blog.)
Does a new user-interface design make its users more productive? That is a question often asked in the software industry, but that is seldom answered satisfactorily. The reason is the difficulty of finding metrics that measure user productivity. Enhancing the productivity of enterprise software means helping its users get more work done in a unit of time. Productivity is only one component of ease-of-use of a product, but an important one that has to be measured with an experienced user.
We know how to measure the usability of products. But one common measurement of usability, the usability test, is heavily weighted toward ease-of-learning, that is, performance and user satisfaction during the first few times a new product is used. After those first few trials, productivity is just beginning. From that point on, it's the hundreds, perhaps thousands, of trials that make users more or less productive. A successful product must have high usability, but it should also enable users to accomplish work faster over the life of the product. Productivity is one component of usability and is best measured after the first exposure to an application.
The figure below shows the mean performance time of a task for first, second, and third usage of two applications. The initial use would be characterized as measuring ease-of-learning the user getting acquainted with the interface; whereas each subsequent use becomes a better measure of productivity (ease-of-use). Application B has a steeper learning curve compared to Application A, but after some exposure, the difference between the products narrows and stabilizes.
Learning curves for two applications: The dots represent the mean time to complete a task in an expense reporting application. The gap in completion time between applications narrows and stabilizes as users become more familiar with the interface.
We are exploring a metric that has the potential to measure the user productivity of Oracle products. It's called Keystroke Level Modeling (KLM) and was created more than 20 years ago by researchers at Carnegie Mellon University and Xerox PARC.
With KLM, each user action is assigned a standard time. For example, clicking on a button with a mouse takes 230 milliseconds, and moving your hand from the mouse to the keyboard takes 360 milliseconds (1000 milliseconds is equal to 1 second.). There also are standard times for mental operations. For example, locating the right icon in a toolbar takes 1350 milliseconds. The times for many activities have been standardized by taking the average time from trials from many experienced users. Furthermore, times for new activities can be standardized in the same way.
With KLM, the estimated time to complete a task can be obtained by determining every step needed to complete it successfully and then adding up the standard times for all of the steps. The total should be an estimate of the time it will take an experienced user to complete the task. But does it? While there have been dozens of published studies showing that KLM can predict time within 10%-30% of actual user time, there are few examples of its usage in the fast-pace nature of commercial software development.
To test its efficacy on our own products, we conducted a study at Oracle's Denver Usability Lab. First, we did a KLM analysis of two different expense reporting applications. We selected five common tasks that both applications allow, for example creating a report to claim travel expenses. We found that over the five tasks, the total predicted task time was 4.5 minutes for application A, and 6.4 minutes for application B. Consequently, the KLM analysis predicts that application A will allow users to be approximately 29% faster and, therefore, more productive (see below).
Comparison between predicted task time (KLM) and actual task time from two expense reporting applications.
Next, we recruited 26 people who submit expense reports (about half with experience with the products and half without experience). We trained them on each application and then had them perform the same five tasks for three errorless trials with both applications. The results showed that their average time on the third errorless trial was 4.8 minutes for application A and 7.3 minutes for application B. The users average time was within 5% and 12% of the KLM time for both applications, respectively. The estimate of being 29% more productive with application A was within 5 percentage points of the observed difference. Twenty-five of the 26 users preferred application A and rated themselves as being more productive with it.
Our study confirms that a KLM analysis of a user interface predicts the actual time of experienced users to within a manageable margin-of-error. We believe that we can now apply KLM analysis to new products to see if they are likely to enable users to be more productive than with a previous product. We can build these productivity estimates well before any code has been generated, which allows us to test and retest designs in moving towards the most productive interface.
For more information on Keystroke Level Modeling see the references and detailed examples located in Wikipedia.