iPAD's Natural User Interface
By mprove on Jul 08, 2010
John Cartan blogged about iPad's Natural User Interface at Work. See my comments inline:
Many people have been surprised by the immediate success of Apple's iPad. It is already beginning to transform the way we read and relax at home - and also the way we work. Why did it succeed where so many earlier tablets failed? And will it (and its inevitable imitators) also transform the enterprise?
The answer to both of these questions, I believe, lies in something called the NUI, the Natural User Interface. We are in the early stages of a paradigm shift that will indeed transform the enterprise. I think I know how it will play out because I've seen it happen once before.
NUI image is an iPad screenshot of the DICE HD application, Fullpower Technologies, Inc.
In 1984, Apple introduced a new computer, the Macintosh, and a new interaction paradigm developed at Xerox PARC called the Graphical User Interface (GUI).
Ted Nelson labels Xerox PARC's user interface and all it's successor as PUI , the Parc User Interface, in order to distinguish between Alto, Star, Mac, Windows, Gnome etc. on the one side and other possible graphical user interfaces on the other. I call it WIMP-Desktop GUI for the very same reason; and let you continue to explain WIMP.
Its windows, icons, menus, and pointing device - the "mouse" - were a radical change from the then reigning Command Line Interface (CLI). At first it was dismissed as a fad or a game system with no relevance for the workplace.
Sources indicate, that WIMP was even coined by UNIX hackers who liked the connotation of the term.
But its influence increased steadily because, for most things, GUIs worked better than CLIs. GUIs required less training and made possible whole new kinds of applications. Within a few years, mice began to appear on every desktop.
For all of its power and ease of use, though, a GUI has limitations. Instead of interacting directly the way we do with people, you still have to formulate commands and interpret responses. The windows desktop is a metaphor, rather than a direct representation.
Yes, indeed it is. It is an illusion of a physical desktop or even office on a two-dimensional computer screen. Some aspects work similar to the real objects, overlapping windows and overlapping sheets of paper for instance. Others do magically more, like calculations in spreadsheets. There are also several applications that do not pay any attention to the desktop metaphor. Games and web browsers come to mind. In fact the paradigm of WIMP Desktop Computing is already challenged by the web.
And the mouse requires you to move your hand horizontally in order to move a pointer vertically - while also chaining you to your desk.
Isn't it astonishing how well people manage to learn the hand-eye coordination? And there are also substitutes for the mouse like the trackpad to add mobility to the PCs and still have a pointing device. In my point of view the most severe deficiency of the mouse was the limited expressiveness. Point (to hover on something), click (to poke something), and click-drag (to move something around). Right-click (to open a context menu) adds many more choices that apply to the context of the clicked object. Last not least the scroll wheel (to operate vertical window sliders).
To complete the picture we should not forget the keyboard. An input device that is tightly coupled to human language and opens an infinite space of expressive power. If the computer understands what the user is typing we can call it CLI, command line interface. The shell in Unix is one example, or for the younger readers: The google search field is also a CLI where you enter a bunch of characters and get something back.
So even as the GUI age was dawning, researchers were already working on its successor, a new approach which came to be known as a NUI. So what is a NUI?
A NUI is an interface that lets people use their natural behaviors to interact directly with information. I find that NUIs have four defining characteristics:
- Direct, natural input
- Realistic, real-time output
- Content, not chrome
- Immediate consequences
"Direct, natural input" can include 3D gestures, speech recognition, facial expressions, and anything else that comes naturally, but for now it mostly means multi-touch. Multi-touch goes far beyond mouse clicks. It allows natural, expressive gestures like pinching, stretching, twisting, and flicking. And it's especially well-suited for devices like phones and tablets that you are already holding in your hands.
It is to be shown, that these multi-touch gestures are natural. At least the objects where these are applied are represented by pixels on a screen. So once again, they are metaphoric. Just by luck (or good design) they imply a mapping between manipulating real objects and interacting with hi-resolution computer images of certain virtual objects. Twisting works fine, if it works. But I never zoomed into a real paper photo by touching it with two fingers and splitting them apart. Well, maybe it works. I never tried!
There is another thought. If you hold a mobile device in your hand, there is just one hand left to interact with the system. Or one thumb if you think about cell phones.
There is just another though catching up: Is texting/SMS a form of CLI?
In any interface, richer input demands richer output. In order to harness natural responses, a NUI output has to be as fast and convincing as nature itself. When the user makes a natural gesture like a pinch, the display has to respond in an animated, often photorealistic way in real time, or else the illusion will be broken.
The GUIs of today, with their windows and icons and menus, are laden with visual signals and controls, or "chrome". This is one of the most unnatural features of a computer interface and tends to distract users from the actual content they are trying to work with. A NUI strips most of this away and lets users focus on one thing at a time.
Do not blame the WIMP desktop / browser GUIs if it is in fact the featuritis that is causing the pain in many applications. In contrast, is it desired to focus on just one thing at a time in the so-called NUI? Is it natural? What is the cost of not being able to jump with ease between related or unrelated apps?
Finally, a NUI is not just spatially realistic, but temporally realistic as well. In the real world, actions have immediate consequences. If you want to go swimming, you don't have to wait for a river to "boot up". Splashes happen as you swim and will not be lost if you forget to save them. Similarly, NUI devices and applications start instantly and stop on a dime. Changes are saved as you go.
Again, I call it bad usability if desktop applications do not respond in time. Hover, click, drag and typing should happen without perceivable delay. It is poor software engineering if these gestures have no immediate response. The user is more patient when it comes to real computation, database access, or page loads in a web browser. In general I think WIMP/desktop/web GUIs and NUIs should have the same objectives. But I have to confess, that the state-of-the-art touch devices provide micro-feedback much better than classical PCs.
Apple did not invent the NUI, but its iPhone was the first device to take these concepts mainstream. Competing smart phones, with their GUI interactions and rows of tiny buttons, were no match for the iPhone's chrome-free, fully-responsive, multi-touch UI. The iPhone was NUI's proof of concept.
But the iPhone did not trigger a full paradigm shift, because there is a relatively small overlap between smartphone use cases and desktop use cases. The iPad, however, is a different beast altogether.
Most smartphone tasks follow what our Oracle mobile team calls the two-minute rule: you take the phone out of your pocket, do your task, and put it right back. Tasks on the iPad, in contrast, often last just as long as desktop tasks. In fact, they are often the very same tasks.
The iPad's screen size makes all the difference. It allows a much fuller expression of NUI interactions with innovations like popovers and orientation-sensitive split screens not possible on a pocket-sized device. The result is that many common tasks currently performed on a desktop or laptop can be done more efficiently and more pleasantly on an iPad.
That's interesting, but I have no valid data or personal experience if this is really the case. Just the length of a session does not put two tasks in the same category. I would expect that typical use cases for PCs and TTs (touch tablet, to introduce a new acronym) are quite different. Tasks for knowledge worker fit better for PCs, while tasks for information seekers and multimedia entertainment are better for tablets. Communication will remain omnipresent on all devices. As I see, we agree to a certain extent:
CLIs did not go away, and neither will GUIs. Both are still superior for certain types of tasks. For awhile at least, GUIs will be preferred for complex, desk-bound tasks that really require multiple windows and lots of chrome.
But for simpler tasks, like reading, surfing the web, dealing with email, sketching diagrams, writing blogs, and for unfettered tasks now done with paper or clipboards in warehouses or hospitals or hallways or airports, more and more people will prefer NUIs.
Just as GUI did many things better than CLI, so NUI now does many things better than GUI. The iPad is a tipping point, just as the Mac was. And because this change will be so far-reaching, the impact on the enterprise is not "if," but "when."
Thanks John, for your initial blog article! It rang so many bells that I had to answer the way I did.
- Darren David, Nathan Moody - Designing Natural User Interfaces
video at http://vimeo.com/4420794
slides at http://stimulant.io/files/uxweek_stimulant_DNI.pdf
- Bret Victor: A brief rant on the future of interaction design. – The future of interaction design is not pictures behind glass.