For those who saw my talk on this topic at UX Australia 2012, you may wish to jump down to the part on applying mental operators. Otherwise you should go through the deck on slideshare.
Despite having touched on the GOMS Keystroke Level Model (KLM) while doing my Masters of Information Architecture, it wasn’t until I read about it in Jef Raskin’s The Humane Interface last year that I discovered just how simple and useful it could be. I was able apply it immediately on an investment banking project I was doing at Deutsche Bank in London and was a little bit miffed that I’d managed to come this far without someone teaching it to me. I was working with some very talented people at EMC Consulting (formally Conchango) at the time and almost no one there had heard of it yet alone used it on a project.
Doing Keystroke Level Model analysis makes you a better designer as it forces you to think about how humans interact with technology at a level that you probably haven’t before. You learn which types of interaction tend to create more work for your users (like big scrolling drop-downs) and have to think at a detailed level about automation and how people chunk tasks.
Speed isn’t everything and other factors need to be taken into account – but to some degree I would argue that speed always matters. Elegance and economy of form is a fundamental quality of any type of design and we should always ensure that we are allowing people to achieve their goals with as little work as possible.
Doing KLM almost seems too easy until you get to the part when you have to apply mental operators. Here are the 6 heuristics from Card, Moran and Newell as cited in The Humane Interface (p 77):
Rule 0: Initial insertion of candidate Ms
Rule 1: Deletion of anticipated Ms
Rule 2: Deletion of Ms within cognitive units
Rule 3: Deletion of Ms before consecutive terminators
Rule 4: Deletion of Ms that are terminators of commands
Rule 5: Deletion of overlapped Ms
Basically you start by adding an M everything. You then delete any Ms that would have already been anticipated as part of a previous M. Any Ms that are part of a cognitive unit (such as a word, number or name) are then deleted. Any M’s preceding Ks that are habitually pressed are removed. Finally, any M that overlaps an R (system delay) is also deleted.
When I first read this I thought it was impenetrable but it really just comes down to chunking. The most common chunks you will see will be pointing and clicking (MPK or think-point-click) and typing a word (MKKKK – with a K for each keystroke). Personally, I find it easier to think about chunking and add my Ms as I go. Some KLM sources use a specific value for drag and mouse button click, but for back of the envelope calculations simply using P for a drag and K for a mouse click is sufficient. Most click and drag behaviour is just a variation on MPK (think-point-click).
I encourage you to start using this on your own projects – once you make a start its amazing how quickly you pick it up. Generally KLM is best for modelling fairly short interactions and tasks so don’t get too ambitious. How this applies to touch screen interfaces is an interesting topic but something I’ll have to cover in another post.
A recurring theme I come across – particularly when dealing with larger organisations – is the tendency for people to try to fix design flaws by throwing more words at the screen. These have a way of quickly adding up like spitballs, creating screens full of mushy fragments of content that don’t really add up to anything cohesive or useful. The catch-phrase used at a large bank I recently did some work for was “that’s ok, we’ll just put some wording in there” and it was uttered repeatedly every time we ran usability testing sessions.
Despite being well-intended, it doesn’t work. In fact, it usually makes the problem worse. This is because content problems need to be fixed by content and design problems need to be fixed by design. If people are confused by something there is often a good reason for it. Is the label poor? Is the visual hierarchy at odds with the logical hierarchy? Is the sequence out of step with customer’s mental models? Is there a disconnect between the action and the entity it acts upon? Should the action be there at all? Throwing wording at broken designs just weighs them down more and reduces the relative visibility of everything else on the screen that is actually important.
Repeat after me: verbosity does not equal usability. ”Perfection is achieved not when there is nothing left to add, but when there is nothing left to take away” is the oft-quoted wisdom of Antoine de Saint-Exupery. Can you spot the content in the following example that might be considered tedious verbosity?
The area of the screen that is visually strongest here is that which conveys the least content. In fact I would go as far as saying that it actually conveys nothing at all and this screen would be clearer and more usable without it.
Settings (or is that preferences?) in software – like their mechanical counterparts – have long been grouped together and placed out of sight from day-to-day use. Alternatively they are surfaced to create a control-laden interface weighed-down with options that are almost never needed. Revealing options contextually can be an elegant third way to solve this problem but the execution needs to be tight.
The Kindle iphone app has a nice example of how this can work in the way that it manages orientation. Rather than having an option to lock it buried away in a menu somewhere or always cluttering the screen, it instead appears the moment the screen switches. It’s subtle but obvious and presented in an elegant, modeless way that doesn’t interrupt you if you did intend to switch the orientation and extends a hand if you didn’t. If you’ve already locked the orientation it provides a nice status indicator right at the point it becomes relevant; which doesn’t interrupt your reading if you didn’t mean to switch the orientation but catches your eye if you did.
I’m reading my book:
The orientation switches (whether accidently or deliberately) and the lock status/control appears:
If I’ve already locked it I’ll instead see this:
You aren’t conscious of the orientation of the screen until it switches; and at the very moment your locus of attention is drawn to the orientation, the locking option appears. Before you can finish thinking “I need to lock that screen” the control to do just that appears – it almost feels a little bit like magic.
I’ve always found the iPad’s handling of orientation a little bit clunky. I need to think about it and it becomes a mini-task in its own right. This is a more elegant solution and an exemplar of how to deal with complexity without breaking flow.
I came across this age-old dilemma recently while working on the design of a major online banking site. Choosing which personal pronoun to use is deceptively complex and requires far more consideration than it would first seem.
Ultimately I think the most important guideline is to only use a personal pronoun where absolutely necessary to communicate meaning. Your first question should be “Do I need a personal pronoun?” And only then “which one should it be?” It’s a slippery slope when you start adding an empty modifier to all of your labels for consistency and end up murdering scanability. You need to ensure its adding value.
The usual sage advice people like to prescribe a problem like this is “it doesn’t matter which one you choose, just be consistent”. This advice is well meaning but entirely useless. Unless your site does almost nothing you’ll inevitably end up with cases where you have to use “your” regardless of which one you choose. This is because the scope of “my” is fundamentally limited to the few objects or items that can be considered an extension of the user. The interaction will become a dialog eventually, at which point “my” breaks down and just starts to sound ridiculous. So if you go with “My” don’t be deceived into thinking that its simply a find and replace. Each instance needs to be considered in its own right.
As for whether you should start with “my” or “your”, consider what the metaphor for the interaction is (I’m talking about a very basic metaphor for the interaction here, not some kind of overarching navigational or behavioural metaphor). If you’re using an iPhone it is one of operating a machine so “my” applies broadly and works well (but still only appearing where absolutely necessary). The “My number” at the top of the contacts screen is a good example of this – only the use of a personal pronoun could communicate what that number is. You don’t think of it as a dialog between you and Apple. Its the same kind of interaction that you’d have with your car – you drive your car, you don’t have a conversation with it.
On the web however things are not so clear cut. You end up with spaces that are a bit of both. Think about online banking. Is it an intimate, private space or is the metaphor one of interacting with a teller/assistant? I think the answer is both. When it comes to selecting a personal pronoun, you need to consider both the brand/site *and* the nature of the specific interaction.
I just bought a Canon Ixus 1000 HS digital camera and my favourite improvement over my old Ixus is one that you wont find amongst all of the talk of megapixels and optical zoom in the reviews – the removal of a subtle but insidious mode.
On my old Ixus, there is a button on top that switches the mode from photography to video recording. You then press the ubiquitous button on the top right to take the photo or – if you’ve switched to video recording mode – the video. Seems simple enough. In fact, at first glance you might think that only a fool could get that wrong. If you look at my video collection however, you’ll find scores of little videos that run for a couple of seconds and all share the same sound track of “huh? aarrgggghhhh!!!”. Some of them are me, some are my wife. I was amused to see that my parents had also amassed a collection of these little short films on a recent trip to Portugal using their identical camera.
Every time I do this I’m angry at myself for being so absent-minded and fumbling over a simple switch. I’m sure you’re the same. However, it occurred to me that this was in fact a text-book example of what is known in interaction design as a mode error. Mode errors are the mistakes people make when performing the same gesture in a system (in this case pressing the take photo/video button) produces a different result depending on the mode the system happens to be in at the time. Anyone can see with the most cursory of glances that the little switch shows the camera as being in video recording mode. The problem with modes however is that unless the current mode is part of your locus of attention, you will make mode errors. Period. Its inevitable. If you consciously work it through you are able to get it right. But when you’re taking a photograph, your cognitive consciousness is focused on taking the shot, the timing, the composition. Operating the camera is automated and unconscious – until you suddenly realise that there has been no click and you are staring at a little video recording icon on the viewfinder. By that point its too late, the shot is ruined and you are the victim of yet another mode error.
My old Ixus with mode selector and single button
The new Ixus has a far more elegant solution to the problem. Instead of overloading one button with several modes, a second button has been introduced. This is operated by the thumb and has been placed right where the button on a camcorder would be. So rather than being just another button on a device, the placement and operation maps perfectly to the established conventions for what used to be 2 different classes of device – index finger for a photo, thumb for video. Simple, elegant and effective – since buying the new camera we’ve not once made an accidental video recording.
My new Ixus with one button for photographs and another for video
This post is heavily influenced by the late and great Jef Raskin. I recommend you read more about modes, the cognitive conscious and automation in his classic, The Humane Interface.
Firefox’s error messaging is a beautiful piece of anthropomorphic design. ”Well, this is embarrassing…”. You immediately want to make it feel better, to put your arm on its shoulder, ”don’t worry about it Firefox, its not that big a deal”. Such a contrast to the usual obnoxious dialog that either shows no remorse of pity whatsoever (“Operation failed, start again”), or even blames you (“you closed me down with 5 tabs still open fool!”).
Think about what we currently tolerate. Imagine if you took the time to carefully fill in a paper form and took it to the service desk only to have the person behind the counter bark “invalid phone number format!”, throw it back in your face and slam the window shut? You’d be furious. You’d complain. You’d go somewhere else next time. Yet, we suffer this on a daily basis online. Trying to buy something from some sites is like ordering from the Soup Nazi, and you’re trying to give them money!
The fact that this kind of message is the exception rather than rule says a lot about where we are in the way we interact with virtual environments, and how far we’ve got to go…
I’m a London-based freelance interaction designer. I'm currently half way through creating this site so will have this section finished soon. Maybe I should dig up one of those "under construction" animated gifs from the mid-90's and let it rip...