You are here

thinktime

The ripples

Seth Godin - Wed 28th Sep 2016 19:09
Every decision we make changes things. The people we befriend, the examples we set, the problems we solve... Sometimes, if we're lucky, we get to glimpse those ripples as we stand at the crossroads. Instead of merely addressing the urgency...        Seth Godin
Categories: thinktime

Task Performance Indicator: A Management Metric for Customer Experience

a list apart - Wed 28th Sep 2016 00:09

It’s hard to quantify the customer experience. “Simpler and faster for users” is a tough sell when the value of our work doesn’t make sense to management. We have to prove we’re delivering real value—increased the success rate, or reduced time-on-task, for example—to get their attention. Management understands metrics that link with other organizational metrics, such as lost revenue, support calls, or repeat visits. So, we need to describe our environment with metrics of our own.

For the team I work with, that meant developing a remote testing method that would measure the impact of changes on customer experience—assessing alterations to an app or website in relation to a defined set of customer “top tasks.” The resulting metric is stable, reliable, and repeatable over time. We call it the Task Performance Indicator (TPI).

For example, if a task has a TPI score of 40 (out of 100), it has major issues. If you measure again in 6 months’ time but nothing has been done to address the issues, the testing score will again result in a TPI of 40.

In traditional usability testing, it has long been established that if you test with between three and eight people, you’ll find out if significant problems exist. Unfortunately, that’s not enough to reveal precise success rates or time-on-task measurements. What we’ve discovered from hundreds of tests over many years is that reliable and stable patterns aren’t apparent until you’re testing with between 13 and 18 people. Why is that?

When the number of participants ranges anywhere from 13–18 people, testing results begin to stabilize and you’re left with a reliable baseline TPI metric.

The following chart shows why we can do this (Fig. 1).

Fig 1: TPI scores start to level out and stabilize as more participants are tested. How TPI scores are calculated

We’ve spent years developing a single score that we believe is a true reflection of the customer experience when completing a task.

For each task, we present the user with a “task question” via live chat. Once they understand what they have to do, the user indicates that they are starting the task. At the end of the task, they must provide an answer to the question. We then ask people how confident they are in their answer.

A number of factors affect the resulting TPI score.

Time: We establish what we call the “Target Time”—how long it should take to complete the task under best practice conditions. The more they exceed the target time, the more it affects the TPI.

Time out: The person takes longer than the maximum time allocated. We set it at 5 minutes.

Confidence: At the end of each task, people are asked how confident they are. For example, low confidence in a correct answer would have a slight negative impact on the TPI score.

Minor wrong: The person is unsure; their answer is almost correct.

Disaster: The person has high confidence, but the wrong result; acting on this wrong answer could have serious consequences.

Gives up: The person gives up on the task.

A TPI of 100 means that the user has successfully completed the task within the agreed target times.

In the following chart, the TPI score is 61 (Fig. 2).

Fig 2: A visual breakdown of sample results for Overall Task Performance, Mean Completion Times, and Mean Target Times. Developing task questions

Questions are the greatest source of potential noise in TPI testing. If a question is not worded correctly, it will invalidate the results. To get an overall TPI for a particular website or app, we typically test 10-12 task questions. In choosing a question, keep in mind the following:

Based on customer top tasks. You must choose task questions that are examples of top tasks. If you measure and then seek to improve the performance of tiny tasks (low demand tasks) you may be contributing to a decline in the overall customer experience.

Repeatable. Create task questions that you can test again in 6 to 12 months.

Representative and typical. Don’t make the task questions particularly difficult. Start off with reasonably basic, typical questions.

Universal, everyone can do it. Every one of your test participants must be able to do each task. If you’re going to be testing a mixture of technical, marketing, and sales people, don’t choose a task question that only a salesperson can do.

One task, one unique answer. Limit each task question to only one actual thing you want people to do, and one unique answer.

Does not contain clues. The participant will examine the task question like Sherlock Holmes would hunt for a clue. Make sure it doesn’t contain any obvious keywords that could be answered by conducting a search.

Short—30 words or less. Remember, the participant is seeing each task question for the first time, so aim to keep its length at less than 20 words (and definitely less than 30).

No change within testing period. Choose questions where the website or app is not likely to change during the testing period. Otherwise, you’re not going to be testing like with like.

Case Study: Task questions for OECD

Let’s look at some top tasks for the customers of Organisation for Economic Co-operation and Development (OECD), an economic and policy advice organization.

  1. Access and submit country surveys, reviews, and reports.
  2. Compare country statistical data.
  3. Retrieve statistics on a particular topic.
  4. Browse a publication online for free.
  5. Access, submit, and review working papers.

Based on that list, these task questions were developed:

  1. What are OECD’s latest recommendations regarding Japan’s healthcare system?
  2. In 2008, was Vietnam on the list of countries that received official development assistance?
  3. Did more males per capita die of heart attacks in Canada than in France in 2004?
  4. What is the latest average starting salary, in US dollars, of a primary school teacher across OECD countries?
  5. What is the title of Box 1.2 on page 73 of OECD Employment Outlook 2009?
  6. Find the title of the latest working paper about improvements to New Zealand’s tax system.
Running the test

To test 10-12 task questions usually takes about one hour, and you’ll need between 13 and 18 participants (we average 15). Make sure that they’re representative of your typical customers. 

We’ve found that remote testing is better, faster, and cheaper than traditional lab-based measurement for TPI testing. With remote testing, people are more likely to behave in a natural way because they are in their normal environment—at home or in the office—and using their own computer. That makes it much easier for someone to give you an hour of their time, rather than spend the morning at your lab. And since the cost is much lower than lab-based tests, we can set them up more quickly and more often. It’s even convenient to schedule them using Webex, GoToMeeting, Skype, etc.

The key to a successful test is that you are confident, calm, and quiet. You’re there to facilitate the test—not to guide it or give opinions. Aim to become as invisible as possible.

Prior to beginning the test, introduce yourself and make sure the participant gives you permission to record the session. Next, ask that they share their screen. Remember to stress that you are only testing the website or app—not them. Ask them to go to an agreed start point where all the tasks will originate. (We typically choose the homepage for the site/app, or a blank tab in the browser.)

Explain that for each task, you will paste a question into the chat box found on their screen. Test the chat box to confirm that the participant can read it, and tell them that you will also read the task aloud a couple of times. Once they understand what they have to do, ask them to indicate when they start the task, and that they must give an answer once they’ve finished. After they’ve completed the task, ask the participant how confident they are in their answer.

Analyzing the results

As you observe the tests, you’re looking for patterns. In particular, look for the major reasons people give for selecting the wrong answer or exceeding the target time.

Video recordings of your customers as they try—and often fail—to complete their tasks have powerful potential. They are the raw material of empathy. When we identify a major problem area during a particular test, we compile a video containing three to six participants who were affected. For each participant, we select less than a minute’s worth of video showing them while affected by this problem. We then edit these participant snippets into a combined video (that we try to keep under three minutes). We then get as many stakeholders as possible to watch it. You should seek to distribute these videos as widely, and as often as possible.

How Cisco uses the Task Performance Indicator

Every six months or so, we measure several tasks for Cisco, including the following:

Task: Download the latest firmware for the RV042 router.

The top task of Cisco customers is downloading software. When we started the Task Performance Indicator for software downloads in 2010, a typical customer might take 15 steps and more than 300 seconds to download a piece of software. It was a very frustrating and annoying experience. The Cisco team implemented a continuous improvement process based on the TPI results. Every six months, the Task Performance Indicator was carried out again to see what had been improved and what still needed fixing. By 2012—for a significant percentage of software—the number of steps to download software had been reduced from 15 to 4, and the time on task had dropped from 300 seconds to 40 seconds. Customers were getting a much faster and better experience.

According to Bill Skeet, Senior Manager of Customer Experience for Cisco Digital Support, implementing the TPI has had a dramatic impact on how people think about their jobs:

We now track the score of each task and set goals for each task. We have assigned tasks and goals to product managers to make sure we have a person responsible for managing the quality of the experience ... Decisions in the past were driven primarily by what customers said and not what they did. Of course, that sometimes didn’t yield great results because what users say and what they do can be quite different.

Troubleshooting and bug fixing are also top tasks for Cisco customers. Since 2012, we’ve tested the following.

Task: Ports 2 and 3 on your ASR 9001 router, running v4.3.0 software, intermittently stop functioning for no apparent reason. Find the Cisco recommended fix or workaround for this issue.

Fig 3: Bug Task Success Rate Comparisons, February 2012 through December 2014.

For a variety of reasons, it was difficult to solve the underlying problems connected with finding the right bug fix information on the Cisco website. Thus, the scores from February 2012 to February 2013 did not improve in any significant way.

For the May 2013 measurement, the team ran a pilot to show how (with the proper investment) it could be much easier to find bug fix information. As we can see in the preceding image, the success rate jumped. However, it was only a pilot and by the next measurement it had been removed and the score dropped again. The evidence was there, though, and the team soon obtained resources to work on a permanent fix. The initial implementation was for the July 2014 measurement, where we see a significant improvement. More refinements were made, then we see a major turnaround by December 2014.

Task: Create a new guest account to access the Cisco.com website and log in with this new account.

Fig 4: Success/Failure rates from March 2015 through June 2015

This task was initially measured in 2014; the results were not good.

In fact, nobody succeeded in completing the task during the March 2014 measurements, resulting in three specific design improvements to the sign-up form. These involved:

  1. Clearly labelling mandatory fields
  2. Improving password guidance
  3. Eliminating address mismatch errors.

A shorter pilot form was also launched as a proof of concept. Success jumped by 50% in the July 2014 measurements, but dropped 21% by December 2014 because the pilot form was no longer there. By June 2015, a shorter, simpler form was fully implemented, and the success again reached 50%.

The team was able to show that because of their work:

  • The three design improvements improved the success rate by 29%.
  • The shorter form improved the success rate by 21%.

That’s very powerful. You can isolate a piece of work and link it to a specific increase in the TPI. You can start predicting that if a company invests X it will get a Y TPI increase. This is control and the route to power and respect within your organization, or to trust and credibility with your client.

If you can link it with other key performance indicators, that’s even more powerful.

The following table shows that improvements to the registration form halved the support requests connected with guest account registration (Fig. 5).

Fig 5: Registration Support Requests, Q1 2014, Q2 2015, and Q3 2015.

A more simplified guest registration process resulted in:

  • A reduction in support requests—from 1,500 a quarter, to less than 700
  • Three fewer people were required to support customer registration
  • 80% productivity improvement
  • Registration time down to 2 minutes from 3:25.

Task: Pretend you have forgotten the password for the Cisco account and take whatever actions are required to log in.

When we measured the change passwords task, we found that there was a 37% failure rate.

A process of improvement was undertaken, as can be seen by the following chart, and by December 2013, we had a 100% success rate (Fig. 6).

Fig 6: Progression of success rate improvement from November 2012 to December 2013.

100% success rate is a fantastic result. Job done, right? Wrong. In digital, the job is never done. It is always an evolving environment. You must keep measuring the top tasks because the digital environment that they exist within is constantly changing. Stuff is getting added, stuff is getting removed, and stuff just breaks (Fig. 7).

Fig 7: Comparison of success rates, March 2014 and July 2014.

When we measured again in March 2014, the success rate had dropped to 59% because of a technical glitch. It was quickly dealt with, so the rate shot back up to 100% by July.

At every step of the way, the TPI gave us evidence about how well we were doing our job. It’s really helped us fight against some of the “bright shiny object” disease and the tendency for everyone to have an opinion on what we put on our webpages ... because we have data to back it up. It gave us more insight into how content organization played a role in our work for Cisco, something that Jeanne Quinn (senior manager responsible for the Cisco Partner) told us kept things clear and simple while working with the client.

The TPI allows you to express the value of your work in ways that makes sense to management. If it makes sense to management—and if you can prove you’re delivering value—then you get more resources and more respect.

Categories: thinktime

Wedding syndrome

Seth Godin - Tue 27th Sep 2016 19:09
Running a business is a lot more important than starting one. Choosing and preparing for the job you'll do for the next career is a much more important task than getting that job. Serving is more important than the campaign....        Seth Godin
Categories: thinktime

Spectator sports

Seth Godin - Mon 26th Sep 2016 19:09
Every year, we spend more than a trillion dollars worth of time and attention on organized spectator sports. The half-life of a sporting event is incredibly short. Far more people are still talking about the Godfather movie or the Nixon...        Seth Godin
Categories: thinktime

Anxiety loves company

Seth Godin - Sun 25th Sep 2016 18:09
Somehow, at least in our culture, we find relief when others are anxious too. So we spread our anxiety, stoking it in other people, looking for solace in the fear in their eyes. And thanks to the media, to the...        Seth Godin
Categories: thinktime

Looking for the trick

Seth Godin - Sat 24th Sep 2016 19:09
When you find a trick, a shortcut, a hack that gets you from here to there without a lot of sweat or risk, it's really quite rewarding. So much so that many successful people are hooked on the trick, always...        Seth Godin
Categories: thinktime

Skills vs. talents

Seth Godin - Fri 23rd Sep 2016 19:09
If you can learn it, it's a skill. If it's important, but innate, it's a talent. The thing is, almost everything that matters is a skill. If even one person is able to learn it, if even one person is...        Seth Godin
Categories: thinktime

For the weekend...

Seth Godin - Fri 23rd Sep 2016 03:09
New podcast with Brian Koppelman Classic podcast with Krista Tippett Unmistakable Creative from 2015 And a video of Creative Mornings and their podcast The Your Turn book continues to spread. Have you seen it yet? Early-bird pricing on the huge...        Seth Godin
Categories: thinktime

Widespread confusion about what it takes to be strong

Seth Godin - Thu 22nd Sep 2016 18:09
Sometimes we confuse strength with: Loudness Brusqueness An inability to listen A resistance to seeing the world as it is An unwillingness to compromise small things to accomplish big ones Fast talking Bullying External unflappability Callousness Lying Policies instead of...        Seth Godin
Categories: thinktime

Big fish in a little pond

Seth Godin - Wed 21st Sep 2016 19:09
There's no doubt that the big fish gets respect, more attention and more than its fair share of business as a result. The hard part of being a big fish in a little pond isn't about being the right fish....        Seth Godin
Categories: thinktime

Why We Should All Be Data Literate

a list apart - Wed 21st Sep 2016 00:09

Recently, I was lucky enough to see the great Jared Spool talk (spoiler: all Spool talks are great Spool talks). In this instance, the user interface icon warned of the perils of blindly letting data drive design.

I am in total agreement with 90 percent of his premise. Collecting and analyzing quantitative data can indeed inform your design decisions, and smart use of metrics can fix critical issues or simply improve the user experience. However, this doesn’t preclude a serious problem with data, or more specifically, with data users. Spool makes this clear: When you don’t understand what data can and can’t tell you and your work is being dictated by decisions based on that lack of understanding—well, your work and product might end up being rubbish. (Who hasn’t heard a manager fixate on some arbitrary metric, such as, “Jane, increase time on page” or “Get the bounce rate down, whatever it takes”?) Designing to blindly satisfy a number almost always leads to a poorer experience, a poorer product, and ultimately the company getting poorer.

Where Spool and I disagree is in his conclusion that all design teams need to include a data scientist. Or, better yet, that all designers should become data scientists. In a perfect world, that would be terrific. In the less-perfect world that most of us inhabit, I feel there’s a more viable way. Simply put: all designers can and should learn to be data literate. Come to think of it, it’d be nice if all citizens learned to be data literate, but that’s a different think piece.

For now, let’s walk through what data literacy is, how to go about getting it for less effort and cost than a certificate from Trump University, and how we can all build some healthy data habits that will serve our designs for the better.

What Data Literacy Is and Isn’t

Okay, data literacy is a broad term—unlike, say, “design.” In the education field, researchers juggle the terms “quantitative literacy,” “mathematical literacy,” and “quantitative reasoning,” but parsing out fine differences is beyond the scope of this article and, probably, your patience. To keep it simple, let’s think about data literacy as healthy skepticism or even bullshit detection. It’s the kind of skepticism you might adopt when faced with statements from politicians or advertisers. If a cookie box is splashed with a “20% more tasty!” banner, your rightful reaction might be “tastier than what, exactly, and who says?” Yes. Remember that response.

Data literacy does require—sorry, phobics—some math. But it’s not so bad. As a designer, you already use math: figuring pixels, or calculating the square footage of a space, or converting ems to percent and back. The basics of what you already do should give you a good handle on concepts like percentages, probability, scale, and change over time, all of which sometimes can hide the real meaning of a statistic or data set. But if you keep asking questions and know how multiplication and division work, you’ll be 92 percent of the way there. (If you’re wondering where I got that percentage from, well—I made it up. Congratulations, you’re already on the road to data literacy.)

Neil Lutsky writes about data literacy in terms of the “construction, communication, and evaluation of arguments.” Why is this relevant to you as a designer? As Spool notes, many design decisions are increasingly driven by data. Data literacy enables you to evaluate the arguments presented by managers, clients, and even analytics packages, as well as craft your own arguments. (After all, a key part of design is being able to explain why you made specific design decisions.) If someone emails you a spreadsheet and says, “These numbers say why this design has to be 5 percent more blue,” you need to be able to check the data and evaluate whether this is a good decision or just plain bonkers.

Yes, this is part of the job.

It’s So Easy

Look, journalists can get pretty good at being data literate. Not all journalists, of course, but there’s a high correlation between the ability to question data and the quality of the journalism—and it’s not high-level or arcane learning. One Poynter Institute data course was even taught (in slightly modified form) to grade schoolers. You’re a smart cookie, so you can do this. Not to mention the fact that data courses are often self-directed, online, and free (see “Resources” listed below).

Unlike data scientists who face complex questions, large data sets, and need to master concepts like regressions and Fourier transforms, you’re probably going to deal with less complex data. If you regularly need to map out complex edge-node relationships in a huge social graph or tackle big data, then yes, get that master’s degree in the subject or consult a pro. But if you’re up against Google Analytics? You can easily learn how to ask questions and look for answers. Seriously, ask questions and look for answers.

Designers need to be better at data literacy for many of the same reasons we need to work on technical literacy, as Sarah Doody explains. We need to understand what developers can and can’t do, and we need to understand what the data can and can’t do. For example, an A/B test of two different designs can tell you one thing about one thing, but if you don’t understand how data works, you probably didn’t set up the experiment conditions in a way that leads to informative results. (Pro tip: if you want to see how a change affects click-through, don’t test two designs where multiple items differ, and don’t expect the numbers to tell you why that happened.) Again: We need to question the data.

So we’ve defined a need, researched our users, and identified and defined a feature called data literacy. What remains is prototyping. Let’s get into it, shall we?

How to Build Data Literacy by Building Habits

Teaching data literacy is an ongoing topic of academic research and debate, so I’ll leave comprehensive course-building to more capable hands than mine. But together, we can cheaply and easily outline simple habits of critical thought and mathematical practice, and this will get us to, let’s say, 89 percent data literacy. At the least, you’ll be better able to evaluate which data could make your work better, which data should be questioned more thoroughly, and how to talk to metric-happy stakeholders or bosses. (Optional homework: this week, take one metric you track or have been told to track at work, walk through the habits below, and report back.)

Habit one: Check source and context

This is the least you should do when presented with a metric as a fait accompli, whether that metric is from a single study, a politician, or an analytics package.

First, ask about the source of the data (in journalism, this is reflex—“Did the study about the health benefits of smoking come from the National Tobacco Profiteering Association?”). Knowing the source, you can then investigate the second question.

The second question concerns how the data was collected, and what that can tell you—and what it can’t.  Let’s say your boss comes in with some numbers about time-on-page, saying “Some pages are more sticky than others. Let’s redesign the others to keep customers on all the other pages longer.” Should you jump to redesign the less-sticky pages, or is there a different problem at play?

It’s simple, and not undermining, to ask how time-on-page was measured and what it means. It could mean a number of things, things that that single metric will never reveal. Things that could be real problems, real advantages, or a combination of the two. Maybe the pages with higher time-on-page numbers simply took a lot longer to load, so potential customers were sitting there as a complex script or crappy CDN was slooooowly drawing things on the not-a-customer-any-more’s screen. Or it could mean some pages had more content. Or it could mean some were designed poorly and users had to figure out what to do next.

How can you find this out? How can you communicate that it’s important to find out? A quick talk with the dev team or running a few observations with real users could lead you to discover what the real problem is and how you can redesign to improve your product.

What you find out could be the difference between good and bad design. And that comes from knowing how a metric is measured, and what it doesn’t measure. The metric itself won’t tell you.

For your third question, ask the size of the sample. See how many users were hitting that site, whether the time-on-page stat was measured for all or some of these users, and whether that’s representative of the usual load. Your design fix could go in different directions depending on the answer. Maybe the metric was from just one user! This is a thing that sometimes happens.

Fourth, think and talk about context. Does this metric depend on something else? For example, might this metric change over time? Then you have to ask over what time period the metric was measured, if that period is sufficient, and whether the time of year when measured might make a difference.

Remember when I said change over time can be a red flag? Let’s say your boss is in a panic, perusing a chart that shows sales from one product page dropping precipitously last month. Design mandates flood your inbox: “We’ve got to promote this item more! Add some eye-catching design, promote it on our home page!”

What can you do to make the right design decisions? Pick a brighter blue for a starburst graphic on that product page?

Maybe it would be more useful to look at a calendar. Could the drop relate to something seasonal that should be expected? Jack o’lantern sales do tend to drop after November 1. Was there relevant news? Apple’s sales always drop before their annual events, as people expect new products to be announced. A plethora of common-sense questions could be asked.

The other key point about data literacy and change is that being data literate can immunize against common errors when looking at change over time. This gets to numeracy.

Habit two: Be numerate

I first learned about numeracy through John Allen Paulos’ book Innumeracy: Mathematical Illiteracy and its Consequences, though the term “innumeracy” was originated by Pulitzer Prize-winning scientist Douglas Hofstadter. Innumeracy is a parallel to illiteracy; it means the inability to reason with numbers. That is, the innumerate can do math but are more likely to trip up when mathematical reasoning is critical. This often happens when dealing with probability and coincidence, with statistics, and with things like percentages, averages, and changes. It’s not just you—these can be hard to sort out sort out! We’re presented with these metrics a lot, but usually given little time to think about them, so brushing up on that bit of math can really help put out (or avoid) a trash fire of bad design decisions.

Consider this:  A founder comes in with the news that an app has doubled its market base in the two weeks it’s been available. It’s literally gone up 100 percent in that time. That’s pretty awesome, right? Time to break out the bubbly, right? But what if you asked a few questions and found that this really meant the founder was the first user, then eventually her mom got onto it. That is literally doubling the user base exactly 100 percent.

Of course that’s obvious and simple. You see right off why this startup probably shouldn’t make the capital outlay to acquire a bottle or two juuuust yet. But exactly this kind of error gets overlooked easily and often when the math gets a bit more complex.

Any time you see a percentage, such as “23% more” or “we lost 17%,” don’t act until you’ve put on your math hat. You don’t even need to assume malice; this stuff simply gets confusing fast, and it’s part of your job not to misread the data and then make design decisions based on an erroneous understanding.

Here’s an example from Nicolas Kayser-Bril, who looks into the headline, “Risk of Multiple Sclerosis Doubles When Working at Night”:

“Take 1,000 Germans. A single one will develop MS over his lifetime. Now, if every one of these 1,000 Germans worked night shifts, the number of MS sufferers would jump to two. The additional risk of developing MS when working in shifts is one in 1,000, not 100%. Surely this information is more useful when pondering whether to take the job.”

This is a known issue in science journalism that isn’t discussed enough, and often leads to misleading headlines. Whenever there’s a number suggesting something that affects people, or a number suggesting change, look not just at the percentage but at what this would mean in the real world; do the math and see if the result matches the headline’s intimation. Also ask how the percentage was calculated. How was the sausage made? Lynn Arthur Steen explains how percentages presented to you may not just be the difference of two numbers divided by a number. Base lesson: always learn what your analytics application measures and how it calculates things. Four out of five dentists agree...so that’s, what, 80 percent true?

Averages are another potentially deceptive metric that simple math can help; sometimes it’s barely relevant, if at all. “The average length of a book purchased on Amazon is 234.23 pages” may not actually tell you anything. Sometimes you need to look into what’s being averaged. Given the example “One in every 15 Europeans is illiterate,” Kayser-Bril points out that maybe close to one in 15 Europeans is under the age of seven. It’s good advice to learn the terms “mode,” “median,” and “standard deviation.” (It doesn’t hurt (much), and can make you a more interesting conversationalist at dinner parties!)

Habit three: Check your biases

I know, that sounds horrible. But in this context, we’re talking about cognitive biases, which everyone has (this is why I encourage designers to study psychology, cognition studies, and sociology as much as they can). Though we have biases, it’s how aware we are of these issues and how we deal with them that counts.

It’s out of scope to list and describe them all (just thinking I know them all is probably an example of Dunning-Kruger). We’ll focus on two that are most immediately relevant when you’re handed supposedly-objective metrics and told to design to them. At least, these are two that I most often see, but that may be selection bias.

Selection bias

Any metric or statistical analysis is only as good as (in part) what you choose to measure. Selection bias is when your choice of what to measure isn’t really random or representative. This can come from a conscious attempt to skew the result, from carelessly overlooking context, or due to some hidden process.

One example might be if you’re trying to determine the average height of the adult male in the United States and find it to be 6'4"—oops, we only collected the heights of basketball players. Online opinion polls are basically embodied examples of selection bias, as the readers of a partisan site are there because they already share the site operator’s opinion. Or you may be given a survey that shows 95 percent of users of your startup’s app say they love it, but when you dig in to the numbers, the people surveyed were all grandmothers of the startup team employees (“Oh, you made this, dear? I love it!”). This holds in usability testing, too: if you only select, say, high-level programmers, you may be convinced that a “to install this app, recompile your OS kernel” is a totally usable feature. Or end up with Pied Piper’s UI.

Now, these all seem like “sure, obvs” examples. But selection bias can show up in much more subtle forms, and in things like clinical studies. Dr. Madhukar Pai’s slides here give some great examples — especially check out Slide 47, which shows how telephone surveys have almost built-in selection biases.

So, what’s a designer to do? As you can see from Dr. Pai’s lecture slides, you can quickly get into some pretty “mathy” work, but the main point is that when you’re faced with a metric, after you’ve checked out the context, look at the sample. You can think about the claim on the cookie box in this way. It’s “20% more tasty”?  What was the sample, 19 servings of chopped liver and one cookie?

Confirmation bias

Storytelling is a powerful tool. Again, it’s how our brains are wired. But as with all tools, it can be used for good or for evil, and can be intentional or accidental. As designers, we’re told we have to be storytellers: how do people act, how do they meet-cute our product, how do they feel, what’s the character arc? This is how we build our knowledge of the world, by building stories about it. But, as Alberto Cairo explains in The Truthful Art this is closely linked to confirmation bias, where we unconsciously (or consciously) search for, select, shape, remember, interpret, or otherwise torture basic information so that it matches what we already think we know, the stories we have. We want to believe.

Confirmation bias can drive selection bias, certainly. If you only test your design with users who already know how your product works (say, power users, stakeholders, and the people who built the product), you will get distorted numbers and a distorted sense of how usable your product is. Don’t laugh: I know of a very large and popular internet company that only does user re-search with power users and stakeholders.

But even if the discovery process is clean, confirmation bias can screw up the interpretation. As Cairo writes, “Even if we are presented with information that renders our beliefs worthless, we’ll try to avoid looking at it, or we’ll twist it in a way that confirms them. We humans try to reduce dissonance no matter what.” What could this mean for your design practice? What could this mean for your designs when stakeholders want you to design to specific data?

Reading (Numbers) is Fundamental

So, yes. If you can work with a data scientist in your design team, definitely do so. Try to work with her and learn alongside her. But if you don’t have this luxury, or the luxury of studying statistics in depth, think of data literacy as a vital part of your design practice. Mike Monteiro is passionate that designers need to know math, and he’s of course correct, but we don’t need to know math just to calculate visual design. We need to know math enough to know how to question and analyze any metric we’re given.

This is something you can practice in everyday life, especially in an election season. When you see someone citing a study, or quoting a number, ask: What was measured? How was it measured? What was the context? What wasn’t measured? Does that work out in real life? Keep looking up terms like selection bias, confirmation bias, Dunning-Kruger, sample size effect, until you remember them and their application. That is how you build habits, and how you’ll build your data literacy muscles.

I’ve long loved the Richard Feynman quote (that Cairo cites in The Truthful Art): “The first principle is that you must not fool yourself — and you are the easiest person to fool.” Consider always that you might be fooling yourself by blindly accepting any metric handed to you. And remember, the second-easiest person to fool is the person who likely handed you the metric, and is motivated to believe a particular outcome. Data literacy requires honesty, mastering numeracy, and stepping through the habits we’ve discussed. Practice every day with news from politics: does a statistic in the news give you that “of course, that’s how things are” feeling? Take a deep breath, and dig in; do you agree with a policy or action because it’s your political party proposing it? What’s the context, the sample size, the bias?

It’s tough to query yourself this way. But that’s the job. It’s tougher to query someone else this way, whether it’s your boss or your significant other. I can’t help you figure out the politics and social minefield of those. But do try. The quality of your work (and life) may depend on it.

Resources
Categories: thinktime

Three things to keep in mind about your reputation

Seth Godin - Tue 20th Sep 2016 18:09
Your reputation has as much impact on your life as what you actually do. Early assumptions about you are sticky and are difficult to change. The single best way to maintain your reputation is to do things you're proud of....        Seth Godin
Categories: thinktime

Understanding taxonomy

Seth Godin - Mon 19th Sep 2016 19:09
If you need to add a word to the dictionary, it's pretty clear where it goes. The dictionary is a handy reminder of how taxonomies work. The words aren't sorted by length, or frequency or date of first usage. They're...        Seth Godin
Categories: thinktime

This week's sponsor: OPTIMAL WORKSHOP

a list apart - Mon 19th Sep 2016 14:09

OPTIMAL WORKSHOP — test your website‘s performance with fast and powerful UX research tools.​

Categories: thinktime

The opposite of the freeloader problem

Seth Godin - Sun 18th Sep 2016 18:09
Is the freegiver advantage. Freeloaders, of course, are people who take more than they give, drains on the system. But the opposite, the opposite is magical. These are the people who feed the community first, who give before taking, who...        Seth Godin
Categories: thinktime

The post-reality paradox

Seth Godin - Sat 17th Sep 2016 18:09
Reality and rational thought have paid more dividends in the last century than ever before. Science-based medicine has dramatically increased the lifespan and health of people around the world. Vaccines have prevented millions of children from lifelong suffering and even...        Seth Godin
Categories: thinktime

But how much does it cost?

Seth Godin - Fri 16th Sep 2016 19:09
I know what the price tag says. But what does it cost? Does it need dry cleaning? What does it eat? How long does the training take? What happens when it breaks? Where will I store it? What's the productivity...        Seth Godin
Categories: thinktime

The professional pushes back

Seth Godin - Thu 15th Sep 2016 19:09
The architect refuses to design the big, ugly building that merely maximizes short term revenue. She understands that raising the average is part of her job. The surgeon refuses to do needless surgery, no matter how much the client insists....        Seth Godin
Categories: thinktime

The clown suit

Seth Godin - Wed 14th Sep 2016 18:09
It's ever more tempting to put on the (metaphorical) clown suit. It allows you to provoke with impunity. Clowns enjoy a different relationship with the laws of physics. You can spray someone in the face with a seltzer bottle, hit...        Seth Godin
Categories: thinktime

Designing Interface Animation: an Interview with Val Head

a list apart - Wed 14th Sep 2016 00:09

A note from the editors: To mark the publication of Designing Interface Animation, ALA managing editor Mica McPheeters and editor Caren Litherland reached out to Val Head via Google Hangouts and email for a freewheeling conversation about web animation. The following interview has been edited for clarity and brevity.

Animation is not new, of course, but its journey on the web has been rocky. For years, technological limitations compelled us to take sides: Should we design rich, captivating sites in Flash? Or should we build static, standards-compliant sites with HTML and CSS (and maybe a little JavaScript)?

Author Val Head describes herself as a “weirdo” who never wanted to choose between those two extremes—and, thanks to the tools at our disposal today, we no longer have to. Without compromising standards, we can now create complex animations natively in the browser: from subtle transitions using CSS to immersive, 3-D worlds with WebGL. Animation today is not just on the web, but of the web. And that, says Val, is a very big deal.

Caren Litherland: Are people intimidated by animation?

Val Head: There are definitely some web folks out there who are intimidated by the idea of using web animation in their work. For some, it’s such a new thing—very few of us have a formal background in motion design or animation—and it can be tough to know where to start or how to use it. I’ve noticed there’s some hesitation to embrace web animation due to the “skip intro” era of Flash sites. There seems to be a fear of recreating past mistakes. But it doesn’t have to be that way at all.

We’re in a new era of web animation right now. The fact that we can create animation with the same technologies we’ve always used to make websites—things like CSS and JavaScript—completely changes the landscape. Now that we can make animation that is properly “of the web” (to borrow a phrase from Jeremy Keith), not just tacked on top with a plug-in, we get to define what the new definition of web animation is with our work.

Right now, on the web, we can create beautiful, purposeful animation that is also accessible, progressively enhanced, and performant. No other medium can do that. Which is really exciting!

CL: I’ve always felt that there was something kind of ahistorical and ahistoricizing about the early web. As the web has matured, it seems to have taken a greater interest in the history and traditions that inform it. Web typography is a good example of this increased self-awareness. Can the same be said for animation?

VH: I think so! In the early days of the web, designers often looked down on it as a less capable medium. Before web type was a thing, a number of my designer friends would say that they could never design for the web because it wasn’t expressive enough as a medium. That the web couldn’t really do design. Then the web matured, web type came along, and that drastically changed how we designed for the web. Web animation is doing much the same thing. It’s another way we have now to be expressive with our design choices, to tell stories, to affect the experience in meaningful ways, and to make our sites unique.

With type, we turned to the long-standing craft of print typography for some direction and ideas, but the more we work with type on the web, the more web typography becomes its own thing. The same is true of web animation. We can look to things like the 12 classic principles of animation for reference, but we’re still defining exactly what web animation will be and the tools and technologies we use for it. Web animation adds another dimension to how we can design on the web and another avenue for reflecting on what the rich histories of design, animation, and film can teach us.

Mica McPheeters: Do you find that animation often gets tacked on at the end of projects? Why is that? Shouldn’t it be incorporated from the outset?

VH: Yes, it often does get left to the end of projects and almost treated as just the icing on top. That’s a big part of what can make animation seem like it’s too hard or ineffective. If you leave any thought of animation until the very end of a project, it’s pretty much doomed to fail or just be meaningless decoration.

Web animation can be so much more than just decoration, but only if we make it part of our design process. It can’t be a meaningful addition to the user experience if you don’t include it in the early conversations that define that experience.

Good web animation takes a whole team. You need input from all disciplines touching the design to make it work well. It can’t just be designed in a vacuum and tossed over the fence. That approach fails spectacularly well when it comes to animation.

Communicating animation ideas and making animation truly part of the process can be the biggest hurdle for teams to embrace animation. Change is hard! That’s why I dedicated two entire chapters of the book to how to get animation done in the real world. I focus on how to communicate animation ideas to teammates and stakeholders, as well as how to prototype those ideas efficiently so you can get to solutions without wasting time. I also cover how to represent animation in your design systems or documentation to empower everyone (no matter what their background is) to make good motion design decisions.

CL: Can you say more about the importance of a motion audit? Can it be carried out in tandem with a content audit? And how do content and animation tie in with each other?

VH: I find motion audits to be incredibly useful before creating a motion style guide or before embarking on new design efforts. It’s so helpful to know where animation is already being used, and to take an objective look at how effective it is both from a UX angle and a branding angle. If you have a team of any significant size, chances are you’ve probably got a lot of redundant, and maybe even conflicting, styles and uses of animation in your site. Motion audits give you a chance to see what you’re already doing, identify things that are working, as well as things that might be broken or just need a little work. They’re also a great way to identify places where animation could provide value but isn’t being used yet.

Looking at all your animation efforts at a high level gives you a chance to consolidate the design decisions behind them, and establish a cohesive approach to animation that will help tie the experience together across mediums and viewport sizes. You really need that high-level view of animation when creating a motion style guide or animation guidelines.

You could definitely collect the data for a motion audit in tandem with a content audit. You’ll likely be looking in all the same places, just collecting up more data as you go through your whole site.

There is a strong tie between content and animation. I’ve been finding this more and more as I work with my consulting clients. Both can be focused around having a strong message and communicating meaningfully. When you have a clear vision of what you want to say, you can say it with the motion you use just like you can say it with the words you choose.

Voice and tone documents can be a great place to start for deciding how your brand expresses itself in motion. I’ve leaned on these more than once in my consulting work. Those same words you use to describe how you’d like your content to feel can be a basis of how you aim to make the animation feel as well. When all your design choices—everything from content, color, type, animation—come from the same place, they create a powerful and cohesive message.

CL: One thing in your book that I found fascinating was your statement that animation “doesn’t have to include large movements or even include motion at all.” Can you talk more about that? And is there any sort of relationship between animation and so called calm technology?

VH: It’s true, animation doesn’t always mean movement. Motion and animation are really two different things, even though we tend to use the words interchangeably. Animation is a change in some property over time, and that property doesn’t have to be a change in position. It can be a change in opacity, or color, or blur. Those kinds of non-movement animation convey a different feel and message than animation with a lot of motion.

If you stick to animating only non-movement properties like opacity, color, and blur, your interface will likely have a more calm and stable feel than if it included a lot of movement. So if your goal is to design something that feels calm, animation can definitely be a part of how you convey that feeling.

Any time you use animation, it says something, there’s no getting around that. When you’re intentional with what you want it to say and how it fits in with the rest of your design effort, you can create animation that feels like it’s so much a part of the design that it’s almost invisible. That’s a magical place to be for design.

MM: Do we also need to be mindful of the potential of animation to cause harm?

VH: We do. Animation can help make interfaces more accessible by reducing cognitive load, helping to focus attention in the right place, or other ways. But it also has potential to cause harm, depending on how you use it. Being aware of how animation can potentially harm or help users leads us to make better decisions when designing it. I included a whole chapter in the book on animating responsibly because it’s an important consideration. I also wrote about how animation can affect people with vestibular disorders a little while back on A List Apart.

MM: Who today, in your opinion, is doing animation right/well/interestingly?

VH: I’m always on the lookout for great uses of animation on the web—in fact, I highlight noteworthy uses of web animation every week in the UI Animation Newsletter.

Stripe Checkout has been one of my favorites for how well it melds UI animation seamlessly into the design. It really achieves that invisible animation that is so well integrated that you don’t necessarily notice it at first. The smooth 3D, microinteraction animation, and sound design on the Sirin Labs product page are also really well done, but take a completely different approach to UI animation than Checkout.

Publications have been using animation in wonderful ways for dataviz and storytelling lately, too. The Wall Street Journal’s Hamilton algorithm piece was a recent data-based favorite of mine and the New York Times did some wonderful storytelling work with animation around the Olympics with their piece on Simone Biles. Also, I really love seeing editorial animation, like the Verge had on a story about Skype’s sound design. The animations they used really brought the story and the sounds they were discussing come to life.

I really love seeing web animation used in such a variety of ways. It makes me extra excited for the future of web animation!

MM: Any parting thoughts, Val?

VH: My best advice for folks who want to use more animation in their work is to start small and don’t be afraid to take risks as you get more comfortable working with animation. The more you animate, the better you’ll get at developing a sense for how to design it well. I wrote Designing Interface Animation to give web folks a solid foundation on animation to build from and I’m really excited to see how web animation will evolve in the near future.

For even more web animation tips and resources, join me and a great bunch of designers and developers on the UI Animation Newsletter for a weekly dose of animation knowledge.

Categories: thinktime

Pages

Subscribe to kattekrab aggregator