The main goals of this chapter are to accomplish the following:
Explain the key concepts and terms used in evaluation.
Introduce a range of different types of evaluation methods.
Show how different evaluation methods are used for different purposes at different stages of the design process and in different contexts of use.
Show how evaluation methods are mixed and modified to meet the demands of evaluating novel systems.
Discuss some of the practical challenges of doing evaluation, including the need forremote evaluation.
Illustrate through short case studies how methods discussed in more depth in Chapters 8 and 9 are used in evaluation and describe some methods that are specific to evaluation.
Provide an overview of methods that are discussed in detail in the next two chapters.
Evaluation is integral to the design process. It involves collecting and analyzing data about users’ experiences when interacting with a sketch, prototype, or component of a system. Evaluation can happen during design, before a product is released, or even after a product is launched with the aim of improving or addressing a pain point reported by a customer. A central goal of evaluation is to improve its design. Evaluation focuses on both the usability of the product (that is, how easy it is to learn and to use) and on the users’ experiences when interacting with it (for example, how satisfying, enjoyable, or motivating the interaction is). Devices such as smartphones, iPads, e-readers, and also mobile apps continue to stimulate awareness about interaction design and usability. Evaluation enables designers to check that their design is appropriate and acceptable for the people who will use it.
There are many different evaluation methods. Which to use depends on the goals of the evaluation. Evaluations can occur in a range of places such as in labs, people’s homes, outdoors, work settings, and remotely, using digital video conferencing systems like Zoom or Teams, or via distributed design and evaluation systems (Ali et al., 2019, 2021). Product evaluations, such as the ranking and commenting systems that retailers use to get feedback about their products, can also be thought of as a kind of evaluation.
Evaluations used to focus primarily on observing participants and measuring their performance during usability testing, experiments, or in natural settings, increasingly referred to as in-the-wild studies or research in the wild (Chapter 2, Box 2.4), to evaluate the design or design concept. But evaluation has become much broader, encompassing a range of methods, some of which involve working with participants remotely via digital and other technology. Others do not concern participants directly, such as modeling users’ behavior and analytics. Modeling users’ behavior provides an approximation of what users might do when interacting with an interface; these models are often done as a quick way of assessing the potential of different interface configurations. Analytics provide a way of examining the performance of an already existing product, such as a website, so that it can be improved. The level of control on what is evaluated varies; sometimes there is none, such as in studies in the wild, and in others there is considerable control over which tasks are performed and the context, such as in experiments. The methods selected will depend on several factors including what the evaluators want to find out, the type of product, when in the design the evaluation occurs, and logistical constraints such as cost and time.
In this chapter, we discuss why evaluation is important, what needs to be evaluated, where evaluation should take place, and when in the lifecycle evaluation is needed. Some examples of different types of evaluation studies are then illustrated by short case studies.