Roadmap to Usability Testing
Usability Evaluation: Current Standards, Methods & Metrics
Debates in Usability: Criticisms, Limitations, and Future Perspectives
Towards an Integrated Approach
A New Usability Paradigm
Usability stands as a pivotal factor in determining the success of any product. It focuses on users' seamless and efficient interaction with digital products or interfaces. The International Organization for Standardization (ISO) 9241-11 dedicated to the ergonomics of human-computer interaction provides the most widely used definition of usability (Sauro, 2015). According to this standard, usability is characterized by "the extent to which a product can be used by specified users to achieve specified goals, with effectiveness, efficiency, and satisfaction in a specified context of use."
Additionally, the ISO 9241-210 standard is another pertinent guideline, focusing on human-centered design and development process for any interactive systems. Following these guidelines throughout the lifecycle of digital products, organizations can create a structured process to ensure a human-centric design approach that facilitates product’s usability. Additionally, dedicated usability tests at various stages of product lifecycle can assess the success of the human-centric design process in guaranteeing product usability.
Usability tests form a structured evaluation process and/or metrices that are crucial for understanding user challenges and enhancing the user-friendliness of digital products. Testing can act as a comprehensive bug hunt, evaluating specific aspects of user interaction to provide detailed insights. Systematically identifying issues like unclear navigation or confusing settings enables precise rectification. The business advantage of detecting problems early lies in ensuring efficient collaboration across design and engineering teams, averting expensive coding errors, and design flaws. Testing helps in preventing frustrations due to communication gaps across teams, unnecessary product iterations, longer time to market, and a product that is disconnected from actual user needs. This proactive approach not only enhances user satisfaction but also saves time and resources for the business.
In this article, we explore the measurement and assessment of usability, addressing the limitations in current standards and associated metrices. To contribute to the field, a comprehensive and integrated approach aimed at improving usability evaluation methods is proposed.
Usability testing follows a well-structured procedure typically consisting of the following steps:
The Common Industry Format (CIF) outlined in ISO/IEC 25062 provides a standardized way to present information about usability testing. It facilitates effectively exchange of usability-related information among different stakeholders. Insights from usability tests serve as guiding principles for iterative improvement, facilitating design and development teams to implement refinements that enhance the product's usability.
Usability evaluation is a multifaceted process that employs various methods. Qualitative and quantitative methods, as well as, a combined approach known as mixed methods, are integral components of this evaluation. Typically, these assessments are carried out by expert evaluators or through automated procedures.
The quantitative methods, focus on gathering numerical data offering concrete measurements of user interactions like task success rates, completion time, and error rates. They objectively assess that "what" questions, identifying what constitutes challenging issues of user interaction. Alternatively, qualitative methods involve gathering non-numerical insights into user preferences and emotions. Techniques such as user interviews and observations are employed to collect subjective data. These methods explore the "why" questions, identifying the reasons behind the user challenges. Employing the mixed method ensures the triangulation of data from diverse sources, addressing both numerical metrics and subjective insights.
Expert-based methods rely on evaluators with expertise who assess user tasks, guideline and standard conformance like the Systems and Software Quality Requirements and Evaluation (SQuaRE) outlined in ISO/IEC 25000:2014. Additionally, experts conduct heuristic evaluation which involves assessing a system’s user interface design issues by employing recognized design and usability principles. Nielson and Norman Group's heuristics, widely recognized in usability assessment, and Schneiderman's Eight Golden Rules offer frameworks for this evaluation (Wong, 2022; Wong, 2020). Cognitive Walkthrough (Lewis & Wharton, 1997) a method to find user difficulties by simulating their task steps in system evaluation, involves experts assessing the interface based on user opinions and experiences to identify problems.
Automated evaluation streamlines usability testing by automating the process. Popular usability testing tools and applications like Maze, User Testing, Usability Hub can cut evaluation time and costs, offering both quantitative and qualitative insights into product usability. They provide objective measurements, problem detection and suggestions for correcting usability issues along with qualitative analysis of sentiments and user surveys.
Blending automated tools with meticulous manual testing provides a well-rounded assessment of a digital product's usability. It ensures data triangulation capturing both fundamental issues identified by automation and nuanced user experiences that require human evaluation.
Most contemporary evaluation tools gauge usability based on the ISO 9421-11's three core criteria—effectiveness, efficiency, and satisfaction (Hornbaek & Law, 2007). This standard sets the benchmarks for what constitutes a 'usable' product. Discussed below are some frequently employed usability metrices based on the ISO criteria.
The Effectiveness criterion focuses on assessing how easily users achieve their goals while interacting with a digital product. Quantitative measures such as Completion Rate gauge the percentage of users successfully completing tasks, providing a tangible measure of task success. The Number of Errors is a metric that quantifies user mistakes during tasks, offering insights into the user experience and identifying areas for improvement. To gauge effectiveness qualitatively, user feedback, observations, and success stories provide valuable narratives. (Budiu, 2017).
The Efficiency criterion assesses how quickly users achieve goals with minimal effort. Quantitative measures like Task Time, evaluate it by assessing the time users take to complete specific tasks. Learnability evaluates how quickly users can perform a task after gaining experience with the product. Assessing learnability through observation and reflective reports can be used to measure efficiency in a qualitative way (Budiu, 2017).
The Satisfaction criterion captures the user's subjective experience. The System Usability Scale (SUS), a widely used tool, contributes to a quantitative assessment of the user experience by providing a standardized measure of perceived usability. Task-Level Satisfaction and Test Level Satisfaction gather feedback on user satisfaction with individual tasks and the entire testing experience, respectively, providing valuable qualitative insights into users' emotional responses and preferences. (Budiu, 2017).
Despite methodological differences, these contemporary usability metrics align with ISO standards, emphasizing adherence to established guidelines (Bertoa & Vallecillo, 2012; Bevan et al., 2016). While these standards offer a common usability framework, it is important to acknowledge certain limitations.
Critics frequently contend that ISO's approach to usability relies on broad categorization, a method criticized for its excessive abstraction and lack of measurability (Tractinsky, 2017). This renders the scientific community to be fragmented with ongoing debates among experts regarding the conceptualization of usability and its measures. This issue reflects a broader challenge in the dispersion of knowledge within the field. The relationship between the three dimensions of efficiency, effectiveness and satisfaction is also reportedly inconsistent and unclear (Hornbaek, 2006). This reduces replicability and further leads to diverse operationalizations of the concepts (Tractinsky, 1997; Frokjaer et al., 2000).
Current usability metrices measuring effectiveness, efficiency and satisfaction also present inherent limitations. They offer broad measurements without delving into specific components. This oversimplification hinders an elaborate understanding by reducing complex aspects to single numbers, potentially overlooking decisive details and contextual factors (Hornbaek, 2006). Moreover, inclusivity considerations and cultural differences are often overlooked, impacting the interpretation and validity of measurements (Borsci et al., 2018). To address this, usability measures should be attuned to cultural variations and inclusive of diverse user groups, ensuring that assessments accurately reflect the experiences of a wide range of users. Experts have emphasized the need for a comprehensive and context-aware approach to usability assessment (Borsci et al., 2018, Hornbaek, 2017).
Moreover, it is crucial to recognize that technological advancements significantly impact product development in the fields of user experience design and interface engineering. The generic and process-oriented nature of ISO standards may face challenges in keeping up with these changes, potentially rendering some guidelines outdated (Kohl, 2020). However, amid development and process-related changes, the fundamental principles rooted in user psychology and experience will remain constant. Considering the subjectivity of usability and varying user preferences, the widely used ISO standards may not capture the full spectrum of user experiences and reactions. Therefore, adopting a holistic approach to usability that incorporates principles from behavioural science is crucial in assessing usability outcomes.
Researchers and practitioners affirm the utility of ISO guidelines. However, they advocate for a nuanced understanding that involves recognizing the limitations inherent in these guidelines (Borsci et al., 2018; Hornbaek, 2017). Furthermore, usability researchers state the need for ongoing development to enhance the inclusivity, comprehensiveness, and validity of these constructs (Borsci et al., 2018).
We propose that usability standards and metrices move beyond a focus on the development process and functional outcomes. While a user-centric development process is impactful in creating functional products, but given the subjectivity of user satisfaction, usability metrices needs to be more integrated with users' cognitive capabilities, preferences, and behaviours. This can be achieved by integrating the ISO guidelines and metrices with behavioural science theories and principles related to user experience.
Case in point - consider the integration of the ISO criteria for usability with the 6 Minds framework (Whalen, 2019). Within the context of measuring usability, the 6 Minds framework addresses various cognitive processes including visual perception, memory recall, language
comprehension, decision-making, emotional response, and strategic thinking. It underscores the importance of considering not only functional usability but also emotional, social, and contextual aspects of user interactions.
As the field of usability continues to evolve, our proposal encourages a shift toward a more integrated and adaptive framework, bridging the gap between established standards and the dynamic nature of user experiences. By integrating ISO standards with insights from behavioural science theories, incorporating best practices from UX, and drawing upon existing knowledge, we aim to capture not only the functional aspects of usability but also the emotional, social, and contextual dimensions of user interactions. By embracing this approach, we can navigate the limitations of current metrics and pave the way for a more user-centric and effective usability assessment in the ever-changing digital landscape.