RESNA 26th International Annual Confence
User-centered design of Software for Assessing Computer Usage Skills
User-centered development of the Compass software for assessing computer input skills continues. This paper describes how we gather information about user requirements and evaluate system usability. These activities help to ensure that Compass will meet the needs of its users.
We are developing software, called Compass, which allows an evaluator to assess an individual's computer input skills. This assessment tool can help diagnose difficulties with an existing interface; evaluate and compare the expected performance with potential access systems; plan training interventions; track changes in a client's abilities over time; and measure the effectiveness of an intervention [1].
A user-centered design process has been employed throughout the project to make sure that the final version of Compass delivers real benefits to clinicians and their clients. In the initial development phase, surveys and usability tests involving clinicians were conducted to determine if the prototype met clinical needs [2]. In the second phase of the project, we now have the opportunity to develop this early prototype into a commercial system.
To be successful, Compass must be highly usable and provide relevant information. To ensure that these goals are met, we established a usability test plan that incorporates informal testing throughout and formal testing at 9, 15, and 21 months into the 2-year project. The primary methods of gathering and incorporating user feedback into the development of the prototype are described below.
The second phase of development began with a survey to get more detailed information on the features that potential users see as important. The survey included 86 questions covering types of tests, evaluator configuration options, and the evaluators' backgrounds and clinical practices. The evaluator options addressed test configurations, reporting options, data management, program installation, remote use, whether the program should suggest interventions and licensing options.
The survey was placed on our website and a request for assistance sent to clinicians who had expressed interest in the project as well as several AT-related listservs. 91 clinicians representing a variety of disciplines and settings responded. Answers were on a 1 - 5 scale, from "Not useful at all" to "Extremely useful." Items averaging significantly above 4.0 (at 95% confidence) were considered to be required features, while those averaging significantly below 3.6 (at 95% confidence) were considered to be unnecessary features at the present time. General comments were also encouraged.
The results helped us prioritize development of specific features and provided a few surprises, as well. For example, although we had not initially considered including tests for basic visual abilities, tests for visual acuity, color discrimination, object tracking, and contrast all had an average usefulness rating significantly greater than 4.0. As a result, we have started working on vision tests. In contrast, there was relatively little interest in being able to use Compass remotely over the World Wide Web. All but one of the questions in this area scored significantly below 3.6, leading us to reduce the scope of our goals for remote operation.
The results of the survey and the comments were used to develop three prototypes in Visual Basic for further testing. These prototypes had different styles of clinician interfaces for managing data and for configuring tests and reports. A review of the prototypes was conducted to determine which design best reflected the way clinicians work. The prototypes were taken to several facilities and each was shown to six reviewers with varied backgrounds. Because some features were not yet active, the prototype was demonstrated to avoid errors due to program limitations rather than design. Each clinician was asked the same set of questions about features for each prototype version. They were then asked to compare the versions and provide relative rankings.
Most of the clinicians preferred the simpler styles of interfaces describing them as "cleaner" and easier to use. They reported that the versions with more detail to provide extra assistance for novice users were nice in theory and might be good for other users but weren't necesssary for them. On the configuration screens there was a split between those who liked seeing all the options as prompts and those who wanted just the most commonly changed items. There was strong agreement that trying to include all the possible client and evaluator data fields would make the input screens too complicated and still would not meet the specific needs of the range of settings where Compass might be used.
Using the rankings and comments we received, the positive features of each prototype were refined and combined, where possible, into a single working prototype. The configuration and report screens were set up to present basic information with a button to show more detail if desired. The input screen was simplified to contain only basic fields such as name and date but a large comment field was added. This provides the opportunity for users who wish to enter specific information such as registration numbers to enter their data in the format that best meets their needs.
The new prototype was shown at the Developers' Forum at the RESNA 2002 annual conference. This provided an excellent opportunity to gain input from a wider range of potential users than would typically be available to us. Participants had the opportunity to try the prototype themselves and discuss it with the development team. No formal questioning protocol was followed but the comments and issues were recorded. This feedback confirmed most of the changes made as the result of the prototype reviews.
The usability of this single prototype was then evaluated in a formal usability test, following methods used in the first phase of the project [2]. A series of seven tasks was defined to cover a range of activities within the software, including the selection of a client, setting up a series of tests for a client, running the tests, and reviewing the results. Before testing began, specific measurable objectives for the performance of the seven tasks were defined based on the surveys and testing in the first phase of the project (see Table 1).
Eight clinicians with varied backgrounds and experience participated. Subjects were given a basic introduction to the goals of Compass, but no specific instructions or orientation to its use. Each task was described on a single piece of paper for reference during testing. Clinicians performed all seven tasks in series, playing the role of both the evaluator and the client during the task. The clinicians were asked to try to solve any problems on their own, only asking questions of the experimenter if they felt stuck.
Each subject action and its associated time were recorded. All comments, requests for help, and any apparent confusion or indecision were noted along with the action. Data were analyzed to determine for each task: the successful completion rate, the completion time, and the number and type of errors made. Subjects completed a post-test survey that assessed their level of agreement with 10 statements about the prototype. Positive responses to the survey questions such as "It was easy to learn how to use Compass" were additional indicators of successful prototype design. Clinicians were also asked open ended questions about the features and ways they might use Compass.
Task Completion Rate |
|||
Table 1 shows that the results met the pre-defined objectives. Subjects completed each task within the target time for that task and all of the tasks within the overall time goal. There were only 2 requests for experimenter assistance. One of these was for general information, and another subject found the answer before the experimenter could respond. All of the survey questions met the target level indicating a very positive response to the prototype. Subjects made an average of 2.1 errors across all tasks; none were catastrophic. There were 22 different points of confusion or concern across all of the tasks. These were ranked by their frequency of occurrence. Where more than one person had difficulty, we made design decisions regarding changes to the methods of use, layout, or instructions. Examples include confusion about the use of a "map" to help users track their location within the program, instructions for running one particular test, file management problems, and interpretation of report data. All of these areas have been reworked for the next prototype version.
The user-centered design has provided invaluable information as we develop the Compass software, resulting in some significant changes from our initial design. By collecting this information throughout the design process we were able to make the appropriate design changes without significant reworking of the software. Formal and informal testing will continue throughout the duration of the project.
This work was funded by the National Institutes of Health, grant #1R42-NS36252-01, as a Phase II STTR award to Koester Performance Research. Many thanks to the participating clinicians for their time and thoughtful insights.
Glen Ashlock,
M.S., ATP
Ann Arbor CIL,
2568 Packard Rd.,
Ann Arbor MI 48104