Usabilty as Procurement Criteria for Software, by N. Bevan
below is an article by Nigel Bevan that should be of interest to all.
Usability as Procurement Criteria for Software
Workshop organised by NIST, Gaithersburg, 9-10 September 1998
Report by Nigel Bevan
This was the third workshop in a series, intended to finalise the procedure
to be used by suppliers to provide usability reports to purchasing
organisations. The workshop was attended by representatives from suppliers
including IBM, Microsoft, HP, Sun, Kodak, Oracle and Compaq.
Representatives of purchasers included Boeing, Northwest Mutual Life, State
Farm Insurance and Fidelity.
The initial plans had assumed that most existing usability test reports
were potentially useful for procurement purposes. Following discussion at
the workshop it was acknowledged that a test which measures usability was
required, which may be quite different from the more common usability
evaluations used to provide feedback for design. My draft description of
this type of testing, for potential inclusion in the proposed scheme, is
The workshop accepted the suggestion to use ISO 9241-11 as a framework for
reporting usability test results. The MUSiC metrics for effectiveness,
efficiency and satisfaction are an example of the application of this
standard. Susan Brummel of GSA, who works in the area of special needs,
was a guest speaker at the meeting. The guidance on the context
description will need to be specific about special needs.
It is expected that provision of reports will initially be negotiated
between individual purchasers and suppliers, although the report format
could also be used to make comparative evaluations, and the purchaser or a
third party could carry out the tests.
The scheme will comprise of a usability report format, usability
measurement guidelines, and rules for joining the pilot evaluation. It is
planned to start pilot tests of the scheme in January 1999. Negotiations
have already started between pairs of suppliers and purchasers represented
at the workshop. Participants will provide feedback on the appropriateness
of the scheme. In the subsequent 30 months, data will also be collected on
productivity improvements and other benefits to the purchasers resulting
from acquisition of more usable software.
Details of the scheme will be publicly available on the web from January
1999. It is intended to be international in scope, and other pairs of
organisations will be encouraged to join the pilot and provide feedback and
cost benefit data (in English) to NIST, the US National Institute of
Standards and Technology. We decided that international participants
should report their results directly to NIST, but that there may be a need
for local support for organisations outside the US that are considering
INDUSTRY STANDARD USABILITY TEST REPORT
The purpose of the Industry Standard Usability Test Report is to give the
results of measuring the usability of a product in specific contexts of
use. (This contrasts with typical internal usability test reports where
the major purpose is often to identify usability defects.)
Industry Standard Usability Test Reports need to provide sufficient
information to enable the reader to be able to judge whether the results
are reliable and relevant to their needs. For results to be reliable it is
1. The users, tasks and context used for the evaluation must be
representative of the intended real world usage of the product.
2. To be representative of real world usage, during testing users should
not be asked to think aloud. They must not be given any hints or
assistance, other than by mechanisms available to real users (such as
documentation or a telephone help desk).
3. Data must be obtained from sufficient users in each category for the
sample of users to be representative of the intended user group: experience
has shown that this generally means that it is necessary to test at least 8
users from each user group.
4. It must be possible for the measures taken to be used to establish
acceptance criteria or to make comparisons between products. This means
that the measures must be counting items of equal value. (For example,
using an unweighted count of errors is usually inappropriate as the impact
of errors differs.)
5. It is recommended that results are given for three aspects of usability:
effectiveness, efficiency and satisfaction.
- Effectiveness should be measured by the extent to which the goals of the
task have been achieved. One possible metric is the percentage of users
that wholly achieve their goals. If goals can be partially achieved (eg by
incomplete or sub-optimum results) then an appropriate metric is the
average goal achievement, scored on a scale of 0 to 100% based on specified
- Efficiency is generally assessed by the mean time taken to achieve the
task (in relation to the average effectiveness). Efficiency may also
relate to other resources (eg total cost of usage), or be relatively
unimportant (eg for some consumer applications).
- Satisfaction is typically assessed by using a standardised questionnaire
such as SUS  or PSSUQ  (both available for use without charge), or
SUMI  or QUIS ). Other specific user attitudes may also be assessed
(eg "delight" with the product), whenever possible using a generally
available standardised scale.
To judge whether the results are relevant to the needs of the purchaser, it
is essential that the report contains a complete description of:
- the intended users of the product
- the goals intended to be achieved using the product
- the intended context of use (hardware, software, and physical and
A detailed description of the context of use is required for each
combination of user group and user goals which has been tested. Any
differences between the intended context of use and actual context of
evaluation also need to be explained.
For more information on measuring usability, see ISO 9241-11 , Dumas and
Redish (1993) , Bevan and Macleod (1994) , Macleod et al (1997) .
 Brooke J (1996). SUS: A "quick and dirty" usability scale. In
Usability Evaluation in industry. Taylor and Francis. See
 Lewis, J.R. (1995). IBM Computer Usability Satisfaction
Questionnaires: Psychometric Evaluation and Instructions for Use.
International Journal of Human-Computer Interaction, 7, 57-78.
 Kirakowski, J. (1996). The software usability measurement inventory:
background and usage. In: P. Jordan, B Thomas, & B Weerdmeester, Usability
Evaluation in Industry. Taylor & Frances, UK. See also
 Shneiderman, B. (1998). Designing the User Interface. Reading, MA,
Addison-Wesley Publishing Co. See also:
 ISO 9241-11 (1998). Guidance on usability. ISO (Available from
national standards bodies. Contact details can be found at
 Dumas, J. S. and J. C. Redish (1993). A Practical Guide to Usability
Testing. Norwood, NJ, Ablex Publishing Co. (chapter 13)
 Bevan, N. and Macleod, M. (1994). Usability measurement in context.
Behaviour and Information Technology, 13, 132-145.
 Macleod M, Bowden R, Bevan N and Curson I (1997). The MUSiC Performance
Measurement method, Behaviour and Information Technology, 16.
| *** NEW ADDRESS *** |
| Nigel Bevan | firstname.lastname@example.org |
| Serco Usability Services | Tel: 0181 614 3811 |
| Alderney House, 4 Sandy Lane | +44 181 614 3811 |
| Teddington, Middlesex | Fax: 0181 614 3765 |
| TW11 0DU, UK | +44 181 614 3765 |
| http://www.usability.serco.com |
Sandra Manzi | e-mail: Sandra.Manzi@issco.unige.ch
ISSCO, University of Geneva | Tel: (+41)22 705 71 16
54 route des Acacias | fax: (+41)22 300 10 86
CH-1227 GENEVA (Switzerland) | WWW: http://www.issco.unige.ch