ISO 9126 - Nigel Bevan.

Summary of presentation.

Nigel Bevan outlined the content of his presentation by saying that he would first discuss the 1991 ISO/IEC 9126 standard ("Information Technology - Software product evaluation - Quality Characteristics and guidlines for their use"), and then look at some things that were wrong with it. He would then consider some issues to do with the relationship between quality and usability before turning to a series of more recent ISO documents, ISO/IEC 12004598 Software Product Evaluation, ISO FCD 9126-1 and ISO/PDTR 9126-2,3. His presentation of the ISO 9126 work started by reminding us that the 1991 version of the standard set out six quality characteristics. Although the 1991 version mentioned that each characteristic could be broken down into sub-characteristics, it was in the new version that explicit suggestions for these sub-characteristics were made, as follows:

FUNCTIONALITY: accuracy, suitability, interoperability, compliance, security
USABILITY: understandability, learnability, operability
MAINTAINABILITY: analysability, changeability, stability, testability
RELIABILITY: maturity, fault tolerance, recoverability
EFFICIENCY: time behaviour, resource utilisation
PORTABILITY: adaptability, installability, conformance, replaceability

After recalling the ISO 9126 (1991) definitions of each of the quality characteristics, he pointed out that defining a standard for quality characteristics presupposed some definition of quality itself. As starting points for reaching such a definition, he summarized the definitions available in different ISO documents dealing with quality. Thus, he quoted the ISO 8402 definition of quality as being "the totality of characteristics of an entity that bear on its ability to satisfy stated and implied needs" (the definition that the first round of EAGLES work had also taken as a starting point). The proposed revision of ISO 9126 defines quality as the "extent to which the quality requirement is met", and defines a software quality characteristic as "a set of attributes that bear on (the effort needed for use)".
Nigel also quoted from Garvin (1984), who distinguishes different kinds of quality:

Transcendent quality: a simple unanalyzable property recognised through experience
Product quality: an inherent characteristic of the product determined by the presence of measurable product attributes
User perceived quality: product attributes which provide the greatest satisfaction to a specified user
Economic quality: a product which provides product performance at an acceptable price

AISO DIS 9241-11 offers some guidance on usability, a quality characteristic which is intuitively very closely related to the overall quality of a software product, through the following definitions:

Usability: the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use. Three terms here are critical and are further defined:

Effectiveness: The accuracy and completeness with which users achieve specified goals.
Efficiency: the resources expended in relation to the accuracy and completeness with which users achieve goals.
Satisfaction: The comfort and quality of use.

On the basis of these preliminary definitions, Nigel suggested an approach to quality based on distinguishing quality process, product quality and quality in use. Quality process relates to development and life cycle processes, and depends to a large extent on use of resources. Product quality relates to the product as considered independently of any context, and can be measured by internal measurements, which look at the code and internal workings of the software or by external measurements which test how the system behaves. Finally, quality in use relates to the effect of the product and is measured by looking at the product in its contexts of use.

ISO 14598-1 offers a general overview of software product evaluation. Three levels are distinguished, requirements definition, specification and design and development. (The overall scenario supposed is very closely related to evaluation for development purposes). A chain passing through each level and back again can be discerned. Requirements are determined by the real world, which determines what the needs are. At the end of the chain, needs relate very strongly to quality in use, and there is a feedback loop between the two. Before getting to the end of the chain, though, at the specification level, needs determine the external quality requirements, which relate to the specification of system behaviour. External quality requirements in their turn determine, at the design and development level, the internal quality requirements, which are related to software attributes. The chain now turns back towards the specification level, since internal quality, measured by internal metrics, both verifies that the internal quality requirements have been met, and serves as an indicator of external quality, which, at the specifications level is related to system integration and testing and is measured by external metrics. External quality in turn is an indicator of quality in use, which, back at the requirements level and at the end of the chain, relates to the system in operation and is also measured by external metrics.

Internal quality here is defined to be the totality of attributes of a product that determines its ability to satisfy stated and implied needs when used under specified conditions. External quality is defined as the extent to which a product satisfies stated and implied needs when used under specific conditions. Nigel reinforced this definition by presenting an analysis in terms of sets: sub-characteristics are sub-sets of characteristics, attributes are subsets of sub-characteristics. In the presentation on EAGLES work, we shall see that this definition is equivalent to the formalisation in terms of feature structures adopted by the EAGLES Evaluation Group. However, one distinction made here which is absent from the EAGLES work is the distinction between external characteristics and internal characteristics. Much discussion during the presentation turned around clarifying this distinction. The common understanding eventually seemed to be that internal attributes were things like the number of lines of code, or the amount of disc space required, where external attributes were things like producing the right answer or having a pleasant interface.

Nigel then went on to present a schema of relationships between measures and attributes. In this schema, measures of actual usage measure quality in use, whilst at the same time indirectly measuring the external attributes of the computer system. External measures of software measure external attributes of the computer system, whilst at the same time indirectly measuring internal attributes of software and indicating quality in use. Internal measures of software measure internal attributes of software whilst also indicating external attributes of software.

The presentation now turned to a presentation of where the draft proposal for ISO 9126 was significantly different from the 1991 version.

A major important difference was the introduction of quality in use (see the INUSE presentation) as a single characteristic to which all other quality characteristics contributed. As well as the breakdown of the quality characteristics into sub-characteristics as mentioned at the beginning of this section, the actual definitions of the quality characteristics had changed as shown below.

FUNCTIONALITY:
Old definition: A set of attributes that bear on the existence of a set of functions and their specified properties. The functions are those that satisfy stated or implied needs.

New definition: The capability of the software product to provide functions which meet stated and implied needs when the software is used under specified conditions.

Sub-characteristics: suitability, accuracy, interoperability, security.

RELIABILITY:
Old definition: A set of attributes that bear on the capability of software to maintain its level of performance under stated conditions for a stated period of time.

New definition: The capability of the software product to maintain a specified level of performance when used under specified conditions.

Sub-characteristics: maturity, fault tolerance, recoverability.

USABILITY:
Old definition: A set of attributes that bear on the effort needed for use and on the individual assessment of such use, by a stated or implied set of users.

New definition: The capability of the software product to be understood, learned and liked by the user, when used under specified conditions.

Sub-characteristics: understandability, learnability, operability, attractiveness.

EFFICIENCY:
Old definition: A set of attributes that bear on the relationship between the level of performance of the software and the amount of resources used, under stated conditions.

New definition: The capability of the software product to provide appropriate performance, relative to the amount of resources used, under stated conditions.

Sub-characteristics: time behaviour, resource utilisation.

MAINTAINABILITY:
Old definition: A set of attributes that bear on the effort needed to make specific modifications.

New definition: The capability of the software product to be modified. Modifications may include corrections, improvements or adaptation of the software to changes in environment, and in requirements and functional specifications.

Sub-characteristics: analysability, changeability, stability, testability.

PORTABILITY:
Old definition: A set of attributes that bear on the ability of software to be transferred from one environment to another.

New definition: The capability of software product to be transferred from one environment to another.

Sub-characteristics: adaptability, installability, coexistence, replacability.

As already noted, quality in use serves as an overriding quality characteristic to which all others contribute. It is defined as "the extent to which a product used by specific users meets their need to achieve specified goals with effectiveness, productivity, safety and satisfaction." Definitions of four critical terms are:

Effectiveness: the extent to which a software product enables users to achieve specified goals with accuracy and completeness in a specified context of use.
Productivity: the resources expended by the user and system in relation to the effectiveness achieved when using the software product in a specified context of use.
Safety: the extent to which the software product limits the risk of harm (to persons) or damage to an acceptable level in a specified context of use
Satisfaction: the extent to which the software product satisfies users in a specified context of use.

The proposed revision of ISO 9126 is accompanied by two other documents, which are currently in a fairly preliminary stage of drafting. Both concern metrics. The 1991 version of ISO 9126 explicitly refrained from talking about metrics or about metric design, seeing this as something that was specific to particular evaluations and where little of value could be said at a very general level. The new drafts set out requirements for metrics in the following terms:

Metrics used for comparison shall be valid and accurate.
Measurements shall be objective. There should be a written and agreed procedure.
Measurements shall be empirical. They should use observation or a psychometric scale. The scales may be interval, ratio or absolute scales.
Measurements shall be reproducible. The same measures should be obtained by different people on different occasions.

The two accompanying documents concern external metrics and internal metrics. ISO 9126-2 (External metrics) is concerned with defining external metrics, internal metrics and quality in use. It also distinguishes three types of metrics: direct, indirect and indicators. (This relates to the relationship between measures and attributes described above). Properties of metrics will be defined. Some important properties include the purpose of the metric, the method, the measurement formula, interpretation, the type of measure, the life cycle stage to which he metric is pertinent and who the metric is useful to. ISO 9126-3 (Internal metrics) is less advanced, and its contents were not given.

Reference:

Garvin (1984) What does "product quality" really mean? Sloane Management Review, Fall 1984.

Discussion.

There had been quite a lot of discussion during the presentation of the paper, mostly aimed at clarifying certain points. In the discussion after the presentation, the following points were raised:

a) The EAGLES group had introduced a seventh quality characteristic, customisability, which had been felt to reflect the special character of language software. (It nearly always has to be modified, often by the user himself, before it meets user requirements). Given the new definition of maintainability, customisability fits much more comfortably under this quality characteristic, and the introduction of a new characteristic seems less justifiable.

b) Much discussion concerned the use of the words "external" and "internal" in the phrases

internal and external quality
internal and external attributes
internal and external metrics

to which were added the EAGLES use of them in the phrase:

internal and external validity (of metrics).

(The EAGLES use is based on work on testing in the social sciences, where internal validity is a quality inherent to the nature of the test, external validity is based on a correlation between the values obtained in the test and some other, external criterion. More detailed discussion and exemplification can be found in the EAGLES report).

Despite the discussion, during which the current author managed to convince herself that basically the same notion was present in each case under slightly different guises, that conviction has disappeared to the extent she does not feel able to give a coherent description of how the different uses relate to one another or reflect some common interpretation. Any help will be gratefully received.

c) It was remarked that the requirements for metrics set out in the current draft version of ISO 9126-2 seemed at first glance only to apply to attributes of type test in the EAGLES framework. (See the EAGLES presentation for type test). However, once it was noticed that the first requirement stipulated that metrics should be valid and accurate, whilst the other stipulations talked about measurements rather than metrics, this was less obviously the case. Here too was a point for further reflection and discussion.

d) A number of participants found that there was a very pleasing convergence between ISO based work and EAGLES work.

Back to main workshop report