What is a "System"

Measurement is about assigning numbers or symbols to attributes of entities, so when doing measurement we must be clear as to what the attribute we are measuring is, and what the entities are. In the case of the Qualitas Corpus the entities are "systems", so we must be clear what we mean by "system".

The kinds of empirical studies the Qualitas Research Group carry out are intended to help us understand how software engineers create code and the relationship between the code structure and quality attributes such as modifiability, reusability, maintainability, and testability. We would like to understand what decisions developers have made when writing the code.

Many systems require third-party software. Should such software be considered part of the system? Given that such software is usually not under the control of the developers of the system, including it in the analysis would be mis-leading in terms of understanding what decisions have been made by developers for that system.

In theory we can just look at what's distributed and determine from that what is and what is not the system code. However there is no common format for organising distributions and consequently it has proved sometimes difficult to answer this question.

The kinds of issues we have faced in identifying what constitutes an system's code include:

Ideally we would have the exact specification as to what the developers consider to be "in" the system (assuming there is agreement amongst them!). However it is a very time consuming process to get such information, and so for the moment we have made our best guess following the principles described below.

Principles for identifying system contents

The two main principles we have used in making decisions about what is in a system and what is not are:

The decision we have made regarding what is in a system is recorded in the sourcepackages attribute.