The Qualitas Corpus is a curated collection of software systems intended to be used for empirical studies of code artefacts. The primary goal is to provide a resource that supports reproducible studies of software. The current release of the Corpus contains open-source Java software systems, often multiple versions.
What do you get?
The current release is version 20130901. It has 112 systems, 15 systems with 10 or more versions, and 754 versions total. There are two main distributions: the "r" (recent) release, containing the most recent versions we have of every system (112 systems) and the "e" (evolution) release, containing all versions of the 15 systems with 10 or more versions, a total of 579 versions. There are other distributions available.
In publications that use the corpus, please cite the APSEC paper and always identify the release used.
|Acquiring the corpus||Installing the corpus|
|Distribution structure||Structure of the content|
|Defining systems||Metadata about the contents|
|Criteria for inclusion||Development status and plans|
|History of the corpus||Conventions used|
|Citing the corpus||Publications based on the corpus|