There are several distributions of the corpus available. These include earlier releases, as well as different variants of the current release. They are listed below with the most recent releases first.
The recent versions release; the 112 systems, but only the most recent version of each system that we have. (Note that for some systems, mainly those that appear to be no longer active, the version we have can be quite old). This is intended for breadth studies. It is 3.31 GiB uninstalled and 9.65 GiB installed (not including jre).
The evolution release; the 15 systems for which we have 10 or more versions, a total of 579 versions. This is intended for studies on software evolution. It is 16.75 GiB uninstalled and 62.98 GiB installed (not including jre).
The recent versions release; the 111 systems, but only the most recent version of each system that we have. (Note that for some systems, mainly those that appear to be no longer active, the version we have can be quite old). This is intended for breadth studies. It is 3.27 GiB uninstalled and 9.57 GiB installed (not including jre).
The evolution release; the 14 systems for which we have 10 or more versions, a total of 486 versions. This is intended for studies on software evolution. It is 12.12 GiB uninstalled and 45.64 GiB installed (not including jre).
The recent versions release; the 106 systems, but only the most recent version of each system that we have. (Note that for some systems, mainly those that appear to be no longer active, the version we have can be quite old). This is intended for breadth studies. It is 2.9 GiB uninstalled and 8.5 GiB installed (not including jre).
The evolution release; the 13 systems for which we have 10 or more versions, a total of 414 versions. This is intended for studies on software evolution. It is 9.3 GiB uninstalled and 31.9 GiB installed (not including jre).
This is the complete corpus, with all 100 systems and every version of each system that we have. (9.42GiB/10.12GB distributed, 32.80GiB when installed)
The recent releases; the 100 systems, but only the most recent version of each system that we have. (Note that for some systems the version we have can be quite old). Useful if you only do breadth studies, and not studies of system evolution, and so don't need the complete distribution. The contents of this distribution should be a proper subset of the complete distribution. The separate identifier is therefore only necessary to identify the distribution being used in case there are issues. (1.39GiB/1.48GB distributed, 4.52GiB installed)
Replaced by 20100719. Useful if you want to replicate studies based on this release.
This is the complete corpus, with all 100 systems and every version of each system that we have. This release has two distributions available:
Replaced by 20100719r. Useful if you want to replicate studies based on this release.
The recent releases; the 100 systems, but only the most recent version of each system that we have. (Note that for some systems the version we have can be quite old). Useful if you only do breadth studies, and not studies of system evolution, and so don't need the complete distribution. The contents of this distribution should be a proper subset of the complete distribution. The separate identifier is therefore only necessary to identify the distribution being used in case there are issues. (1.2GiB distributed, 3.5GiB installed)
The corrected version of the 20080603 release. Useful if you really want to reproduce studies done on this version, but don't want to have to find the relevant versions of systems from the complete corpus or acquire the complete corpus. (2.8GiB distributed, 8.6GiB installed)
You really really only want exactly what was used in studies on this release. (2.8GiB distributed, 8.6GiB installed)