20120401 Release Notes

The goal for this release was to reorganise some of the metadata, specificially that data that does (or should) appear in the .properties files and the summary.csv file. As part of this exercise, some issues with the metadata were uncovered and fixed (details below).

Some new systems were added and the evolution release systems were updated with all their new versions since the last release of the corpus. One of the new systems (freemind) was also added as an evolution system.

In summary, 5 new systems were added, bringing the total to 111, and a total of 76 new sysvers were added (including for the new systems), bringing the total to 661 sysvers.

The "r" distribution has versions of 111 systems (3.27 GiB uninstalled, 9.56 GiB installed). The "e" distribution contains the 14 systems for which there are 10 or more versions. This distribution (12.12 GiB uninstalled, 45.64 GiB installed) is intended for evolution studies. The full distribution (at 15.69 GiB uninstalled, 56.20 GiB installed) can be made available on request.


The attributes metadata has undergone extensive change. Previously this data was found in two places, the individual .properties files and the global summary.csv file. Some attributes were found in both places, and some where only found in one place, so it was non-trivial to get all the attribute values. With this release, attribute values can be found in the same two places, but now all attribute values are found in both places.

Some new attributes have been added, most notably license (albeit incompletely), status, jreversion, and distribution. The versionnotes, which had been used as a catch-all for any notes regarding a given version, including internal management notes, has been re-purposed to contain only version-specific notes of use to a corpus user. The internal notes have been removed.

Some of the attribute names have been modified to provide a more consistent naming scheme. This affects the old names of systemversion (now sysver), LOC(Both) (now loc(both)), NCLOC(Both) (now ncloc(both)), #Both (now n_both), #Bin (now n_bin), #Top(Bin) (now n_top(bin)), and #Files (now n_files). Other attributes are now not provided, being either for internal use only or obsolete (notes, acquisitiondate, acquisitionperson, language, languageversion, origin, opensource, obfuscated, source).

During this exercise, a lot of checks were made of existing metadata, and some errors or missing data were uncovered. Below are the changes to attribute values that are not represented by the comments above:

Changed releasedate to use IS0 8601 ("2002-10-2" to "2002-10-02")
Changed releasedate to use IS0 8601 ("2004-09-4" to "2004-09-04")
Corrected url "http://www.columbamail.org" to "http://sourceforge.net/projects/columba"
Changed releasedate to use IS0 8601 ("2006/02/12" to "2006-02-12")
Changed releasedate to use IS0 8601 ("2006/06/10" to "2006-06-10")
Corrected domain for hibernate-3.5.3-final, hibernate-3.5.5-final, hibernate-3.6.0-beta1, hibernate-3.6.0-beta2, hibernate-3.6.0-beta3, and hibernate-3.6.0-beta4 ("Persistence Object Mapper" to "database")
Added fullname ("HyperSQL DataBase")
Changed releasedate to use IS0 8601 ("2006/04/16" to "2006-04-16")
Changed releasedate to use IS0 8601 ("2005/10/10" to "2005-10-10")
Corrected domain ("J2EE server" to "middleware")
Added fullname ("Chemistry Development Kit")
Changed releasedate to use IS0 8601 ("2007-06-10" to "2008-06-10")
Added domain ("programming language")
Changed fullname for marauroa-2.5 and marauroa-3.8.1 ("Server for Arianne, a multiplayer online games framework and engine" to "Marauroa")
Corrected domain ("parsers/generator/make" to "parsers/generators/make")
Changed releasedate to use IS0 8601 ("2006/03/06" to "2006-03-06")
Changed releasedate to use IS0 8601 ("2006/03/03" to "2006-03-03")
Added fullname ("Velocity Engine")
Changed fullname for all versions ("Weka---Machine Learning Software in Java" to "Weka")
Changed fullname ("Xalan." to "Xalan")
New Systems
batik, collections, freemind, hadoop, wct
Added versions
ant-1.8.2, antlr-3.3, antlr-3.4, argouml-0.22, argouml-0.30.1, argouml-0.32, argouml-0.32.1, argouml-0.32.2, argouml-0.34, azureus-, azureus-, azureus-, azureus-, azureus-, azureus-, batik-1.7, collections-3.2.1, eclipse_SDK-3.4.1, eclipse_SDK-3.6.1, eclipse_SDK-3.6.2, eclipse_SDK-3.7, eclipse_SDK-3.7.1, freecol-0.10.0, freecol-0.10.1, freecol-0.10.2, freecol-0.10.3, freecol-0.9.5, freemind-0.0.2, freemind-0.0.3, freemind-0.1.0, freemind-0.2.0, freemind-0.3.0, freemind-0.3.1, freemind-0.4, freemind-0.5, freemind-0.6, freemind-0.6.1, freemind-0.6.5, freemind-0.6.7, freemind-0.7.1, freemind-0.8.0, freemind-0.8.1, freemind-0.9.0, hadoop-1.0.0, hibernate-3.6.0, hibernate-3.6.1, hibernate-3.6.10, hibernate-3.6.2, hibernate-3.6.3, hibernate-3.6.4, hibernate-3.6.5, hibernate-3.6.6, hibernate-3.6.7, hibernate-3.6.8, hibernate-3.6.9, hibernate-4.0.0, hibernate-4.0.1, hibernate-4.1.0, jmeter-2.5, jmeter-2.5.1, junit-4.10, junit-4.9, lucene-2.9.4, lucene-3.0.3, lucene-3.1.0, lucene-3.2.0, lucene-3.3.0, lucene-3.4.0, lucene-3.5.0, wct-1.5.2, weka-3.6.3, weka-3.6.4, weka-3.6.5, weka-3.6.6, weka-3.7.4, weka-3.7.5
Changed properties
The contents.csv files changed for existing versions of eclipse_SDK due to the source now being unpacked.
Changed install
All of the .install for existing versions of eclipse_SDK have been changed to unpack the source code (previously this wasn't happening).
Changed bin
With the change to the eclipse_SDK installation procedure, the organisation of the bin directory has changed for most versions.
Changed src
With the change to the eclipse_SDK installation procedure, the organisation of the src directory has changed for most versions, specifically those versions now have the src unpacked.
Changed docs
The documentation for the metadata has been reorganised to reflect the new structure. A glossary has been added.