Term (link to details) |
Short description |
Candidate Pair |
A candidate pair is a pair of code fragments
for which there is some information regarding whether or not one is a
clone of the other. The information may be that
the pair is in fact a
clone pair, but it could also be that the
pair is not a clone pair.
|
Code Fragment |
A code fragment is any contiguous sequence of
text lines in a source code file.
|
Clone |
One code fragment is a clone of another
fragment if it is conceivable that a rational developer created one
fragment by copying (and possibly modifying) the other.
|
Clone Pair |
A clone pair is a pair of code fragments
for which there is some evidence that one fragment is a
clone of the other. That is, it is a
candidate pair where the information
is in support of the clone relationship existing.
|
Cluster |
A cluster is a set of code fragments
where, for every code fragment, there is at least one other code
fragment such that the two fragments together are a
clone pair. Note that this is the
"connected component" definition. The "clique" variant would require
that every pair of code fragments form a clone pair, but this
variant is not used in the Collection.
|
Confidence Level
|
Confidence level is an ordinal-scale value indicating the degree of
confidence regarding some datum.
|
ELOC |
"Executable" lines of code --- lines that are not blank,
are not entirely comments, and contain more than braces. |
Master File |
This is the authoritative data source for
clone information.
There is one for each system version.
|
Provenance |
Provenance
in the context of the Collection, refers to identifying the origin and
(ideally) processes for creating the data that provides supporting
evidence for the clone data.
|
Clone type |
There have been several classifications proposed for code clones,
the one that seems to be referred to the most is the clone 'type'.
The categories in this classification are Type-1
Type-2, Type-3, and
Type-4 (definitions taken from
Roy et
al.). There is not unanimous agreement on these categories,
especially what's in Type-3 and Type-4 is not considered in the
Collection.
|
Type-1 clone |
Identical code fragments except for variations in whitespace, layout and
comments.
|
Type-2 clone |
Syntactically identical fragments except for variations in
identifiers, literals, types, whitespace, layout and comments.
|
Type-3 clone |
Copied fragments with further modifications such as changed,
added or removed statements, in addition to variations in identifiers,
literals, types, whitespace, layout and comments.
|
Type-4 clone |
Two or more code fragments that perform the same computation
but are implemented by different syntactic variants.
|