|Term (link to details)
A candidate pair is a pair of code fragments
for which there is some information regarding whether or not one is a
clone of the other. The information may be that
the pair is in fact a
clone pair, but it could also be that the
pair is not a clone pair.
A code fragment is any contiguous sequence of
text lines in a source code file.
One code fragment is a clone of another
fragment if it is conceivable that a rational developer created one
fragment by copying (and possibly modifying) the other.
A clone pair is a pair of code fragments
for which there is some evidence that one fragment is a
clone of the other. That is, it is a
candidate pair where the information
is in support of the clone relationship existing.
A cluster is a set of code fragments
where, for every code fragment, there is at least one other code
fragment such that the two fragments together are a
clone pair. Note that this is the
"connected component" definition. The "clique" variant would require
that every pair of code fragments form a clone pair, but this
variant is not used in the Collection.
Confidence level is an ordinal-scale value indicating the degree of
confidence regarding some datum.
||"Executable" lines of code --- lines that are not blank,
are not entirely comments, and contain more than braces.
This is the authoritative data source for
There is one for each system version.
in the context of the Collection, refers to identifying the origin and
(ideally) processes for creating the data that provides supporting
evidence for the clone data.
There have been several classifications proposed for code clones,
the one that seems to be referred to the most is the clone 'type'.
The categories in this classification are Type-1
Type-2, Type-3, and
Type-4 (definitions taken from
al.). There is not unanimous agreement on these categories,
especially what's in Type-3 and Type-4 is not considered in the
Identical code fragments except for variations in whitespace, layout and
Syntactically identical fragments except for variations in
identifiers, literals, types, whitespace, layout and comments.
Copied fragments with further modifications such as changed,
added or removed statements, in addition to variations in identifiers,
literals, types, whitespace, layout and comments.
Two or more code fragments that perform the same computation
but are implemented by different syntactic variants.