A confidence level reflects the amount of evidence, and the agreement of said evidence, in support of the measurement. Generally, the more evidence there is, the higher the confidence. However, if there is lot of evidence, but it is inconsistent (for example some says a candidate pair is a clone pair but other evidence says it is not) then that will lower the confidence level. Another interpretation of confidence level is that it indicates the probability that the measurement might change in light of more evidence.
A confidence level provides little information about the error in the measurement. It could be that a measurement has the lowest level of confidence, but is absolutely correct — it is just that there is no other support evidence.
The current measurement values are:
The output from any tool, unless that tool is known to have perfect precision and recall, is assigned confidence level "Low", lacking any further data. The rationale is that, assuming the tool has reasonable quality, we can have some confidence in the value (that is, the confidence level should not be "Lowest"), but there will be doubt for any measurement from any tool with non perfect recall and precision, so we cannot claim particularly high confidence.
Assessment by a human generally increases the level of confidence, unless the human judges the original measurement to be wrong. In the latter case, the measurement should be changed to "Lowest". In the former case, how much it goes up depends on its current value and how confidence the human judge is.
For example, if it is currently "Low", and the human agrees on the measurement, and there has been no other assessment, then the level should be increased to "Medium" at least. But in some cases (e.g. Type I clones), the human could be quite sure that no one else will disagree, in which case the level could be increased to "High".
However, if the current level is "Medium" or "High", then there may not yet be enough supporting evidence to increase the level.
Under no circumstances will a measurement assessed by only one human get the level "Highest".
At this time, I don't actually know what it will take to get a measurement of "Highest"!