|
|
|
The above table is an
example confusion matrix. The diagonal elements in this matrix indicate
numbers of sample for which the classification results agree with the
reference data.
|
|
The matrix contain the
complete information on the categorical accuracy. Off diagonal elements in
each row present the numbers of sample that has been misclassified by the
classifier, i.e., the classifier is committing a label to those samples which
actually belong to other labels. The misclassification error is called commission
error.
|
|
The off-diagonal elements in
each column are those samples being omitted by the classifier. Therefore, the
misclassification error is also called omission error.
|
|
|
|
In order to summarize the
classification results, the most commonly used accuracy measure is the
overall accuracy:
|
|
From the example of
confusion matrix, we can obtain = (28 + 15 + 20)/100 = 63%.
|
|
More specific measures are
needed because the overall accuracy does not indicate how the accuracy is
distributed across the individual categories. The categories could, and
frequently do, exhibit drastically differing accuracies but overall accuracy
method considers these categories as having equivalent or similar accuracies.
|
|
By examining the confusion
matrix, it can be seen that at least two methods can be used to determine
individual category accuracies.
|
|
(1) The ratio between the
number of correctly classified and the row total
|
|
(2) The ratio between the
number of correctly classified and the column total
|
|
(1) is called the user's
accuracy because users are concerned about what percentage of the classes
has been correctly classified.
|
|
(2) is called the producer's
accuracy.
|
|
The producer is more
interested in (2) because it tells how correctly the reference samples are
classified.
|
|
However, there is a more
appropriate way of presenting the individual classification accuracies. This
is through the use of commission error and omission error.
|
|
Commission error = 1 -
user's accuracy
|
|
Omission error = 1 -
producer's accuracy
|
|
|