Some Basic Performance Evaluation Measures

4 min readJan 25, 2021

After solving machine learning problems, we need to measure the performance of the model and for that purpose there are 6 measures:

Confusion Matrix
Accuracy
Recall
Precision
F1 -Score
Decision Boundary Plot

Confusion Matrix

It is a table which is used to describe the performance of a classification
model on a set of test data for which the true values are known. By this
visualization of the performance of an algorithm can be done. It allows easy
identification of confusion between classes eg. one class is commonly mislabeled as the other. Most of the performance evaluation is done from the confusion matrix.
It consist of summary of prediction results of a classification problem. The
number of correct and incorrect predictions are summarized with count values. The confusion matrix shows the ways in which your classification model is confused when it makes prediction.
It gives us insight not only into the errors being made by a classifier but more importantly the types of errors that are being made.

Definition of the Terms :-
Positive (P) : Observation is positive (for example: is an orange).
Negative (N) : Observation is not positive (for example: is not an orange).
True Positive (TP) : Observation is positive and is predicted positive.
False Negative (FN) : Observation is positive, but is predicted negative.
True Negative (TN) : Observation is negative, and is predicted negative.
False Positive (FP) : Observation is negative, but is predicted positive.

Accuracy

Accuracy is simply a ratio of correctly predicted observation to the total
observations. If accuracy is high then model is best. Accuracy is a great
measure but only when dataset is symmetric where values of false positive
and false negatives are almost same. Therefore, you have to look at other
parameters to evaluate the performance of your model.

Recall

Recall is defined as the ratio of the total number of correctly classified positive examples divide to the total number of positive examples. High Recall indicates the class is correctly recognized (small number of FN).

Precision

To get the value of precision we divide the total number of correctly classified positive examples by the total number of predicted positive examples. High Precision indicates an example labeled as positive is indeed positive (small number of FP).

High recall, low precision :-
This means that most of the positive examples are recognized correctly but
false positives are large.
Low recall, high precision :-
This shows that lot of positive examples are missed but predicted positive
are indeed positive.

F1-Score

Since we have two measures (Precision and Recall) it helps to have a measurement that represents both of them. We calculate an F-measure which uses Harmonic Mean in place of Arithmetic Mean as it punishes the extreme values more.

Decision Boundary Plot

In classification problems, prediction of a particular class is involved among
multiple classes.
In other words, it can also be framed in a way that a particular instance needs to be kept under a particular region and needs to separated from other regions . This separation from other regions can be seen by a boundary known as Decision Boundary. This visualization of the Decision Boundary in feature space is done on a Scatter Plot where every point depicts a data-point of the data-set and axes depicting the features.
The Decision Boundary separates the data-points into regions, which are
actually the classes in which they belong.

For eg. decision boundary plot of linearly separable data for KNN algorithm is :