Metrics

Published on: October 20, 2021

Table of Content

Classification

Binary cross entropy

Binary cross entropy is a loss function used for binary classification tasks (tasks with only two outcomes/classes). It works by calculating the following average:

The above equation can be split into two parts to make it easier to understand:

The above graph shows that the further away the prediction is from the actual y value the bigger the loss gets.

That means that if the correct answer is 0, then the cost function will be 0 if the prediction is also 0. If the prediction approaches 1, then the cost function will approach infinity.

If the correct answer is 1, then the cost function will be 0 if the prediction is 1. If the prediction approaches 0, then the cost function will approach infinity.

Resources:

Code:

Binary Cross Entropy Numpy Implementation

Categorical Crossentropy

Categorical crossentropy is a loss function used for multi-class classification tasks. The outputed loss is the negative average of the sum of the true values multiplied by the log of the predicted values .

Resources:

Code:

Categorical Cross Entropy Numpy Implementation

Accuracy Score

The fraction of predictions the model classified correctly.

For binary classification, accuracy can also be calculated in terms of positives and negatives as follows:

Where , , , and .

Resources:

Code:

Accuracy Score Numpy Implementation

Confusion matrix

A confusion matrix is a table that summarises the predictions of a classifier or classification model. By definition, entry in a confusion matrix is the number of observations actually in group , but predicted to be in group .

Resources:

Precision

Precision is a metric for classification models that identifies the frequency with which a model was correct when predicting the positive class. Precision is defined as the number of true positives over the number of true positives plus the number of false positives.

Resources:

Code:

Precision Numpy Implementation

Recall

Recall is a metric for classification models that identifies how many positive labels the model identified out of all the possible positive labels.

Resources:

Code:

Recall Numpy Implementation

F1-Score

The F1-Score is the harmonic mean of precision and recall. A perfect model will have an F1-Score of 1.

It's also possible to weight precision or recall differently using the -Score. Here a real factor is used to weight the recall times as much as the precision.

Resources:

Code:

Receiver operating characteristic (ROC)

The ROC curve (receiver operating characteristic curve) is a graph that illustrates the performance of a classification model as its discrimination threshold is varied. The ROC curve is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings.

True Positive Rate (TPR):

False Positive Rate (FPR):

Resources:

Area under the ROC curve (AUC)

AUC stands for "Area under the ROC Curve". AUC provides an aggregate measure of performance across all possible classification thresholds. One way of interpreting AUC is as the probability that the model ranks a random positive example more highly than a random negative example. - Google Developers Machine Learning Crash Course

Resources:

Hinge Loss

Hinge loss is a loss function usef for "maximum-margin" classification, most notably for Support Vector Machines (SVMs).

Resources:

Code:

Hinge Loss Numpy Implementation

KL Divergence

The Kullback-Leibler divergence, , often shortenend to just KL divergence, is a measure of how one probability distribution is different from a second, reference porbability distribution.

Resources:

Code:

KL Divergence Numpy Implementation

Brier Score

The Brier Score is a strictly proper score function or strictly proper scoring rule that measures the accuracy of probabilistic predictions. For unidimensional predictions, it is strictly equivalent to the mean squared error as applied to predicted probabilities. - Wikipedia

Resources:

Code:

Brier Score Numpy Implementation

Regression

Mean Squared Error

The mean squared error (MSE) or mean squared deviation (MSD) measure the average of the squares of the errors - that is, the average squared differences between the estimated and actual values.

Resources:

Code:

Mean Squared Error Numpy Implementation

Mean Squared Logarithmic Error

Mean Squared Logarithmic Error (MSLE) is an extension of Mean Squared Error (MSE) often used when the target has an exponential growth.

Note: This metrics penalizes under-predictions greater than over-predictions.

Code:

Mean Squared Logarithmic Error Numpy Implementation

Resources:

Mean Absolute Error

The mean absolute error (MAE) measure the average of the absolute values of the errors - that is, the average absolute differences between the estimated and actual values.

Code:

Mean Absolute Error Numpy Implementation

Resources:

Mean Absolute Percentage Error

Mean absolute percentage error is an extension of the mean absolute error (MAE) that divides the difference between the predicted value and the actual value by the actual value. The main idea of MAPD is to be sensitive to relative errors. It's for example not changed by a global scaling of the target variable.

Code:

Mean Absolute Percentage Error Numpy Implementation

Resources:

Median Absolute Error

The median absolute error also often called median absolute deviation (MAD) is metric that is particularly robust to outliers. The loss is calculated by taking the median of all absolute differences between the target and the prediction.

Code:

Median Absolute Error Numpy Implementation

Resources:

Cosine Similarity

Cosine similarity is a measure of similarity between two vectors. The cosine similarity is the cosine of the angle between two vectors.

Code:

Cosine Similarity Numpy Implementation

Resources:

R2 Score

The coefficient of determination, denoted as is the proportion of the variation in the dependent variable that has been explained by the independent variables in the model.

where

Code:

R2 Score Numpy Implementation

Resources:

Tweedie deviance

The Tweedie distributions are a family of probability distributions, which include he purely continuous normal, gamma and Inverse Gaussian distributions and more.

The unit deviance of a reproductive Tweedie distribution is given by:

Code:

Tweedie deviance Numpy Implementation

Resources:

D^2 score

The -Score computes the percentage of deviance explained. It is a generalization of , where the squared error is replaced by the Tweedie deviance. - Scikit Learn

, also known as McFadden’s likelihood ratio index, is calculated as

Code:

D^2 Score Numpy Implementation

Resources:

D² score, the coefficient of determination

Huber Loss

Huber loss is a loss function that is often used in robust regression. The function is quadratich for small values of and linear for large values.

where and is the point where the loss changes from a quadratic to linear.

Code:

Huber Numpy Implementation

Resources:

Log Cosh Loss

Logarithm of the hyperbolic cosine of the prediction error.

Code:

Log Cosh Loss Numpy Implementation

Resources:

Log Cosh loss Tensorflow

Metrics

Table of Content

Classification

Binary cross entropy

Categorical Crossentropy

Accuracy Score

Confusion matrix

Precision

Recall

F1-Score

Receiver operating characteristic (ROC)

Area under the ROC curve (AUC)

Hinge Loss

KL Divergence

Brier Score

Regression

Mean Squared Error

Mean Squared Logarithmic Error

Mean Absolute Error

Mean Absolute Percentage Error

Median Absolute Error

Cosine Similarity

R2 Score

Tweedie deviance

D^2 score

Huber Loss

Log Cosh Loss

More stories

Kernel PCA

Principal Component Analysis (PCA)

Linear Discriminant Analysis (LDA)