Version: | 0.1.4 |
Title: | Evaluation Metrics for Machine Learning |
Description: | An implementation of evaluation metrics in R that are commonly used in supervised machine learning. It implements metrics for regression, time series, binary classification, classification, and information retrieval problems. It has zero dependencies and a consistent, simple interface for all functions. |
Maintainer: | Michael Frasco <mfrasco6@gmail.com> |
Suggests: | testthat |
URL: | https://github.com/mfrasco/Metrics |
BugReports: | https://github.com/mfrasco/Metrics/issues |
License: | BSD_3_clause + file LICENSE |
RoxygenNote: | 6.0.1 |
NeedsCompilation: | no |
Packaged: | 2018-07-09 03:33:50 UTC; mfrasco |
Author: | Ben Hamner [aut, cph], Michael Frasco [aut, cre], Erin LeDell [ctb] |
Repository: | CRAN |
Date/Publication: | 2018-07-09 04:30:18 UTC |
Mean Quadratic Weighted Kappa
Description
MeanQuadraticWeightedKappa
computes the mean quadratic weighted kappa,
which can optionally be weighted
Usage
MeanQuadraticWeightedKappa(kappas, weights = rep(1, length(kappas)))
Arguments
kappas |
A numeric vector of possible kappas. |
weights |
An optional numeric vector of ratings. |
See Also
Examples
kappas <- c(0.3 ,0.2, 0.2, 0.5, 0.1, 0.2)
weights <- c(1.0, 2.5, 1.0, 1.0, 2.0, 3.0)
MeanQuadraticWeightedKappa(kappas, weights)
Quadratic Weighted Kappa
Description
ScoreQuadraticWeightedKappa
computes the quadratic weighted kappa between
two vectors of integers
Usage
ScoreQuadraticWeightedKappa(rater.a, rater.b, min.rating = min(c(rater.a,
rater.b)), max.rating = max(c(rater.a, rater.b)))
Arguments
rater.a |
An integer vector of the first rater's ratings. |
rater.b |
An integer vector of the second rater's ratings. |
min.rating |
The minimum possible rating. |
max.rating |
The maximum possible rating. |
See Also
Examples
rater.a <- c(1, 4, 5, 5, 2, 1)
rater.b <- c(2, 2, 4, 5, 3, 3)
ScoreQuadraticWeightedKappa(rater.a, rater.b, 1, 5)
Accuracy
Description
accuracy
is defined as the proportion of elements in actual
that are
equal to the corresponding element in predicted
Usage
accuracy(actual, predicted)
Arguments
actual |
The ground truth vector, where elements of the vector can be any variable type. |
predicted |
The predicted vector, where elements of the vector represent a
prediction for the corresponding value in |
See Also
Examples
actual <- c('a', 'a', 'c', 'b', 'c')
predicted <- c('a', 'b', 'c', 'b', 'a')
accuracy(actual, predicted)
Absolute Error
Description
ae
computes the elementwise absolute difference between two numeric vectors.
Usage
ae(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
ae(actual, predicted)
Absolute Percent Error
Description
ape
computes the elementwise absolute percent difference between two numeric
vectors
Usage
ape(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
Details
ape
is calculated as (actual
- predicted
) / abs(actual)
.
This means that the function will return -Inf
, Inf
, or NaN
if actual
is zero.
See Also
mape
smape
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
ape(actual, predicted)
Average Precision at k
Description
apk
computes the average precision at k, in the context of information
retrieval problems.
Usage
apk(k, actual, predicted)
Arguments
k |
The number of elements of |
actual |
The ground truth vector of relevant documents. The vector can contain
any numeric or character values, order does not matter, and the
vector does not need to be the same length as |
predicted |
The predicted vector of retrieved documents. The vector can
contain any numeric of character values. However, unlike |
Details
apk
loops over the first k values of predicted
. For each value, if
the value is contained within actual
and has not been predicted before,
we increment the number of sucesses by one and increment our score by the number
of successes divided by k. Then, we return our final score divided by the number
of relevant documents (i.e. the length of actual
).
apk
will return NaN
if length(actual)
equals 0
.
See Also
Examples
actual <- c('a', 'b', 'd')
predicted <- c('b', 'c', 'a', 'e', 'f')
apk(3, actual, predicted)
Area under the ROC curve (AUC)
Description
auc
computes the area under the receiver-operator characteristic curve (AUC).
Usage
auc(actual, predicted)
Arguments
actual |
The ground truth binary numeric vector containing 1 for the positive class and 0 for the negative class. |
predicted |
A numeric vector of predicted values, where the smallest values correspond
to the observations most believed to be in the negative class
and the largest values indicate the observations most believed
to be in the positive class. Each element represents the
prediction for the corresponding element in |
Details
auc
uses the fact that the area under the ROC curve is equal to the probability
that a randomly chosen positive observation has a higher predicted value than a
randomly chosen negative value. In order to compute this probability, we can
calculate the Mann-Whitney U statistic. This method is very fast, since we
do not need to compute the ROC curve first.
Examples
actual <- c(1, 1, 1, 0, 0, 0)
predicted <- c(0.9, 0.8, 0.4, 0.5, 0.3, 0.2)
auc(actual, predicted)
Bias
Description
bias
computes the average amount by which actual
is greater than
predicted
.
Usage
bias(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
Details
If a model is unbiased bias(actual, predicted)
should be close to zero.
Bias is calculated by taking the average of (actual
- predicted
).
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
bias(actual, predicted)
Classification Error
Description
ce
is defined as the proportion of elements in actual
that are not equal
to the corresponding element in predicted
.
Usage
ce(actual, predicted)
Arguments
actual |
The ground truth vector, where elements of the vector can be any variable type. |
predicted |
The predicted vector, where elements of the vector represent a
prediction for the corresponding value in |
See Also
Examples
actual <- c('a', 'a', 'c', 'b', 'c')
predicted <- c('a', 'b', 'c', 'b', 'a')
ce(actual, predicted)
F1 Score
Description
f1
computes the F1 Score in the context of information retrieval problems.
Usage
f1(actual, predicted)
Arguments
actual |
The ground truth vector of relevant documents. The vector can contain
any numeric or character values, order does not matter, and the
vector does not need to be the same length as |
predicted |
The predicted vector of retrieved documents. The vector can contain
any numeric or character values, order does not matter, and the
vector does not need to be the same length as |
Details
f1
is defined as 2 * precision * recall / (precision + recall)
. In the
context of information retrieval problems, precision is the proportion of retrieved
documents that are relevant to a query and recall is the proportion of relevant
documents that are successfully retrieved by a query. If there are zero relevant
documents that are retrieved, zero relevant documents, or zero predicted documents,
f1
is defined as 0
.
See Also
Examples
actual <- c('a', 'c', 'd')
predicted <- c('d', 'e')
f1(actual, predicted)
F-beta Score
Description
fbeta_score
computes a weighted harmonic mean of Precision and Recall.
The beta
parameter controls the weighting.
Usage
fbeta_score(actual, predicted, beta = 1)
Arguments
actual |
The ground truth binary numeric vector containing 1 for the positive class and 0 for the negative class. |
predicted |
The predicted binary numeric vector containing 1 for the positive
class and 0 for the negative class. Each element represents the
prediction for the corresponding element in |
beta |
A non-negative real number controlling how close the F-beta score is to
either Precision or Recall. When |
See Also
Examples
actual <- c(1, 1, 1, 0, 0, 0)
predicted <- c(1, 0, 1, 1, 1, 1)
recall(actual, predicted)
Log Loss
Description
ll
computes the elementwise log loss between two numeric vectors.
Usage
ll(actual, predicted)
Arguments
actual |
The ground truth binary numeric vector containing 1 for the positive class and 0 for the negative class. |
predicted |
A numeric vector of predicted values, where the values correspond
to the probabilities that each observation in |
See Also
Examples
actual <- c(1, 1, 1, 0, 0, 0)
predicted <- c(0.9, 0.8, 0.4, 0.5, 0.3, 0.2)
ll(actual, predicted)
Mean Log Loss
Description
logLoss
computes the average log loss between two numeric vectors.
Usage
logLoss(actual, predicted)
Arguments
actual |
The ground truth binary numeric vector containing 1 for the positive class and 0 for the negative class. |
predicted |
A numeric vector of predicted values, where the values correspond
to the probabilities that each observation in |
See Also
Examples
actual <- c(1, 1, 1, 0, 0, 0)
predicted <- c(0.9, 0.8, 0.4, 0.5, 0.3, 0.2)
logLoss(actual, predicted)
Mean Absolute Error
Description
mae
computes the average absolute difference between two numeric vectors.
Usage
mae(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
mae(actual, predicted)
Mean Absolute Percent Error
Description
mape
computes the average absolute percent difference between two numeric vectors.
Usage
mape(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
Details
mape
is calculated as the average of (actual
- predicted
) / abs(actual)
.
This means that the function will return -Inf
, Inf
, or NaN
if actual
is zero. Due to the instability at or near zero, smape
or
mase
are often used as alternatives.
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
mape(actual, predicted)
Mean Average Precision at k
Description
mapk
computes the mean average precision at k for a set of predictions, in
the context of information retrieval problems.
Usage
mapk(k, actual, predicted)
Arguments
k |
The number of elements of |
actual |
A list of vectors, where each vector represents a ground truth vector of relevant documents. In each vector, the elements can be numeric or character values, and the order of the elements does not matter. |
predicted |
A list of vectors, where each vector represents the predicted vector
of retrieved documents for the corresponding element of |
Details
mapk
evaluates apk
for each pair of elements from actual
and
predicted
.
See Also
Examples
actual <- list(c('a', 'b'), c('a'), c('x', 'y', 'b'))
predicted <- list(c('a', 'c', 'd'), c('x', 'b', 'a', 'b'), c('y'))
mapk(2, actual, predicted)
actual <- list(c(1, 5, 7, 9), c(2, 3), c(2, 5, 6))
predicted <- list(c(5, 6, 7, 8, 9), c(1, 2, 3), c(2, 4, 6, 8))
mapk(3, actual, predicted)
Mean Absolute Scaled Error
Description
mase
computes the mean absolute scaled error between two numeric
vectors. This function is only intended for time series data, where
actual
and numeric
are numeric vectors ordered by time.
Usage
mase(actual, predicted, step_size = 1)
Arguments
actual |
The ground truth numeric vector ordered in time, with most recent observation at the end of the vector. |
predicted |
The predicted numeric vector ordered in time, where each element
of the vector represents a prediction for the corresponding
element of |
step_size |
A positive integer that specifies how many observations to look back
in time in order to compute the naive forecast. The default is
However, if |
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
step_size <- 1
mase(actual, predicted, step_size)
Median Absolute Error
Description
mdae
computes the median absolute difference between two numeric vectors.
Usage
mdae(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
mdae(actual, predicted)
Mean Squared Error
Description
mse
computes the average squared difference between two numeric vectors.
Usage
mse(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
mse(actual, predicted)
Mean Squared Log Error
Description
msle
computes the average of squared log error between two numeric vectors.
Usage
msle(actual, predicted)
Arguments
actual |
The ground truth non-negative vector |
predicted |
The predicted non-negative vector, where each element in the vector
is a prediction for the corresponding element in |
Details
msle
adds one to both actual
and predicted
before taking
the natural logarithm to avoid taking the natural log of zero. As a result,
the function can be used if actual
or predicted
have zero-valued
elements. But this function is not appropriate if either are negative valued.
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
msle(actual, predicted)
Inherit Documentation for Binary Classification Metrics
Description
This object provides the documentation for the parameters of functions that provide binary classification metrics
Arguments
actual |
The ground truth binary numeric vector containing 1 for the positive class and 0 for the negative class. |
predicted |
The predicted binary numeric vector containing 1 for the positive
class and 0 for the negative class. Each element represents the
prediction for the corresponding element in |
Inherit Documentation for Classification Metrics
Description
This object provides the documentation for the parameters of functions that provide classification metrics
Arguments
actual |
The ground truth vector, where elements of the vector can be any variable type. |
predicted |
The predicted vector, where elements of the vector represent a
prediction for the corresponding value in |
Inherit Documentation for Regression Metrics
Description
This object provides the documentation for the parameters of functions that provide regression metrics
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
Percent Bias
Description
percent_bias
computes the average amount that actual
is greater
than predicted
as a percentage of the absolute value of actual
.
Usage
percent_bias(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
Details
If a model is unbiased percent_bias(actual, predicted)
should be close
to zero. Percent Bias is calculated by taking the average of
(actual
- predicted
) / abs(actual)
across all observations.
percent_bias
will give -Inf
, Inf
, or NaN
, if any
elements of actual
are 0
.
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
percent_bias(actual, predicted)
Precision
Description
precision
computes proportion of observations predicted to be in the
positive class (i.e. the element in predicted
equals 1)
that actually belong to the positive class (i.e. the element
in actual
equals 1)
Usage
precision(actual, predicted)
Arguments
actual |
The ground truth binary numeric vector containing 1 for the positive class and 0 for the negative class. |
predicted |
The predicted binary numeric vector containing 1 for the positive
class and 0 for the negative class. Each element represents the
prediction for the corresponding element in |
See Also
Examples
actual <- c(1, 1, 1, 0, 0, 0)
predicted <- c(1, 1, 1, 1, 1, 1)
precision(actual, predicted)
Relative Absolute Error
Description
rae
computes the relative absolute error between two numeric vectors.
Usage
rae(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
Details
rae
divides sum(ae(actual, predicted))
by sum(ae(actual, mean(actual)))
,
meaning that it provides the absolute error of the predictions relative to a naive model that
predicted the mean for every data point.
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
rrse(actual, predicted)
Recall
Description
recall
computes proportion of observations in the positive class
(i.e. the element in actual
equals 1) that are predicted
to be in the positive class (i.e. the element in predicted
equals 1)
Usage
recall(actual, predicted)
Arguments
actual |
The ground truth binary numeric vector containing 1 for the positive class and 0 for the negative class. |
predicted |
The predicted binary numeric vector containing 1 for the positive
class and 0 for the negative class. Each element represents the
prediction for the corresponding element in |
See Also
Examples
actual <- c(1, 1, 1, 0, 0, 0)
predicted <- c(1, 0, 1, 1, 1, 1)
recall(actual, predicted)
Root Mean Squared Error
Description
rmse
computes the root mean squared error between two numeric vectors
Usage
rmse(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
rmse(actual, predicted)
Root Mean Squared Log Error
Description
rmsle
computes the root mean squared log error between two numeric vectors.
Usage
rmsle(actual, predicted)
Arguments
actual |
The ground truth non-negative vector |
predicted |
The predicted non-negative vector, where each element in the vector
is a prediction for the corresponding element in |
Details
rmsle
adds one to both actual
and predicted
before taking
the natural logarithm to avoid taking the natural log of zero. As a result,
the function can be used if actual
or predicted
have zero-valued
elements. But this function is not appropriate if either are negative valued.
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
rmsle(actual, predicted)
Root Relative Squared Error
Description
rrse
computes the root relative squared error between two numeric vectors.
Usage
rrse(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
Details
rrse
takes the square root of sse(actual, predicted)
divided by
sse(actual, mean(actual))
, meaning that it provides the squared error of the
predictions relative to a naive model that predicted the mean for every data point.
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
rrse(actual, predicted)
Relative Squared Error
Description
rse
computes the relative squared error between two numeric vectors.
Usage
rse(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
Details
rse
divides sse(actual, predicted)
by sse(actual, mean(actual))
,
meaning that it provides the squared error of the predictions relative to a naive model that
predicted the mean for every data point.
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
rse(actual, predicted)
Squared Error
Description
se
computes the elementwise squared difference between two numeric vectors.
Usage
se(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
se(actual, predicted)
Squared Log Error
Description
sle
computes the elementwise squares of the differences in the logs of two numeric vectors.
Usage
sle(actual, predicted)
Arguments
actual |
The ground truth non-negative vector |
predicted |
The predicted non-negative vector, where each element in the vector
is a prediction for the corresponding element in |
Details
sle
adds one to both actual
and predicted
before taking
the natural logarithm of each to avoid taking the natural log of zero. As a result,
the function can be used if actual
or predicted
have zero-valued
elements. But this function is not appropriate if either are negative valued.
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
sle(actual, predicted)
Symmetric Mean Absolute Percentage Error
Description
smape
computes the symmetric mean absolute percentage error between
two numeric vectors.
Usage
smape(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
Details
smape
is defined as two times the average of abs(actual - predicted) / (abs(actual) + abs(predicted))
.
Therefore, at the elementwise level, it will provide NaN
only if actual
and predicted
are both zero. It has an upper bound of 2
, when either actual
or
predicted
are zero or when actual
and predicted
are opposite
signs.
smape
is symmetric in the sense that smape(x, y) = smape(y, x)
.
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
smape(actual, predicted)
Sum of Squared Errors
Description
sse
computes the sum of the squared differences between two numeric vectors.
Usage
sse(actual, predicted)
Arguments
actual |
The ground truth numeric vector. |
predicted |
The predicted numeric vector, where each element in the vector
is a prediction for the corresponding element in |
See Also
Examples
actual <- c(1.1, 1.9, 3.0, 4.4, 5.0, 5.6)
predicted <- c(0.9, 1.8, 2.5, 4.5, 5.0, 6.2)
sse(actual, predicted)