If you are not only using R in the interactive console for ad-hoc data analysis, but running eg batch jobs (for ETL, reporting, modeling, forecasting etc) as well, then logging the status(changes) of your script is a must so that later on you can review / debug what have happened.
For most cases, it’s enough to load the package and use the functions
with the log
prefix to log important and not so important
messages, for example:
library(logger)
log_info("Loading data")
#> INFO [2024-10-21 00:07:14] Loading data
data(mtcars)
log_info("The dataset includes {nrow(mtcars)} rows")
#> INFO [2024-10-21 00:07:14] The dataset includes 32 rows
if (max(mtcars$hp) < 1000) {
log_warn("Oh, no! There are no cars with more than 1K horsepower in the dataset :/")
log_debug("The most powerful car is {rownames(mtcars)[which.max(mtcars$hp)]} with {max(mtcars$hp)} hp")
}
#> WARN [2024-10-21 00:07:14] Oh, no! There are no cars with more than 1K horsepower in the dataset :/
Interestingly, the most powerful car was not being logged – because
by default the logger
prints messages with at least the
INFO
log level:
To change that, specify the new log level threshold, eg
TRACE
to log everything:
The rerunning the above code chunk:
log_info("Loading data")
#> INFO [2024-10-21 00:07:14] Loading data
data(mtcars)
log_info("The dataset includes {nrow(mtcars)} rows")
#> INFO [2024-10-21 00:07:14] The dataset includes 32 rows
if (max(mtcars$hp) < 1000) {
log_warn("Oh, no! There are no cars with more than 1K horsepower in the dataset :/")
log_debug("The most powerful car is {rownames(mtcars)[which.max(mtcars$hp)]} with {max(mtcars$hp)} hp")
}
#> WARN [2024-10-21 00:07:14] Oh, no! There are no cars with more than 1K horsepower in the dataset :/
#> DEBUG [2024-10-21 00:07:14] The most powerful car is Maserati Bora with 335 hp
You may also find the ?log_eval
function useful to log
both an R expression and its result in the same log record:
f <- sqrt
g <- mean
x <- 1:31
log_eval(y <- f(g(x)), level = INFO)
#> INFO [2024-10-21 00:07:14] 'y <- f(g(x))' => '4'
str(y)
#> num 4
Sometimes, it may be reasonable to log R objects as markdown, e.g. a
smallish data.frame
or data.table
,
e.g. mtcars
or iris
. Calling the formatter
using pander
instead of glue
can help:
log_formatter(formatter_pander)
log_info(head(iris))
#> INFO [2024-10-21 00:07:15]
#> INFO [2024-10-21 00:07:15]
#> INFO [2024-10-21 00:07:15] Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> INFO [2024-10-21 00:07:15] -------------- ------------- -------------- ------------- ---------
#> INFO [2024-10-21 00:07:15] 5.1 3.5 1.4 0.2 setosa
#> INFO [2024-10-21 00:07:15] 4.9 3 1.4 0.2 setosa
#> INFO [2024-10-21 00:07:15] 4.7 3.2 1.3 0.2 setosa
#> INFO [2024-10-21 00:07:15] 4.6 3.1 1.5 0.2 setosa
#> INFO [2024-10-21 00:07:15] 5 3.6 1.4 0.2 setosa
#> INFO [2024-10-21 00:07:15] 5.4 3.9 1.7 0.4 setosa
#> INFO [2024-10-21 00:07:15]
For more details, check the function reference in the manual, or start with the The Anatomy of a Log Request and Customizing the Format and the Destination of a Log Record vignettes.