As part of a reproducible workflow, caching of function calls, code
chunks, and other elements of a project is a critical component. The
objective of a reproducible workflow is is likely that an entire work
flow from raw data to publication, decision support, report writing,
presentation building etc., could be built and be reproducible anywhere,
on any computer, operating system, with any starting conditions, on
demand. The reproducible::Cache function is built to work
with any R function.
Cache users DBI as a backend, with key
functions, dbReadTable, dbRemoveTable,
dbSendQuery, dbSendStatement,
dbCreateTable and dbAppendTable. These can all
be accessed via Cache, showCache,
clearCache, and keepCache. It is optimized for
speed of transactions, using digest::digest on objects and
files. The main function is superficially similar to
archivist::cache, which uses digest::digest in
all cases to determine whether the arguments are identical in subsequent
iterations. It also but does many things that make standard
caching with digest::digest don’t work reliably between
systems. For these, the function .robustDigest is
introduced to make caching transferable between systems. This is
relevant for file paths, environments, parallel clusters, functions
(which are contained within an environment), and many others (e.g., see
?.robustDigest for methods). Cache also adds
important elements like automated tagging and the option to retrieve
disk-cached values via stashed objects in memory using
memoise::memoise. This means that running
Cache 1, 2, and 3 times on the same function will get
progressively faster. This can be extremely useful for web apps built
with, say shiny.
Any function can be cached by wrapping Cache around the
function call, or by using base pipe |>:
This will be a slight change to a function call, such as:
terra::project(raster, crs = terra::crs(newRaster)) to
Cache(terra::project(raster, crs = terra::crs(newRaster)))
or with the pipe, which may be more convenient as it is easy to add and
remove caching in the code base:
terra::project(raster, crs = terra::crs(newRaster)) |> Cache()
This is particularly useful for expensive operations.
##
## Attaching package: 'data.table'
## The following object is masked from 'package:terra':
##
## shift
tmpDir <- file.path(tempfile(), "reproducible_examples", "Cache")
dir.create(tmpDir, recursive = TRUE)
# Source raster with a complete LCC definition
ras <- terra::rast(terra::ext(0, 300, 0, 300), vals = 1:9e4, res = 1)
terra::crs(ras) <- "+proj=lcc +lat_1=60 +lat_2=70 +lat_0=50 +lon_0=-100 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs"
# Target CRS in PROJ form (no EPSG lookup)
newCRS <- "+proj=longlat +datum=WGS84 +no_defs"
# Derive target extent from source extent (no registry lookup)
target_ext <- terra::project(terra::ext(ras), from = terra::crs(ras), to = newCRS)
# Build template with chosen resolution; assign CRS
tmplate <- terra::rast(target_ext, resolution = 0.00001)
terra::crs(tmplate) <- newCRS
# No Cache
system.time(map1 <- terra::project(ras, tmplate, method = "near"))## user system elapsed
## 0.03 0.00 0.05
# Try with memoise for this example -- for many simple cases, memoising will not be faster
opts <- options("reproducible.useMemoise" = TRUE)
# With Cache -- a little slower the first time because saving to disk
system.time({
suppressWarnings({
map1 <- terra::project(ras, tmplate, method = "near") |>
Cache(cachePath = tmpDir)
})
})## Saved! Cache file: b080b61ea3444bd7.rds; fn: project (and added a memoised copy)
## user system elapsed
## 0.21 0.03 0.79
# faster the second time; improvement depends on size of object and time to run function
system.time({
map2 <- terra::project(ras, tmplate, method = "near") |>
Cache(cachePath = tmpDir)
})## Warning: [readValues] raster has no values
## Object to retrieve (fn: project, b080b61ea3444bd7.rds) ...
## Loaded! Memoised result from previous project call
## user system elapsed
## 0.11 0.00 0.14
## [1] "Attributes: < Component \".Cache\": Component \"newCache\": 1 element mismatch >"
try(clearCache(tmpDir, ask = FALSE), silent = TRUE) # just to make sure it is clear
ranNumsA <- rnorm(10, 16) |> Cache(cachePath = tmpDir)## Saved! Cache file: aa549dd751b2f26d.rds; fn: rnorm
## Object to retrieve (fn: rnorm, aa549dd751b2f26d.rds) ...
## Loaded! Cached result from previous rnorm call
## No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
## this will not persist across R sessions.
## Saved! Cache file: aa549dd751b2f26d.rds; fn: rnorm
## Saved! Cache file: 4629054c73f1eaab.rds; fn: Cache
## No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
## this will not persist across R sessions.
## Object to retrieve (fn: rnorm, aa549dd751b2f26d.rds) ...
## Loaded! Cached result from previous rnorm call
## Saved! Cache file: 9850a44f9407dcc5.rds; fn: Cache
## Object to retrieve (fn: rnorm, aa549dd751b2f26d.rds) ...
## Loaded! Cached result from previous rnorm call
# Any minor change makes it different
ranNumsE <- rnorm(10, 6) |> Cache(cachePath = tmpDir) # different## Saved! Cache file: d78b46a2a76d6d80.rds; fn: rnorm
## Saved! Cache file: adf21923cd1e50d0.rds; fn: rnorm
## Saved! Cache file: e23cab430872a0ea.rds; fn: runif
# access it again, from Cache
Sys.sleep(1)
ranNumsA <- rnorm(4) |> Cache(cachePath = tmpDir, userTags = "objectName:a")## Object to retrieve (fn: rnorm, adf21923cd1e50d0.rds) ...
## Loaded! Cached result from previous rnorm call
## Cache size:
## Total (including Rasters): 40 bytes
## Selected objects (not including Rasters): 40 bytes
# keep only items accessed "recently" (i.e., only objectName:a)
onlyRecentlyAccessed <- showCache(tmpDir, userTags = max(wholeCache[tagKey == "accessed"]$tagValue))## Cache size:
## Total (including Rasters): 40 bytes
## Selected objects (not including Rasters): 40 bytes
# inverse join with 2 data.tables ... using: a[!b]
# i.e., return all of wholeCache that was not recently accessed
# Note: the two different ways to access -- old way with "artifact" will be deprecated
toRemove <- unique(wholeCache[!onlyRecentlyAccessed, on = "cacheId"], by = "cacheId")$cacheId
clearCache(tmpDir, toRemove, ask = FALSE) # remove ones not recently accessed## Cache size:
## Total (including Rasters): 40 bytes
## Selected objects (not including Rasters): 40 bytes
## Empty data.table (0 rows and 4 cols): cacheId,tagKey,tagValue,createdDate
keepCache does the same as previous example, but more
simply.
## Saved! Cache file: adf21923cd1e50d0.rds; fn: rnorm
## No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
## this will not persist across R sessions.
## Saved! Cache file: e23cab430872a0ea.rds; fn: runif
## Saved! Cache file: 6830acdefaf58da9.rds; fn: Cache
# keep only those cached items from the last 24 hours
oneDay <- 60 * 60 * 24
keepCache(tmpDir, after = Sys.time() - oneDay, ask = FALSE)## Nothing to remove; keeping all
## cacheId tagKey tagValue
## <char> <char> <char>
## 1: 6830acdefaf58da9 function Cache
## 2: 6830acdefaf58da9 objectName b
## 3: 6830acdefaf58da9 accessed 2026-01-07 21:16:49.52534
## 4: 6830acdefaf58da9 inCloud FALSE
## 5: 6830acdefaf58da9 elapsedTimeDigest 0.004920959 secs
## 6: 6830acdefaf58da9 preDigest FUN:97eff0b205c774d7
## 7: 6830acdefaf58da9 preDigest cacheSaveFormat:cf2828ea967d53e7
## 8: 6830acdefaf58da9 preDigest dryRun:e9aac936a0e8f6ae
## 9: 6830acdefaf58da9 preDigest .FUN:ab4b977119e40b21
## 10: 6830acdefaf58da9 preDigest .cacheChaining:71681d621365dfd7
## 11: 6830acdefaf58da9 preDigest .cacheExtra:c85d88fc56f4e042
## 12: 6830acdefaf58da9 preDigest .functionName:c85d88fc56f4e042
## 13: 6830acdefaf58da9 preDigest conn:118387d5d48f757d
## 14: 6830acdefaf58da9 preDigest drv:9ce9a83896bf68a1
## 15: 6830acdefaf58da9 class numeric
## 16: 6830acdefaf58da9 object.size 1008
## 17: 6830acdefaf58da9 fromDisk FALSE
## 18: 6830acdefaf58da9 resultHash
## 19: 6830acdefaf58da9 elapsedTimeFirstRun 0.01528692 secs
## 20: adf21923cd1e50d0 function rnorm
## 21: adf21923cd1e50d0 objectName a
## 22: adf21923cd1e50d0 accessed 2026-01-07 21:16:49.50805
## 23: adf21923cd1e50d0 inCloud FALSE
## 24: adf21923cd1e50d0 elapsedTimeDigest 0.003265858 secs
## 25: adf21923cd1e50d0 preDigest .FUN:4f604aa46882b368
## 26: adf21923cd1e50d0 preDigest mean:c40c00762a0dac94
## 27: adf21923cd1e50d0 preDigest n:7eef4eae85fd9229
## 28: adf21923cd1e50d0 preDigest sd:853b1797f54b229c
## 29: adf21923cd1e50d0 class numeric
## 30: adf21923cd1e50d0 object.size 80
## 31: adf21923cd1e50d0 fromDisk FALSE
## 32: adf21923cd1e50d0 resultHash
## 33: adf21923cd1e50d0 elapsedTimeFirstRun 0.001113176 secs
## cacheId tagKey tagValue
## <char> <char> <char>
## createdDate
## <char>
## 1: 2026-01-07 21:16:49.541876
## 2: 2026-01-07 21:16:49.541876
## 3: 2026-01-07 21:16:49.541876
## 4: 2026-01-07 21:16:49.541876
## 5: 2026-01-07 21:16:49.541876
## 6: 2026-01-07 21:16:49.541876
## 7: 2026-01-07 21:16:49.541876
## 8: 2026-01-07 21:16:49.541876
## 9: 2026-01-07 21:16:49.541876
## 10: 2026-01-07 21:16:49.541876
## 11: 2026-01-07 21:16:49.541876
## 12: 2026-01-07 21:16:49.541876
## 13: 2026-01-07 21:16:49.541876
## 14: 2026-01-07 21:16:49.541876
## 15: 2026-01-07 21:16:49.541876
## 16: 2026-01-07 21:16:49.541876
## 17: 2026-01-07 21:16:49.541876
## 18: 2026-01-07 21:16:49.541876
## 19: 2026-01-07 21:16:49.541876
## 20: 2026-01-07 21:16:49.509952
## 21: 2026-01-07 21:16:49.509952
## 22: 2026-01-07 21:16:49.509952
## 23: 2026-01-07 21:16:49.509952
## 24: 2026-01-07 21:16:49.509952
## 25: 2026-01-07 21:16:49.509952
## 26: 2026-01-07 21:16:49.509952
## 27: 2026-01-07 21:16:49.509952
## 28: 2026-01-07 21:16:49.509952
## 29: 2026-01-07 21:16:49.509952
## 30: 2026-01-07 21:16:49.509952
## 31: 2026-01-07 21:16:49.509952
## 32: 2026-01-07 21:16:49.509952
## 33: 2026-01-07 21:16:49.509952
## createdDate
## <char>
# Keep all Cache items created with an rnorm() call
keepCache(tmpDir, userTags = "rnorm", ask = FALSE)## cacheId tagKey tagValue
## <char> <char> <char>
## 1: adf21923cd1e50d0 function rnorm
## 2: adf21923cd1e50d0 objectName a
## 3: adf21923cd1e50d0 accessed 2026-01-07 21:16:49.50805
## 4: adf21923cd1e50d0 inCloud FALSE
## 5: adf21923cd1e50d0 elapsedTimeDigest 0.003265858 secs
## 6: adf21923cd1e50d0 preDigest .FUN:4f604aa46882b368
## 7: adf21923cd1e50d0 preDigest mean:c40c00762a0dac94
## 8: adf21923cd1e50d0 preDigest n:7eef4eae85fd9229
## 9: adf21923cd1e50d0 preDigest sd:853b1797f54b229c
## 10: adf21923cd1e50d0 class numeric
## 11: adf21923cd1e50d0 object.size 80
## 12: adf21923cd1e50d0 fromDisk FALSE
## 13: adf21923cd1e50d0 resultHash
## 14: adf21923cd1e50d0 elapsedTimeFirstRun 0.001113176 secs
## createdDate
## <char>
## 1: 2026-01-07 21:16:49.509952
## 2: 2026-01-07 21:16:49.509952
## 3: 2026-01-07 21:16:49.509952
## 4: 2026-01-07 21:16:49.509952
## 5: 2026-01-07 21:16:49.509952
## 6: 2026-01-07 21:16:49.509952
## 7: 2026-01-07 21:16:49.509952
## 8: 2026-01-07 21:16:49.509952
## 9: 2026-01-07 21:16:49.509952
## 10: 2026-01-07 21:16:49.509952
## 11: 2026-01-07 21:16:49.509952
## 12: 2026-01-07 21:16:49.509952
## 13: 2026-01-07 21:16:49.509952
## 14: 2026-01-07 21:16:49.509952
## Cache size:
## Total (including Rasters): 20 bytes
## Selected objects (not including Rasters): 20 bytes
## cacheId tagKey tagValue
## <char> <char> <char>
## 1: adf21923cd1e50d0 function rnorm
## 2: adf21923cd1e50d0 objectName a
## 3: adf21923cd1e50d0 accessed 2026-01-07 21:16:49.50805
## 4: adf21923cd1e50d0 inCloud FALSE
## 5: adf21923cd1e50d0 elapsedTimeDigest 0.003265858 secs
## 6: adf21923cd1e50d0 preDigest .FUN:4f604aa46882b368
## 7: adf21923cd1e50d0 preDigest mean:c40c00762a0dac94
## 8: adf21923cd1e50d0 preDigest n:7eef4eae85fd9229
## 9: adf21923cd1e50d0 preDigest sd:853b1797f54b229c
## 10: adf21923cd1e50d0 class numeric
## 11: adf21923cd1e50d0 object.size 80
## 12: adf21923cd1e50d0 fromDisk FALSE
## 13: adf21923cd1e50d0 resultHash
## 14: adf21923cd1e50d0 elapsedTimeFirstRun 0.001113176 secs
## createdDate
## <char>
## 1: 2026-01-07 21:16:49.509952
## 2: 2026-01-07 21:16:49.509952
## 3: 2026-01-07 21:16:49.509952
## 4: 2026-01-07 21:16:49.509952
## 5: 2026-01-07 21:16:49.509952
## 6: 2026-01-07 21:16:49.509952
## 7: 2026-01-07 21:16:49.509952
## 8: 2026-01-07 21:16:49.509952
## 9: 2026-01-07 21:16:49.509952
## 10: 2026-01-07 21:16:49.509952
## 11: 2026-01-07 21:16:49.509952
## 12: 2026-01-07 21:16:49.509952
## 13: 2026-01-07 21:16:49.509952
## 14: 2026-01-07 21:16:49.509952
# Remove all Cache items that happened within a rnorm() call
clearCache(tmpDir, userTags = "rnorm", ask = FALSE)## Cache size:
## Total (including Rasters): 20 bytes
## Selected objects (not including Rasters): 20 bytes
## Empty data.table (0 rows and 4 cols): cacheId,tagKey,tagValue,createdDate
# Also, can set a time before caching happens and remove based on this
# --> a useful, simple way to control Cache
ranNumsA <- rnorm(4) |> Cache(cachePath = tmpDir, userTags = "objectName:a")## Saved! Cache file: adf21923cd1e50d0.rds; fn: rnorm
startTime <- Sys.time()
Sys.sleep(1)
ranNumsB <- rnorm(5) |> Cache(cachePath = tmpDir, userTags = "objectName:b")## Saved! Cache file: 438a3028a4570cf9.rds; fn: rnorm
## cacheId tagKey tagValue
## <char> <char> <char>
## 1: 438a3028a4570cf9 function rnorm
## 2: 438a3028a4570cf9 objectName b
## 3: 438a3028a4570cf9 accessed 2026-01-07 21:16:50.70756
## 4: 438a3028a4570cf9 inCloud FALSE
## 5: 438a3028a4570cf9 elapsedTimeDigest 0.003247023 secs
## 6: 438a3028a4570cf9 preDigest .FUN:4f604aa46882b368
## 7: 438a3028a4570cf9 preDigest mean:c40c00762a0dac94
## 8: 438a3028a4570cf9 preDigest n:a4f076b3db622faf
## 9: 438a3028a4570cf9 preDigest sd:853b1797f54b229c
## 10: 438a3028a4570cf9 class numeric
## 11: 438a3028a4570cf9 object.size 96
## 12: 438a3028a4570cf9 fromDisk FALSE
## 13: 438a3028a4570cf9 resultHash
## 14: 438a3028a4570cf9 elapsedTimeFirstRun 0.002373934 secs
## createdDate
## <char>
## 1: 2026-01-07 21:16:50.710372
## 2: 2026-01-07 21:16:50.710372
## 3: 2026-01-07 21:16:50.710372
## 4: 2026-01-07 21:16:50.710372
## 5: 2026-01-07 21:16:50.710372
## 6: 2026-01-07 21:16:50.710372
## 7: 2026-01-07 21:16:50.710372
## 8: 2026-01-07 21:16:50.710372
## 9: 2026-01-07 21:16:50.710372
## 10: 2026-01-07 21:16:50.710372
## 11: 2026-01-07 21:16:50.710372
## 12: 2026-01-07 21:16:50.710372
## 13: 2026-01-07 21:16:50.710372
## 14: 2026-01-07 21:16:50.710372
# default userTags is "and" matching; for "or" matching use |
ranNumsA <- runif(4) |> Cache(cachePath = tmpDir, userTags = "objectName:a")## Saved! Cache file: e23cab430872a0ea.rds; fn: runif
## Saved! Cache file: adf21923cd1e50d0.rds; fn: rnorm
## Cache size:
## Total (including Rasters): 40 bytes
## Selected objects (not including Rasters): 40 bytes
## cacheId tagKey tagValue
## <char> <char> <char>
## 1: adf21923cd1e50d0 function rnorm
## 2: adf21923cd1e50d0 objectName b
## 3: adf21923cd1e50d0 accessed 2026-01-07 21:16:50.85068
## 4: adf21923cd1e50d0 inCloud FALSE
## 5: adf21923cd1e50d0 elapsedTimeDigest 0.003268957 secs
## 6: adf21923cd1e50d0 preDigest .FUN:4f604aa46882b368
## 7: adf21923cd1e50d0 preDigest mean:c40c00762a0dac94
## 8: adf21923cd1e50d0 preDigest n:7eef4eae85fd9229
## 9: adf21923cd1e50d0 preDigest sd:853b1797f54b229c
## 10: adf21923cd1e50d0 class numeric
## 11: adf21923cd1e50d0 object.size 80
## 12: adf21923cd1e50d0 fromDisk FALSE
## 13: adf21923cd1e50d0 resultHash
## 14: adf21923cd1e50d0 elapsedTimeFirstRun 0.001499891 secs
## 15: e23cab430872a0ea function runif
## 16: e23cab430872a0ea objectName a
## 17: e23cab430872a0ea accessed 2026-01-07 21:16:50.83370
## 18: e23cab430872a0ea inCloud FALSE
## 19: e23cab430872a0ea elapsedTimeDigest 0.002732992 secs
## 20: e23cab430872a0ea preDigest .FUN:881ec847b7161f3c
## 21: e23cab430872a0ea preDigest max:853b1797f54b229c
## 22: e23cab430872a0ea preDigest min:c40c00762a0dac94
## 23: e23cab430872a0ea preDigest n:7eef4eae85fd9229
## 24: e23cab430872a0ea class numeric
## 25: e23cab430872a0ea object.size 80
## 26: e23cab430872a0ea fromDisk FALSE
## 27: e23cab430872a0ea resultHash
## 28: e23cab430872a0ea elapsedTimeFirstRun 0.001397133 secs
## cacheId tagKey tagValue
## <char> <char> <char>
## createdDate
## <char>
## 1: 2026-01-07 21:16:50.853539
## 2: 2026-01-07 21:16:50.853539
## 3: 2026-01-07 21:16:50.853539
## 4: 2026-01-07 21:16:50.853539
## 5: 2026-01-07 21:16:50.853539
## 6: 2026-01-07 21:16:50.853539
## 7: 2026-01-07 21:16:50.853539
## 8: 2026-01-07 21:16:50.853539
## 9: 2026-01-07 21:16:50.853539
## 10: 2026-01-07 21:16:50.853539
## 11: 2026-01-07 21:16:50.853539
## 12: 2026-01-07 21:16:50.853539
## 13: 2026-01-07 21:16:50.853539
## 14: 2026-01-07 21:16:50.853539
## 15: 2026-01-07 21:16:50.836402
## 16: 2026-01-07 21:16:50.836402
## 17: 2026-01-07 21:16:50.836402
## 18: 2026-01-07 21:16:50.836402
## 19: 2026-01-07 21:16:50.836402
## 20: 2026-01-07 21:16:50.836402
## 21: 2026-01-07 21:16:50.836402
## 22: 2026-01-07 21:16:50.836402
## 23: 2026-01-07 21:16:50.836402
## 24: 2026-01-07 21:16:50.836402
## 25: 2026-01-07 21:16:50.836402
## 26: 2026-01-07 21:16:50.836402
## 27: 2026-01-07 21:16:50.836402
## 28: 2026-01-07 21:16:50.836402
## createdDate
## <char>
# show objects that are both runif and rnorm
# (i.e., none in this case, because objecs are either or, not both)
showCache(tmpDir, userTags = c("runif", "rnorm")) ## empty## Cache size:
## Total (including Rasters): 0 bytes
## Selected objects (not including Rasters): 0 bytes
## Empty data.table (0 rows and 4 cols): cacheId,tagKey,tagValue,createdDate
# show objects that are either runif or rnorm ("or" search)
showCache(tmpDir, userTags = "runif|rnorm")## Cache size:
## Total (including Rasters): 40 bytes
## Selected objects (not including Rasters): 40 bytes
## cacheId tagKey tagValue
## <char> <char> <char>
## 1: adf21923cd1e50d0 function rnorm
## 2: adf21923cd1e50d0 objectName b
## 3: adf21923cd1e50d0 accessed 2026-01-07 21:16:50.85068
## 4: adf21923cd1e50d0 inCloud FALSE
## 5: adf21923cd1e50d0 elapsedTimeDigest 0.003268957 secs
## 6: adf21923cd1e50d0 preDigest .FUN:4f604aa46882b368
## 7: adf21923cd1e50d0 preDigest mean:c40c00762a0dac94
## 8: adf21923cd1e50d0 preDigest n:7eef4eae85fd9229
## 9: adf21923cd1e50d0 preDigest sd:853b1797f54b229c
## 10: adf21923cd1e50d0 class numeric
## 11: adf21923cd1e50d0 object.size 80
## 12: adf21923cd1e50d0 fromDisk FALSE
## 13: adf21923cd1e50d0 resultHash
## 14: adf21923cd1e50d0 elapsedTimeFirstRun 0.001499891 secs
## 15: e23cab430872a0ea function runif
## 16: e23cab430872a0ea objectName a
## 17: e23cab430872a0ea accessed 2026-01-07 21:16:50.83370
## 18: e23cab430872a0ea inCloud FALSE
## 19: e23cab430872a0ea elapsedTimeDigest 0.002732992 secs
## 20: e23cab430872a0ea preDigest .FUN:881ec847b7161f3c
## 21: e23cab430872a0ea preDigest max:853b1797f54b229c
## 22: e23cab430872a0ea preDigest min:c40c00762a0dac94
## 23: e23cab430872a0ea preDigest n:7eef4eae85fd9229
## 24: e23cab430872a0ea class numeric
## 25: e23cab430872a0ea object.size 80
## 26: e23cab430872a0ea fromDisk FALSE
## 27: e23cab430872a0ea resultHash
## 28: e23cab430872a0ea elapsedTimeFirstRun 0.001397133 secs
## cacheId tagKey tagValue
## <char> <char> <char>
## createdDate
## <char>
## 1: 2026-01-07 21:16:50.853539
## 2: 2026-01-07 21:16:50.853539
## 3: 2026-01-07 21:16:50.853539
## 4: 2026-01-07 21:16:50.853539
## 5: 2026-01-07 21:16:50.853539
## 6: 2026-01-07 21:16:50.853539
## 7: 2026-01-07 21:16:50.853539
## 8: 2026-01-07 21:16:50.853539
## 9: 2026-01-07 21:16:50.853539
## 10: 2026-01-07 21:16:50.853539
## 11: 2026-01-07 21:16:50.853539
## 12: 2026-01-07 21:16:50.853539
## 13: 2026-01-07 21:16:50.853539
## 14: 2026-01-07 21:16:50.853539
## 15: 2026-01-07 21:16:50.836402
## 16: 2026-01-07 21:16:50.836402
## 17: 2026-01-07 21:16:50.836402
## 18: 2026-01-07 21:16:50.836402
## 19: 2026-01-07 21:16:50.836402
## 20: 2026-01-07 21:16:50.836402
## 21: 2026-01-07 21:16:50.836402
## 22: 2026-01-07 21:16:50.836402
## 23: 2026-01-07 21:16:50.836402
## 24: 2026-01-07 21:16:50.836402
## 25: 2026-01-07 21:16:50.836402
## 26: 2026-01-07 21:16:50.836402
## 27: 2026-01-07 21:16:50.836402
## 28: 2026-01-07 21:16:50.836402
## createdDate
## <char>
# keep only objects that are either runif or rnorm ("or" search)
keepCache(tmpDir, userTags = "runif|rnorm", ask = FALSE)## Nothing to remove; keeping all
## cacheId tagKey tagValue
## <char> <char> <char>
## 1: adf21923cd1e50d0 function rnorm
## 2: adf21923cd1e50d0 objectName b
## 3: adf21923cd1e50d0 accessed 2026-01-07 21:16:50.85068
## 4: adf21923cd1e50d0 inCloud FALSE
## 5: adf21923cd1e50d0 elapsedTimeDigest 0.003268957 secs
## 6: adf21923cd1e50d0 preDigest .FUN:4f604aa46882b368
## 7: adf21923cd1e50d0 preDigest mean:c40c00762a0dac94
## 8: adf21923cd1e50d0 preDigest n:7eef4eae85fd9229
## 9: adf21923cd1e50d0 preDigest sd:853b1797f54b229c
## 10: adf21923cd1e50d0 class numeric
## 11: adf21923cd1e50d0 object.size 80
## 12: adf21923cd1e50d0 fromDisk FALSE
## 13: adf21923cd1e50d0 resultHash
## 14: adf21923cd1e50d0 elapsedTimeFirstRun 0.001499891 secs
## 15: e23cab430872a0ea function runif
## 16: e23cab430872a0ea objectName a
## 17: e23cab430872a0ea accessed 2026-01-07 21:16:50.83370
## 18: e23cab430872a0ea inCloud FALSE
## 19: e23cab430872a0ea elapsedTimeDigest 0.002732992 secs
## 20: e23cab430872a0ea preDigest .FUN:881ec847b7161f3c
## 21: e23cab430872a0ea preDigest max:853b1797f54b229c
## 22: e23cab430872a0ea preDigest min:c40c00762a0dac94
## 23: e23cab430872a0ea preDigest n:7eef4eae85fd9229
## 24: e23cab430872a0ea class numeric
## 25: e23cab430872a0ea object.size 80
## 26: e23cab430872a0ea fromDisk FALSE
## 27: e23cab430872a0ea resultHash
## 28: e23cab430872a0ea elapsedTimeFirstRun 0.001397133 secs
## cacheId tagKey tagValue
## <char> <char> <char>
## createdDate
## <char>
## 1: 2026-01-07 21:16:50.853539
## 2: 2026-01-07 21:16:50.853539
## 3: 2026-01-07 21:16:50.853539
## 4: 2026-01-07 21:16:50.853539
## 5: 2026-01-07 21:16:50.853539
## 6: 2026-01-07 21:16:50.853539
## 7: 2026-01-07 21:16:50.853539
## 8: 2026-01-07 21:16:50.853539
## 9: 2026-01-07 21:16:50.853539
## 10: 2026-01-07 21:16:50.853539
## 11: 2026-01-07 21:16:50.853539
## 12: 2026-01-07 21:16:50.853539
## 13: 2026-01-07 21:16:50.853539
## 14: 2026-01-07 21:16:50.853539
## 15: 2026-01-07 21:16:50.836402
## 16: 2026-01-07 21:16:50.836402
## 17: 2026-01-07 21:16:50.836402
## 18: 2026-01-07 21:16:50.836402
## 19: 2026-01-07 21:16:50.836402
## 20: 2026-01-07 21:16:50.836402
## 21: 2026-01-07 21:16:50.836402
## 22: 2026-01-07 21:16:50.836402
## 23: 2026-01-07 21:16:50.836402
## 24: 2026-01-07 21:16:50.836402
## 25: 2026-01-07 21:16:50.836402
## 26: 2026-01-07 21:16:50.836402
## 27: 2026-01-07 21:16:50.836402
## 28: 2026-01-07 21:16:50.836402
## createdDate
## <char>
ras <- terra::rast(terra::ext(0, 5, 0, 5),
res = 1,
vals = sample(1:5, replace = TRUE, size = 25),
crs = "+proj=lcc +lat_1=48 +lat_2=33 +lon_0=-100 +ellps=WGS84"
)
rasCRS <- terra::crs(ras)
# A slow operation, like GIS operation
notCached <- suppressWarnings(
# project raster generates warnings when run non-interactively
terra::project(ras, rasCRS, res = 5)
)
cached <- suppressWarnings(
# project raster generates warnings when run non-interactively
# using quote works also
terra::project(ras, rasCRS, res = 5) |> Cache(cachePath = tmpDir)
)## Saved! Cache file: ac92e464c27589f5.rds; fn: project
# second time is much faster
reRun <- suppressWarnings(
# project raster generates warnings when run non-interactively
terra::project(ras, rasCRS, res = 5) |> Cache(cachePath = tmpDir)
)## Object to retrieve (fn: project, ac92e464c27589f5.rds) ...
## Loaded! Cached result from previous project call
# recovered cached version is same as non-cached version
all.equal(notCached, reRun, check.attributes = FALSE) ## TRUE## [1] "Attributes: < Names: 2 string mismatches >"
## [2] "Attributes: < Length mismatch: comparison on first 2 components >"
## [3] "Attributes: < Component 1: Modes: character, list >"
## [4] "Attributes: < Component 1: names for current but not for target >"
## [5] "Attributes: < Component 1: Attributes: < names for target but not for current > >"
## [6] "Attributes: < Component 1: Attributes: < Length mismatch: comparison on first 0 components > >"
## [7] "Attributes: < Component 1: target is character, current is list >"
## [8] "Attributes: < Component 2: 'current' is not an envRefClass >"
Nested caching, which is when Caching of a function occurs inside an outer function, which is itself cached. This is a critical element to working within a reproducible work flow. It is not enough during development to cache flat code chunks, as there will be many levels of “slow” functions. Ideally, at all points in a development cycle, it should be possible to get to any line of code starting from the very initial steps, running through everything up to that point, in less than a few seconds. If the workflow can be kept very fast like this, then there is a guarantee that it will work at any point.
##########################
## Nested Caching
# Make 2 functions
inner <- function(mean) {
d <- 1
rnorm(n = 3, mean = mean)
}
outer <- function(n) {
inner(0.1) |> Cache(cachePath = tmpdir2)
}
# make 2 different cache paths
tmpdir1 <- file.path(tempfile(), "first")
tmpdir2 <- file.path(tempfile(), "second")
# Run the Cache ... notOlderThan propagates to all 3 Cache calls,
# but cachePath is tmpdir1 in top level Cache and all nested
# Cache calls, unless individually overridden ... here inner
# uses tmpdir2 repository
outer(n = 2) |> Cache(cachePath = tmpdir1)## Saved! Cache file: a352d42cb0291199.rds; fn: inner
## Saved! Cache file: 61564dd5e84ab6d5.rds; fn: outer
## [1] -0.4524105 -0.8258873 -0.5288960
## attr(,".Cache")
## attr(,".Cache")$newCache
## [1] TRUE
##
## attr(,"tags")
## [1] "cacheId:61564dd5e84ab6d5"
## attr(,"callInCache")
## [1] ""
## Cache size:
## Total (including Rasters): 252 bytes
## Selected objects (not including Rasters): 252 bytes
## cacheId tagKey tagValue
## <char> <char> <char>
## 1: 61564dd5e84ab6d5 function outer
## 2: 61564dd5e84ab6d5 accessed 2026-01-07 21:16:51.17550
## 3: 61564dd5e84ab6d5 inCloud FALSE
## 4: 61564dd5e84ab6d5 elapsedTimeDigest 0.00630784 secs
## 5: 61564dd5e84ab6d5 preDigest .FUN:fd3ff16451bebbef
## 6: 61564dd5e84ab6d5 preDigest n:82dc709f2b91918a
## 7: 61564dd5e84ab6d5 class numeric
## 8: 61564dd5e84ab6d5 object.size 1008
## 9: 61564dd5e84ab6d5 fromDisk FALSE
## 10: 61564dd5e84ab6d5 resultHash
## 11: 61564dd5e84ab6d5 elapsedTimeFirstRun 0.01651311 secs
## createdDate
## <char>
## 1: 2026-01-07 21:16:51.193193
## 2: 2026-01-07 21:16:51.193193
## 3: 2026-01-07 21:16:51.193193
## 4: 2026-01-07 21:16:51.193193
## 5: 2026-01-07 21:16:51.193193
## 6: 2026-01-07 21:16:51.193193
## 7: 2026-01-07 21:16:51.193193
## 8: 2026-01-07 21:16:51.193193
## 9: 2026-01-07 21:16:51.193193
## 10: 2026-01-07 21:16:51.193193
## 11: 2026-01-07 21:16:51.193193
## Cache size:
## Total (including Rasters): 20 bytes
## Selected objects (not including Rasters): 20 bytes
## cacheId tagKey tagValue
## <char> <char> <char>
## 1: a352d42cb0291199 function inner
## 2: a352d42cb0291199 outerFunction outer
## 3: a352d42cb0291199 accessed 2026-01-07 21:16:51.18428
## 4: a352d42cb0291199 inCloud FALSE
## 5: a352d42cb0291199 elapsedTimeDigest 0.006507158 secs
## 6: a352d42cb0291199 preDigest .FUN:c411a17f70613a02
## 7: a352d42cb0291199 preDigest mean:22413394efd9f6a3
## 8: a352d42cb0291199 class numeric
## 9: a352d42cb0291199 object.size 80
## 10: a352d42cb0291199 fromDisk FALSE
## 11: a352d42cb0291199 resultHash
## 12: a352d42cb0291199 elapsedTimeFirstRun 0.001060963 secs
## createdDate
## <char>
## 1: 2026-01-07 21:16:51.186084
## 2: 2026-01-07 21:16:51.186084
## 3: 2026-01-07 21:16:51.186084
## 4: 2026-01-07 21:16:51.186084
## 5: 2026-01-07 21:16:51.186084
## 6: 2026-01-07 21:16:51.186084
## 7: 2026-01-07 21:16:51.186084
## 8: 2026-01-07 21:16:51.186084
## 9: 2026-01-07 21:16:51.186084
## 10: 2026-01-07 21:16:51.186084
## 11: 2026-01-07 21:16:51.186084
## 12: 2026-01-07 21:16:51.186084
# userTags get appended
# all items have the outer tag propagate, plus inner ones only have inner ones
clearCache(tmpdir1, ask = FALSE)
outerTag <- "outerTag"
innerTag <- "innerTag"
inner <- function(mean) {
d <- 1
rnorm(n = 3, mean = mean) |> Cache(notOlderThan = Sys.time() - 1e5, userTags = innerTag)
}
outer <- function(n) {
inner(0.1) |> Cache()
}
aa <- Cache(outer, n = 2) |> Cache(cachePath = tmpdir1, userTags = outerTag)## No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
## this will not persist across R sessions.
## No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
## this will not persist across R sessions.
## No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
## this will not persist across R sessions.
## Saved! Cache file: a481e2b85f7337f2.rds; fn: rnorm
## Saved! Cache file: 9691c7bae9ad6582.rds; fn: inner
## Saved! Cache file: 2d26a68d5154433e.rds; fn: outer
## Saved! Cache file: 2e46ad9c8f6c63f4.rds; fn: Cache
## Cache size:
## Total (including Rasters): 252 bytes
## Selected objects (not including Rasters): 252 bytes
## cacheId tagKey tagValue
## <char> <char> <char>
## 1: 2e46ad9c8f6c63f4 function Cache
## 2: 2e46ad9c8f6c63f4 userTags outerTag
## 3: 2e46ad9c8f6c63f4 accessed 2026-01-07 21:16:51.23474
## 4: 2e46ad9c8f6c63f4 inCloud FALSE
## 5: 2e46ad9c8f6c63f4 elapsedTimeDigest 0.003523111 secs
## 6: 2e46ad9c8f6c63f4 preDigest FUN:f4ab56347506d214
## 7: 2e46ad9c8f6c63f4 preDigest cacheSaveFormat:cf2828ea967d53e7
## 8: 2e46ad9c8f6c63f4 preDigest dryRun:e9aac936a0e8f6ae
## 9: 2e46ad9c8f6c63f4 preDigest .FUN:ab4b977119e40b21
## 10: 2e46ad9c8f6c63f4 preDigest .cacheChaining:71681d621365dfd7
## 11: 2e46ad9c8f6c63f4 preDigest .cacheExtra:c85d88fc56f4e042
## 12: 2e46ad9c8f6c63f4 preDigest .functionName:c85d88fc56f4e042
## 13: 2e46ad9c8f6c63f4 preDigest conn:118387d5d48f757d
## 14: 2e46ad9c8f6c63f4 preDigest drv:9ce9a83896bf68a1
## 15: 2e46ad9c8f6c63f4 preDigest n:82dc709f2b91918a
## 16: 2e46ad9c8f6c63f4 class numeric
## 17: 2e46ad9c8f6c63f4 object.size 1008
## 18: 2e46ad9c8f6c63f4 fromDisk FALSE
## 19: 2e46ad9c8f6c63f4 resultHash
## 20: 2e46ad9c8f6c63f4 elapsedTimeFirstRun 0.03766489 secs
## cacheId tagKey tagValue
## <char> <char> <char>
## createdDate
## <char>
## 1: 2026-01-07 21:16:51.273777
## 2: 2026-01-07 21:16:51.273777
## 3: 2026-01-07 21:16:51.273777
## 4: 2026-01-07 21:16:51.273777
## 5: 2026-01-07 21:16:51.273777
## 6: 2026-01-07 21:16:51.273777
## 7: 2026-01-07 21:16:51.273777
## 8: 2026-01-07 21:16:51.273777
## 9: 2026-01-07 21:16:51.273777
## 10: 2026-01-07 21:16:51.273777
## 11: 2026-01-07 21:16:51.273777
## 12: 2026-01-07 21:16:51.273777
## 13: 2026-01-07 21:16:51.273777
## 14: 2026-01-07 21:16:51.273777
## 15: 2026-01-07 21:16:51.273777
## 16: 2026-01-07 21:16:51.273777
## 17: 2026-01-07 21:16:51.273777
## 18: 2026-01-07 21:16:51.273777
## 19: 2026-01-07 21:16:51.273777
## 20: 2026-01-07 21:16:51.273777
## createdDate
## <char>
Sometimes, it is not absolutely desirable to maintain the work flow
intact because changes that are irrelevant to the analysis, such as
changing messages sent to a user, may be changed, without a desire to
rerun functions. The cacheId argument is for this. Once a
piece of code is run, then the cacheId can be manually
extracted (it is reported at the end of a Cache call) and manually
placed in the code, passed in as, say,
cacheId = "ad184ce64541972b50afd8e7b75f821b".
## Saved! Cache file: ca275879d5116967.rds; fn: rnorm
## [1] -0.6264538
## attr(,".Cache")
## attr(,".Cache")$newCache
## [1] TRUE
##
## attr(,"tags")
## [1] "cacheId:ca275879d5116967"
## attr(,"callInCache")
## [1] ""
# manually look at output attribute which shows cacheId: 422bae4ed2f770cc
rnorm(1) |> Cache(cachePath = tmpdir1, cacheId = "422bae4ed2f770cc") # same value## cacheId passed to override automatic digesting; using 422bae4ed2f770cc
## Saved! Cache file: 422bae4ed2f770cc.rds; fn: rnorm
## [1] 0.1836433
## attr(,".Cache")
## attr(,".Cache")$newCache
## [1] TRUE
##
## attr(,"tags")
## [1] "cacheId:422bae4ed2f770cc"
## attr(,"callInCache")
## [1] ""
# override even with different inputs:
rnorm(2) |> Cache(cachePath = tmpdir1, cacheId = "422bae4ed2f770cc")## cacheId passed to override automatic digesting; using 422bae4ed2f770cc
## Object to retrieve (fn: rnorm, 422bae4ed2f770cc.rds) ...
## Loaded! Cached result from previous rnorm call
## [1] 0.1836433
## attr(,".Cache")
## attr(,".Cache")$newCache
## [1] FALSE
##
## attr(,"tags")
## [1] "cacheId:422bae4ed2f770cc"
## attr(,"callInCache")
## [1] ""
Since the cache is simply a DBI data table (of an SQLite
database by default). In addition, there are several helpers in the
reproducible package, including showCache,
keepCache and clearCache that may be useful.
Also, one can access cached items manually (rather than simply rerunning
the same Cache function again).
# As of reproducible version 1.0, there is a new backend directly using DBI
mapHash <- unique(showCache(tmpDir, userTags = "project")$cacheId)## Cache size:
## Total (including Rasters): 676 bytes
## Selected objects (not including Rasters): 676 bytes
## Loaded! Cached result from previous call
By default, caching relies on a sqlite database for it’s backend.
While this works in many situations, there are some important
limitations of using sqlite for caching, including 1) speed; 2)
concurrent transactions; 3) sharing database across machines or
projects. Fortunately, Cache makes use of DBI
package and thus supports several database backends, including mysql and
postgresql.
See https://github.com/PredictiveEcology/SpaDES/wiki/Using-alternate-database-backends-for-Cache for further information on configuring these additional backends.
In general, we feel that a liberal use of Cache will
make a re-usable and reproducible work flow. shiny apps can
be made, taking advantage of Cache. Indeed, much of the
difficulty in managing data sets and saving them for future use, can be
accommodated by caching.