Version: 0.17.0
Depends: R (≥ 2.14.0)
Imports: utils, R.methodsS3 (≥ 1.8.1), R.oo (≥ 1.24.0), R.utils (≥ 2.10.1), digest (≥ 0.6.13)
Title: Fast and Light-Weight Caching (Memoization) of Objects and Results to Speed Up Computations
Author: Henrik Bengtsson [aut, cre, cph]
Maintainer: Henrik Bengtsson <henrikb@braju.com>
Description: Memoization can be used to speed up repetitive and computational expensive function calls. The first time a function that implements memoization is called the results are stored in a cache memory. The next time the function is called with the same set of parameters, the results are momentarily retrieved from the cache avoiding repeating the calculations. With this package, any R object can be cached in a key-value storage where the key can be an arbitrary set of R objects. The cache memory is persistent (on the file system).
License: LGPL-2.1 | LGPL-3 [expanded from: LGPL (≥ 2.1)]
LazyLoad: TRUE
URL: https://github.com/HenrikBengtsson/R.cache
BugReports: https://github.com/HenrikBengtsson/R.cache/issues
Encoding: UTF-8
NeedsCompilation: no
Packaged: 2025-05-02 21:22:23 UTC; henrik
Repository: CRAN
Date/Publication: 2025-05-02 22:20:02 UTC

Package R.cache

Description

Memoization can be used to speed up repetitive and computational expensive function calls. The first time a function that implements memoization is called the results are stored in a cache memory. The next time the function is called with the same set of parameters, the results are momentarily retrieved from the cache avoiding repeating the calculations. With this package, any R object can be cached in a key-value storage where the key can be an arbitrary set of R objects. The cache memory is persistent (on the file system).

Installation and updates

To install this package and all of its dependent packages, do: install.packages("R.cache")

To get started

loadCache, saveCache

Methods for loading and saving objects from and to the cache.

getCacheRootPath, setCacheRootPath

Methods for getting and setting the directory where cache files are stored.

How to cite this package

Whenever using this package, please cite [1] as

Bengtsson, H. The R.oo package - Object-Oriented Programming with References Using
Standard R Code, Proceedings of the 3rd International Workshop on Distributed
Statistical Computing (DSC 2003), ISSN 1609-395X, Hornik, K.; Leisch, F. & Zeileis,
A. (ed.), 2003

Wishlist

Here is a list of features that would be useful, but which I have too little time to add myself. Contributions are appreciated.

If you consider implement some of the above, make sure it is not already implemented by downloading the latest "devel" version!

Related work

See also the filehash package, and the cache() function in the Biobase package of Bioconductor.

License

The releases of this package is licensed under LGPL version 2.1 or newer.

References

[1] H. Bengtsson, The R.oo package - Object-Oriented Programming with References Using Standard R Code, In Kurt Hornik, Friedrich Leisch and Achim Zeileis, editors, Proceedings of the 3rd International Workshop on Distributed Statistical Computing (DSC 2003), March 20-22, Vienna, Austria. https://www.r-project.org/conferences/DSC-2003/Proceedings/

Author(s)

Henrik Bengtsson


Loads an object from a file connection

Description

Loads an object from a file connection similar to load(), but without resetting file connections (to position zero).

WARNING: This is an internal function that should not be called by anything but the internal code of the R.cache package.

Usage

.baseLoad(con, envir=parent.frame())

Arguments

con

A connection.

envir

An environment where the loaded object will be stored.

Details

The reason why it is not possible to use load() is that that resets the file position of the connection before trying to load the object. The reason why that happens is because when you pass a regular file connection to load() it gets coerced via gzcon(), which is the function that resets the file position.

The workaround is to create a local copy of base::load() and modify it by dropping the gzcon() coercion. This is possible because this function, that is .baseLoad(), is always called with a gzfile() connection.

Value

Returns (invisible) a character vector of the names of objects loaded.

See Also

This function is used by loadCache() and readCacheHeader().


Non-documented objects

Description

This page contains aliases for all "non-documented" objects that R CMD check detects in this package.

Almost all of them are generic functions that have specific document for the corresponding method coupled to a specific class. Other functions are re-defined by setMethodS3() to default methods. Neither of these two classes are non-documented in reality. The rest are deprecated methods.

Author(s)

Henrik Bengtsson


Options used by R.cache

Description

Below are all R options specific to the R.cache package.
WARNING: Note that the names and the default values of these options may change in future versions of the package. Please use with care until further notice.

Options for controlling futures

R.cache.compress:

If TRUE, saveCache() will write compressed cache files, otherwise not. (Default: FALSE)

R.cache.enabled:

If TRUE, loadCache() is reading from and saveCache() is writing to the cache, otherwise not. (Default: TRUE)

R.cache.rootPath:

A character string specifying the default cache root path. If not set, environment variable R_CACHE_ROOTPATH is considered.

R.cache.touchOnLoad:

If TRUE, loadCache() will update the "last-modified" timestamp of the cache file (to the current time), otherwise not. (Default: FALSE)


Creates a copy of an existing function such that its results are memoized

Description

Creates a copy of an existing function such that its results are memoized.

Usage

## Default S3 method:
addMemoization(fcn, envir=parent.frame(), ...)

Arguments

fcn

A function (or the name of a function) that should be copied and have memoization added.

envir

The environment from where to look for the function.

...

Additional arguments for controlling the memoization, i.e. all arguments of memoizedCall() that are not passed to do.call().

Details

The new function is setup such that the the memoized call is done in the environment of the caller (the parent frame of the function).

If the function returns NULL, that particular function call is not memoized.

Value

Returns a function.

Author(s)

Henrik Bengtsson

See Also

The returned function utilized memoizedCall() internally.


Removes all files in a cache file directory

Description

Removes all files in a cache file directory.

Usage

## Default S3 method:
clearCache(path=getCachePath(...), ..., recursive=FALSE, prompt=TRUE && interactive())

Arguments

path

A character string specifying the directory to be cleared. By default, the path is what is returned by getCachePath() which arguments ... are also passed.

...

Arguments passed to getCachePath(), especially argument dirs to specify subdirectories.

recursive

If TRUE, subdirectories are also removed, otherwise just the files in the specified directory.

prompt

If TRUE, the user will be prompted to confirm that the directory will cleared before files are removed.

Details

If the specified directory does not exists, an exception is thrown.

Value

Returns (invisibly) a character vector of pathnames of the files removed. If no files were removed, NULL is returned.

Author(s)

Henrik Bengtsson


Evaluates an R expression with memoization

Description

Evaluates an R expression with memoization such that the same objects are assigned to the current environment and the same result is returned, if any.

Usage

evalWithMemoization(expr, key=NULL, ..., envir=parent.frame(),
  drop=c("srcref", "srcfile", "wholeSrcref"), force=FALSE)

Arguments

expr

The expression to be evaluated.

key

Additional objects to uniquely identify the evaluation.

...

Additional arguments passed to loadCache() and saveCache().

envir

The environment in which the expression should be evaluated.

drop

character vector of expr attributes to drop. The default is to drop all source-reference information.

force

If TRUE, existing cached results are ignored.

Value

Returns the value of the evaluated expr expression, if any.

Author(s)

Henrik Bengtsson

See Also

Internally, eval() is used to evaluate the expression.

Examples

for (kk in 1:5) {
  cat(sprintf("Iteration #%d:\n", kk))
  res <- evalWithMemoization({
    cat("Evaluating expression...")
    a <- 1
    b <- 2
    c <- 4
    Sys.sleep(1)
    cat("done\n")
    b
  })
  print(res)

  # Sanity checks
  stopifnot(a == 1 && b == 2 && c == 4)

  # Clean up
  rm(a, b, c)
} # for (kk ...)


## OUTPUTS:
## Iteration #1:
## Evaluating expression...done
## [1] 2
## Iteration #2:
## [1] 2
## Iteration #3:
## [1] 2
## Iteration #4:
## [1] 2
## Iteration #5:
## [1] 2


############################################################
# WARNING
############################################################
# If the expression being evaluated depends on
# "input" objects, then these must be be specified
# explicitly as "key" objects.
for (ii in 1:2) {
  for (kk in 1:3) {
    cat(sprintf("Iteration #%d:\n", kk))
    res <- evalWithMemoization({
      cat("Evaluating expression...")
      a <- kk
      Sys.sleep(1)
      cat("done\n")
      a
    }, key=list(kk=kk))
    print(res)

    # Sanity checks
    stopifnot(a == kk)

    # Clean up
    rm(a)
  } # for (kk ...)
} # for (ii ...)

## OUTPUTS:
## Iteration #1:
## Evaluating expression...done
## [1] 1
## Iteration #2:
## Evaluating expression...done
## [1] 2
## Iteration #3:
## Evaluating expression...done
## [1] 3
## Iteration #1:
## [1] 1
## Iteration #2:
## [1] 2
## Iteration #3:
## [1] 3

Locates a cache file

Description

Locates a cache file from a key object.

Usage

## Default S3 method:
findCache(key=NULL, ...)

Arguments

key

An optional object from which a hexadecimal hash code will be generated and appended to the filename.

...

Additional argument passed to generateCache().

Value

Returns the pathname as a character, or NULL if the no cached data exists.

Author(s)

Henrik Bengtsson

See Also

generateCache(). loadCache().


Generates a cache pathname from a key object

Description

Generates a cache pathname from a key object.

Usage

## Default S3 method:
generateCache(key, suffix=".Rcache", ...)

Arguments

key

A list or an environment from which a character string checksum will be calculated and that will constitute the name part of the cache filename.

suffix

A character string to be appended to the end of the filename.

...

Arguments passed to getCachePath().

Value

Returns the pathname as a character string.

Author(s)

Henrik Bengtsson

See Also

findCache(). Internally, the generic function getChecksum() is used to calculate the checksum of argument key.


Gets the path to the file cache directory

Description

Gets the path to the file cache directory. If missing, the directory is created.

Usage

## Default S3 method:
getCachePath(dirs=NULL, path=NULL, rootPath=getCacheRootPath(), ...)

Arguments

dirs

A character vector constituting the path to the cache subdirectory (of the cache root directory as returned by getCacheRootPath()) to be used. If NULL, the path will be the cache root path.

path, rootPath

(Advanced) character strings specifying the explicit/default cache path and root cache path.

...

Not used.

Value

Returns the path as a character string. If the user does not have write permissions to the path, then an error is thrown.

Author(s)

Henrik Bengtsson

See Also

setCachePath.


Gets the root path to the file cache directory

Description

Gets the root path to the file cache directory.

Usage

## Default S3 method:
getCacheRootPath(defaultPath=NULL, ...)

Arguments

defaultPath

The default path, if no user-specified directory has been given.

...

Not used.

Value

Returns the path as a character string.

Author(s)

Henrik Bengtsson

See Also

Too set the directory where cache files are stored, see setCacheRootPath().

Examples

  print(getCacheRootPath())

Generates a deterministic checksum for an R object

Description

Generates a deterministic checksum for an R object such that (i) if the same object is used again, then the same checksum is obtained, and (ii) if another object is used, then a different checksum is obtained with extremely high probability. In other words, it is highly unlikely that two different objects have the same checksum.

Usage

## Default S3 method:
getChecksum(object, ...)

Arguments

object

The object for which a checksum should be calculated.

...

Additional arguments passed to digest.

Details

Because getChecksum() is a generic function, it is possible to provide custom methods for specific classes of objects. This means that, if a certain class specifies fields that carry auxiliary data, then these can be excluded from the checksum calculation. For instance, assume that all objects of class 'TimestampedObject' contain timestamps specifying when each object was created. Then a custom getChecksum() method for this class can first drop the timestamp and then call the default getChecksum() function.

Value

Returns checksum represented as a character string.

Author(s)

Henrik Bengtsson

See Also

Internally, the digest method is used to calculate the checksum.


Loads data from file cache

Description

Loads data from file cache, which is unique for an optional key object.

Usage

## Default S3 method:
loadCache(key=NULL, sources=NULL, suffix=".Rcache", removeOldCache=TRUE, pathname=NULL,
  dirs=NULL, ..., onError=c("warning", "error", "message", "quiet", "print"))

Arguments

key

An optional object from which a hexadecimal hash code will be generated and appended to the filename.

sources

Optional source objects. If the cache object has a timestamp older than one of the source objects, it will be ignored and removed.

suffix

A character string to be appended to the end of the filename.

removeOldCache

If TRUE and the cache is older than the sources, the cache file is removed, otherwise not.

pathname

The pathname to the cache file. If specified, arguments key and suffix are ignored. Note that this is only needed in order to read a cache file for which the key is unknown, for instance, in order to investigate an unknown cache file.

dirs

A character vector constituting the path to the cache subdirectory (of the cache root directory as returned by getCacheRootPath()) to be used. If NULL, the path will be the cache root path.

...

Not used.

onError

A character string specifying what the action is if an exception is thrown.

Details

The hash code calculated from the key object is a 32 characters long hexadecimal MD5 hash code. For more details, see getChecksum().

Value

Returns an R object or NULL, if cache does not exist.

Author(s)

Henrik Bengtsson

See Also

saveCache().

Examples

simulate <- function(mean, sd) {
  # 1. Try to load cached data, if already generated
  key <- list(mean, sd)
  data <- loadCache(key)
  if (!is.null(data)) {
    cat("Loaded cached data\n")
    return(data);
  }

  # 2. If not available, generate it.
  cat("Generating data from scratch...")
  data <- rnorm(1000, mean=mean, sd=sd)
  Sys.sleep(1)             # Emulate slow algorithm
  cat("ok\n")
  saveCache(data, key=key, comment="simulate()")

  data;
}

data <- simulate(2.3, 3.0)
data <- simulate(2.3, 3.5)
data <- simulate(2.3, 3.0) # Will load cached data

# Clean up
file.remove(findCache(key=list(2.3,3.0)))
file.remove(findCache(key=list(2.3,3.5)))

Calls a function with memoization

Description

Calls a function with memoization, that is, caches the results to be retrieved if the function is called again with the exact same arguments.

Usage

## Default S3 method:
memoizedCall(what, ..., envir=parent.frame(), force=FALSE, sources=NULL, dirs=NULL)

Arguments

what

The function to be called, or a character string specifying the name of the function to be called, cf. do.call().

...

Arguments passed to the function.

envir

The environment in which the function is evaluated.

force

If TRUE, any cached results are ignored, otherwise not.

sources, dirs

Optional arguments passed to loadCache() and saveCache().

Details

If the function returns NULL, that particular function call is not memoized.

Value

Returns the result of the function call.

Author(s)

Henrik Bengtsson

See Also

Internally, loadCache() is used to load memoized results, if available. If not available, then do.call() is used to evaluate the function call, and saveCache() is used to save the results to cache.


Loads data from file cache

Description

Loads data from file cache, which is unique for an optional key object.

Usage

## Default S3 method:
readCacheHeader(file, ...)

Arguments

file

A filename or a connection.

...

Not used.

Value

Returns a named list structure with element identifier, version, comment (optional), sources (optional), and timestamp.

Author(s)

Henrik Bengtsson

See Also

findCache(). loadCache(). saveCache().

Examples


data <- 1:120
key <- list(some=1, vari=2, ables=3)

saveCache(key=key, data, comment="A simple example of a cached object.")

header <- readCacheHeader(findCache(key=key))
print(header)

# Clean up
file.remove(findCache(key=key))

Saves data to file cache

Description

Saves data to file cache, which is unique for an optional key object.

Usage

## Default S3 method:
saveCache(object, key=NULL, sources=NULL, suffix=".Rcache", comment=NULL, pathname=NULL,
  dirs=NULL, compress=NULL, ...)

Arguments

object

The object to be saved to file.

key

An optional object from which a hexadecimal hash code will be generated and appended to the filename.

sources

Source objects used for comparison of timestamps when cache is loaded later.

suffix

A character string to be appended to the end of the filename.

comment

An optional character string written in ASCII at the beginning of the file.

pathname

(Advanced) An optional character string specifying the pathname to the cache file. If not specified (default), a unique one is automatically generated from arguments key and suffix among other things.

dirs

A character vector constituting the path to the cache subdirectory (of the cache root directory as returned by getCacheRootPath()) to be used. If NULL, the path will be the cache root path.

compress

If TRUE, the cache file will be saved using gzip compression, otherwise not.

...

Additional argument passed to save().

Value

Returns (invisible) the pathname of the cache file.

Compression

The saveCache() method saves a compressed cache file (with filename extension *.gz) if argument compress is TRUE. The loadCache() method locates (via findCache()) and loads such cache files as well.

Author(s)

Henrik Bengtsson

See Also

For more details on how the hash code is generated etc, loadCache().

Examples

## Not run: For an example, see ?loadCache

Sets the path to the file cache directory

Description

Sets the path to the file cache directory.

Usage

## Default S3 method:
setCachePath(dirs=NULL, path=NULL, ...)

Arguments

dirs

A character vector constituting the path to the cache subdirectory of interest.

path

The path to override the path according to the dirs argument.

...

Not used.

Value

Returns nothing.

Author(s)

Henrik Bengtsson

See Also

getCachePath().


Sets the root path to the file cache directory

Description

Sets the root path to the file cache directory.

Usage

## Default S3 method:
setCacheRootPath(path=NULL, ...)

Arguments

path

The path.

...

Not used.

Value

Returns (invisibly) the old root path.

Author(s)

Henrik Bengtsson

See Also

getCacheRootPath().


Interactively offers the user to set up the default root path

Description

Interactively offers the user to set up the default root path.

Usage

## Default S3 method:
setupCacheRootPath(defaultPath=NULL, ...)

Arguments

defaultPath

Default root path to set.

...

Not used.

Details

If the cache root path is already set, it is used and nothing is done. If the "default" root path (defaultPath) exists, it is used, otherwise, if running interactively, the user is asked to approve the usage (and creation) of the default root path. In all other cases, the cache root path is set to a session-specific temporary directory.

Value

Returns (invisibly) the root path, or NULL if running a non-interactive session.

Author(s)

Henrik Bengtsson

See Also

Internally, setCacheRootPath() is used to set the cache root path. The interactive() function is used to test whether R is running interactively or not.

mirror server hosted at Truenetwork, Russian Federation.