Package {dbMatrix}


Title: Database-Backed Matrix Classes and Operations
Version: 0.1.0
Description: Provides S4 classes and methods for storing dense and sparse matrices in 'DuckDB' databases. The package supports constructing database-backed matrices from base R and 'Matrix' objects, extracting slices and summaries, performing arithmetic and selected linear algebra operations, and materializing results for larger-than-memory workflows. It integrates with 'dbProject' to keep database paths, live connections, and lazy matrix tables synchronized across interactive analyses.
License: GPL-3
Encoding: UTF-8
URL: https://github.com/dbverse-org/dbmatrix-r, https://dbverse-org.github.io/dbmatrix-r/
BugReports: https://github.com/dbverse-org/dbmatrix-r/issues
RoxygenNote: 7.3.3
Depends: R (≥ 4.1.0)
Imports: Matrix (≥ 1.6-5), MatrixGenerics (≥ 1.12.3), methods, DBI, dplyr, dbplyr, duckdb (≥ 1.4.0), data.table (≥ 1.12.2), glue, bit64, cli, Rcpp, arrow, nanoarrow, dbProject, rlang
LinkingTo: Rcpp, RcppEigen, RSpectra, nanoarrow
Suggests: knitr, rmarkdown, testthat (≥ 3.0.0), irlba, crayon, R.utils, checkmate, reticulate, sparseMatrixStats, RSpectra
VignetteBuilder: knitr
Config/testthat/edition: 3
NeedsCompilation: yes
Packaged: 2026-05-14 02:35:49 UTC; ecruiz
Author: Edward C. Ruiz ORCID iD [aut, cre], Jiaji George Chen [aut], Ruben Dries [aut]
Maintainer: Edward C. Ruiz <ecr7407@gmail.com>
Repository: CRAN
Date/Publication: 2026-05-19 08:30:02 UTC

Value Matching

Description

Implements the %in% operator for dbMatrix objects. This operator checks if elements from the left operand are contained in the right operand, returning a logical vector.

Usage

## S4 method for signature 'dbDenseMatrix,ANY'
x %in% table

## S4 method for signature 'ANY,dbDenseMatrix'
x %in% table

## S4 method for signature 'dbSparseMatrix,ANY'
x %in% table

Arguments

x

A dbMatrix object or any other object

table

Any object or a dbMatrix object

Details

This is a method for the standard %in% operator for dbMatrix objects. It follows R's standard behavior for the %in% operator:

Value

A logical vector of the same length as x, indicating which elements of x are in table.

Examples

con <- DBI::dbConnect(duckdb::duckdb(), ":memory:")
mat <- matrix(1:9, nrow = 3, ncol = 3)
dbmat <- dbMatrix(
  value = mat,
  con = con,
  name = "example_matrix",
  class = "dbDenseMatrix",
  overwrite = TRUE
)

dbmat %in% c(1, 3, 5, 7, 9)

c(1, 3, 5, 7, 9) %in% dbmat
DBI::dbDisconnect(con, shutdown = TRUE)

Input validation for data arg

Description

Input validation for data arg

Usage

.check_value(value)

Arguments

value

A Matrix, matrix, or tbl_duckdb_connection object

Value

No return value. Called for input validation and throws an error if value is invalid.


Evaluate if a dbSparseMatrix should be densified

Description

Evaluate if a dbSparseMatrix should be densified

Usage

.eval_op_densify(generic_char, vec_matrix)

Arguments

generic_char

A character string representing the operation to be performed.

vec_matrix

A dbMatrix object with 1D row or col.

Details

Evaluates if a dbSparseMatrix should be densified for ⁠[Arith]⁠ operations and specific scalar values for operations in the order of dbSparseMatrix, vector

Value

TRUE if the operation should densify the sparse matrix before evaluation, otherwise FALSE.


Join a dbSparseMatrix with a dbMatrix object

Description

Join a dbSparseMatrix with a dbMatrix object

Usage

.join_dbm_vect(dbm, vec_matrix, op, swap_arith_order = FALSE)

Arguments

dbm

A dbSparseMatrix object.

vec_matrix

A dbMatrix object with 1D row or col.

op

A character string representing the operation to be performed.

swap_arith_order

order of the arguments for the operation. default: NULL

Value

A dbMatrix object containing the result of applying op between dbm and vec_matrix.


Convert a dbSparseMatrix to dbDenseMatrix

Description

Internal function to convert a dbSparseMatrix to dbDenseMatrix.

Usage

.to_db_dense(x, chunk_size = NULL)

Arguments

x

A dbSparseMatrix object

chunk_size

integer. Number of columns to process per chunk during densification. If NULL (default), the function first checks the global option dbMatrix.chunk_size. If that is also NULL, it calculates a chunk size such that the estimated memory usage of each chunk does not exceed dbMatrix.max_mem_convert (default 8GB). If the total size is within the limit, a single chunk is used.

Value

A dbDenseMatrix object


Arith dbMatrix, e2

Description

See methods::Arith for more details.

See methods::Arith for more details.

See methods::Arith for more details.

See methods::Ops for more details.

See methods::Ops for more details.

See methods::Ops for more details.

Usage

## S4 method for signature 'dbMatrix,ANY'
Arith(e1, e2)

## S4 method for signature 'ANY,dbMatrix'
Arith(e1, e2)

## S4 method for signature 'dbMatrix,dbMatrix'
Arith(e1, e2)

## S4 method for signature 'dbMatrix,ANY'
Ops(e1, e2)

## S4 method for signature 'ANY,dbMatrix'
Ops(e1, e2)

## S4 method for signature 'dbMatrix,dbMatrix'
Ops(e1, e2)

## S4 method for signature 'DBIConnection'
dbLoad(conn, name, class)

## S4 method for signature 'dbMatrix'
writeMM(obj, file, ...)

Arguments

e1

First operand.

e2

Second operand.

conn

DBIConnection object

name

valid name value (character)

class

character, class of the dbMatrix object (e.g. "dbDenseMatrix" or "dbSparseMatrix")

obj

dbMatrix object

file

path to file

...

additional arguments

Value


Math Operations for dbMatrix Objects

Description

Implements the Math S4groupGeneric functions for dbMatrix objects. This includes various mathematical operations such as logarithms, exponentials, trigonometric functions, and other transformations.

Usage

## S4 method for signature 'dbMatrix'
Math(x)

Arguments

x

A dbMatrix object.

Details

This method provides implementations for the following Math functions:

Arithmetic and rounding:

Cumulative operations:

Logarithmic:

DuckDB Log Function Mappings:

R Function DuckDB Function Notes
log(x) LN(x) Natural logarithm
log10(x) LOG10(x) Base-10 logarithm
log2(x) LOG2(x) Base-2 logarithm
log1p(x) LN(x + 1) log(1+x), computed as LN

Sparsity-Preserving Log: For dbSparseMatrix with pending operations, log(x + 1) operations preserve sparsity since log(0 + 1) = 0. The multiplicative component is applied first, then the log transformation is applied to sparse values only.

Trigonometric:

Exponential:

Special functions:

The function applies the specified mathematical operation to each element of the dbMatrix object.

Value

A dbMatrix object with the mathematical operation applied to each element.

Examples

mat <- matrix(1, nrow = 3, ncol = 3)
dbmat <- as.dbMatrix(mat)
log(dbmat)
sqrt(dbmat)
sin(dbmat)


Summary Methods for dbMatrix Objects

Description

Implements the S4groupGeneric group generic functions for dbMatrix objects.

Usage

## S4 method for signature 'dbMatrix'
Summary(x, ..., na.rm = TRUE)

Arguments

x

A dbMatrix object.

...

Additional arguments (not used, but included for compatibility with the generic).

na.rm

Logical. If TRUE, remove NA values before computation. Always set to TRUE for this implementation.

Details

This method provides implementations for the following S4groupGeneric functions:

Value

The result of applying the respective summary function to the dbMatrix object. The type of the return value depends on the specific function called.

Examples

mat <- matrix(1, nrow = 3, ncol = 3)
dbmat <- as.dbMatrix(mat)
max(dbmat)
min(dbmat)
prod(dbmat)
sum(dbmat)
any(dbmat > 0)
all(dbmat > 0)


Extract or replace values in database-backed matrices

Description

Methods for subsetting and replacing values in dbMatrix objects.

Usage

## S4 method for signature 'dbMatrix,dbIndex,missing,ANY'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'dbMatrix,missing,dbIndex,ANY'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'dbMatrix,dbIndex,dbIndex,ANY'
x[i, j, ..., drop = FALSE]

## S4 method for signature 'dbMatrix,dbMatrix,missing,ANY'
x[i, j, ..., drop = TRUE]

## S4 replacement method for signature 'dbMatrix,dbMatrix,missing,ANY'
x[i, j] <- value

## S4 method for signature 'dbMatrix,dbDenseMatrix,missing,ANY'
x[i, j, ..., drop = TRUE]

## S4 method for signature 'dbMatrix,missing,dbDenseMatrix,ANY'
x[i, j, ..., drop = FALSE]

## S4 method for signature 'dbMatrix,dbDenseMatrix,dbDenseMatrix,ANY'
x[i, j, ..., drop = FALSE]

Arguments

x

A dbMatrix object.

i

Row, logical matrix, or matrix-style index.

j

Column index.

...

Additional arguments.

drop

Ignored; included for matrix API compatibility.

value

Replacement value.

Value

A subsetted or modified dbMatrix, or an extracted vector for matrix-style indexing.


Convert Matrix::Matrix to dbMatrix

Description

Converts in-memory matrix, Matrix::dgeMatrix, or Matrix::dgCMatrix into a dbMatrix object.

Generic function to convert in-memory objects to dbMatrix objects.

Usage

as.dbMatrix(x, con = NULL, name = "dbMatrix", overwrite = FALSE, ...)

as.dbMatrix(x, con = NULL, name = "dbMatrix", overwrite = FALSE, ...)

Arguments

x

Object to convert (e.g., matrix, dgCMatrix)

con

DBI or duckdb connection object

name

Table name to assign within database

overwrite

Whether to overwrite if table already exists

...

Additional arguments passed to methods

Details

If no con is provided, a temporary in-memory database connection is created. If no name is provided, a unique table name is generated.

Value

A dbDenseMatrix for dense matrix inputs or a dbSparseMatrix for sparse matrix inputs. The returned object keeps the input dimensions and dimnames while storing matrix values in DuckDB.


Convert dbMatrix to in-memory matrix

Description

Converts a dbMatrix object into an in-memory matrix or sparse matrix.

Usage

## S3 method for class 'dbMatrix'
as.matrix(x, ..., sparse = FALSE, names = TRUE)

Arguments

x

A dbMatrix object (dbSparseMatrix or dbDenseMatrix)

...

Additional arguments (not used)

sparse

Logical indicating if the output should be a sparse matrix default:FALSE

names

Logical indicating if the output should have dimnames. default:FALSE

Details

This method converts a dbMatrix object into an in-memory Matrix::dgCMatrix (sparse = TRUE) or matrix() (default, sparse = FALSE).

Warning: This function can cause memory issues for large dbMatrix objects.

Set sparse = TRUE to convert to a sparse matrix. Set names = TRUE to keep dimnames.

Value

A Matrix::dgCMatrix or matrix


Coerce dbMatrix to dgCMatrix

Description

Coercion methods to convert dbMatrix objects to in-memory dgCMatrix objects. Respects dbMatrix.max_mem_convert option to prevent OOM errors.

Value

A Matrix::dgCMatrix object containing the collected matrix values. Dense inputs are converted to sparse Matrix format after collection.


Coerce dbMatrix to matrix

Description

Coercion methods to convert dbMatrix objects to in-memory matrix objects. Respects dbMatrix.max_mem_convert option to prevent OOM errors.

Value

A base R matrix containing the collected matrix values with the same dimensions and dimnames as the source object.


Coerce matrix to dbMatrix

Description

Coercion methods to convert in-memory matrix objects to dbMatrix objects. Creates a new in-memory DuckDB connection.

Value

A database-backed matrix object. Dense inputs return a dbDenseMatrix, while sparse Matrix::dgCMatrix inputs return a dbSparseMatrix.


Row (column) standard deviations for dbMatrix objects

Description

Calculates the standard deviation for each row (column) of a matrix-like object.

Usage

## S4 method for signature 'dbDenseMatrix'
colSds(
  x,
  rows = NULL,
  cols = NULL,
  na.rm = FALSE,
  center = NULL,
  ...,
  useNames = TRUE
)

## S4 method for signature 'dbSparseMatrix'
colSds(
  x,
  rows = NULL,
  cols = NULL,
  na.rm = FALSE,
  center = NULL,
  ...,
  useNames = TRUE
)

## S4 method for signature 'dbDenseMatrix'
rowSds(
  x,
  rows = NULL,
  cols = NULL,
  na.rm = TRUE,
  center = NULL,
  ...,
  useNames = TRUE
)

## S4 method for signature 'dbSparseMatrix'
rowSds(
  x,
  rows = NULL,
  cols = NULL,
  na.rm = TRUE,
  center = NULL,
  ...,
  useNames = TRUE
)

Arguments

x

A dbMatrix object.

rows

Always NULL for dbMatrix queries. TODO

cols

Always NULL for dbMatrix queries. TODO

na.rm

Always TRUE for dbMatrix queries. Included for compatibility with the generic.

center

Always NULL for dbMatrix queries. Included for compatibility with the generic.

...

Additional arguments (not used, but included for compatibility with the generic).

useNames

Always TRUE for dbMatrix queries. Included for compatibility with the generic.

Value

A named numeric vector containing one sample standard deviation per row or column of x.


Force computation of a dbMatrix

Description

Explicitly compute a dbMatrix and save it to a table in the database. This overrides the default dplyr::compute to use a direct ⁠CREATE TABLE AS⁠ statement, which is more robust for large tables in DuckDB.

Usage

## S3 method for class 'dbMatrix'
compute(
  x,
  name = NULL,
  temporary = TRUE,
  dimnames = TRUE,
  overwrite = FALSE,
  ...
)

Arguments

x

A dbMatrix object

name

Name of the table to create. If NULL, a random name is generated.

temporary

Logical. If TRUE (default), create a temporary table.

dimnames

default = TRUE. If TRUE, the rownames and colnames will be saved in the database. This allows full reconstruction of the dbMatrix object using dbProject::dbLoad().

overwrite

Logical. If TRUE, overwrite the table if it already exists. Default is FALSE.

...

Additional arguments passed to methods (ignored).

Value

A dbMatrix object pointing to the new table.


S4 Class for dbDenseMatrix

Description

Representation of dense matrices using an on-disk database. Inherits from dbMatrix.

Value

Objects of class dbDenseMatrix store all matrix entries explicitly in DuckDB. They are typically returned by dbMatrix() or as.dbMatrix() for dense inputs.


S4 virtual class for dbMatrix

Description

Representation of sparse and dense matrices in a database. Each object is used as a connection to a single table that exists within the database. Inherits from dbData.

Create an S4 dbMatrix object in sparse or dense triplet vector format.

Usage

dbMatrix(
  value,
  class = NULL,
  con = NULL,
  overwrite = FALSE,
  name = "dbMatrix",
  dims = NULL,
  dim_names = NULL,
  mtx_rowname_file_path,
  mtx_rowname_col_idx = 1,
  mtx_colname_file_path,
  mtx_colname_col_idx = 1,
  ...
)

dbMatrix(
  value,
  class = NULL,
  con = NULL,
  overwrite = FALSE,
  name = "dbMatrix",
  dims = NULL,
  dim_names = NULL,
  mtx_rowname_file_path,
  mtx_rowname_col_idx = 1,
  mtx_colname_file_path,
  mtx_colname_col_idx = 1,
  ...
)

Arguments

value

data to be added to the database. See details for supported data types (required)

class

class of the dbMatrix: dbDenseMatrix or dbSparseMatrix (required)

con

DBI or duckdb connection object (required)

overwrite

whether to overwrite if table already exists in database (required)

name

table name to assign within database (required, default: "dbMatrix")

dims

dimensions of the matrix (optional: [int, int])

dim_names

dimension names of the matrix (optional: list(enum, enum))

mtx_rowname_file_path

path to .mtx rowname file to be read into (optional) database. by default, no header is assumed.

mtx_rowname_col_idx

column index of row name file (optional)

mtx_colname_file_path

path to .mtx colname file to be read into database. by default, no header is assumed. (optional)

mtx_colname_col_idx

column index of column name file (optional)

...

additional params to pass to dplyr::copy_to

Details

This function reads in data into a pre-existing DuckDB database. Supported value data types:

Value

dbMatrix() returns an initialized dbDenseMatrix or dbSparseMatrix S4 object that points to matrix data stored in DuckDB.

A dbDenseMatrix or dbSparseMatrix object, depending on class, pointing to matrix data stored in DuckDB. The object records the matrix dimensions and dimension names.

Slots

dim_names

row (1) and col (2) names

dims

dimensions of the matrix

init

logical. Whether the object is fully initialized

Examples

dgc <- readRDS(system.file("extdata", "dgc.rds", package = "dbMatrix"))
con <- DBI::dbConnect(duckdb::duckdb(), ":memory:")
dbSparse <- dbMatrix(
  value = dgc,
  con = con,
  name = "sparse_matrix",
  class = "dbSparseMatrix",
  overwrite = TRUE
)
dbSparse

dbMatrix_from_tbl

Description

Constructs a dbSparseMatrix object from a tbl_duckdb_connection object.

Usage

dbMatrix_from_tbl(
  tbl,
  rownames_colName,
  colnames_colName,
  value_colName = NULL,
  name = "dbMatrix",
  overwrite = FALSE,
  row_names = NULL,
  col_names = NULL,
  i_col = NULL,
  j_col = NULL
)

Arguments

tbl

tbl_duckdb_connection table in DuckDB database in long format

rownames_colName

character column name of rownames in tbl (required)

colnames_colName

character column name of colnames in tbl (required)

value_colName

character column name containing pre-aggregated integer counts. If NULL (default), counts occurrences of each row-column pair. (optional)

name

table name to assign within database (required, default: "dbMatrix")

overwrite

whether to overwrite if table already exists in database (required)

row_names

character vector of pre-computed row names (sorted). If NULL (default), row names are extracted from the table. (optional)

col_names

character vector of pre-computed column names (sorted). If NULL (default), column names are extracted from the table. (optional)

i_col

character column name containing pre-computed row indices (1-based integers). If provided with j_col, skips index encoding for optimal performance. (optional)

j_col

character column name containing pre-computed column indices (1-based integers). If provided with i_col, skips index encoding for optimal performance. (optional)

Details

The tbl_duckdb_connection object must contain dimension names as columns in long format.

If value_colName is provided, the function uses pre-aggregated counts from that column. This is useful when the input table already contains aggregated counts (e.g., from a GROUP BY + SUM operation). If value_colName is NULL (default), the function counts occurrences of each row-column pair.

When row_names and/or col_names are provided, the function uses these directly instead of querying distinct values from the table. This can significantly improve performance when the input table is a complex lazy query (e.g., result of spatial joins).

When i_col and j_col are provided, the function uses these pre-computed integer indices directly, skipping expensive string-to-index encoding. This is the fastest path.

Value

dbMatrix object


dbMatrix Package Global Options

Description

The following global options can be modified to control the behavior of the dbMatrix package.

Details

Use options() to set the below options.

Value

No return value. This documentation page describes package options.

Options


S4 Class for dbSparseMatrix

Description

Representation of sparse matrices using an on-disk database. Inherits from dbMatrix.

Value

Objects of class dbSparseMatrix store only non-zero matrix entries in DuckDB. They are typically returned by dbMatrix() or as.dbMatrix() for sparse inputs.


Perform Streaming SVD on a dbMatrix

Description

Perform Streaming SVD on a dbMatrix

Usage

db_svd(
  dbm,
  k = 10,
  center = TRUE,
  scale = FALSE,
  center_rows = NULL,
  memory_limit = getOption("dbMatrix.svd_memory", 8 * 1024^3),
  return_format = c("svd", "pca")
)

Arguments

dbm

A dbSparseMatrix object

k

Number of singular values to compute

center

Logical, center rows (default TRUE)

scale

Logical, scale rows (default FALSE)

center_rows

Logical, center rows vs columns (default TRUE for standard PCA)

memory_limit

Bytes for Fast Path. Default 8 GB.

return_format

"svd" (d, u, v) or "pca" (eigenvalues, loadings, coords)

Value

List with SVD or PCA components


Dimensions of an Object

Description

Retrieve the dimension of an object.

Usage

## S4 method for signature 'dbMatrix'
dim(x)

Arguments

x

dbMatrix object

Value

An integer vector of length 2 giving the number of rows and columns in x.


get_MM_dim

Description

Internal function to read dimensions of a .mtx file

Usage

get_MM_dim(mtx_file_path)

Arguments

mtx_file_path

path to .mtx file to be read into database

Details

Scans for the header of an mtx file (starting with %) and takes one more line representing the dimensions and number of nonzero values.

Note: the header size can vary depending on the .mtx file.

Value

integer vector of dimensions


get_MM_dimnames

Description

Internal function to read row and column names of a .mtx file

Usage

get_MM_dimnames(
  mtx_file_path,
  mtx_rowname_file_path,
  mtx_rowname_col_idx = 1,
  mtx_colname_file_path,
  mtx_colname_col_idx = 1,
  ...
)

Arguments

mtx_file_path

path to .mtx file to be read into database

mtx_rowname_file_path

path to .mtx rowname file to be read into database. by default, no header is assumed.

mtx_rowname_col_idx

column index of row name file

mtx_colname_file_path

path to .mtx colname file to be read into database. by default, no header is assumed.

mtx_colname_col_idx

column index of column name file

...

additional params to pass to data.table::fread()

Details

Can be used to read row and column names from .mtx files. Note: these files must not contain a header (colnames).

The mtx_rowname_col_idx and mtx_colname_col_idx can be used to specify the column index of the row and column name files, respectively. By default, the first column is used for both.

TODO: Support for reading in only rownames or colnames.

Value

list of row and column name character vectors


get_con

Description

get_con

Usage

get_con(dbMatrix)

Arguments

dbMatrix

A database-backed object inheriting from dbData.

Value

A live DBI connection associated with the database-backed object.


get_dbdir

Description

get_dbdir

Usage

get_dbdir(dbMatrix)

Arguments

dbMatrix

A database-backed object inheriting from dbData.

Value

A character scalar giving the DuckDB database directory used by the object.


get_tblName

Description

get_tblName

Usage

get_tblName(dbMatrix)

Arguments

dbMatrix

A database-backed object inheriting from dbData.

Value

A character scalar giving the DuckDB table name associated with the object.


Return the First or Last Parts of an Object

Description

Returns the first or last parts of a vector, matrix, array, table, data frame or function. Since head() and tail() are generic functions, they have been extended to other classes, including "ts" from stats.

Usage

## S4 method for signature 'dbMatrix'
head(x, n = 6L, ...)

## S4 method for signature 'dbMatrix'
tail(x, n = 6L, ...)

Arguments

x

an object

n

an integer vector of length up to dim(x) (or 1, for non-dimensioned objects). A logical is silently coerced to integer. Values specify the indices to be selected in the corresponding dimension (or along the length) of the object. A positive value of n[i] includes the first/last n[i] indices in that dimension, while a negative value excludes the last/first abs(n[i]), including all remaining indices. NA or non-specified values (when length(n) < length(dim(x))) select all indices in that dimension. Must contain at least one non-missing value.

...

arguments to be passed to or from other methods.

Value

A dbMatrix object containing the first or last n rows of x, with updated dimensions and row names.


Element-wise is.na for dbMatrix

Description

Returns a dbMatrix with numeric values indicating NA positions (1 = NA, 0 = not NA).

Usage

## S4 method for signature 'dbMatrix'
is.na(x)

Arguments

x

A dbMatrix object.

Value

A dbMatrix with same dimensions, containing 1 where the original value was NA and 0 otherwise.

Examples

mat <- matrix(c(1, NA, 3, NA), nrow = 2)
dbmat <- as.dbMatrix(mat)
is.na(dbmat)


Length of a dbMatrix Object

Description

Get or set the length of vectors (including lists) and factors, and of any other R object for which a method has been defined.

Usage

## S4 method for signature 'dbMatrix'
length(x)

Arguments

x

dbMatrix object

Value

A length-one integer giving the number of stored elements in x.


Map dimnames to i,j indices

Description

Map dimnames to i,j indices

Usage

map_ijx_dimnames(dbMatrix, colName_i, colName_j)

Arguments

dbMatrix

dbMatrix object

colName_i

name of column rownames to add to database

colName_j

name of column colnames to add to database default: 'FALSE'.'

Details

Constructs a table in a database that contains the accompanying dimnames for a dbMatrix. The resulting columns in the table:

Value

A lazy tbl_dbi with i, j, x, and the mapped row/column name columns.


Arithmetic Mean for dbMatrix objects

Description

Generic function for the (trimmed) arithmetic mean.

Usage

## S4 method for signature 'dbDenseMatrix'
mean(x, ...)

## S4 method for signature 'dbSparseMatrix'
mean(x, ...)

Arguments

x

dbMatrix object

...

further arguments passed to or from other methods.

Value

A length-one numeric vector giving the arithmetic mean of all entries in x.


The names of a dbMatrix Object

Description

The names of a dbMatrix Object

Usage

## S4 method for signature 'dbDenseMatrix'
names(x)

Arguments

x

A dbMatrix object

Value

A character vector of the names of the 1D dbMatrix object (1D matrices only)


The Number of Rows/Columns of a dbMatrix Object

Description

nrow and ncol return the number of rows or columns present in x.

Usage

nrow.dbMatrix(x)

ncol.dbMatrix(x)

Arguments

x

dbMatrix object

Value

A length-one integer giving the number of rows or columns in x.


Compute a dense COO table in a database connection

Description

Precomputes a COO list table in a specificied database connection in column- major order. This can speed up operations that involve breaking sparsity of a dbSparseMatrix, such as in cases when performing + or - arithmetic operations.

Usage

precompute(conn, m, n, verbose = FALSE)

Arguments

conn

duckdb database connection

m

number of rows of precomputed dbMatrix table

n

number of columns of precomputed dbMatrix table

verbose

logical, print progress messages. default: FALSE.

Details

The m and n parameters must exceed the maximum row and column indices of the dbMatrix in order to be used for densifying any dbMatrix. If these params are less than the maximum row and column indices, a new precomputed table will be automatically generated with the name 'precomp_mXn'.

In such cases, run this function again with a larger n_rows and num_cols, or to manually remove the precomputed table set options(dbMatrix.precomp = NULL) in the R console.

Value

A tbl_dbi object referencing the newly created precomputed lookup table in DuckDB.


Description

Generate array for pretty printing of matrix values

Usage

print_array(
  i = NULL,
  j = NULL,
  x = NULL,
  dims,
  rownames = rep("", dims[1]),
  class = c("sparse", "dense"),
  fill = ".",
  digits = 5L
)

Arguments

i, j, x

matched vectors of integers in i and j, with value in x

dims

dimensions of the array (integer vector of 2)

fill

fill character

digits

default = 5. If numeric, round to this number of digits

Value

No return value. Called for its side effect of printing a formatted matrix preview to the console.


Row (column) means for dbMatrix objects

Description

Calculates the mean for each row (column) of a matrix-like object.

Usage

## S4 method for signature 'dbMatrix'
rowMeans(x, na.rm = FALSE, dims = 1, ...)

## S4 method for signature 'dbMatrix'
colMeans(x, na.rm = FALSE, dims = 1, ...)

Arguments

x

An NxK matrix-like object, a numeric data frame, or an array-like object of two or more dimensions.

na.rm

Always TRUE for dbMatrix queries. Included for compatibility with the generic.

dims

Always 1 for dbMatrix queries. Included for compatibility with the generic.

...

Additional arguments passed to specific methods.

Value

A named numeric vector containing one mean per row or column of x.


Row (column) sums for dbMatrix objects

Description

Calculates the sum for each row (column) of a matrix-like object.

Usage

## S4 method for signature 'dbDenseMatrix'
rowSums(x, na.rm = FALSE, dims = 1, ..., memory = FALSE)

## S4 method for signature 'dbSparseMatrix'
rowSums(x, na.rm = FALSE, dims = 1, ...)

## S4 method for signature 'dbDenseMatrix'
colSums(x, na.rm = FALSE, dims = 1, ...)

## S4 method for signature 'dbSparseMatrix'
colSums(x, na.rm = FALSE, dims = 1, ...)

Arguments

x

An NxK matrix-like object, a numeric data frame, or an array-like object of two or more dimensions.

na.rm

Always TRUE for dbMatrix. Included for compatibility with the generic.

dims

Always 1 for dbMatrix queries. Included for compatibility with the generic.

...

Additional arguments passed to specific methods.

memory

logical. If FALSE (default), results returned as dbDenseMatrix. This is recommended for large computations. Set to TRUE to return the results as a vector.

Value

A named numeric vector containing one sum per row or column of x.


Row (column) variances for dbMatrix objects

Description

Calculates the variance for each row (column) of a matrix-like object.

Usage

## S4 method for signature 'dbDenseMatrix'
rowVars(
  x,
  rows = NULL,
  cols = NULL,
  na.rm = TRUE,
  center = NULL,
  ...,
  useNames = TRUE
)

## S4 method for signature 'dbSparseMatrix'
rowVars(
  x,
  rows = NULL,
  cols = NULL,
  na.rm = TRUE,
  center = NULL,
  ...,
  useNames = TRUE
)

## S4 method for signature 'dbDenseMatrix'
colVars(
  x,
  rows = NULL,
  cols = NULL,
  na.rm = TRUE,
  center = NULL,
  ...,
  useNames = TRUE
)

## S4 method for signature 'dbSparseMatrix'
colVars(
  x,
  rows = NULL,
  cols = NULL,
  na.rm = TRUE,
  center = NULL,
  ...,
  useNames = TRUE
)

Arguments

x

A dbMatrix object.

rows

Always NULL for dbMatrix queries. Included for compatibility with the generic.

cols

Always NULL for dbMatrix queries. Included for compatibility with the generic.

na.rm

Always TRUE for dbMatrix queries. Included for compatibility with the generic.

center

Always NULL for dbMatrix queries. Included for compatibility with the generic.

...

Additional arguments (not used, but included for compatibility with the generic).

useNames

Always TRUE for dbMatrix queries. Included for compatibility with the generic.

Value

A named numeric vector containing one sample variance per row or column of x.


Retrieve and Set Row (Column) Dimension Names of dbMatrix Objects

Description

Retrieve and Set Row (Column) Dimension Names of dbMatrix Objects

Usage

rownames.dbMatrix(x, do.NULL = TRUE, prefix = "row")

## S3 replacement method for class 'dbMatrix'
rownames(x) <- value

colnames.dbMatrix(x, do.NULL = TRUE, prefix = "col")

## S3 replacement method for class 'dbMatrix'
colnames(x) <- value

## S4 method for signature 'dbMatrix'
dimnames(x)

## S4 replacement method for signature 'dbMatrix,list'
dimnames(x) <- value

Arguments

x

a matrix-like R object, with at least two dimensions for colnames.

do.NULL

Not used for this method. Included for compatibility with the generic.

prefix

Not used for this method. Included for compatibility with the generic.

value

a valid value for that component of dimnames(x). For a matrix or array this is either NULL or a character vector of non-zero length equal to the appropriate dimension.

Value

rownames() and colnames() return character vectors of dimension names. dimnames() returns a length-2 list containing row and column name vectors. The replacement forms return the modified dbMatrix object.


sim_dgc

Description

Simulate a dbSparseMatrix in memory

Simulate a dbDenseMatrix in memory.

Usage

sim_duckdb(value = datasets::iris, name = "test", con = NULL, memory = TRUE)

sim_dgc(num_rows = 50, num_cols = 50, n_vals = 50)

sim_denseMat(num_rows = 50, num_cols = 50)

sim_ijx_matrix(mat_type = NULL, num_rows = 50, num_cols = 50, seed_num = 42)

sim_dbSparseMatrix(
  num_rows = 50,
  num_cols = 50,
  seed_num = 42,
  name = "sparse_test",
  memory = FALSE
)

sim_dbDenseMatrix(
  num_rows = 50,
  num_cols = 50,
  seed_num = 42,
  name = "dense_test",
  memory = FALSE
)

Arguments

num_rows

The number of rows in the matrix (default: 50)

num_cols

The number of columns in the matrix (default: 50)

seed_num

The seed number for reproducibility (default: 42)

Details

This function generates a simulated sparse matrix (dgCMatrix) with number of rows and columns and sets n_vals random values to a non-zero value.

This function generates a simulated dense matrix object with a specified number of rows and columns.

This function generates an ijx representation of a simulated dgCMatrix object with a specified number of rows and columns and sets 50 random values to a non-zero value.

Value

A dgCMatrix object

Functions


Matrix Transpose

Description

Given a dbMatrix x, t returns the transpose of x.

Usage

## S4 method for signature 'dbMatrix'
t(x)

Arguments

x

dbMatrix object

Value

dbMatrix object


to_ijx_disk

Description

to_ijx_disk

Usage

to_ijx_disk(con, name)

Arguments

con

duckdb connection

name

name of table to convert to ijx on disk

Value

remote table in long format unpivoted from wide format matrix


Convert dbMatrix to named ijx table

Description

Converts a dbMatrix to a lazy long table where row and column indices are replaced by dimension names.

Usage

to_named_ijx_tbl(
  x,
  row_col = "row_name",
  col_col = "col_name",
  compute = FALSE
)

Arguments

x

A dbMatrix object (dbSparseMatrix or dbDenseMatrix)

row_col

Name for the row-name column (default: "row_name")

col_col

Name for the column-name column (default: "col_name")

compute

Whether to materialize as temp table (default: FALSE)

Value

A lazy tbl with columns: row_col, col_col, x

mirror server hosted at Truenetwork, Russian Federation.