Type: Package
Title: Time-Based Rolling Functions
Version: 0.1.6
Description: Provides rolling statistical functions based on date and time windows instead of n-lagged observations.
URL: https://mps9506.github.io/tbrf/
BugReports: https://github.com/mps9506/tbrf/issues
License: GPL-3 | file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.1
Depends: R (≥ 2.10)
Imports: boot, dplyr, lubridate, purrr, rlang, tibble, tidyr
Suggests: spelling, covr, ggalt, ggplot2, testthat, knitr, rmarkdown
VignetteBuilder: knitr
Language: en-US
Config/Needs/website: mps9506/mpsTemplates
NeedsCompilation: no
Packaged: 2025-04-02 15:12:30 UTC; michael.schramm
Author: Michael Schramm ORCID iD [aut, cre, cph], Frank Harrell [ctb]
Maintainer: Michael Schramm <mpschramm@gmail.com>
Repository: CRAN
Date/Publication: 2025-04-02 16:00:05 UTC

Dissolved oxygen measurements from the Tres Palacios rivers

Description

Data from the Texas Commission on Environmental Quality Surface Water Quality Monitoring Information System. The 'AverageDO“ field is the mean of dissolved oxygen concentrations (mg/L) measured at a field site at that day. The MinDO is the minimum dissolved oxygen concentration measured at that site on that day.

Usage

data(Dissolved_Oxygen)

Format

A data frame with 236 rows and 6 variables:

Station_ID

unique water quality monitoring station identifier

Date

sampling date in yyyy-mm-dd format

Param_Code

unique parameter code

Param_Desc

parameter description with units

Average_DO

mean of dissolved oxygen measurement, in mg/L

Min_DO

minimum of dissolved oxygen measurement, in mg/L

Source

https://www80.tceq.texas.gov/SwqmisPublic/public/default.htm


Confidence Intervals for Binomial Probabilities

Description

An implementation of the binconf function in Frank Harrell's Hmisc package. Produces 1-alpha confidence intervals for binomial probabilities.

Usage

binom_ci(
  x,
  n,
  alpha = 0.05,
  method = c("wilson", "exact", "asymptotic"),
  return.df = FALSE
)

Arguments

x

vector containing the number of "successes" for binomial variates.

n

vector containing the numbers of corresponding observations.

alpha

probability of a type I error, so confidence coefficient = 1-alpha.

method

character string specifying which method to use. The "exact" method uses the F distribution to compute exact (based on the binomial cdf) intervals; the "wilson" interval is score-test-based; and the "asymptotic" is the text-book, asymptotic normal interval. Following Agresti and Coull, the Wilson interval is to be preferred and so is the default.

return.df

logical flag to indicate that a data frame rather than a matrix be returned.

Author(s)

Frank Harrell, modified by Michael Schramm

References

A. Agresti and B.A. Coull, Approximate is better than "exact" for interval estimation of binomial proportions, American Statistician, 52:119–126, 1998.

R.G. Newcombe, Logit confidence intervals and the inverse sinh transformation, American Statistician, 55:200–202, 2001.

L.D. Brown, T.T. Cai and A. DasGupta, Interval estimation for a binomial proportion (with discussion), Statistical Science, 16:101–133, 2001.

Examples

binom_ci(46,50,method="wilson")

Calculates the Geometric Mean

Description

Originally from Paul McMurdie, Ben Bolker, and Gregor on Stack Overflow: https://stackoverflow.com/questions/2602583/geometric-mean-is-there-a-built-in

Usage

gm_mean(x, na.rm = TRUE, zero.propagate = FALSE)

Arguments

x

vector of numeric values

na.rm

logical TRUE/FALSE remove NA values

zero.propagate

logical TRUE/FALSE. Allows the optional propagation of zeros.

Value

the geometric mean of the vector


Returns the Geomean and CI

Description

Generates Geometric mean and confidence intervals using bootstrap.

Usage

gm_mean_ci(
  window,
  conf = 0.95,
  na.rm = TRUE,
  type = "basic",
  R = 1000,
  parallel = "no",
  ncpus = getOption("boot.ncpus", 1L),
  cl = NULL,
  zero.propagate = FALSE
)

Arguments

window

vector of data values

conf

confidence level of the required interval. NA if skipping calculating the bootstrapped CI

na.rm

logical TRUE/FALSE. Remove NAs from the dataset. Defaults TRUE

type

character string, one of c("norm","basic", "stud", "perc", "bca"). "all" is not a valid value. See boot.ci

R

the number of bootstrap replicates. see boot

parallel

The type of parallel operation to be used (if any). see boot

ncpus

integer: number of process to be used in parallel operation. see boot

cl

optional parallel or snow cluster for use if parallel = "snow". see boot

zero.propagate

logical TRUE/FALSE Allows the optional propagation of zeros.

Value

named list with geometric mean and (optionally) specified confidence interval


List NA

Description

function to return tibble with NAs as specified

Usage

list_NA(x)

Arguments

x

named vector

Value

empty tibble


Returns the mean and CI

Description

Generates mean and confidence intervals using bootstrap.

Usage

mean_ci(
  window,
  conf = 0.95,
  na.rm = TRUE,
  type = "basic",
  R = 1000,
  parallel = "no",
  ncpus = getOption("boot.ncpus", 1L),
  cl = NULL
)

Arguments

window

vector of data values

conf

confidence level of the required interval. NA if skipping calculating the bootstrapped CI

na.rm

logical TRUE/FALSE. Remove NAs from the dataset. Defaults TRUE

type

character string, one of c("norm","basic", "stud", "perc", "bca"). "all" is not a valid value. See boot.ci

R

the number of bootstrap replicates. see boot

parallel

The type of parallel operation to be used (if any). see boot

ncpus

integer: number of process to be used in parallel operation. see boot

cl

optional parallel or snow cluster for use if parallel = "snow". see boot

Value

named list with mean and (optionally) specified confidence interval


Returns the median and CI

Description

Generates median and confidence intervals using bootstrap.

Usage

median_ci(
  window,
  conf = 0.95,
  na.rm = TRUE,
  type = "basic",
  R = 1000,
  parallel = "no",
  ncpus = getOption("boot.ncpus", 1L),
  cl = NULL
)

Arguments

window

vector of data values

conf

confidence level of the required interval. NA if skipping calculating the bootstrapped CI

na.rm

logical TRUE/FALSE. Remove NAs from the dataset. Defaults TRUE

type

character string, one of c("norm","basic", "stud", "perc", "bca"). "all" is not a valid value. See boot.ci

R

the number of bootstrap replicates. see boot

parallel

The type of parallel operation to be used (if any). see boot

ncpus

integer: number of process to be used in parallel operation. see boot

cl

optional parallel or snow cluster for use if parallel = "snow". see boot

Value

named list with mean and (optionally) specified confidence interval


Open Window

Description

calculates the period at each row from the row of interest

Usage

open_window(x, tcolumn, unit = "years", n, i)

Arguments

x

dataframe

tcolumn

time column

unit

unit

n

desired n

i

row number

Value

vector


Time-Based Rolling Binomial Probability

Description

Produces a a rolling time-window based vector of binomial probability and confidence intervals.

Usage

tbr_binom(.tbl, x, tcolumn, unit = "years", n, alpha = 0.05)

Arguments

.tbl

dataframe with two variables.

x

indicates the variable column containing "success" and "failure" observations coded as 1 or 0.

tcolumn

indicates the variable column containing Date or Date-Time values.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window in the selected units.

alpha

numeric, probability of a type 1 error, so confidence coefficient = 1-alpha

Value

tibble with binomial point estimate and confidence intervals.

See Also

binom_ci

Examples

## Generate Sample Data
df <- tibble::tibble(
date = sample(seq(as.Date('2000-01-01'), as.Date('2015/12/30'), by = "day"), 100),
value = rbinom(100, 1, 0.25)
)

## Run Function
tbr_binom(df, x = value,
tcolumn = date, unit = "years", n = 5,
alpha = 0.1)

Binomial test based on time window

Description

Binomial test based on time window

Usage

tbr_binom_window(x, tcolumn, unit = "years", n, i, alpha)

Arguments

x

column containing "success" and "failure" observations as 0 or 1

tcolumn

formatted time column

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

i

rows

alpha

numeric, probability of a type 1 error, so confidence coefficient = 1-alpha

Value

list


Time-Based Rolling Geometric Mean

Description

Produces a a rolling time-window based vector of geometric means and confidence intervals.

Usage

tbr_gmean(.tbl, x, tcolumn, unit = "years", n, ...)

Arguments

.tbl

a data frame with at least two variables; time column formatted as date, date/time and value column.

x

column containing the values to calculate the geometric mean.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

...

additional arguments passed to gm_mean_ci

Value

tibble with columns for the rolling geometric mean and upper and lower confidence levels.

See Also

gm_mean_ci

Examples


## Return a tibble with new rolling geometric mean column
tbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5)

## Not run: 
## Return a tibble with rolling geometric mean and 95% CI
tbr_gmean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)
## End(Not run)

Geometric mean based on a time-window

Description

Geometric mean based on a time-window

Usage

tbr_gmean_window(x, tcolumn, unit = "years", n, i, ...)

Arguments

x

column containing the values to calculate the geometric mean.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

i

row

...

additional arguments passed to gmean_ci

Value

list


Time-Based Rolling Mean

Description

Produces a a rolling time-window based vector of means and confidence intervals.

Usage

tbr_mean(.tbl, x, tcolumn, unit = "years", n, ...)

Arguments

.tbl

a data frame with at least two variables; time column formatted as date, date/time and value column.

x

column containing the numeric values to calculate the mean.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

...

additional arguments passed to mean_ci.

Value

tibble with columns for the rolling mean and upper and lower confidence intervals.

See Also

mean_ci

Examples

## Return a tibble with new rolling mean column
tbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5)

## Not run: 
## Return a tibble with rolling mean and 95% CI
tbr_mean(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)
## End(Not run)

Mean Based on a Time-Window

Description

Mean Based on a Time-Window

Usage

tbr_mean_window(x, tcolumn, unit = "years", n, i, ...)

Arguments

x

column containing the values to calculate the mean.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

i

row

...

additional arguments passed to mean_ci

Value

list


Time-Based Rolling Median

Description

Produces a a rolling time-window based vector of medians and confidence intervals.

Usage

tbr_median(.tbl, x, tcolumn, unit = "years", n, ...)

Arguments

.tbl

a data frame with at least two variables; time column formatted as date, date/time and value column.

x

column containing the numeric values to calculate the mean.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

...

additional arguments passed to median_ci

Value

tibble with columns for the rolling median and upper and lower confidence intervals.

See Also

median_ci

Examples

## Return a tibble with new rolling median column
tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years",
n = 5)

## Not run: 
## Return a tibble with rolling median and 95% CI 
tbr_median(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, conf = .95)
## End(Not run)

Median Based on a Time-Window

Description

Median Based on a Time-Window

Usage

tbr_median_window(x, tcolumn, unit = "years", n, i, ...)

Arguments

x

column containing the values to calculate the median.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

i

row

...

additional arguments passed to median_ci

Value

list


Use Generic Functions with Time Windows

Description

Use Generic Functions with Time Windows

Usage

tbr_misc(.tbl, x, tcolumn, unit = "years", n, func, ...)

Arguments

.tbl

a data frame with at least two variables; time column formatted as date, date/time and value column.

x

column containing the values the function is applied to.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

func

specified function

...

optional additional arguments passed to function func

Value

tibble

Examples

tbr_misc(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5, func = mean)

Time-Based Rolling Standard Deviation

Description

Time-Based Rolling Standard Deviation

Usage

tbr_sd(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE)

Arguments

.tbl

a data frame with at least two variables; time column formatted as date, date/time and value column.

x

column containing the values to calculate the standard deviation.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

na.rm

logical. Should missing values be removed?

Value

tibble with column for the rolling sd.

See Also

sd

Examples

tbr_sd(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n = 5)

Standard Deviation Based on a Time-Window

Description

Standard Deviation Based on a Time-Window

Usage

tbr_sd_window(x, tcolumn, unit = "years", n, i, ...)

Arguments

x

column containing the values to calculate the standard deviation.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

i

row

...

additional arguments passed to base::sd()

Value

numeric value


Time-Based Rolling Sum

Description

Time-Based Rolling Sum

Usage

tbr_sum(.tbl, x, tcolumn, unit = "years", n, na.rm = FALSE)

Arguments

.tbl

a data frame with at least two variables; time column formatted as date, date/time and value column.

x

column containing the values to calculate the sum.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

na.rm

logical. Should missing values be removed?

Value

dataframe with column for the rolling sum.

See Also

sum

Examples

tbr_sum(Dissolved_Oxygen, x = Average_DO, tcolumn = Date, unit = "years", n =
5)

Sum Based on a Time-Window

Description

Sum Based on a Time-Window

Usage

tbr_sum_window(x, tcolumn, unit = "years", n, i, na.rm)

Arguments

x

column containing the values to calculate the sum.

tcolumn

formatted time column.

unit

character, one of "years", "months", "weeks", "days", "hours", "minutes", "seconds"

n

numeric, describing the length of the time window.

i

row

na.rm

logical. Should missing values be removed?

Value

numeric value

mirror server hosted at Truenetwork, Russian Federation.