Type: | Package |
Title: | Extract 'REDCap' Databases into Tidy 'Tibble's |
Version: | 1.2.3 |
Description: | Convert 'REDCap' exports into tidy tables for easy handling of 'REDCap' repeat instruments and event arms. |
License: | MIT + file LICENSE |
URL: | https://chop-cgtinformatics.github.io/REDCapTidieR/, https://github.com/CHOP-CGTInformatics/REDCapTidieR |
BugReports: | https://github.com/CHOP-CGTInformatics/REDCapTidieR/issues |
Depends: | R (≥ 3.5.0) |
Imports: | checkmate, cli, dplyr, glue, lobstr, lubridate, purrr, REDCapR (≥ 1.2.0), rlang, stringi, stringr, tibble, tidyr, tidyselect, formattable, pillar, vctrs, readr, stats, forcats |
Suggests: | covr, knitr, labelled, lintr, openxlsx2 (≥ 0.8), prettyunits, rmarkdown, skimr, testthat (≥ 3.0.0), withr |
VignetteBuilder: | knitr |
Config/testthat/edition: | 3 |
Encoding: | UTF-8 |
Language: | en-US |
LazyData: | true |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2025-06-06 15:31:54 UTC; porterej |
Author: | Richard Hanna |
Maintainer: | Richard Hanna <hannar1@chop.edu> |
Repository: | CRAN |
Date/Publication: | 2025-06-06 16:20:05 UTC |
REDCapTidieR: Extract 'REDCap' Databases into Tidy 'Tibble's
Description
Convert 'REDCap' exports into tidy tables for easy handling of 'REDCap' repeat instruments and event arms.
Author(s)
Maintainer: Richard Hanna hannar1@chop.edu (ORCID)
Authors:
Stephan Kadauke kadaukes@chop.edu (ORCID)
Ezra Porter ezrajporter@gmail.com (ORCID)
See Also
Useful links:
Report bugs at https://github.com/CHOP-CGTInformatics/REDCapTidieR/issues
Supplement a supertibble from a longitudinal database with information about the events associated with each instrument
Description
Supplement a supertibble from a longitudinal database with information about the events associated with each instrument
Usage
add_event_mapping(supertbl, linked_arms, repeat_event_types)
Arguments
supertbl |
a supertibble object to supplement with metadata |
linked_arms |
the tibble with event mappings created by
|
repeat_event_types |
a dataframe output from |
Value
The original supertibble with an events redcap_events
list column
containing arms and events associated with each instrument
Add labelled features to write_redcap_xlsx
Description
Helper function to support labelled
aesthetics to XLSX supertibble output
Usage
add_labelled_xlsx_features(
supertbl,
supertbl_meta,
wb,
sheet_vals,
include_toc_sheet = TRUE,
include_metadata_sheet = TRUE,
supertbl_toc = NULL
)
Arguments
supertbl |
a supertibble generated using |
supertbl_meta |
supertibble metadata generated by |
wb |
An |
sheet_vals |
Helper argument passed from |
include_toc_sheet |
Include a sheet capturing the supertibble output.
Default |
include_metadata_sheet |
Include a sheet capturing the combined output of the
supertibble |
supertbl_toc |
The table of contents supertibble defined in the parent
function. Default |
Supplement a supertibble with additional metadata fields
Description
Supplement a supertibble with additional metadata fields
Usage
add_metadata(
supertbl,
db_metadata,
redcap_uri,
token,
suppress_redcapr_messages
)
Arguments
supertbl |
a supertibble object to supplement with metadata |
db_metadata |
a REDCap metadata tibble |
redcap_uri |
The URI/URL of the REDCap server (e.g., "https://server.org/apps/redcap/api/"). Required. |
token |
The user-specific string that serves as the password for a project. Required. |
suppress_redcapr_messages |
A logical to control whether to suppress messages
from REDCapR API calls. Default |
Details
This function assumes that db_metadata
has been processed to
include a row for each option of each multiselection field, i.e. with
update_field_names()
Value
The original supertibble with additional fields:
-
instrument_label
containing labels for each instrument -
redcap_metadata
containing metadata for the fields in each instrument as a list column
Add the metadata sheet
Description
Internal helper function. Adds appropriate elements to wb
object. Returns
a dataframe.
Usage
add_metadata_sheet(
supertbl,
supertbl_meta,
wb,
add_labelled_column_headers,
table_style,
column_width,
na_replace
)
Arguments
supertbl |
a supertibble generated using |
supertbl_meta |
an |
wb |
An |
add_labelled_column_headers |
Whether or not to include labelled outputs. |
table_style |
Any excel table style name or "none" (see "formatting"
in |
column_width |
Width to set columns across the workbook. Default "auto", otherwise a numeric value. Standard Excel is 8.43. |
na_replace |
The value used to replace |
Value
A dataframe
Add partial key helper variables to dataframes
Description
Make helper variables redcap_event
and redcap_arm
available as
branches from var
for later use.
Usage
add_partial_keys(db_data, var = NULL)
Arguments
db_data |
The REDCap database output defined by
|
var |
The unquoted name of the field containing event and arm
identifiers. Default |
Value
Two appended columns, redcap_event
and redcap_arm
to the end of read_redcap
output tibble
s.
Add skimr metrics to a supertibble's metadata
Description
Add default skim metrics to the redcap_data
list elements of
a supertibble output from read_readcap
.
Usage
add_skimr_metadata(supertbl)
Arguments
supertbl |
a supertibble generated using |
Details
For more information on the default metrics provided, check the get_default_skimmer_names documentation.
Value
A supertibble with skimr metadata metrics
Examples
superheroes_supertbl
add_skimr_metadata(superheroes_supertbl)
## Not run:
redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")
supertbl <- read_redcap(redcap_uri, token)
add_skimr_metadata(supertbl)
## End(Not run)
Add the supertbl table of contents sheet
Description
Internal helper function. Adds appropriate elements to wb
object. Returns
a dataframe.
Usage
add_supertbl_toc(
wb,
supertbl,
include_metadata_sheet,
add_labelled_column_headers,
table_style,
column_width,
na_replace
)
Arguments
wb |
An |
supertbl |
a supertibble generated using |
include_metadata_sheet |
Include a sheet capturing the combined output of the
supertibble |
add_labelled_column_headers |
Whether or not to include labelled outputs. |
table_style |
Any excel table style name or "none" (see "formatting"
in |
column_width |
Width to set columns across the workbook. Default "auto", otherwise a numeric value. Standard Excel is 8.43. |
na_replace |
The value used to replace |
Value
A dataframe
Apply factor labels to a vector
Description
Apply factor labels to a vector
Usage
apply_labs_factor(x, labels, ...)
Arguments
x |
a vector to label |
labels |
a named vector of labels in the format |
... |
unused, needed to ignore extra arguments that may be passed |
Details
Dots are needed to ignore ptype
argument that may be passed to apply_labs_haven
Value
factor
Apply haven value labels to a vector
Description
Apply haven value labels to a vector
Usage
apply_labs_haven(x, labels, ptype, ...)
Arguments
x |
a vector to label |
labels |
a named vector of labels in the format |
ptype |
vector to serve as prototype for label values |
... |
unused, needed to ignore extra arguments that may be passed |
Details
Assumes a check_installed() has been run for labelled
. Since haven
preserves the
underlying data values we need to make sure the data type of the value options in the metadata matches
the data type of the values in the actual data. This function accepts a prototype, usually a column
from db_data, and uses force_cast()
to do a best-effort casting of the value options in the metadata
to the same data type as ptype
. The fallback is to convert x
and the value labels to character.
Value
haven_labelled
vector
Add supertbl S3 class
Description
Add supertbl S3 class
Usage
as_supertbl(x)
Arguments
x |
an object to class |
Value
The object with redcaptidier_supertbl
S3 class
Bind supertbl metadata
Description
Simple helper function for binding supertbl metadata into one table. This
supports creating the metadata XLSX sheet as well as supertbl_recode
.
Usage
bind_supertbl_metadata(supertbl)
Arguments
supertbl |
A supertibble generated using |
Extract data tibbles from a REDCapTidieR supertibble and bind them to an environment
Description
Take a supertibble generated with read_redcap()
and bind its data tibbles (i.e. the tibbles in the redcap_data
column) to
an environment. The default is the global environment.
Usage
bind_tibbles(supertbl, environment = global_env(), tbls = NULL)
Arguments
supertbl |
A supertibble generated by |
environment |
The environment to bind the tibbles to. Default is
|
tbls |
A vector of the |
Value
This function returns nothing as it's used solely for its side effect of modifying an environment.
Examples
## Not run:
# Create an empty environment
my_env <- new.env()
ls(my_env)
superheroes_supertbl
bind_tibbles(superheroes_supertbl, my_env)
ls(my_env)
## End(Not run)
Utility function to calculate summary for each tibble in a supertibble
Description
Utility function to calculate summary for each tibble in a supertibble
Usage
calc_metadata_stats(data)
Arguments
data |
a tibble of redcap data stored in the |
Value
A list containing:
-
data_rows
, the number of rows in the data -
data_cols
, the number of columns in the data -
data_size
, the size of the data in bytes -
data_na_pct
, the percentage of cells that are NA excluding identifiers and form completion fields
Check requested data argument exists in REDCap data
Description
Provide an error message when an argument is requested, but is not found in
any read_redcap()
redcap_data
output.
Usage
check_data_arg_exists(db_data, col, arg, call = caller_env())
Arguments
db_data |
The REDCap database output generated by
|
col |
The column to check for in |
arg |
The argument used for the column check |
call |
The calling environment to use in the error message |
Details
Currently used for the following arguments:
export_survey_fields
:*_timestamp
export_data_access_groups
:redcap_data_access_group
Value
An error message saying the requested data does not exist
Check equal distinct values between two columns
Description
Takes a dataframe and two columns and checks if n_distinct
of the second
column is all unique based on grouping of the first column.
Usage
check_equal_col_summaries(data, col1, col2, call = caller_env())
Arguments
data |
a dataframe |
col1 |
a column to group by |
col2 |
a column to check for uniqueness |
Check data field for field values not in metadata
Description
Check data field for field values not in metadata
Usage
check_extra_field_values(x, values)
Arguments
x |
data field |
values |
expected field values |
Parse logical field and compile data for warning if parsing errors occurred
Description
Parse logical field and compile data for warning if parsing errors occurred
Usage
check_field_is_logical(x)
Arguments
x |
vector to parse |
Check fields are of checkbox field type
Description
Check fields are of checkbox field type
Usage
check_fields_are_checkboxes(metadata_tbl, call = caller_env())
Arguments
metadata_tbl |
A metadata tibble from a supertibble |
call |
The calling environment to use in the error message |
Check fields exist for checkbox combination
Description
Check fields exist for checkbox combination
Usage
check_fields_exist(fields, expr, call = caller_env())
Arguments
fields |
Vector of character strings to check the length of |
expr |
An expression |
call |
The calling environment to use in the error message |
Check if file already exists
Description
Provide an error message when a file is declared for writing that already exists.
Usage
check_file_exists(file, overwrite, call = caller_env())
Arguments
file |
The file that is being checked |
overwrite |
Whether the file was declared to be overwritten |
call |
The calling environment to use in the error message |
Details
In the case of write_redcap_xlsx()
, this should only error when a file
already exists and is not declared for overwite
.
Value
An error message saying the requested file already exists
Check that all requested instruments are in REDCap project metadata
Description
Provide an error message when any instrument names are passed to
read_redcap()
that do not exist in the project metadata.
Usage
check_forms_exist(db_metadata, forms, call = caller_env())
Arguments
db_metadata |
The metadata file read by
|
forms |
The character vector of instrument names passed to
|
call |
the calling environment to use in the error message |
Value
An error message listing the requested instruments that don't exist
Check if labelled
Description
Checks if a supplied supertibble is labelled and throws an error if it is not
but labelled
is set to TRUE
Usage
check_labelled(supertbl, add_labelled_column_headers, call = caller_env())
Arguments
supertbl |
a supertibble generated using |
add_labelled_column_headers |
Whether or not to include labelled outputs |
call |
the calling environment to use in the warning message |
Value
A boolean
Check metadata fields exist for checkbox combination
Description
Similar to check_fields_exist()
, but instead of verifying fields that exist
in the data tibble this seeks to verify their existence under the metadata
tibble field_name
s.
Usage
check_metadata_fields_exist(metadata_tbl, cols, call = caller_env())
Arguments
metadata_tbl |
A metadata tibble from the supertibble generated by |
cols |
Selected columns identified for |
call |
The calling environment to use in the error message |
Check that parsed labels are not duplicated
Description
Check that parsed labels are not duplicated
Usage
check_parsed_labels(
parsed_labels_output,
field_name,
warn_stripped_text = FALSE,
call = caller_env(n = 2)
)
Arguments
parsed_labels_output |
a vector of parsed labels produced by |
field_name |
the name of the field associated with the labels to use in the warning message |
warn_stripped_text |
logical for whether to include a note about HTML tag stripping in the message |
call |
the calling environment to use in the error message. The parent of calling environment by default because this check usually occurs 2 frames below the relevant context for the user |
Value
a warning message alerting specifying the duplicate labels and REDCap field affected
Check that a supplied REDCap database is populated
Description
Check for potential outputs where metadata is present, but nrow
and
ncol
equal 0
. This causes multi_choice_to_labels
to fail, but
a helpful error message should be provided.
Usage
check_redcap_populated(db_data, call = caller_env())
Arguments
db_data |
The REDCap database output generated by
|
call |
the calling environment to use in the error message |
Value
A helpful error message alerting the user to check their API privileges.
Check for instruments that have both repeating and non-repeating structure
Description
Check for potential instruments that are given both repeating and
nonrepeating structure. REDCapTidieR
does not support database
structures built this way.
Usage
check_repeat_and_nonrepeat(db_data, call = caller_env())
Arguments
db_data |
The REDCap database output generated by
|
call |
the calling environment to use in the error message |
Value
A helpful error message alerting the user to existence of an instrument that was designated both as repeating and non-repeating.
Check that all metadata tibbles within a supertibble contain
field_name
and field_label
columns
Description
Check that all metadata tibbles within a supertibble contain
field_name
and field_label
columns
Usage
check_req_labelled_metadata_fields(supertbl, call = caller_env())
Arguments
supertbl |
a supertibble containing a |
call |
the calling environment to use in the error message |
Value
an error message alerting that instrument metadata is incomplete
Check for possible API user privilege issues
Description
Check for potential user access privilege issues and provide an appropriate warning message. This can occur when metadata forms/field names do not appear in a database export.
Usage
check_user_rights(db_data, db_metadata, call = caller_env())
Arguments
db_data |
The REDCap database output generated by
|
db_metadata |
The REDCap metadata output generated by |
call |
the calling environment to use in the warning message |
Value
A helpful error message alerting the user to check their API privileges.
Check an argument with checkmate
Description
Check an argument with checkmate
Usage
check_arg_is_supertbl(
x,
req_cols = c("redcap_data", "redcap_metadata"),
arg = caller_arg(x),
call = caller_env()
)
check_arg_is_env(x, ..., arg = caller_arg(x), call = caller_env())
check_arg_is_character(x, ..., arg = caller_arg(x), call = caller_env())
check_arg_is_logical(x, ..., arg = caller_arg(x), call = caller_env())
check_arg_choices(x, ..., arg = caller_arg(x), call = caller_env())
check_arg_is_valid_token(x, arg = caller_arg(x), call = caller_env())
check_arg_is_valid_extension(
x,
valid_extensions,
arg = caller_arg(x),
call = caller_env()
)
Arguments
x |
An object to check |
req_cols |
required fields for |
arg |
The name of the argument to include in an error message. Captured
by |
call |
the calling environment to use in the error message |
... |
additional arguments passed on to checkmate |
Value
TRUE
if x
passes the checkmate check. An error otherwise with the name of
the checkmate function as a class
Extract non-longitudinal REDCap databases into tidy tibbles
Description
Helper function internal to read_redcap
responsible for
extraction and final processing of a tidy tibble
to the user from
a non-longitudinal REDCap database.
Usage
clean_redcap(db_data, db_metadata)
Arguments
db_data |
The REDCap database output defined by
|
db_metadata |
The REDCap metadata output defined by
|
Value
Returns a tibble
with list elements containing tidy dataframes. Users
can access dataframes under the redcap_data
column with reference to
form_name
and structure
column details.
Extract longitudinal REDCap databases into tidy tibbles
Description
Helper function internal to read_redcap
responsible for
extraction and final processing of a tidy tibble
to the user from
a longitudinal REDCap database.
Usage
clean_redcap_long(
db_data_long,
db_metadata_long,
linked_arms,
allow_mixed_structure = FALSE
)
Arguments
db_data_long |
The longitudinal REDCap database output defined by
|
db_metadata_long |
The longitudinal REDCap metadata output defined by
|
linked_arms |
Output of |
allow_mixed_structure |
A logical to allow for support of mixed repeating/non-repeating
instruments. Setting to |
Value
Returns a tibble
with list elements containing tidy dataframes. Users
can access dataframes under the redcap_data
column with reference to
form_name
and structure
column details.
Combine checkbox fields with respect to repaired outputs
Description
This function seeks to preserve the original data columns and types from the originally supplied data_tbl and add on the new columns from data_tbl_mod.
If names_repair
presents a repair strategy, the output columns will be
captured and updated here while dropping the original columns.
Usage
combine_and_repair_tbls(data_tbl, data_tbl_mod, new_cols, names_repair)
Arguments
data_tbl |
The original data table given to |
data_tbl_mod |
A modified data table from |
new_cols |
The new columns created for checkbox combination |
names_repair |
What happens if the output has invalid column names?
The default, "check_unique" is to error if the columns are duplicated.
Use "minimal" to allow duplicates in the output, or "unique" to de-duplicated
by adding numeric suffixes. See |
Value
a tibble
Combine Checkbox Fields into a Single Column
Description
combine_checkboxes()
consolidates multiple checkbox fields in a REDCap data
tibble into a single column. This transformation simplifies analysis by
merging several binary columns into one labeled factor column, making the
data more interpretable and easier to analyze.
Usage
combine_checkboxes(
supertbl,
tbl,
cols,
names_prefix = "",
names_sep = "_",
names_glue = NULL,
names_repair = "check_unique",
multi_value_label = "Multiple",
values_fill = NA,
raw_or_label = "label",
keep = TRUE
)
Arguments
supertbl |
A supertibble generated by |
tbl |
The |
cols |
Checkbox columns to combine to single column. Required. |
names_prefix |
String added to the start of every variable name. |
names_sep |
String to separate new column names from |
names_glue |
Instead of |
names_repair |
What happens if the output has invalid column names?
The default, "check_unique" is to error if the columns are duplicated.
Use "minimal" to allow duplicates in the output, or "unique" to de-duplicated
by adding numeric suffixes. See |
multi_value_label |
A string specifying the value to be used when multiple checkbox fields are selected. Default "Multiple". |
values_fill |
Value to use when no checkboxes are selected. Default |
raw_or_label |
Either 'raw' or 'label' to specify whether to use raw coded values or labels for the options. Default 'label'. |
keep |
Logical indicating whether to keep the original checkbox fields in
the output. Default |
Details
combine_checkboxes()
operates on the data and metadata tibbles produced by
the read_redcap()
function. Since it relies on the checkbox field naming
conventions used by REDCap, changes to the checkbox variable names or their
associated metadata field_name
s could lead to errors.
REDCap checkbox fields are typically expanded into separate variables for each
checkbox option, with names formatted as checkbox_var___1
, checkbox_var___2
,
etc. combine_checkboxes()
detects these variables and combines them into a
single column. If the expected variables are not found, an error is returned.
Value
A modified supertibble.
Examples
library(dplyr)
# Set up sample data tibble
data_tbl <- tibble::tribble(
~"study_id", ~"multi___1", ~"multi___2", ~"multi___3",
1, TRUE, FALSE, FALSE,
2, TRUE, TRUE, FALSE,
3, FALSE, FALSE, FALSE
)
# Set up sample metadata tibble
metadata_tbl <- tibble::tribble(
~"field_name", ~"field_type", ~"select_choices_or_calculations",
"study_id", "text", NA,
"multi___1", "checkbox", "1, Red | 2, Yellow | 3, Blue",
"multi___2", "checkbox", "1, Red | 2, Yellow | 3, Blue",
"multi___3", "checkbox", "1, Red | 2, Yellow | 3, Blue"
)
# Create sample supertibble
supertbl <- tibble::tribble(
~"redcap_form_name", ~"redcap_data", ~"redcap_metadata",
"tbl", data_tbl, metadata_tbl
)
class(supertbl) <- c("redcap_supertbl", class(supertbl))
# Combine checkboxes under column "multi"
combine_checkboxes(
supertbl = supertbl,
tbl = "tbl",
cols = starts_with("multi")
) %>%
dplyr::pull(redcap_data) %>%
dplyr::first()
## Not run:
redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")
supertbl <- read_redcap(redcap_uri, token)
combine_checkboxes(
supertbl = supertbl,
tbl = "tbl",
cols = starts_with("col"),
multi_value_label = "Multiple",
values_fill = NA
)
## End(Not run)
Convert a new checkbox column's values
Description
This function takes a single column of data and converts the values based on the overall data tibble cross referenced with a nested section of the metadata tibble.
case_when
logic helps determine whether the value is a coalesced singular
value or a user-specified one via multi_value_label
or values_fill
.
Usage
convert_checkbox_vals(
metadata,
.new_value,
data_tbl,
raw_or_label,
multi_value_label,
values_fill
)
Arguments
metadata |
A nested portion of the overall metadata tibble |
.new_value |
The new column values made by |
data_tbl |
The data tibble from the original supertibble |
raw_or_label |
Either 'raw' or 'label' to specify whether to use raw coded values or labels for the options. Default 'label'. |
multi_value_label |
A string specifying the value to be used when multiple checkbox fields are selected. Default "Multiple". |
values_fill |
Value to use when no checkboxes are selected. Default |
Details
This function is used in conjunction with pmap()
.
Convert Mixed Structure Instruments to Repeating Instruments
Description
For longitudinal projects where users set allow_mixed_structure
to TRUE
,
this function will handle the process of setting the nonrepeating parts of the
instrument to repeating ones with a single instance.
Usage
convert_mixed_instrument(db_data_long, mixed_structure_ref)
Arguments
db_data_long |
The longitudinal REDCap database output defined by
|
mixed_structure_ref |
Reference dataframe containing mixed structure fields and forms. |
Value
Returns a tibble
with list elements containing tidy dataframes. Users
can access dataframes under the redcap_data
column with reference to
form_name
and structure
column details.
Utility function to convert redcap repeat instance columns into appropriate form and event columns
Description
Utility function to convert redcap repeat instance columns into appropriate form and event columns
Usage
create_repeat_instance_vars(db_data)
Arguments
db_data |
The REDCap database output generated by
|
Details
The output of a standard REDCap export with repeating forms and/or events
makes use of redcap_repeat_instance
in combination with
redcap_repeat_instrument
and whether or not data exists in both. Instead,
rename and separate redcap_repeat_instance
into redcap_form_instance
and
redcap_event_instance
.
Value
A dataframe.
Check whether a REDCap database has repeat forms
Description
Simple utility function checking for the existence of repeat forms in a REDCap database.
Usage
db_has_repeat_forms(db_data)
Arguments
db_data |
A REDCap dataframe. |
Value
A boolean.
Extract non-repeat tables from non-longitudinal REDCap databases
Description
Sub-helper function to clean_redcap
for single nonrepeat table
extraction.
Usage
distill_nonrepeat_table(form_name, db_data, db_metadata)
Arguments
form_name |
The |
db_data |
The REDCap database output defined by
|
db_metadata |
The REDCap metadata output defined by
|
Value
A subset tibble
of all data related to a specified form_name
Extract non-repeat tables from longitudinal REDCap databases
Description
Sub-helper function to clean_redcap_long
for single nonrepeat table
extraction.
Usage
distill_nonrepeat_table_long(
form_name,
db_data_long,
db_metadata_long,
linked_arms
)
Arguments
form_name |
The |
db_data_long |
The REDCap database output defined by
|
db_metadata_long |
The REDCap metadata output defined by
|
linked_arms |
Output of |
Value
A tibble
of all data related to a specified form_name
Extract repeat tables from non-longitudinal REDCap databases
Description
Sub-helper function to clean_redcap
for single repeat table
extraction.
Usage
distill_repeat_table(form_name, db_data, db_metadata)
Arguments
form_name |
The |
db_data |
The non-longitudinal REDCap database output defined by
|
db_metadata |
The non-longitudinal REDCap metadata output defined by
|
Value
A subset tibble
of all data related to a specified form_name
Extract repeat tables from longitudinal REDCap databases
Description
Sub-helper function to clean_redcap_long
for single repeat table
extraction.
Usage
distill_repeat_table_long(
form_name,
db_data_long,
db_metadata_long,
linked_arms,
has_mixed_structure_forms = FALSE,
mixed_structure_ref = NULL
)
Arguments
form_name |
The |
db_data_long |
The REDCap database output defined by
|
db_metadata_long |
The REDCap metadata output defined by
|
linked_arms |
Output of |
has_mixed_structure_forms |
Whether the instrument under evaluation has a mixed
structure. Default |
mixed_structure_ref |
A mixed structure reference dataframe supplied
by |
Value
A tibble
of all data related to a specified form_name
Extract a specific metadata tibble from a supertibble
Description
Utility function to extract a specific metadata tibble from a supertibble
given a redcap_form_name
Usage
extract_metadata_tibble(supertbl, redcap_form_name)
Arguments
supertbl |
A supertibble generated by |
redcap_form_name |
A character string identifying the |
Value
A tibble
Extract a single data tibble from a REDCapTidieR supertibble
Description
Take a supertibble generated with read_redcap()
and return one of its data tibbles.
Usage
extract_tibble(supertbl, tbl)
Arguments
supertbl |
A supertibble generated by |
tbl |
The |
Details
This function makes it easy to extract a single instrument's data from a REDCapTidieR supertibble.
Value
A tibble
.
Examples
superheroes_supertbl
extract_tibble(superheroes_supertbl, "heroes_information")
Extract data tibbles from a REDCapTidieR supertibble into a list
Description
Take a supertibble generated with read_redcap()
and return a named list of data tibbles.
Usage
extract_tibbles(supertbl, tbls = everything())
Arguments
supertbl |
A supertibble generated by |
tbls |
A vector of |
Details
This function makes it easy to extract a multiple instrument's data from a
REDCapTidieR supertibble into a named list. Specifying instruments using
tidyselect helper functions such as dplyr::starts_with()
or dplyr::ends_with()
is supported.
Value
A named list of tibble
s
Examples
superheroes_supertbl
# Extract all data tibbles
extract_tibbles(superheroes_supertbl)
# Only extract data tibbles starting with "heroes"
extract_tibbles(superheroes_supertbl, starts_with("heroes"))
Format REDCap variable labels
Description
Use these functions with the format_labels
argument of
make_labelled()
to define how variable labels should be formatted before
being applied to the data columns of redcap_data
. These functions are
helpful to create pretty variable labels from REDCap field labels.
-
fmt_strip_whitespace()
removes extra white space inside and at the start and end of a string. It is a thin wrapper ofstringr::str_trim()
andstringr::str_squish()
. -
fmt_strip_trailing_colon()
removes a colon character at the end of a string. -
fmt_strip_trailing_punct()
removes punctuation at the end of a string. -
fmt_strip_html()
removes html tags from a string. -
fmt_strip_field_embedding()
removes text between curly braces{}
which REDCap uses for special "field embedding" logic. Note thatread_redcap()
removes html tags and field embedding logic from field labels in the metadata by default.
Usage
fmt_strip_whitespace(x)
fmt_strip_trailing_colon(x)
fmt_strip_trailing_punct(x)
fmt_strip_html(x)
fmt_strip_field_embedding(x)
Arguments
x |
a character vector |
Value
a modified character vector
Examples
fmt_strip_whitespace("Poorly Spaced Label ")
fmt_strip_trailing_colon("Label:")
fmt_strip_trailing_punct("Label-")
fmt_strip_html("<b>Bold Label</b>")
fmt_strip_field_embedding("Label{another_field}")
superheroes_supertbl
make_labelled(superheroes_supertbl, format_labels = fmt_strip_trailing_colon)
Format value for error message
Description
Format value for error message
Usage
format_error_val(x)
Arguments
x |
value to format |
Value
If x is atomic, x with cli formatting to truncate to 5 values. Otherwise, a string summarizing x produced by as_label
Determine fields included in REDCapR::redcap_read_oneshot
output
that should be dropped from results of read_redcap
Description
Determine fields included in REDCapR::redcap_read_oneshot
output
that should be dropped from results of read_redcap
Usage
get_fields_to_drop(db_metadata, form)
Arguments
db_metadata |
metadata tibble created by
|
form |
the name of the instrument containing identifiers |
Details
This function applies rules to determine which fields are included in the
results of REDCapR::redcap_read_oneshot
because the user didn't
request the instrument containing identifiers
Value
A character vector of extra field names that can be used to filter the
results of REDCapR::redcap_read_oneshot
Get metadata specification table
Description
Get metadata specification table
Usage
get_metadata_spec(
metadata_tbl,
selected_cols,
names_prefix,
names_sep,
names_glue
)
Arguments
metadata_tbl |
A metadata tibble from the supertibble generated by |
selected_cols |
Character string vector of field names for checkbox combination |
names_prefix |
String added to the start of every variable name. |
names_sep |
String to separate new column names from |
names_glue |
Instead of |
Value
a tibble
Get Mixed Structure Instrument List
Description
Define fields in a given project that are used in both a repeating and nonrepeating manner.
Usage
get_mixed_structure_fields(db_data)
Arguments
db_data |
The REDCap database output generated by
|
Value
a dataframe
Utility function to extract the name of the project identifier field for a tibble of REDCap data
Description
Utility function to extract the name of the project identifier field for a tibble of REDCap data
Usage
get_record_id_field(data)
Arguments
data |
a tibble of REDCap data |
Details
The current implementation assumes that the first field in the data is the project identifier
Value
The name of the identifier field in the data
Add identification for repeat event types
Description
To correctly assign repeat event types a few assumptions must be made:
There are only 3 behaviors: nonrepeating, repeat_separately, and repeat_together
If an event only shows
redcap_repeat_instance
andredcap_repeat_instrument
asNA
, it can be considered a nonrepeat event.If an event is always
NA
forredcap_repeat_instrument
and filled forredcap_repeat_instance
it can be assumed to be a repeat_together eventrepeat_separate and nonrepeating event types exhibit the same behavior along the primary keys of the data. nonrepeating event types can have data display with
redcap_repeat_instance
values both filled and asNA
. If this is the case, it can be assumed the event is a repeating separate event.
Usage
get_repeat_event_types(data)
Arguments
data |
the REDCap data |
Value
A dataframe with unique event names mapped to their corresponding repeat types
Swap vector names for values
Description
Swap vector names for values
Usage
invert_vec(x)
Arguments
x |
a vector |
Value
Vector with names and values reversed
Determine if an object is labelled
Description
An internal utility function used to inform other processes of whether or
not a given object has been labelled (i.e. with make_labelled()
).
Usage
is_labelled(obj)
Arguments
obj |
An object to be tested for "label" attributes |
Details
An object is considered labelled if it has "label" attributes.
Value
A boolean
Link longitudinal REDCap instruments with their events/arms
Description
For REDCap databases containing arms and events, it is necessary to determine how these are linked and what variables belong to them.
Usage
link_arms(redcap_uri, token, suppress_redcapr_messages = TRUE)
Arguments
redcap_uri |
The REDCap URI |
token |
The REDCap API token |
suppress_redcapr_messages |
A logical to control whether to suppress messages
from REDCapR API calls. Default |
Value
Returns a tibble
of redcap_event_name
s with list elements
containing a vector of associated instruments.
Apply variable labels to a REDCapTidieR supertibble
Description
Take a supertibble and use the labelled
package to apply variable labels to
the columns of the supertibble as well as to each tibble in the
redcap_data
, redcap_metadata
, and redcap_events
columns
of that supertibble.
Usage
make_labelled(supertbl, format_labels = NULL)
Arguments
supertbl |
a supertibble generated using |
format_labels |
one or multiple optional label formatting functions. A label formatting function is a function that takes a character vector and returns a modified character vector of the same length. This function is applied to field labels before attaching them to variables. One of:
|
Details
The variable labels for the data tibbles are derived from the field_label
column of the metadata tibble.
Value
A labelled supertibble.
Examples
superheroes_supertbl
make_labelled(superheroes_supertbl)
make_labelled(superheroes_supertbl, format_labels = tolower)
## Not run:
redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")
supertbl <- read_redcap(redcap_uri, token)
make_labelled(supertbl)
## End(Not run)
Make skimr labels from default skimr outputs
Description
A simple helper function that returns all default skimr
names as formatted
character vector for use in make_lablled
Usage
make_skimr_labels()
Details
All labels supplied are manually created and agreed upon as human-readable
Value
A character vector
Update multiple choice fields with label data
Description
Update REDCap variables with multi-choice types to standard form labels taken from REDCap metadata.
Usage
multi_choice_to_labels(
db_data,
db_metadata,
raw_or_label = "label",
call = caller_env()
)
Arguments
db_data |
A REDCap database object |
db_metadata |
A REDCap metadata object |
raw_or_label |
A string (either 'raw', 'label', or 'haven') that specifies whether
to export the raw coded values or the labels for the options of categorical
fields. Default is 'label'. If 'haven' is supplied, categorical fields are converted
to |
call |
call for conditions |
Details
Coerce variables of field_type
"truefalse", "yesno", and "checkbox" to
logical. Introduce form_status_complete
column and append to end of
tibble
outputs. Ensure field_type
s "dropdown" and "radio" are
converted appropriately since label appendings are important and unique to
these.
Parse labels from REDCap metadata into usable formats
Description
Takes a string separated by ,
s and/or |
s (i.e. comma/tab
separated values) containing key value pairs (raw
and label
)
and returns a tidy tibble
.
Usage
parse_labels(string, return_vector = FALSE, return_stripped_text_flag = FALSE)
Arguments
string |
A |
return_vector |
logical for whether to return result as a vector |
return_stripped_text_flag |
logical for whether to return a flag indicating whether or not text was stripped from labels |
Details
The associated string
comes from metadata outputs.
Value
A tidy tibble
from a matrix giving raw and label outputs to
be used in later functions if return_vector = FALSE
, the default.
Otherwise a vector result in a c(raw = label) format to use with
dplyr::recode
Convert yesno, truefalse, and checkbox fields to logical
Description
Convert yesno, truefalse, and checkbox fields to logical
Usage
parse_logical_cols(db_data, db_metadata, call = caller_env())
Arguments
db_data |
A REDCap database object |
db_metadata |
A REDCap metadata object |
call |
call for conditions |
Import a REDCap database into a tidy supertibble
Description
Query the REDCap API to retrieve data and metadata about a project, and transform the output into a "supertibble" that contains data and metadata organized into tibbles, broken down by instrument.
Usage
read_redcap(
redcap_uri,
token,
raw_or_label = "label",
forms = NULL,
export_survey_fields = NULL,
export_data_access_groups = NULL,
suppress_redcapr_messages = TRUE,
guess_max = Inf,
allow_mixed_structure = getOption("redcaptidier.allow.mixed.structure", FALSE)
)
Arguments
redcap_uri |
The URI/URL of the REDCap server (e.g., "https://server.org/apps/redcap/api/"). Required. |
token |
The user-specific string that serves as the password for a project. Required. |
raw_or_label |
A string (either 'raw', 'label', or 'haven') that specifies whether
to export the raw coded values or the labels for the options of categorical
fields. Default is 'label'. If 'haven' is supplied, categorical fields are converted
to |
forms |
A character vector of REDCap instrument names that specifies
which instruments to import. Default is |
export_survey_fields |
A logical that specifies whether to export
survey identifier and timestamp fields. The default, |
export_data_access_groups |
A logical that specifies whether to export
the data access group field. The default, |
suppress_redcapr_messages |
A logical to control whether to suppress messages
from REDCapR API calls. Default |
guess_max |
A positive base::numeric value
passed to |
allow_mixed_structure |
A logical to allow for support of mixed repeating/non-repeating
instruments. Setting to |
Details
This function uses the REDCapR
package to query the REDCap API. The REDCap API returns a
block matrix that mashes
data from all data collection instruments
together. The read_redcap()
function
deconstructs the block matrix and splices the data into individual tibbles,
where one tibble represents the data from one instrument.
Value
A tibble
in which each row represents a REDCap instrument. It
contains the following columns:
-
redcap_form_name
, the name of the instrument -
redcap_form_label
, the label for the instrument -
redcap_data
, a tibble with the data for the instrument -
redcap_metadata
, a tibble of data dictionary entries for each field in the instrument -
redcap_events
, a tibble with information about the arms and longitudinal events represented in the instrument. Only if the project has longitudinal events enabled -
structure
, the instrument structure, either "repeating" or "nonrepeating" -
data_rows
, the number of rows in the instrument's data tibble -
data_cols
, the number of columns in the instrument's data tibble -
data_size
, the size in memory of the instrument's data tibble computed bylobstr::obj_size()
-
data_na_pct
, the percentage of cells in the instrument's data columns that areNA
excluding identifier and form completion columns
Examples
## Not run:
redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")
read_redcap(
redcap_uri,
token,
raw_or_label = "label"
)
## End(Not run)
Additional release questions
Description
Additional release questions to be added when using devtools::release()
during CRAN submissions.
Usage
release_questions()
Details
This follows the documentation provided in devtools::release()
.
Value
A series of character string questions
Remove rows with empty data
Description
Remove rows that are empty in all associated data columns (those derived from fields in REDCap). This occurs when a form is filled out in an event, but other forms are not. Regardless of a form's status, all forms in an event are included in the output so long as any form in the event contains data.
This only applies to longitudinal REDCap databases containing events.
Usage
remove_empty_rows(data, my_record_id)
Arguments
data |
A REDCap dataframe from a longitudinal database,
pre-processed within a |
my_record_id |
The record ID defined in the project. |
Value
A dataframe.
Replace checkbox TRUEs with raw_or_label values
Description
Simple utility function for replacing checkbox field values.
Usage
replace_true(col, col_name, metadata, raw_or_label)
Arguments
col |
A vector |
col_name |
A string |
metadata |
A metadata tibble from the original supertibble |
raw_or_label |
Either 'raw' or 'label' to specify whether to use raw coded values or labels for the options. Default 'label'. |
Value
A character string
Convert user input into label formatting function
Description
Convert user input into label formatting function
Usage
resolve_formatter(format_labels, env = caller_env(n = 2), call = caller_env())
Arguments
format_labels |
argument passed to |
env |
the environment in which to look up functions if
|
call |
the calling environment to use in the error message |
Value
a function
Safely set variable labels
Description
A utility function for setting labels of a tibble from a named vector while accounting for labels that may not be present in the data.
Usage
safe_set_variable_labels(data, labs)
Value
A tibble
Apply applicable skimmers to data
Description
A helper function for add_skimr_metadata()
which applies applicable
skimmers to a given dataframe.
Usage
skim_data(redcap_data, redcap_metadata, is_labelled)
Value
A dataframe
Remove html tags and field embedding logic from a string
Description
Remove html tags and field embedding logic from a string
Usage
strip_html_field_embedding(x)
Arguments
x |
vector of strings to format |
Value
vector of strings with html tags, field embedding logic, and extra whitespace removed
Superheroes Data
Description
A dataset of superheroes in a REDCapTidieR supertbl
object
Usage
superheroes_supertbl
Format
heroes_information
A tibble
with 734 rows and 12 columns:
- record_id
REDCap record ID
- name
Hero name
- gender
Gender
- eye_color
Eye color
- race
Race
- hair_color
Hair color
- height
Height
- weight
Weight
- publisher
Publisher
- skin_color
Skin color
- alignment
Alignment
- form_status_complete
REDCap instrument completed?
super_hero_powers
A tibble
with 5,966 rows and 4 columns:
- record_id
REDCap record ID
- redcap_form_instance
REDCap repeat instance
- power
Super power
- form_status_complete
REDCap instrument completed?
Source
Recode fields using supertbl metadata
Description
This utility function helps to map metadata field types in order to apply changes in supertbl tables.
Usage
supertbl_recode(supertbl, supertbl_meta, add_labelled_column_headers)
Arguments
supertbl |
A supertibble generated using |
supertbl_meta |
an |
add_labelled_column_headers |
Whether or not to include labelled outputs |
Provide a succinct summary of an object
Description
tbl_sum()
gives a brief textual description of a table-like object,
which should include the dimensions and the data source in the first element,
and additional information in the other elements (such as grouping for dplyr).
The default implementation forwards to obj_sum()
.
Usage
## S3 method for class 'redcap_supertbl'
tbl_sum(x)
Arguments
x |
Object to summarise. |
Value
A named character vector, describing the dimensions in the first element and the data source in the name of the first element.
Make a REDCapR
API call with custom error handling
Description
Make a REDCapR
API call with custom error handling
Usage
try_redcapr(expr, call = caller_env())
Arguments
expr |
an expression making a |
call |
the calling environment to use in the warning message |
Value
If successful, the data
element of the REDCapR
result. Otherwise an error
Implement REDCapR DAG Data into Supertibble
Description
This helper function uses output from REDCapR::redcap_dag_read and applies the necessary
raw/label values to the redcap_data_access_group
column.
This is done because REDCapTidieR retrieves raw data by default, then merges
labels from the metadata. However, some columns like
redcap_data_access_group
are not in the metadata and so there is nothing by
default to reference.
Usage
update_dag_cols(data, dag_data, raw_or_label)
Arguments
data |
the REDCap data |
dag_data |
a DAG dataset exported from REDCapR::redcap_dag_read |
raw_or_label |
A string (either 'raw', 'label', or 'haven') that specifies whether
to export the raw coded values or the labels for the options of categorical
fields. Default is 'label'. If 'haven' is supplied, categorical fields are converted
to |
Correctly label variables belonging to checkboxes with minus signs
Description
Using db_data
and db_metadata
, temporarily create a conversion
column that reverts automatic REDCap behavior where database column names
have "-"s converted to "_"s.
Usage
update_data_col_names(db_data, db_metadata)
Arguments
db_data |
The REDCap database output defined by
|
db_metadata |
The REDCap metadata output defined by
|
Details
This is an issue with checkbox fields since analysts should be able to verify checkbox variable suffices with their label counterparts.
Value
Updated db_data
column names for checkboxes where "-"s were
replaced by "_"s.
Update metadata field names for checkbox handling
Description
Takes a db_metadata
object and:
replaces checkbox field rows with a set of rows, one for each checkbox option
appends a
field_name_updated
field to the end for checkbox variable handlingupdates
field_label
for any new checkbox rows to include the specific option in "field_label: option label" formatstrips html and field embedding logic from
field_label
Usage
update_field_names(db_metadata)
Arguments
db_metadata |
The REDCap metadata output defined by
|
Details
Assumes db_metadata
:
has non-zero number of rows
contains
field_name
andfield_label
columns
Value
Column db_metadata
with field_name_updated
appended
and field_label
updated for new rows corresponding to checkbox options
Vector type as a string
Description
vec_ptype_full()
displays the full type of the vector. vec_ptype_abbr()
provides an abbreviated summary suitable for use in a column heading.
Usage
## S3 method for class 'redcap_supertbl'
vec_ptype_abbr(x, ..., prefix_named, suffix_shape)
Arguments
x |
A vector. |
... |
These dots are for future extensions and must be empty. |
prefix_named |
If |
suffix_shape |
If |
Value
A string.
Write Supertibbles to XLSX
Description
Transform a supertibble into an XLSX file, with each REDCap data tibble in a separate sheet.
Usage
write_redcap_xlsx(
supertbl,
file,
add_labelled_column_headers = NULL,
use_labels_for_sheet_names = TRUE,
include_toc_sheet = TRUE,
include_metadata_sheet = TRUE,
table_style = "tableStyleLight8",
column_width = "auto",
recode_logical = TRUE,
na_replace = "",
overwrite = FALSE
)
Arguments
supertbl |
A supertibble generated using |
file |
The name of the file to which the output will be written. |
add_labelled_column_headers |
If |
use_labels_for_sheet_names |
If |
include_toc_sheet |
If |
include_metadata_sheet |
If |
table_style |
Any Excel table style name or "none". For more details, see
the
"formatting" vignette
of the |
column_width |
Sets the width of columns throughout the workbook. The default is "auto", but you can specify a numeric value. |
recode_logical |
If |
na_replace |
The value used to replace |
overwrite |
If |
Value
An openxlsx2
workbook object, invisibly
Examples
## Not run:
redcap_uri <- Sys.getenv("REDCAP_URI")
token <- Sys.getenv("REDCAP_TOKEN")
supertbl <- read_redcap(redcap_uri, token)
supertbl %>%
write_redcap_xlsx(file = "supertibble.xlsx")
# Add variable labels
library(labelled)
supertbl %>%
make_labelled() %>%
write_redcap_xlsx(file = "supertibble.xlsx", add_labelled_column_headers = TRUE)
## End(Not run)