This vignette shows functionalities used for annotating and filtering the data within the clinDataReview package.
Utility functions to automate standard pre-processing steps of the data are available in the package.
Note that these functions are mainly useful in combination with the specification of the parameters in ‘config’ file in the clinical data reports (see the dedicated reporting vignette).
For this vignette, we will use example data available in the clinUtils package.
library(clinDataReview)The input dataset for the clinical data review should be a data.frame with clinical data. Such data is typically imported from SAS data file or xpt data file.
Such dataset can be imported for multiple files at once via the clinUtils::loadDataADaMSDTM function.
The label of the variables stored in the SAS datasets is also used for title/captions.
A few ADaM datasets are included in the clinUtils package for the demonstration, via the dataset dataADaMCDISCP01 and corresponding variable labels.
library(clinUtils)
data(dataADaMCDISCP01)
labelVars <- attr(dataADaMCDISCP01, "labelVars")
dataLB <- dataADaMCDISCP01$ADLBC
dataDM <- dataADaMCDISCP01$ADSL
dataAE <- dataADaMCDISCP01$ADAEThe annotateData enables to add metadata for a specific domain/dataset.
dataLBAnnot <- annotateData(
data = dataLB,
annotations = list(data = dataDM, vars = c("ETHNIC", "ARM")),
verbose = TRUE
)## Data annotated with variable(s): ETHNIC ('ETHNIC'), ARM ('ARM') from the 'custom' dataset based on the variable(s): USUBJID ('USUBJID').
knitr::kable(
head(dataLBAnnot),
caption = paste("Laboratory parameters annotated with",
"demographics information with the `annotatedData` function"
)
)| STUDYID | SUBJID | USUBJID | TRTP | TRTPN | TRTA | TRTAN | TRTSDT | TRTEDT | AGE | AGEGR1 | AGEGR1N | RACE | RACEN | SEX | COMP24FL | DSRAEFL | SAFFL | AVISIT | AVISITN | ADY | ADT | VISIT | VISITNUM | PARAM | PARAMCD | PARAMN | PARCAT1 | AVAL | BASE | CHG | A1LO | A1HI | R2A1LO | R2A1HI | BR2A1LO | BR2A1HI | ANL01FL | ALBTRVAL | ANRIND | BNRIND | ABLFL | AENTMTFL | LBSEQ | LBNRIND | LBSTRESN | DATASET | ETHNIC | ARM |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Baseline | 0 | -9 | 2013-08-14 | SCREENING 1 | 1 | Sodium (mmol/L) | SODIUM | 18 | CHEM | 139.00 | 139.00 | NA | 132.0 | 147.0 | 1.053030 | 0.9455782 | 1.053030 | 0.9455782 | 81.50 | N | N | Y | 26 | NORMAL | 139.00 | ADLBC | NOT HISPANIC OR LATINO | Xanomeline High Dose | |||
| CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Baseline | 0 | -9 | 2013-08-14 | SCREENING 1 | 1 | Potassium (mmol/L) | K | 19 | CHEM | 4.00 | 4.00 | NA | 3.4 | 5.4 | 1.176471 | 0.7407407 | 1.176471 | 0.7407407 | 4.10 | N | N | Y | 19 | NORMAL | 4.00 | ADLBC | NOT HISPANIC OR LATINO | Xanomeline High Dose | |||
| CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Baseline | 0 | -9 | 2013-08-14 | SCREENING 1 | 1 | Chloride (mmol/L) | CL | 20 | CHEM | 109.00 | 109.00 | NA | 94.0 | 112.0 | 1.159574 | 0.9732143 | 1.159574 | 0.9732143 | 62.00 | N | N | Y | 11 | NORMAL | 109.00 | ADLBC | NOT HISPANIC OR LATINO | Xanomeline High Dose | |||
| CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Baseline | 0 | -9 | 2013-08-14 | SCREENING 1 | 1 | Bilirubin (umol/L) | BILI | 21 | CHEM | 8.55 | 8.55 | NA | 3.0 | 21.0 | 2.850000 | 0.4071429 | 2.850000 | 0.4071429 | 22.95 | N | N | Y | 6 | NORMAL | 8.55 | ADLBC | NOT HISPANIC OR LATINO | Xanomeline High Dose | |||
| CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Baseline | 0 | -9 | 2013-08-14 | SCREENING 1 | 1 | Alkaline Phosphatase (U/L) | ALP | 22 | CHEM | 88.00 | 88.00 | NA | 31.0 | 110.0 | 2.838710 | 0.8000000 | 2.838710 | 0.8000000 | 77.00 | N | N | Y | 2 | NORMAL | 88.00 | ADLBC | NOT HISPANIC OR LATINO | Xanomeline High Dose | |||
| CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Baseline | 0 | -9 | 2013-08-14 | SCREENING 1 | 1 | Gamma Glutamyl Transferase (U/L) | GGT | 23 | CHEM | 43.00 | 43.00 | NA | 10.0 | 61.0 | 4.300000 | 0.7049180 | 4.300000 | 0.7049180 | 48.50 | N | N | Y | 15 | NORMAL | 43.00 | ADLBC | NOT HISPANIC OR LATINO | Xanomeline High Dose |
The filterData enables to filter a dataset.
dataLBAnnotTreatment <- filterData(
data = dataLBAnnot,
filters = list(var = "ARM", value = "Placebo", rev = TRUE),
verbose = TRUE
)## 354 records with ARM ('ARM') %in% 'Placebo' are filtered in the data.
knitr::kable(
unique(dataLBAnnotTreatment[, c("USUBJID", "ARM")]),
caption = paste("Subset of laboratory parameters filtered",
"with placebo patients"
)
)| USUBJID | ARM | |
|---|---|---|
| 1 | 01-701-1148 | Xanomeline High Dose |
| 397 | 01-701-1192 | Xanomeline Low Dose |
| 793 | 01-701-1211 | Xanomeline Low Dose |
| 1363 | 01-718-1371 | Xanomeline High Dose |
| 1615 | 01-718-1427 | Xanomeline High Dose |
The transformData enables to convert data to a different format.
For example, the laboratory data is converted from a long format, containing one record per endpoint * visit * subject to a wide format containing one record per visit * subject. The endpoints are included in different columns.
eDishData <- transformData(
data = subset(dataLB, PARAMCD %in% c("ALT", "BILI")),
transformations = list(
type = "pivot_wider",
varsID = c("USUBJID", "VISIT"),
varsValue = c("LBSTRESN", "LBNRIND"),
varPivot = "PARAMCD"
),
verbose = TRUE,
labelVars = labelVars
)## Warning in reshapeWide(data, idvar = idvar, timevar = timevar, varying = varying, : some constant variables
## (AVISIT,AVISITN,PARAM,PARAMN,AVAL,BASE,CHG,A1LO,A1HI,R2A1LO,R2A1HI,BR2A1LO,BR2A1HI,ANL01FL,ALBTRVAL,LBSEQ) are really varying
## Warning in reshapeWide(data, idvar = idvar, timevar = timevar, varying = varying, : multiple rows match for PARAMCD=BILI: first taken
## Warning in reshapeWide(data, idvar = idvar, timevar = timevar, varying = varying, : multiple rows match for PARAMCD=ALT: first taken
## Data is converted to a wide format with variables: 'LBSTRESN', 'LBNRIND' for different: 'PARAMCD' by 'Unique Subject Identifier', 'Visit Name' pivoted to different columns.
knitr::kable(head(eDishData))| STUDYID | SUBJID | USUBJID | TRTP | TRTPN | TRTA | TRTAN | TRTSDT | TRTEDT | AGE | AGEGR1 | AGEGR1N | RACE | RACEN | SEX | COMP24FL | DSRAEFL | SAFFL | AVISIT | AVISITN | ADY | ADT | VISIT | VISITNUM | PARAM | PARAMN | PARCAT1 | AVAL | BASE | CHG | A1LO | A1HI | R2A1LO | R2A1HI | BR2A1LO | BR2A1HI | ANL01FL | ALBTRVAL | ANRIND | BNRIND | ABLFL | AENTMTFL | LBSEQ | DATASET | LBSTRESN.BILI | LBNRIND.BILI | LBSTRESN.ALT | LBNRIND.ALT | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 4 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Baseline | 0 | -9 | 2013-08-14 | SCREENING 1 | 1 | Bilirubin (umol/L) | 21 | CHEM | 8.55 | 8.55 | NA | 3 | 21 | 2.85 | 0.4071429 | 2.85 | 0.4071429 | 22.95 | N | N | Y | 6 | ADLBC | 8.55 | NORMAL | 34 | NORMAL | |||
| 40 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Week 2 | 2 | 14 | 2013-09-05 | WEEK 2 | 4 | Bilirubin (umol/L) | 21 | CHEM | 8.55 | 8.55 | 0.00 | 3 | 21 | 2.85 | 0.4071429 | 2.85 | 0.4071429 | 22.95 | N | N | 43 | ADLBC | 8.55 | NORMAL | 41 | NORMAL | ||||
| 76 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Week 4 | 4 | 28 | 2013-09-19 | WEEK 4 | 5 | Bilirubin (umol/L) | 21 | CHEM | 8.55 | 8.55 | 0.00 | 3 | 21 | 2.85 | 0.4071429 | 2.85 | 0.4071429 | 22.95 | N | N | 78 | ADLBC | 8.55 | NORMAL | 35 | NORMAL | ||||
| 112 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Week 6 | 6 | 42 | 2013-10-03 | WEEK 6 | 7 | Bilirubin (umol/L) | 21 | CHEM | 8.55 | 8.55 | 0.00 | 3 | 21 | 2.85 | 0.4071429 | 2.85 | 0.4071429 | 22.95 | N | N | 108 | ADLBC | 8.55 | NORMAL | 31 | NORMAL | ||||
| 148 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Week 8 | 8 | 57 | 2013-10-18 | WEEK 8 | 8 | Bilirubin (umol/L) | 21 | CHEM | 8.55 | 8.55 | 0.00 | 3 | 21 | 2.85 | 0.4071429 | 2.85 | 0.4071429 | 22.95 | N | N | 138 | ADLBC | 8.55 | NORMAL | 31 | NORMAL | ||||
| 184 | CDISCPILOT01 | 1148 | 01-701-1148 | Xanomeline High Dose | 81 | Xanomeline High Dose | 81 | 2013-08-23 | 2014-02-20 | 57 | <65 | 1 | WHITE | 1 | M | Y | Y | Week 12 | 12 | 87 | 2013-11-17 | WEEK 12 | 9 | Bilirubin (umol/L) | 21 | CHEM | 6.84 | 8.55 | -1.71 | 3 | 21 | 2.28 | 0.3257143 | 2.85 | 0.4071429 | Y | 24.66 | N | N | 168 | ADLBC | 6.84 | NORMAL | 39 | NORMAL |
The processData function executes all the pre-processing steps described in the previous section at once.
dataLBAnnotTreatment2 <- processData(
data = dataLB,
processing = list(
list(annotate = list(data = dataDM, vars = c("ETHNIC", "ARM"))),
list(filter = list(var = "ARM", value = "Placebo", rev = TRUE))
),
verbose = TRUE
)
identical(dataLBAnnotTreatment, dataLBAnnotTreatment2)[1] TRUE
R version 4.3.3 (2024-02-29) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 22.04.4 LTS
Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so; LAPACK version 3.10.0
locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=C LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
time zone: Etc/UTC tzcode source: system (glibc)
attached base packages: [1] stats graphics grDevices utils datasets methods base
other attached packages: [1] clinUtils_0.1.5 clinDataReview_1.5.1 knitr_1.46
loaded via a namespace (and not attached): [1] plotly_4.10.4 sass_0.4.9 utf8_1.2.4 generics_0.1.3 tidyr_1.3.1 xml2_1.3.6 stringi_1.8.3 jsonvalidate_1.3.2 [9] hms_1.1.3 digest_0.6.35 magrittr_2.0.3 evaluate_0.23 grid_4.3.3 bookdown_0.39 fastmap_1.1.1 plyr_1.8.9
[17] jsonlite_1.8.8 httr_1.4.7 purrr_1.0.2 fansi_1.0.6 crosstalk_1.2.1 viridisLite_0.4.2 scales_1.3.0 lazyeval_0.2.2
[25] jquerylib_0.1.4 cli_3.6.2 rlang_1.1.3 munsell_0.5.1 cachem_1.0.8 yaml_2.3.8 tools_4.3.3 parallel_4.3.3
[33] dplyr_1.1.4 colorspace_2.1-0 ggplot2_3.5.0 DT_0.33 forcats_1.0.0 vctrs_0.6.5 R6_2.5.1 lifecycle_1.0.4
[41] stringr_1.5.1 htmlwidgets_1.6.4 pkgconfig_2.0.3 pillar_1.9.0 bslib_0.7.0 gtable_0.3.4 data.table_1.15.4 glue_1.7.0
[49] Rcpp_1.0.12 haven_2.5.4 xfun_0.43 tibble_3.2.1 tidyselect_1.2.1 htmltools_0.5.8.1 rmarkdown_2.26 compiler_4.3.3