Type: Package
Title: Retrieve and Analyze Clinical Trials Data from Public Registers
Version: 1.24.0
Imports: jsonlite, xml2, nodbi (≥ 0.10.7), rvest, stringi, lubridate, jqr, dplyr, zip, V8, readr, rlang, htmlwidgets, stringdist, tidyr, httr2
URL: https://cran.r-project.org/package=ctrdata, https://rfhb.github.io/ctrdata/
BugReports: https://github.com/rfhb/ctrdata/issues
Description: A system for querying, retrieving and analyzing protocol- and results-related information on clinical trials from four public registers, the 'European Union Clinical Trials Register' ('EUCTR', https://www.clinicaltrialsregister.eu/), 'ClinicalTrials.gov' (https://clinicaltrials.gov/ and also translating queries the retired classic interface), the 'ISRCTN' (http://www.isrctn.com/) and the 'European Union Clinical Trials Information System' ('CTIS', https://euclinicaltrials.eu/). Trial information is downloaded, converted and stored in a database ('PostgreSQL', 'SQLite', 'DuckDB' or 'MongoDB'; via package 'nodbi'). Protocols, statistical analysis plans, informed consent sheets and other documents in registers associated with trials can also be downloaded. Other functions implement trial concepts canonically across registers, identify deduplicated records, easily find and extract variables (fields) of interest even from complex nested data as used by the registers, merge variables and update queries. The package can be used for monitoring, meta- and trend-analysis of the design and conduct as well as of the results of clinical trials across registers.
License: MIT + file LICENSE
RoxygenNote: 7.3.2
Suggests: devtools, knitr, rmarkdown, RSQLite, mongolite, tinytest (≥ 1.2.1), RPostgres, duckdb, httr, tibble, clipr, chromote
VignetteBuilder: knitr
NeedsCompilation: no
Encoding: UTF-8
Language: en-GB
Packaged: 2025-07-20 20:09:01 UTC; ralfherold
Author: Ralf Herold ORCID iD [aut, cre], Marek Kubica [cph] (node-xml2js library), Ivan Bozhanov [cph] (jstree library)
Maintainer: Ralf Herold <ralf.herold@mailbox.org>
Repository: CRAN
Date/Publication: 2025-07-21 07:40:02 UTC

ctrdata: Retrieve and Analyze Clinical Trials Data from Public Registers

Description

A system for querying, retrieving and analyzing protocol- and results-related information on clinical trials from four public registers, the 'European Union Clinical Trials Register' ('EUCTR', https://www.clinicaltrialsregister.eu/), 'ClinicalTrials.gov' (https://clinicaltrials.gov/ and also translating queries the retired classic interface), the 'ISRCTN' (http://www.isrctn.com/) and the 'European Union Clinical Trials Information System' ('CTIS', https://euclinicaltrials.eu/). Trial information is downloaded, converted and stored in a database ('PostgreSQL', 'SQLite', 'DuckDB' or 'MongoDB'; via package 'nodbi'). Protocols, statistical analysis plans, informed consent sheets and other documents in registers associated with trials can also be downloaded. Other functions implement trial concepts canonically across registers, identify deduplicated records, easily find and extract variables (fields) of interest even from complex nested data as used by the registers, merge variables and update queries. The package can be used for monitoring, meta- and trend-analysis of the design and conduct as well as of the results of clinical trials across registers.

Author(s)

Maintainer: Ralf Herold ralf.herold@mailbox.org (ORCID)

Other contributors:

See Also

Useful links:


Check and prepare nodbi connection object for ctrdata

Description

Check and prepare nodbi connection object for ctrdata

Usage

ctrDb(con)

Arguments

con

A database connection object, created with nodbi. See section '1 - Database connection' in ctrdata.

Value

Connection object as list, with collection element under root


Find synonyms of an active substance

Description

An active substance can be identified by a recommended international nonproprietary name (INN), a trade or product name, or a company code(s). To find likely synonyms, the function retrieves from CTGOV2 the field protocolSection.armsInterventionsModule.interventions.otherNames. Note this does not seem to be based on choices from a dictionary but may be manually filled, thus is not free of error and needs to be checked.

Usage

ctrFindActiveSubstanceSynonyms(activesubstance = "", verbose = FALSE)

Arguments

activesubstance

An active substance, in an atomic character vector

verbose

Print number of studies found in CTGOV2 for 'activesubstance'

Value

A character vector of the active substance (input parameter) and synonyms, or NULL if active substance was not found and may be invalid

Examples

## Not run: 

ctrFindActiveSubstanceSynonyms(activesubstance = "imatinib")
#  [1] "imatinib"          "CGP 57148"         "CGP 57148B"
#  [4] "CGP57148B"         "Gleevec"           "GLIVEC"
#  [7] "Imatinib"          "Imatinib Mesylate" "NSC 716051"
# [10] "ST1571"            "STI 571"           "STI571"

## End(Not run)


Generates queries that work across registers

Description

From high-level search terms provided by the user, generate specific queries for each registers with which ctrdata works, see ctrdata-registers. Search terms that are expanded to concepts such as from MeSH and MedDRA by the search implementations in registers include the 'intervention' and 'condition'. Logical operators only work with 'searchPhrase'.

Usage

ctrGenerateQueries(
  searchPhrase = NULL,
  condition = NULL,
  intervention = NULL,
  phase = NULL,
  population = NULL,
  recruitment = NULL,
  startBefore = NULL,
  startAfter = NULL,
  completedBefore = NULL,
  completedAfter = NULL,
  onlyMedIntervTrials = TRUE,
  onlyWithResults = FALSE,
  countries = NULL
)

Arguments

searchPhrase

String with optional logical operators ("AND", "OR") that will be searched in selected fields of registers that can handle logical operators (general or title fields), should not include quotation marks

condition

String with condition / disease

intervention

String with intervention

phase

String, e.g. "phase 2" (note that "phase 2+3" is a specific category, not the union set of "phase 2" and "phase 3")

population

String, e.g. "P" (paediatric), "A" (adult), "P+A" (adult and paediatric), "E" (elderly), "P+A+E" participants can be recruited

recruitment

String, one of "ongoing", "completed", "other" ( which includes "ended early" but this cannot be searched; use trial concept f.statusRecruitment to identify this status)

startBefore

String that can be interpreted as date (for EUCTR, when trial was first registered)

startAfter

String that can be interpreted as date (for EUCTR, when trial was first registered)

completedBefore

String that can be interpreted as date (does not work with EUCTR)

completedAfter

String that can be interpreted as date (does not work with EUCTR)

onlyMedIntervTrials

Logical, default TRUE, which indicates if queries should search only for medicine interventional clinical trial

onlyWithResults

Logical

countries

Vector of country names, two- or three-letter ISO 3166 codes

Value

Named vector of URLs for finding trials in the registers and as input to functions ctrLoadQueryIntoDb and ctrOpenSearchPagesInBrowser

Examples


urls <- ctrGenerateQueries(
  intervention = "antibody",
  phase = "phase 3",
  startAfter = "2000-01-01")

# open queries in register web interface
sapply(urls, ctrOpenSearchPagesInBrowser)

urls <- ctrGenerateQueries(
  searchPhrase = "antibody AND covid",
  recruitment = "completed",
  )

# find research platform and platform trials
urls <- ctrGenerateQueries(
  searchPhrase = paste0(
   "basket OR platform OR umbrella OR master protocol OR ",
   "multiarm OR multistage OR subprotocol OR substudy OR ",
   "multi-arm OR multi-stage OR sub-protocol OR sub-study"),
 startAfter = "01/31/2010",
 countries = c("DE", "US", "United Kingdom"))

# open queries in register web interface
sapply(urls, ctrOpenSearchPagesInBrowser)

## Not run: 
# count trials found
sapply(urls, ctrLoadQueryIntoDb, only.count = TRUE)

# load queries into database collection
dbc <- nodbi::src_sqlite(collection = "my_collection")
sapply(urls, ctrLoadQueryIntoDb, con = dbc)

## End(Not run)


Get register name and query parameters from search URL

Description

Extracts query parameters and register name from parameter 'url' or from the clipboard, into which the URL of a register search was copied.

Usage

ctrGetQueryUrl(url = "", register = "")

Arguments

url

URL such as from the browser address bar. If not specified, clipboard contents will be checked for a suitable URL. For automatically copying the user's query of a register in a web browser to the clipboard, see here. Can also contain a query term such as from dbQueryHistory()["query-term"]. Can also be an identifier of a trial, which based on its format will indicate to which register it relates.

register

Optional name of register (one of "EUCTR", "CTGOV2" "ISRCTN" or "CTIS") in case 'url' is a query term but not a full URL

Value

A data frame (or tibble, if tibble is loaded) with column names 'query-term' and 'query-register'. The data frame (or tibble) can be passed as such as parameter 'queryterm' to ctrLoadQueryIntoDb and as parameter 'url' to ctrOpenSearchPagesInBrowser.

Examples


# user copied into the clipboard the URL from
# the address bar of the browser that shows results
# from a query in one of the trial registers
if (interactive()) try(ctrGetQueryUrl(), silent = TRUE)

# extract query parameters from search result URL
# (URL was cut for the purpose of formatting only)
ctrGetQueryUrl(
    url = paste0(
        "https://classic.clinicaltrials.gov/ct2/results?",
        "cond=&term=AREA%5BMaximumAge%5D+RANGE%5B0+days%2C+28+days%5D",
        "&type=Intr&rslt=&age_v=&gndr=&intr=Drugs%2C+Investigational",
        "&titles=&outc=&spons=&lead=&id=&cntry=&state=&city=&dist=",
        "&locn=&phase=2&rsub=&strd_s=01%2F01%2F2015&strd_e=01%2F01%2F2016",
        "&prcd_s=&prcd_e=&sfpd_s=&sfpd_e=&rfpd_s=&rfpd_e=&lupd_s=&lupd_e=&sort="
    )
)

# other examples
ctrGetQueryUrl("https://www.clinicaltrialsregister.eu/ctr-search/trial/2007-000371-42/results")
ctrGetQueryUrl("https://euclinicaltrials.eu/ctis-public/view/2022-500041-24-00")
ctrGetQueryUrl("https://classic.clinicaltrials.gov/ct2/show/NCT01492673?cond=neuroblastoma")
ctrGetQueryUrl("https://clinicaltrials.gov/ct2/show/NCT01492673?cond=neuroblastoma")
ctrGetQueryUrl("https://clinicaltrials.gov/study/NCT01467986?aggFilters=ages:child")
ctrGetQueryUrl("https://www.isrctn.com/ISRCTN70039829")

# using identifiers of single trials
ctrGetQueryUrl("70039829")
ctrGetQueryUrl("ISRCTN70039829")
ctrGetQueryUrl("NCT00617929")
ctrGetQueryUrl("2022-501142-30-00")
ctrGetQueryUrl("2012-003632-23")


Load and store register trial information

Description

Retrieves information on clinical trials from registers and stores it in a collection in a database. Main function of ctrdata for accessing registers. A collection can store trial information from different queries or different registers. Query details are stored in the collection and can be accessed using dbQueryHistory. A previous query can be re-run, which replaces or adds trial records while keeping any user annotations of trial records.

Usage

ctrLoadQueryIntoDb(
  queryterm = NULL,
  register = "",
  querytoupdate = NULL,
  forcetoupdate = FALSE,
  euctrresults = FALSE,
  euctrresultshistory = FALSE,
  euctrprotocolsall = TRUE,
  ctgov2history = FALSE,
  ctishistory = FALSE,
  documents.path = NULL,
  documents.regexp = "prot|sample|statist|sap_|p1ar|p2ars|icf|ctalett|lay|^[0-9]+ ",
  annotation.text = "",
  annotation.mode = "append",
  only.count = FALSE,
  con = NULL,
  verbose = FALSE
)

Arguments

queryterm

Either a string with the full URL of a search query in a register, or the data frame returned by ctrGetQueryUrl or dbQueryHistory, or an '_id' in the format of one of the trial registers, or, together with register, a string with query elements of a search URL. The query details are recorded in the collection for later use, e.g. to update records. For "CTIS", queryterm can be an empty string to obtain all trial records. For automatically copying the user's query of a register in a web browser to the clipboard, see here

register

String with abbreviation of register to query, either "EUCTR", "CTGOV2", "ISRCTN" or "CTIS". Not needed if queryterm has a full URL to query results, or has a single trial identifier, or comes from ctrGetQueryUrl or dbQueryHistory.

querytoupdate

Either the word "last", or the row number of a query in the data frame returned by dbQueryHistory that should be run to retrieve any new or update trial records since this query was run the last time. This parameter takes precedence over queryterm. For "EUCTR" and "CTIS", updates are available only for the last seven days; the query is run again if more time has passed since it was run last.

forcetoupdate

If TRUE, run again the query given in querytoupdate, irrespective of when it was run last. Default is FALSE.

euctrresults

If TRUE, also load available results when retrieving and loading trials from EUCTR. This slows down this function. (For "CTGOV2" and "CTIS", available results are always retrieved and loaded into the collection.)

euctrresultshistory

If TRUE, load results and also the available history of results publication in "EUCTR." This somewhat time-consuming. Default is FALSE.

euctrprotocolsall

If TRUE, load all available records of protocol-related data (that is, versions from all EU Member States and any third country where the trial is conducted); if FALSE, only a single record per trial is loaded, to accelerate loading. Default is TRUE, but only for backwards consistency; for new collections, FALSE is the recommended setting, unless there are questions about differences between Member States' protocol versions of a trial such as dates or outcomes of an authorisation decision or an ethics opinion, global status and end.

ctgov2history

For trials from CTGOV2, retrieve historic versions of the record. Default is FALSE, because this is a time-consuming operation. Use n for n from all versions (recommended), 1 for the first (original) version, -1 for the last-but-one version, "n:m" for the nth to the mth versions, or TRUE for all versions of the trial record to be retrieved. Note that for register CTIS, historic versions were available in the 'applications' field only before the register's relaunch on 2024-06-17.

ctishistory

If TRUE, and only when using querytoupdate, move the current CTIS record into an array history with the record which holds one or more historic versions, before updating the rest of the record from CTIS. Default is FALSE, because this is a time-consuming operation. See "Historic versions" in vignette("ctrdata_summarise").

documents.path

If this is a relative or absolute path to a directory that exists or can be created, save any documents into it that are directly available from the register ("EUCTR", "CTGOV2", "ISRCTN", "CTIS") such as PDFs on results, analysis plans, spreadsheets, patient information sheets, assessments or product information. Default is NULL, which disables saving documents. For "EUCTR", sets euctrresults = TRUE since documents are available only with results.

documents.regexp

Regular expression, case insensitive, to select documents by filename, if saving documents is requested (see documents.path). If set to NULL, empty placeholder files are saved for every document that could be saved, which is useful to get an overview on the number and types of documents available for download. Default is "prot|sample|statist|sap_|p1ar|p2ars|icf|ctalett|lay|^[0-9]+ ". Used with "CTGOV2", "ISRCTN" and "CTIS" (for "EUCTR", all documents are downloaded since they are few and have non-canonical filenames.)

annotation.text

Text to be including into the field "annotation" in the records retrieved with the query that is to be loaded into the collection. The contents of the field "annotation" for a trial record are preserved e.g. when running this function again and loading a record of a with an annotation, see parameter annotation.mode.

annotation.mode

One of "append" (default), "prepend" or "replace" for new annotation.text with respect to any existing annotation for the records retrieved with the query that is to be loaded into the collection.

only.count

Set to TRUE to return only the number of trial records found in the register for the query. Does not load trial information into the database. Default is FALSE.

con

A database connection object, created with nodbi. See section '1 - Database connection' in ctrdata.

verbose

If TRUE, prints additional information (default FALSE).

Value

A list with elements 'n' (number of trial records newly imported or updated), ‘success' (a vector of _id’s of successfully loaded records), 'failed' (a vector of identifiers of records that failed to load) and 'queryterm' (the query term used). The returned list has several attributes (including database and collection name, as well as the query history of this database collection) to facilitate documentation.

Examples

## Not run: 

dbc <- nodbi::src_sqlite(collection = "my_collection")

# Retrieve protocol- and results-related information
# on two specific trials identified by their EU number
ctrLoadQueryIntoDb(
  queryterm = "2005-001267-63+OR+2008-003606-33",
  register = "EUCTR",
  euctrresults = TRUE,
  con = dbc
)

# Count ongoing interventional cancer trials involving children
# Note this query is a classical CTGOV query and is translated
# to a corresponding query for the current CTGOV2 webinterface
ctrLoadQueryIntoDb(
  queryterm = "cond=cancer&recr=Open&type=Intr&age=0",
  register = "CTGOV",
  only.count = TRUE,
  con = dbc
)

# Retrieve all information on more than 40 trials
# that are labelled as phase 3 and that mention
# either neuroblastoma or lymphoma from ISRCTN,
# into the same collection as used before
ctrLoadQueryIntoDb(
  queryterm = paste0(
    "https://www.isrctn.com/search?",
    "q=neuroblastoma+OR+lymphoma&filters=phase%3APhase+III"),
  con = dbc
)

# Retrieve information trials in CTIS mentioning neonates
ctrLoadQueryIntoDb(
  queryterm = paste0("https://euclinicaltrials.eu/ctis-public/",
  "search#searchCriteria={%22containAll%22:%22%22,",
  "%22containAny%22:%22neonates%22,%22containNot%22:%22%22}"),
  con = dbc
)

## End(Not run)


Open register to show query results or search page

Description

Open advanced search pages of register(s), or execute search in browser

Usage

ctrOpenSearchPagesInBrowser(url = "", register = "", copyright = FALSE)

Arguments

url

of search results page to show in the browser. To open the browser with a previous search, the output of ctrGetQueryUrl or dbQueryHistory can be used. Can be left as empty string (default) to open the advanced search page of register.

register

Register(s) to open, "EUCTR", "CTGOV2", "ISRCTN" or "CTIS". Default is empty string, and this opens the advanced search page of the registers (including the expert search page in the case of CTGOV).

copyright

(Optional) If set to TRUE, opens only the copyright pages of all registers.

Value

(String) Full URL corresponding to the shortened url in conjunction with register if any, or invisibly TRUE if no url is specified.

Examples


# Open all and check copyrights before using registers
ctrOpenSearchPagesInBrowser(copyright = TRUE)

# Open specific register advanced search page
ctrOpenSearchPagesInBrowser(register = "CTGOV2")
ctrOpenSearchPagesInBrowser(register = "CTIS")
ctrOpenSearchPagesInBrowser(register = "EUCTR")
ctrOpenSearchPagesInBrowser(register = "ISRCTN")

# Open all queries that were loaded into demo collection
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials",
  flags = RSQLite::SQLITE_RO)

dbh <- dbQueryHistory(con = dbc)

for (r in seq_len(nrow(dbh))) {
    ctrOpenSearchPagesInBrowser(dbh[r, ])
}


Show full structure and all data of a trial

Description

If used interactively, the function shows a widget of all data in the trial as a tree of field names and values. The widget opens in the default browser. Fields names and values can be search and selected. Selected fields can be copied to the clipboard for use with function dbGetFieldsIntoDf. The trial is retrieved with ctrLoadQueryIntoDb if no database con is provided or if the trial is not in database con. For use in a Shiny app, see output and render functions in source code here.

Usage

ctrShowOneTrial(identifier = NULL, con = NULL)

Arguments

identifier

A trial identifier string

con

A database connection object, created with nodbi. See section '1 - Database connection' in ctrdata.

Details

This is the widget for CTIS trial 2022-501142-30-00:

ctrdata_ctrShowOneTrial.jpg

Value

Invisibly, the trial data for constructing an HTML widget.

Examples


dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials",
  flags = RSQLite::SQLITE_RO)

# get sample of identifiers of trials in database
sample(dbFindIdsUniqueTrials(con = dbc), 5L)

# all such identifiers work
id <- "2014-003556-31"
id <- "2014-003556-31-SE"
id <- "76463425"
id <- "ISRCTN76463425"
id <- "NCT03431558"
id <- "2022-501142-30-00"

# note these ids also work with
# ctrGetQueryUrl(url = id) and
# ctrLoadQueryIntoDb(queryterm = id, ...)

# show widget for user to explore and search content as well as to
# select fields of interest and to click on "Copy names of selected
# fields to clipboard..." to use them with dbGetFieldsIntoDf()
ctrShowOneTrial(identifier = id, con = dbc)


Getting started, database connection, function overview

Description

ctrdata is a package for aggregating and analysing data on clinical studies, and for obtaining documents, from public trial registers

1 - Define a database connection

Package ctrdata retrieves trial data and stores it in a database collection. The connection is specified using nodbi, which allows to use different database backends in an identical way. A database connection object is specified once and then can be used as parameter con in subsequent calls of ctrdata functions. Specifying collection = "<my trial data collection's name>" indicates the table in the database that package ctrdata should use.

Database Connection object
SQLite dbc <- nodbi::src_sqlite(dbname = "my_db", collection = "my_coll")
DuckDB dbc <- nodbi::src_duckdb(dbname = "my_db", collection = "my_coll")
MongoDB dbc <- nodbi::src_mongo(db = "my_db", collection = "my_coll")
PostgreSQL dbc <- nodbi::src_postgres(dbname = "my_db"); dbc[["collection"]] <- "my_coll"

2 - Load information from clinical trial registers

ctrGenerateQueries (generate from simple user input specific queries for registers EUCTR, CTIS, CTGOV2 and ISRCTN), ctrOpenSearchPagesInBrowser (open queries in browser), see script (automatically copy user search in any register to clipboard), see ctrdata-registers for details on registers and how to search, ctrLoadQueryIntoDb (load trial records found with query into database collection).

3 - Use database with downloaded trial information

ctrShowOneTrial (show widget to explore structure, fields and data of a trial), dbFindFields (find names of fields of interest in trial records in a collection), dbGetFieldsIntoDf (create a data frame with fields of interest and calculated trial concepts from collection), ctrdata-trial-concepts (calculate pre-defined trial concepts for every register), dbFindIdsUniqueTrials (get de-duplicated identifiers of clinical trials' records to subset a data frame).

4 - Operate on a trial data frame from dbGetFieldsIntoDf

dfTrials2Long (convert fields with nested elements into long format), dfName2Value (get values for variable(s) of interest), ctrdata-trial-concepts (calculate pre-defined trial concepts for every register).

Author(s)

Ralf Herold ralf.herold@mailbox.org

See Also

Useful links:


Information on clinical trial registers

Description

Registers of the four clinical trial registers from which package ctrdata can retrieve, aggregate and analyse protocol- and result-related information as well as documents, last updated 2025-05-31.

1 - Overview

2 - Notable changes

CTGOV "classic" was retired on 2024-06-25; ctrdata subsequently translates CTGOV queries to CTGOV2 queries. The new website ("CTGOV2") can be used with ctrdata since 2023-08-27. Database collections created with CTGOV queries can still be used since functions in ctrdata continue to support them. CTIS was relaunched on 2024-06-17, changing the data structure and search syntax, to which ctrdata was updated. CTIS can be used with ctrdata since 2023-03-25. EUCTR removed search parameter ⁠status=⁠ as of February 2025. More information on changes: here.

3 - References

Material EUCTR CTGOV2 ISRCTN CTIS
About link link link link
Terms & conditions, disclaimer link link link link
How to search link link link link
Search interface link link link link
Expert / advanced search link link link link
Glossary / related information link link link link
FAQ / caveats / examples link link, link, link link link
Data dictionaries / structure link, link, link link, link, link link link (see XLSX files)
Example* (see below) link link link link

Some registers are expanding entered search terms using dictionaries (example).

4 - Example and ctrdata motivation

See vignette("ctrdata_summarise") for several other examples.

*This example is an expert search for interventional trials primarily with neonates, investigating treatments for infectious conditions. It shows that searches in the web interface of most registers are not sufficient to identify the trials of interest:

To address this issue, trials can be retrieved with ctrLoadQueryIntoDb into a database collection and in a second step trials of interest can be selected based on values of relevant fields, for example:

ctrdata supports users with pre-defined ctrdata-trial-concepts and these cover the example above, and with functions dbFindFields and ctrShowOneTrial for finding fields of interest and reviewing data structure, respectively.

Author(s)

Ralf Herold ralf.herold@mailbox.org


Trial concepts implemented across registers

Description

ctrdata includes (since version 1.21.0) functions that implement selected trial concepts. Concepts of clinical trials, such as their start or status of recruitment, require to analyse several fields against various pre-defined values. The structure and value sets of fields differ between all ctrdata-registers. In this situation, the implemented trial concepts simplify and accelerate a user's analysis workflow and also increase analysis consistency.

Details

The implementation of trial concepts in ctrdata has not been validated with any formal approach, but has been checked for plausibility and against expectations. The implementation is based on current understanding, on public data models and on scientific papers, as relevant. As with other R functions, call help("f.startDate") or print its implementation code by entering the name of the function as command, e.g. f.startDate. Please raise an issue here to ask about or improve a trial concept.

The following trial concepts can be used by referencing their name when calling dbGetFieldsIntoDf (parameter calculate). Concepts will continue to be refined and added; last updated 2025-07-18.

Author(s)

Ralf Herold ralf.herold@mailbox.org


Find names of fields in the database collection

Description

Given part of the name of a field of interest to the user, this function returns the full field names used in records that were previously loaded into a collection (using ctrLoadQueryIntoDb). Only names of fields that have a value in the collection can be returned. Set sample = FALSE to force screening all records in the collection for field names, see below. See ctrShowOneTrial to interactively find fields.

Usage

dbFindFields(namepart = ".*", con, sample = TRUE, verbose = FALSE)

Arguments

namepart

A character string (can be a regular expression, including Perl-style) to be searched among all field names (keys) in the collection, case-insensitive. The default '".*"' lists all fields.

con

A database connection object, created with nodbi. See section '1 - Database connection' in ctrdata.

sample

If TRUE (default), uses a sample of only 5 trial records per register to identify fields, to rapidly return a possibly incomplete set of field names. If FALSE, uses all trial records in the collection, which will take more time with more trials but ensures to returns all names of all fields in the collection.

verbose

If TRUE, prints additional information (default FALSE).

Details

The full names of child fields are returned in dot notation (e.g., clinical_results.outcome_list.outcome.measure.class_list.class.title) In addition, names of parent fields (e.g., clinical_results) are returned. Data in parent fields is typically complex (nested), see dfTrials2Long for easily handling it. For field definitions of the registers, see "Definition" in ctrdata-registers. Note: When dbFindFields is first called after ctrLoadQueryIntoDb, it will take a moment.

Value

Vector of strings with full names of field(s) found, ordered by register and alphabet, see examples. Names of the vector are the names of the register holding the respective fields. The field names can be fed into dbGetFieldsIntoDf to extract the data for the field(s) from the collection into a data frame.

Examples


dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials",
  flags = RSQLite::SQLITE_RO)

dbFindFields(namepart = "date", con = dbc)[1:5]

# view all 1880+ fields from all registers:

allFields <- dbFindFields(con = dbc, sample = FALSE)

if (interactive()) View(data.frame(
  register = names(allFields),
  field = allFields))


Get identifiers of deduplicated trial records

Description

Records for a clinical trial can be loaded from more than one register into a collection. This function returns deduplicated identifiers for all trials in the collection, respecting the register(s) preferred by the user. All registers are recording identifiers also from other registers, which are used by this function to provide a vector of identifiers of deduplicated trials.

Usage

dbFindIdsUniqueTrials(
  preferregister = c("CTGOV2", "EUCTR", "CTGOV", "ISRCTN", "CTIS"),
  prefermemberstate = "BE",
  include3rdcountrytrials = TRUE,
  con,
  verbose = FALSE
)

Arguments

preferregister

A vector of the order of preference for registers from which to generate unique _id's, default c("CTGOV2", "EUCTR", "CTGOV", "ISRCTN", "CTIS")

prefermemberstate

Code of single EU Member State for which records should returned. If not available, a record for BE or lacking this, any random Member State's record for the trial will be returned. For a list of codes of EU Member States, please see vector countriesEUCTR. Specifying "3RD" will return the Third Country record of trials, where available.

include3rdcountrytrials

A logical value if trials should be retained that are conducted exclusively in third countries, that is, outside the European Union. Ignored if prefermemberstate is set to "3RD".

con

A database connection object, created with nodbi. See section '1 - Database connection' in ctrdata.

verbose

If TRUE, prints out the fields of registers used to find corresponding trial records

Details

Note that the content of records may differ between registers (and, for "EUCTR", between records for different Member States). Such differences are not considered by this function.

Note that the trial concept ".isUniqueTrial" (which uses this function) can be calculated at the time of creating a data frame with dbGetFieldsIntoDf, which often may be the preferred approach.

Value

A named vector with strings of keys (field "_id") of records in the collection that represent unique trials, where names correspond to the register of the record.

Examples


dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials",
  flags = RSQLite::SQLITE_RO)

dbFindIdsUniqueTrials(con = dbc)[1:10]

# alternative as of ctrdata version 1.21.0,
# using defaults of dbFindIdsUniqueTrials()
df <- dbGetFieldsIntoDf(
  fields = "keyword",
  calculate = "f.isUniqueTrial",
  con = dbc)

# using base R
df[df[[".isUniqueTrial"]], ]

## Not run: 
library(dplyr)
df %>% filter(.isUniqueTrial)

## End(Not run)


Create data frame of specified fields or trial concepts from database collection

Description

Fields in the collection are retrieved from all records into a data frame (or tibble). The function uses the field names to appropriately type the values that it returns, harmonising original values (e.g., "Yes" to 'TRUE', "false" to 'FALSE', "Information not present in EudraCT" to 'NA', date strings to dates or time differences, number strings to numbers). Trial concepts are calculated for all records and included in the return value.

Usage

dbGetFieldsIntoDf(fields = "", calculate = "", con, verbose = FALSE)

Arguments

fields

Vector of one or more strings, with names of sought fields. See function dbFindFields for how to find names of fields and ctrShowOneTrial for interactively selecting field names. Dot path notation ("field.subfield") without indices is supported. If compatibility with nodbi::src_postgres is needed, specify fewer than 50 fields, or use parent fields such as '"a.b"' instead of 'c("a.b.c.d", "a.b.c.e")' and then access sought fields with dfTrials2Long followed by dfName2Value or with other R functions.

calculate

Vector of one or more strings, which are names of functions to calculate certain trial concepts from fields in the collection, across different registers. See ctrdata-trial-concepts for available functions.

con

A database connection object, created with nodbi. See section '1 - Database connection' in ctrdata.

verbose

If TRUE, prints additional information (default FALSE).

Details

Within a given trial record, a field can be hierarchical and structured, that is, nested. The function simplifies the structure of nested data and may concatenate multiple strings in a field using " / " (see example) and may have widened the returned data frame with additional columns that were recursively expanded from simply nested data (e.g., "externalRefs" to columns "externalRefs.doi", "externalRefs.eudraCTNumber" etc.). For an alternative ways for handling complex nested data, see dfTrials2Long and dfName2Value for extracting the sought variable(s).

Value

A data frame (or tibble, if tibble is loaded) with columns corresponding to the sought fields. A column with the record '_id' will always be included. The maximum number of rows of the returned data frame is equal to the number of trial records in the database collection, or less if none of the fields has a value in a record.

Examples


dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials",
  flags = RSQLite::SQLITE_RO)

# get fields that are nested within another field
# and can have multiple values with the nested field
dbGetFieldsIntoDf(
  fields = "b1_sponsor.b31_and_b32_status_of_the_sponsor",
  con = dbc)

# fields that are lists of string values are
# returned by concatenating values with " / "
dbGetFieldsIntoDf(
  fields = "keyword",
  con = dbc)

# calculate new field(s) from data across trials
df <- dbGetFieldsIntoDf(
  fields = "keyword",
  calculate = c("f.statusRecruitment", "f.isUniqueTrial", "f.startDate"),
  con = dbc)

table(df$.statusRecruitment, exclude = NULL)

## Not run: 
library(dplyr)
library(ggplot2)

df %>%
  filter(.isUniqueTrial) %>%
  count(.statusRecruitment)

df %>%
  filter(.isUniqueTrial) %>%
  ggplot() +
  stat_ecdf(aes(
    x = .startDate,
    colour = .statusRecruitment))

## End(Not run)


Show history of queries loaded into a database collection

Description

Show history of queries loaded into a database collection

Usage

dbQueryHistory(con, verbose = FALSE)

Arguments

con

A database connection object, created with nodbi. See section '1 - Database connection' in ctrdata.

verbose

If TRUE, prints additional information (default FALSE).

Value

A data frame (or tibble, if tibble is loaded) with columns: 'query-timestamp', 'query-register', 'query-records' (note: this is the number of records loaded when last executing ctrLoadQueryIntoDb, not the total record number) and 'query-term', with one row for each time that ctrLoadQueryIntoDb loaded trial records into this collection.

Examples


dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials",
  flags = RSQLite::SQLITE_RO)

dbQueryHistory(con = dbc)


Merge variables, keeping type where possible, optionally relevel factors

Description

Merge variables in a data frame such as returned by dbGetFieldsIntoDf into a new variable, and optionally also map its values to new levels. See ctrdata-trial-concepts for pre-defined cross-register concepts that are already implemented based on merging fields from different registers and calculating a new field.

Usage

dfMergeVariablesRelevel(df = NULL, colnames = "", levelslist = NULL)

Arguments

df

A data.frame with the variables (columns) to be merged into one vector.

colnames

A vector of names of columns in 'df' that hold the variables to be merged, or a selection of columns as per select.

levelslist

A names list with one slice each for a new value to be used for a vector of old values (optional).

Value

A vector, with the type of the columns to be merged

Examples


dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials",
  flags = RSQLite::SQLITE_RO)

df <- dbGetFieldsIntoDf(
  fields = c(
    "protocolSection.eligibilityModule.healthyVolunteers",
    "f31_healthy_volunteers",
    "eligibility.healthy_volunteers"
  ),
  con = dbc
)

table(
  dfMergeVariablesRelevel(
    df = df,
    colnames = 'matches("healthy")'
))


Get value for variable of interest

Description

Get information for variable of interest (e.g., clinical endpoints) from long data frame of protocol- or result-related trial information as returned by dfTrials2Long. Parameters 'valuename', 'wherename' and 'wherevalue' are matched using Perl regular expressions and ignoring case.

Usage

dfName2Value(df, valuename = "", wherename = "", wherevalue = "")

Arguments

df

A data frame (or tibble) with four columns ('_id', 'identifier', 'name', 'value') as returned by dfTrials2Long

valuename

A character string for the name of the field that holds the value of the variable of interest (e.g., a summary measure such as "endPoints.*tendencyValue.value")

wherename

(optional) A character string to identify the variable of interest among those that repeatedly occur in a trial record (e.g., "endPoints.endPoint.title")

wherevalue

(optional) A character string with the value of the variable identified by 'wherename' (e.g., "response")

Value

A data frame (or tibble, if tibble is loaded) that includes the values of interest, with columns '_id', 'identifier', 'name', 'value' and 'where' (with the contents of 'wherevalue' found at 'wherename'). Contents of 'value' are strings unless all its elements are numbers. The 'identifier' is generated by function dfTrials2Long to identify matching elements, e.g endpoint descriptions and measurements.

Examples


dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials",
  flags = RSQLite::SQLITE_RO)

dfwide <- dbGetFieldsIntoDf(
    fields = c(
        ## ctgov - typical results fields
        # "clinical_results.baseline.analyzed_list.analyzed.count_list.count",
        # "clinical_results.baseline.group_list.group",
        # "clinical_results.baseline.analyzed_list.analyzed.units",
        "clinical_results.outcome_list.outcome",
        "study_design_info.allocation",
        ## euctr - typical results fields
        # "trialInformation.fullTitle",
        # "baselineCharacteristics.baselineReportingGroups.baselineReportingGroup",
        # "trialChanges.hasGlobalInterruptions",
        # "subjectAnalysisSets",
        # "adverseEvents.seriousAdverseEvents.seriousAdverseEvent",
        "endPoints.endPoint",
        "subjectDisposition.recruitmentDetails"
    ), con = dbc
)

dflong <- dfTrials2Long(df = dfwide)

## get values for the endpoint 'response'
dfName2Value(
    df = dflong,
    valuename = paste0(
        "clinical_results.*measurement.value|",
        "clinical_results.*outcome.measure.units|",
        "endPoints.endPoint.*tendencyValue.value|",
        "endPoints.endPoint.unit"
    ),
    wherename = paste0(
        "clinical_results.*outcome.measure.title|",
        "endPoints.endPoint.title"
    ),
    wherevalue = "response"
)


Convert data frame with trial records into long format

Description

The function works with procotol- and results- related information. It converts lists and other values that are in a data frame returned by dbGetFieldsIntoDf into individual rows of a long data frame. From the resulting long data frame, values of interest can be selected using dfName2Value. The function is particularly useful for fields with complex content, such as node field "clinical_results" from EUCTR, for which dbGetFieldsIntoDf returns as a multiply nested list and for which this function then converts every observation of every (leaf) field into a row of its own.

Usage

dfTrials2Long(df)

Arguments

df

Data frame (or tibble) with columns including the trial identifier (_id) and one or more variables as obtained from dbGetFieldsIntoDf

Value

A data frame (or tibble, if tibble is loaded) with the four columns: '_id', 'identifier', 'name', 'value'

Examples


dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials",
  flags = RSQLite::SQLITE_RO)

dfwide <- dbGetFieldsIntoDf(
  fields = "clinical_results.participant_flow",
  con = dbc)

dfTrials2Long(df = dfwide)


Calculate type of assignment to intervention in a study

Description

Calculate type of assignment to intervention in a study

Usage

f.assignmentType(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.assignmentType', which is a factor with levels 'R' (randomised assignment) and 'NR' (all other types of assignment).

Examples

# fields needed
f.assignmentType()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  field = "ctrname",
  calculate = "f.assignmentType",
  con = dbc)
trialsDf


Calculate type of control data collected in a study

Description

Trial concept calculated: type of internal control. ICH E10 lists as types of control: placebo concurrent control, no-treatment concurrent control, dose-response concurrent control, active (positive) concurrent control, external (including historical) control, multiple control groups. Dose-controlled trials are currently not identified. External (including historical) controls are so far not identified in specific register fields. Cross-over designs, where identifiable, have active controls.

Usage

f.controlType(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.controlType', which is a factor with levels 'none', 'no-treatment', 'placebo', 'active', 'placebo+active' and 'other'.

Examples

# fields needed
f.controlType()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  field = "ctrname",
  calculate = "f.controlType",
  con = dbc)
trialsDf


Description

Trial concept calculated: Calculates the links e.g. to publications or other external files referenced from a study record. Requires loading results-related information for EUCTR. Note that documents stored in registers can be downloaded directly, see ctrLoadQueryIntoDb.

Usage

f.externalLinks(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and new column '.externalLinks' (character).

Examples

# fields needed
f.hasResults()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.externalLinks",
  con = dbc)
trialsDf


Calculate if a study's results are available

Description

Trial concept calculated: Calculates if results have been recorded in the register, as structured data, reports or publications, for example. Requires loading results-related information for EUCTR.

Usage

f.hasResults(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and new column '.hasResults' (logical).

Examples

# fields needed
f.hasResults()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.hasResults",
  con = dbc)
trialsDf


Calculate if study is a medicine-interventional study

Description

Trial concept calculated: Calculates if record is a medicine-interventional trial, investigating one or more medicine, whether biological or not. For EUCTR and CTIS, this corresponds to all records as per the definition of the EU Clinical Trial Regulation. For CTGOV and CTGOV2, this is based on drug or biological as type of intervention, and interventional as type of study. For ISRCTN, this is based on drug or biological as type of intervention, and interventional as type of study.

Usage

f.isMedIntervTrial(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with colums '_id' and '.isMedIntervTrial', a logical.

Examples

# fields needed
f.isMedIntervTrial()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.isMedIntervTrial",
  con = dbc)
trialsDf


Calculate if record is unique for a study

Description

Trial concept calculated: Applies function dbFindIdsUniqueTrials() with its defaults.

Usage

f.isUniqueTrial(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.isUniqueTrial', a logical.

Examples

# fields needed
f.isUniqueTrial()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.isUniqueTrial",
  con = dbc)
trialsDf


Calculate if study is likely a platform trial or not

Description

Trial concept calculated: platform trial, research platform. As operational definition, at least one of these criteria is true: a. trial has "platform", "basket", "umbrella", "multi.?arm", "multi.?stage" or "master protocol" in its title or description (for ISRCTN, this is the only criterion; some trials in EUCTR lack data in English), b. trial has more than 2 active arms with different investigational medicines, after excluding comparator, auxiliary and placebo medicines (calculated with f.numTestArmsSubstances; not used for ISRCTN because it cannot be calculated precisely), c. trial has more than 2 periods, after excluding safety run-in, screening, enrolling, extension and follow-up periods (for CTGOV and CTGOV2, this criterion requires results-related data).

Usage

f.likelyPlatformTrial(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Details

For EUCTR, requires that results have been included in the collection, using 'ctrLoadQueryIntoDb(queryterm = ..., euctrresults = TRUE, con = ...)'. Requires packages dplyr and stringdist to be installed; stringdist is used for evaluating terms in brackets in the trial title, where trials may be related if the term similarity is 0.77 or higher.

Publication references considered: EU-PEARL WP2 2020 and Williams RJ et al. 2022, doi:10.1136/bmj-2021-067745

Value

data frame with columns '_id' and '.likelyPlatformTrial', a logical, and two complementary columns, each with lists of identifiers: '.likelyRelatedTrials' (based on other identifiers provided in the trial record, including 'associatedClinicalTrials' from CTIS; listing identifiers whether or not the trial with the other identifier is in the database collection) and '.maybeRelatedTrials' (based on similar short terms in the first set of brackets or before a colon in the trial title; only listing identifiers from the database collection).

Examples

# fields needed
f.likelyPlatformTrial()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.likelyPlatformTrial",
  con = dbc)
trialsDf


Calculate number of sites of a study

Description

Trial concept calculated: number of the sites where the trial is conducted. EUCTR lacks information on number of sites outside of the EEA; for each non-EEA country mentioned, at least one site is assumed.

Usage

f.numSites(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.numSites', an integer.

Examples

# fields needed
f.numSites()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.numSites",
  con = dbc)
trialsDf


Calculate number of arms or groups with investigational medicines in a study

Description

Trial concept calculated: number of active arms with different investigational medicines, after excluding non-active comparator, auxiliary and placebo arms / medicines. For ISRCTN, this is imprecise because arms are not identified in a field. Most registers provide no or only limited information on phase 1 trials, so that this number typically cannot be calculated for these trials. Requires packages stringdist to be installed; stringdist is used for evaluating names of active substances, which are considered similar when the similarity is 0.8 or higher.

Usage

f.numTestArmsSubstances(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.numTestArmsSubstances', an integer

Examples

# fields needed
f.numTestArmsSubstances()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.numTestArmsSubstances",
  con = dbc)
trialsDf


Calculate details of a primary endpoint of a study

Description

Trial concept calculated: full description of the primary endpoint, concatenating with " == " its title, description, time frame of assessment. The details vary by register. The text description can be used for identifying trials of interest or for analysing trends in primary endpoints, which among the set of all endpoints are most often used for determining the number of participants sought for the study.

Usage

f.primaryEndpointDescription(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.primaryEndpointDescription', which is a list (that is, one or more items in one vector per row; the background is that some trials have several endpoints as primary).

Examples

# fields needed
f.primaryEndpointDescription()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.primaryEndpointDescription",
  con = dbc)
trialsDf


Calculate details of a study's primary endpoint analysis and testing

Description

Trial concept calculated: Calculates several results-related elements of the primary analysis of the primary endpoint. Requires loading results-related information. For CTIS and ISRCTN, such information is not available in structured format. Recommended to be combined with .controlType, .sampleSize etc. for analyses.

Usage

f.primaryEndpointResults(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and new columns: '.primaryEndpointFirstPvalue' (discarding any inequality indicator, e.g. <=), '.primaryEndpointFirstPmethod' (normalised string, e.g. chisquared), '.primaryEndpointFirstPsize' (number included in test, across assignment groups).

Examples

# fields needed
f.primaryEndpointResults()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.primaryEndpointResults",
  con = dbc)
trialsDf


Calculate date of results of a study

Description

Trial concept calculated: earliest date of results as recorded in the register. At that date, results may have been incomplete and may have been changed later. For EUCTR, requires that results and preferrably also their history of publication have been included in the collection, using ctrLoadQueryIntoDb(queryterm = ..., euctrresultshistory = TRUE, con = ...). Cannot be calculated for ISRCTN, which does not have a corresponding field.

Usage

f.resultsDate(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.resultsDate', a date.

Examples

# fields needed
f.resultsDate()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.resultsDate",
  con = dbc)
trialsDf


Calculate sample size of a study

Description

Trial concept calculated: sample size of the trial, preferring results-related (achieved recruitment) over protocol-related information (planned sample size).

Usage

f.sampleSize(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.sampleSize', an integer.

Examples

# fields needed
f.sampleSize()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.sampleSize",
  con = dbc)
trialsDf


Calculate type of sponsor of a study

Description

Trial concept calculated: type or class of the sponsor(s) of the study. No specific field is available in ISRCTN; thus, sponsor type is set to 'other'. Note: If several sponsors, sponsor type is deemed 'mixed' if there is both, a commercial and a non-commercial sponsor(s).

Usage

f.sponsorType(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.sponsorType', which is a factor with levels 'for profit', 'not for profit', 'mixed' (not and for profit sponsors) or 'other'.

Examples

# fields needed
f.sponsorType()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.sponsorType",
  con = dbc)
trialsDf


Calculate start date of a study

Description

Trial concept calculated: start of the trial, based on the documented or planned start of recruitment, or on the date of opinion of the competent authority.

Usage

f.startDate(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.startDate', a date.

Examples

# fields needed
f.startDate()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  field = "ctrname",
  calculate = "f.startDate",
  con = dbc)
trialsDf


Calculate status of recruitment of a study

Description

Trial concept calculated: status of recruitment at the time of loading the trial records. Maps the categories that are in fields which specify the state of recruitment. Simplifies the status into three categories.

Usage

f.statusRecruitment(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.statusRecruitment', which is a factor with levels 'ongoing' (includes active, not yet recruiting; temporarily halted; suspended; authorised, not started and similar), 'completed' (includes ended; ongoing, recruitment ended), 'ended early' (includes prematurely ended, terminated early) and 'other' (includes revoked, withdrawn, planned, stopped).

Examples

# fields needed
f.statusRecruitment()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.statusRecruitment",
  con = dbc)
trialsDf


Calculate objectives of a study

Description

Trial concept calculated: objectives of the trial, by searching for text fragments found in fields describing its purpose, objective, background or hypothesis, after applying .isMedIntervTrial, because the text fragments are tailored to medicinal product interventional trials. This is a simplification, and it is expected that the criteria will be further refined. The text fragments only apply to English.

Usage

f.trialObjectives(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.trialObjectives', which is a string with letters separated by a space, such as E (efficacy, including cure, survival, effectiveness); A (activity, including reponse, remission, seroconversion); S (safety); PK; PD (including biomarker); D (dose-finding, determining recommended dose); LT (long-term); and FU (follow-up).

Examples

# fields needed
f.trialObjectives()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.trialObjectives",
  con = dbc)
trialsDf


Calculate phase of a clinical trial

Description

Trial concept calculated: phase of a clinical trial as per ICH E8(R1).

Usage

f.trialPhase(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.trialPhase', which is an ordered factor with levels 'phase 1', 'phase 1+2', 'phase 2', 'phase 2+3', 'phase 2+4', 'phase 3', 'phase 3+4', 'phase 1+2+3', 'phase 4', 'phase 1+2+3+4'.

Examples

# fields needed
f.trialPhase()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.trialPhase",
  con = dbc)
trialsDf


Calculate in- and exclusion criteria and age groups

Description

Trial concept calculated: inclusion and exclusion criteria as well as age groups that can participate in a trial, based on protocol-related information. Since CTGOV uses single text field for eligibility criteria, text extraction is used to separate in- and exclusion criteria. (See dfMergeVariablesRelevel with an example for healthy volunteers.)

Usage

f.trialPopulation(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and new columns: '.trialPopulationAgeGroup' (factor, "P", "A", "P+A", "E", "A+E", "P+A+E"), '.trialPopulationInclusion' (string), '.trialPopulationExclusion' (string).

Examples

# fields needed
f.trialPopulation()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.trialPopulation",
  con = dbc)
trialsDf


Calculate the title of a study

Description

Trial concept calculated: scientific or full title of the study.

Usage

f.trialTitle(df = NULL)

Arguments

df

data frame such as from dbGetFieldsIntoDf. If 'NULL', prints fields needed in 'df' for calculating this trial concept, which can be used with dbGetFieldsIntoDf.

Value

data frame with columns '_id' and '.trialTitle', a string.

Examples

# fields needed
f.trialTitle()

# apply trial concept when creating data frame
dbc <- nodbi::src_sqlite(
  dbname = system.file("extdata", "demo.sqlite", package = "ctrdata"),
  collection = "my_trials", flags = RSQLite::SQLITE_RO)
trialsDf <- dbGetFieldsIntoDf(
  calculate = "f.trialTitle",
  con = dbc)
trialsDf

mirror server hosted at Truenetwork, Russian Federation.