A cohort is a set of people that fulfill a certain set of criteria for a period of time.
In omopgenerics we defined the cohort_table
class that
allows us to represent individuals in a cohort.
A cohort_table
is created using the
newCohortTable()
function that is defined by:
A cohort table.
A cohort set.
A cohort attrition.
The cohort set and the cohort attrition will be instantiated in the
database every time that we create a new cohort table. You can access
the cohort set of a cohort_table using the function
settings()
:
settings(cdm$cohort)
You can access the cohort attrition of a cohort_table using the
function attrition()
:
attrition(cdm$cohort)
Cohort attrition table is also used to compute the number of counts
that each cohort has. It can be seen with the function
cohortCount()
. This entirely relies with the cohort
attrition attribute and it is not performing any actual calculation.
cohortCount(cdm$cohort)
Each one of the elements that define a cohort table have to fulfill certain criteria.
A cohort set must be a table with:
Lower case column names.
At least cohort_definition_id, cohort_name columns
(cohortColumns("cohort_set")
).
cohort_name
it must contain unique cohort names
(currently they are cased to snake case).
cohort_definition_id
it must contain unique cohort
ids, all the ids present in table must be present in the cohort set and
the same ids must be present in cohort attrition.
A cohort attrition must be a table with:
Lower case column names.
At least cohort_definition_id, number_records, number_subjects,
reason_id, reason, excluded_records, excluded_subjects columns
(cohortColumns("cohort_attrition")
).
cohort_definition_id
it must contain cohort ids, all
the ids present in table must be present in the cohort attrition and the
same ids must be present in cohort set.
There must exist unique pairs of
cohort_definition_id
and reason_id
.
A cohort table must be a table with:
It comes from a cdm_reference (extracted via
cdm$cohort
).
It has the same source than this cdm_reference.
Lower case column names.
At least cohort_definition_id, subject_id, cohort_start_date,
cohort_end_date columns (cohortColumns("cohort")
).
There is no record with NA
value in the required
columns.
There is no record with cohort_start_date
after
cohort_end_date
.
There is no overlap between records. A person can be in a cohort several times (several records with the same subject_id). But it can’t enter (cohort_start_date) the cohort again before leaving it (cohort_end_date). So an individual can’t be simultaneously more than once in the same cohort. This rule is applied at the cohort_definition_id level, so records with different cohort_definition_id can overlap.
All the time between cohort_start_date and cohort_end_date (both included) the individual must be in observation.
You can bind two cohort tables using the method bind()
.
You can combine several cohort tables using this method. The only
constrain is that cohort names must be unique across the different
cohort tables. You have to provide a name for the new cohort table.
<- bind(cdm$cohort1, cdm$cohort2, cdm$cohort3, name = "my_new_cohort")
cdm
$my_new_cohort
cdm
settings(cdm$my_new_cohort)
attrition(cdm$my_new_cohort)
cohortCount(cdm$my_new_cohort)
You can export the metadata of a cohort_table
using the
function: summary()
:
summary(cdm$cohort)
This will provide a summarised_result
object with the
metadata of the cohort (cohort set, cohort counts and cohort
attrition).