Recoding CHMS medication variables

library(chmsflow)
library(recodeflow)
library(dplyr)

1. Introduction

chmsflow provides 16 functions that classify medications from ATC codes recorded in CHMS clinic data. Each function checks whether a respondent is taking a specific drug class and returns 1 (yes) or 0 (no), with haven::tagged_na() codes for missing or not-applicable responses.

Available medication variables

Variable Drug class ATC prefix Cycles 3–6 function Cycles 1–2 function
ace_med ACE inhibitors C09 is_ace_inhibitor() is_ace_med_cycles1to2()
bb_med Beta blockers C07 is_beta_blocker() is_bb_med_cycles1to2()
ccb_med Calcium channel blockers C08 is_calcium_channel_blocker() is_ccb_med_cycles1to2()
diur_med Diuretics C03 is_diuretic() is_diur_med_cycles1to2()
misc_htn_med Other antihypertensives mixed is_other_antihtn_med() is_misc_htn_med_cycles1to2()
any_htn_med Any antihypertensive combined is_any_antihtn_med() is_any_htn_med_cycles1to2()
nsaid_med NSAIDs M01A is_nsaid() is_nsaid_med_cycles1to2()
diab_med Diabetes medications A10 is_diabetes_med() is_diab_med_cycles1to2()

Cycle differences

Medication data is structured differently across CHMS cycles:

2. When to use medication recoding

If your analysis requires medication variables, always perform medication recoding first, before recoding any other variables. Two downstream health outcome variables depend on medication status:

3. Workflow

The workflow is the same for all cycles: recode medication variables and merge into the main cycle dataset using recode_meds_cycles1to2() or recode_meds_cycles3to6(), then derive health outcomes using recode_after_meds(). Use recode_after_meds() instead of rec_with_table() – it automatically excludes medication-specific rows from variable_details so pre-computed medication columns are passed through rather than re-derived.

3.1 Cycles 1–2

Cycles 1–2 medication data uses uppercase column names (CLINICID, ATC_101A, etc.). recode_meds_cycles1to2() normalizes these internally.

Step 1 – Recode medication variables and merge with main cycle data. Requires: cycle1, cycle1_meds.

cycle1 <- recode_meds_cycles1to2(cycle1, cycle1_meds, c("any_htn_med", "diab_med"))

Step 2 – Derive diabetes status. Requires: cycle1 from Step 1.

cycle1_diab_data <- recode_after_meds(
  cycle1,
  c("lab_hba1", "diab_a1c", "diab_med", "ccc_51", "diab_status")
)
head(select(cycle1_diab_data, clinicid, diab_status))
  clinicid diab_status
1        1           1
2        2           2
3        3        <NA>
4        4           1
5        5           1
6        6           2

Step 3 – Derive hypertension status. Requires: cycle1 from Step 1.

cycle1_htn_data <- recode_after_meds(
  cycle1,
  c(
    # Blood pressure (raw + adjusted)
    "bpmdpbps", "bpmdpbpd", "sbp_adj_mmhg", "dbp_adj_mmhg",
    # Medication inputs (merged in Step 1)
    "any_htn_med", "ccc_32",
    # Diabetes chain (input to htn functions)
    "lab_hba1", "diab_a1c", "ccc_51", "diab_med", "diab_status",
    # CVD chain
    "ccc_61", "ccc_63", "ccc_81", "cvd_status",
    # CKD chain
    "lab_bcre", "pgdcgt", "clc_sex", "clc_age", "gfr_ml_min", "ckd_status",
    # Hypertension outcomes
    "htn_status", "htn_adj_status", "htn_control_status", "htn_control_adj_status"
  )
)
head(select(cycle1_htn_data, clinicid, htn_status, htn_adj_status))
  clinicid htn_status htn_adj_status
1        1          1              1
2        2          1              1
3        3          1              1
4        4          1              1
5        5          1              1
6        6          1              1

3.2 Cycles 3–6

Step 1 – Recode medication variables and merge with main cycle data. Requires: cycle3, cycle3_meds.

cycle3 <- recode_meds_cycles3to6(cycle3, cycle3_meds, c("any_htn_med", "diab_med"))

Step 2 – Derive diabetes status. Requires: cycle3 from Step 1.

cycle3_diab_data <- recode_after_meds(
  cycle3,
  c("lab_hba1", "diab_a1c", "diab_med", "ccc_51", "diab_status")
)
head(select(cycle3_diab_data, clinicid, diab_status))
  clinicid diab_status
1        1           1
2        2           1
3        3           1
4        4           1
5        5           1
6        6           1

Step 3 – Derive hypertension status. Requires: cycle3 from Step 1.

cvd_status, diab_status, and ckd_status are intermediate inputs to the hypertension functions. Their full input chains must also be listed so recode_after_meds() can derive them.

cycle3_htn_data <- recode_after_meds(
  cycle3,
  c(
    # Blood pressure (raw + adjusted)
    "bpmdpbps", "bpmdpbpd", "sbp_adj_mmhg", "dbp_adj_mmhg",
    # Medication inputs (merged in Step 1)
    "any_htn_med", "ccc_32",
    # Diabetes chain (input to htn functions)
    "lab_hba1", "diab_a1c", "ccc_51", "diab_med", "diab_status",
    # CVD chain
    "ccc_61", "ccc_63", "ccc_81", "cvd_status",
    # CKD chain
    "lab_bcre", "pgdcgt", "clc_sex", "clc_age", "gfr_ml_min", "ckd_status",
    # Hypertension outcomes
    "htn_status", "htn_adj_status", "htn_control_status", "htn_control_adj_status"
  )
)
head(select(cycle3_htn_data, clinicid, htn_status, htn_adj_status))
  clinicid htn_status htn_adj_status
1        1          1              1
2        2          1              1
3        3          1              1
4        4          1              1
5        5          1              1
6        6          1              1

4. Advanced: using individual classification functions

The is_* functions underlie the wrapper functions and are available directly for custom workflows – for example, deriving a single drug class without the full pipeline, or integrating classification logic into your own aggregation steps.

Each function accepts an ATC code and a time-last-taken value and returns 1, 0, or a tagged_na() code:

# Single medication classification
is_beta_blocker("C07AA05", 1) # returns 1
[1] 1
is_ace_inhibitor("C09AA02", 1) # returns 1
[1] 1
is_diabetes_med("A10BA02", 1) # returns 1
[1] 1

Cycle format differences

Cycles 1–2 – one row per respondent with up to 80 atc_*/mhr_* column pairs. The is_*_med_cycles1to2() variants accept named arguments for each slot:

# Classification using cycles 1--2 wide-format columns
is_ace_med_cycles1to2(atc_101a = "C09AA02", mhr_101b = 1) # returns 1
[1] 1
is_ace_med_cycles1to2(atc_101a = "C09AA02", mhr_101b = 6) # returns 0 (not taken recently)
[1] 0

Cycles 3–6 – one row per medication per respondent with two columns: meucatc (ATC code) and npi_25b (time last taken). Classify per row, then aggregate across rows per respondent:

cycle3_meds |>
  mutate(ace_med = is_ace_inhibitor(meucatc, npi_25b)) |>
  aggregate_meds_by_person(variables = "ace_med")
# A tibble: 50 × 2
   clinicid ace_med
      <int>   <dbl>
 1        1       1
 2        2       1
 3        3       1
 4        4       1
 5        5       1
 6        6       1
 7        7       1
 8        8       1
 9        9       1
10       10       1
# ℹ 40 more rows

Warning

Avoid using as.numeric(as.character(.x)) to aggregate medication columns. That pattern strips tagged_na("a") (valid skip) and tagged_na("b") (missing/refused) distinctions, collapsing them into plain NA. Use aggregate_meds_by_person() instead – it preserves tagged-NA semantics across the aggregation.

Next steps

mirror server hosted at Truenetwork, Russian Federation.