Getting Started with climatehealth

What is climatehealth?

The climatehealth package provides R functions for calculating climate–health indicators following the statistical framework developed under the SOSCHI (Standards for Official Statistics on Climate–Health Interactions) project. It covers indicators for six climate-health topic areas:

Topic	Lead
Temperature-related health effects	ONS
Health effects of wildfires	ONS
Mental health (suicides and heat)	ONS
Water-borne diseases (diarrhoea)	AIMS
Health effects of air pollution	AIMS
Vector-borne diseases (malaria)	RIPS/AIMS

Each topic has a dedicated analysis function that takes a data file path and column mappings, fits the appropriate statistical models, and returns results and optional plots.

Installation

From CRAN

install.packages("climatehealth")

From GitHub (latest development version)

install.packages("remotes")
remotes::install_github("onssoschi/climatehealth")

Optional dependencies

Two indicators (malaria and diarrhoea) depend on INLA and terra respectively, which are not on CRAN and must be installed separately if needed:

climatehealth::install_INLA()
climatehealth::install_terra()

Once installed, load the package:

library(climatehealth)

Package workflow

All six indicator functions follow the same pattern:

Provide a path to your input CSV.
Map your column names to the function’s expected arguments (or use the defaults if your data already matches them).
Choose optional extras: covariates, meta-analysis, output saving.
Inspect the returned list for model results, plots, and summary tables.

your_data.csv  -->  indicator_do_analysis()  -->  results list
                                              -->  figures (optional)
                                              -->  CSV outputs (optional)

Your first analysis: temperature and mortality

temp_mortality_do_analysis() estimates the association between ambient temperature and mortality using a distributed lag non-linear model (DLNM).

res <- climatehealth::temp_mortality_do_analysis(
  data_path        = "path/to/your/data.csv",
  date_col         = "date",
  region_col       = "region",
  temperature_col  = "tmean",
  health_outcome_col = "deaths",
  population_col   = "population",
  meta_analysis    = FALSE,
  save_fig         = FALSE,
  save_csv         = FALSE
)

The returned object res is a named list. Common fields include:

res$data_raw          # the input data as loaded
res$analysis_results  # model coefficients and confidence intervals
res$meta_results      # pooled estimates (when meta_analysis = TRUE)

Adding covariates

Pass extra column names via independent_cols (continuous exposures) and control_cols (factors such as day-of-week or public holidays):

res <- climatehealth::temp_mortality_do_analysis(
  data_path          = "path/to/your/data.csv",
  date_col           = "date",
  region_col         = "region",
  temperature_col    = "tmean",
  health_outcome_col = "deaths",
  population_col     = "population",
  independent_cols   = c("humidity", "ozone"),
  control_cols       = c("dow", "holiday_flag"),
  meta_analysis      = FALSE,
  save_fig           = FALSE,
  save_csv           = FALSE
)

Pooling across regions with meta-analysis

Set meta_analysis = TRUE to pool region-level estimates into a single national estimate:

res <- climatehealth::temp_mortality_do_analysis(
  data_path          = "path/to/your/data.csv",
  date_col           = "date",
  region_col         = "region",
  temperature_col    = "tmean",
  health_outcome_col = "deaths",
  population_col     = "population",
  country            = "National",
  meta_analysis      = TRUE,
  save_fig           = FALSE,
  save_csv           = FALSE
)

The six indicators

Air pollution

air_pollution_do_analysis() estimates attributable mortality burden from PM2.5 exposure. By default it expects columns named date, region, pm25, deaths, population, humidity, precipitation, tmax, and wind_speed.

res <- climatehealth::air_pollution_do_analysis(
  data_path       = "path/to/your/data.csv",
  save_outputs    = FALSE,
  run_descriptive = TRUE,
  run_power       = TRUE
)

Compare against multiple PM2.5 reference thresholds in a single run:

res <- climatehealth::air_pollution_do_analysis(
  data_path         = "path/to/your/data.csv",
  reference_standards = list(
    list(value = 15, name = "WHO"),
    list(value = 25, name = "National")
  ),
  save_outputs = FALSE,
  run_power    = TRUE
)

Wildfires

wildfire_do_analysis() estimates the health impact of wildfire smoke exposure.

res <- climatehealth::wildfire_do_analysis(
  data_path          = "path/to/your/data.csv",
  date_col           = "date",
  region_col         = "region",
  exposure_col       = "pm25_fire",
  health_outcome_col = "respiratory_admissions",
  population_col     = "population",
  meta_analysis      = FALSE,
  save_fig           = FALSE,
  save_csv           = FALSE
)

Mental health (suicides and heat)

suicides_heat_do_analysis() models the association between temperature and suicide counts.

res <- climatehealth::suicides_heat_do_analysis(
  data_path          = "path/to/your/data.csv",
  date_col           = "date",
  region_col         = "region",
  temperature_col    = "tmean",
  health_outcome_col = "suicides",
  population_col     = "population",
  meta_analysis      = FALSE,
  save_fig           = FALSE,
  save_csv           = FALSE
)

Water-borne diseases (diarrhoea)

diarrhea_do_analysis() estimates climate-driven diarrhoea burden.

res <- climatehealth::diarrhea_do_analysis(
  data_path          = "path/to/your/data.csv",
  date_col           = "date",
  region_col         = "region",
  temperature_col    = "tmean",
  health_outcome_col = "diarrhea_cases",
  population_col     = "population",
  meta_analysis      = FALSE,
  save_fig           = FALSE,
  save_csv           = FALSE
)

Vector-borne diseases (malaria)

malaria_do_analysis() requires the INLA package (see Installation above).

res <- climatehealth::malaria_do_analysis(
  data_path          = "path/to/your/data.csv",
  date_col           = "date",
  region_col         = "region",
  temperature_col    = "tmean",
  health_outcome_col = "malaria_cases",
  population_col     = "population",
  meta_analysis      = FALSE,
  save_fig           = FALSE,
  save_csv           = FALSE
)

Descriptive statistics

Before running an indicator analysis, use run_descriptive_stats() to explore your data: distributions, correlations, missing values, outliers, and seasonal patterns.

df <- read.csv("path/to/your/data.csv")

desc <- climatehealth::run_descriptive_stats(
  data               = df,
  output_path        = "path/to/output/folder",
  aggregation_column = "region",
  dependent_col      = "deaths",
  independent_cols   = c("tmean", "humidity", "rainfall"),
  plot_corr_matrix   = TRUE,
  plot_dist          = TRUE,
  plot_na_counts     = TRUE,
  plot_scatter       = TRUE,
  plot_box           = TRUE,
  create_base_dir    = TRUE
)

Add units for cleaner plot labels, and enable time-series and rate calculations:

desc <- climatehealth::run_descriptive_stats(
  data               = df,
  output_path        = "path/to/output/folder",
  aggregation_column = "region",
  population_col     = "population",
  dependent_col      = "deaths",
  independent_cols   = c("tmean", "humidity", "rainfall"),
  units = c(
    deaths    = "count",
    tmean     = "C",
    humidity  = "%",
    rainfall  = "mm"
  ),
  timeseries_col     = "date",
  plot_corr_matrix   = TRUE,
  plot_dist          = TRUE,
  plot_ma            = TRUE,
  ma_days            = 30,
  plot_seasonal      = TRUE,
  plot_regional      = TRUE,
  plot_total         = TRUE,
  detect_outliers    = TRUE,
  calculate_rate     = TRUE,
  create_base_dir    = TRUE
)

The returned list includes paths to all generated plots:

desc$run_output_path      # folder where all outputs were saved
desc$region_output_paths  # per-region output sub-folders

Saving outputs

Every indicator function accepts save_fig and save_csv arguments (or save_outputs for air pollution). Set these to TRUE and supply output_folder_path to write results to disk. The function creates a timestamped sub-folder automatically.

res <- climatehealth::temp_mortality_do_analysis(
  data_path          = "path/to/your/data.csv",
  date_col           = "date",
  region_col         = "region",
  temperature_col    = "tmean",
  health_outcome_col = "deaths",
  population_col     = "population",
  meta_analysis      = TRUE,
  save_fig           = TRUE,
  save_csv           = TRUE,
  output_folder_path = "path/to/output/folder"
)

Error handling

The package uses structured conditions. You can catch them with tryCatch:

result <- tryCatch(
  climatehealth::temp_mortality_do_analysis(
    data_path          = "path/to/your/data.csv",
    date_col           = "wrong_column_name",
    health_outcome_col = "deaths",
    population_col     = "population"
  ),
  climate_error = function(e) {
    message("climatehealth error: ", conditionMessage(e))
    NULL
  }
)

Use is_climate_error() to test whether a caught condition came from this package:

climatehealth::is_climate_error(e)

Next steps

Full example scripts for each indicator are in system.file("examples", package = "climatehealth").
Function reference: see ?temp_mortality_do_analysis and related help pages.
Methodology documents for each SOSCHI topic are linked from the SOSCHI project website.
Report issues by emailing climate.health@ons.gov.uk or via the Contact Us page.