The climatehealth package provides R functions for calculating climate–health indicators following the statistical framework developed under the SOSCHI (Standards for Official Statistics on Climate–Health Interactions) project. It covers indicators for six climate-health topic areas:
| Topic | Lead |
|---|---|
| Temperature-related health effects | ONS |
| Health effects of wildfires | ONS |
| Mental health (suicides and heat) | ONS |
| Water-borne diseases (diarrhoea) | AIMS |
| Health effects of air pollution | AIMS |
| Vector-borne diseases (malaria) | RIPS/AIMS |
Each topic has a dedicated analysis function that takes a data file path and column mappings, fits the appropriate statistical models, and returns results and optional plots.
All six indicator functions follow the same pattern:
your_data.csv --> indicator_do_analysis() --> results list
--> figures (optional)
--> CSV outputs (optional)
temp_mortality_do_analysis() estimates the association
between ambient temperature and mortality using a distributed lag
non-linear model (DLNM).
res <- climatehealth::temp_mortality_do_analysis(
data_path = "path/to/your/data.csv",
date_col = "date",
region_col = "region",
temperature_col = "tmean",
health_outcome_col = "deaths",
population_col = "population",
meta_analysis = FALSE,
save_fig = FALSE,
save_csv = FALSE
)The returned object res is a named list. Common fields
include:
res$data_raw # the input data as loaded
res$analysis_results # model coefficients and confidence intervals
res$meta_results # pooled estimates (when meta_analysis = TRUE)Pass extra column names via independent_cols (continuous
exposures) and control_cols (factors such as day-of-week or
public holidays):
res <- climatehealth::temp_mortality_do_analysis(
data_path = "path/to/your/data.csv",
date_col = "date",
region_col = "region",
temperature_col = "tmean",
health_outcome_col = "deaths",
population_col = "population",
independent_cols = c("humidity", "ozone"),
control_cols = c("dow", "holiday_flag"),
meta_analysis = FALSE,
save_fig = FALSE,
save_csv = FALSE
)Set meta_analysis = TRUE to pool region-level estimates
into a single national estimate:
res <- climatehealth::temp_mortality_do_analysis(
data_path = "path/to/your/data.csv",
date_col = "date",
region_col = "region",
temperature_col = "tmean",
health_outcome_col = "deaths",
population_col = "population",
country = "National",
meta_analysis = TRUE,
save_fig = FALSE,
save_csv = FALSE
)air_pollution_do_analysis() estimates attributable
mortality burden from PM2.5 exposure. By default it expects columns
named date, region, pm25,
deaths, population, humidity,
precipitation, tmax, and
wind_speed.
res <- climatehealth::air_pollution_do_analysis(
data_path = "path/to/your/data.csv",
save_outputs = FALSE,
run_descriptive = TRUE,
run_power = TRUE
)Compare against multiple PM2.5 reference thresholds in a single run:
wildfire_do_analysis() estimates the health impact of
wildfire smoke exposure.
suicides_heat_do_analysis() models the association
between temperature and suicide counts.
diarrhea_do_analysis() estimates climate-driven
diarrhoea burden.
malaria_do_analysis() requires the INLA
package (see Installation above).
res <- climatehealth::malaria_do_analysis(
data_path = "path/to/your/data.csv",
date_col = "date",
region_col = "region",
temperature_col = "tmean",
health_outcome_col = "malaria_cases",
population_col = "population",
meta_analysis = FALSE,
save_fig = FALSE,
save_csv = FALSE
)Before running an indicator analysis, use
run_descriptive_stats() to explore your data:
distributions, correlations, missing values, outliers, and seasonal
patterns.
df <- read.csv("path/to/your/data.csv")
desc <- climatehealth::run_descriptive_stats(
data = df,
output_path = "path/to/output/folder",
aggregation_column = "region",
dependent_col = "deaths",
independent_cols = c("tmean", "humidity", "rainfall"),
plot_corr_matrix = TRUE,
plot_dist = TRUE,
plot_na_counts = TRUE,
plot_scatter = TRUE,
plot_box = TRUE,
create_base_dir = TRUE
)Add units for cleaner plot labels, and enable time-series and rate calculations:
desc <- climatehealth::run_descriptive_stats(
data = df,
output_path = "path/to/output/folder",
aggregation_column = "region",
population_col = "population",
dependent_col = "deaths",
independent_cols = c("tmean", "humidity", "rainfall"),
units = c(
deaths = "count",
tmean = "C",
humidity = "%",
rainfall = "mm"
),
timeseries_col = "date",
plot_corr_matrix = TRUE,
plot_dist = TRUE,
plot_ma = TRUE,
ma_days = 30,
plot_seasonal = TRUE,
plot_regional = TRUE,
plot_total = TRUE,
detect_outliers = TRUE,
calculate_rate = TRUE,
create_base_dir = TRUE
)The returned list includes paths to all generated plots:
desc$run_output_path # folder where all outputs were saved
desc$region_output_paths # per-region output sub-foldersEvery indicator function accepts save_fig and
save_csv arguments (or save_outputs for air
pollution). Set these to TRUE and supply
output_folder_path to write results to disk. The function
creates a timestamped sub-folder automatically.
res <- climatehealth::temp_mortality_do_analysis(
data_path = "path/to/your/data.csv",
date_col = "date",
region_col = "region",
temperature_col = "tmean",
health_outcome_col = "deaths",
population_col = "population",
meta_analysis = TRUE,
save_fig = TRUE,
save_csv = TRUE,
output_folder_path = "path/to/output/folder"
)The package uses structured conditions. You can catch them with
tryCatch:
result <- tryCatch(
climatehealth::temp_mortality_do_analysis(
data_path = "path/to/your/data.csv",
date_col = "wrong_column_name",
health_outcome_col = "deaths",
population_col = "population"
),
climate_error = function(e) {
message("climatehealth error: ", conditionMessage(e))
NULL
}
)Use is_climate_error() to test whether a caught
condition came from this package:
system.file("examples", package = "climatehealth").?temp_mortality_do_analysis and related help pages.