Help for package metasurvey

Title:

Reproducible Survey Data Processing with Step Pipelines

Version:

0.0.21

URL:

https://metasurveyr.github.io/metasurvey/, https://github.com/metasurveyr/metasurvey

BugReports:

https://github.com/metasurveyr/metasurvey/issues

Description:

Provides a step-based pipeline for reproducible survey data processing, building on the 'survey' package for complex sampling designs. Supports rotating panels with bootstrap replicate weights, and provides a recipe system for sharing and reproducing data transformation workflows across survey editions.

License:

GPL (≥ 3)

Imports:

data.table (≥ 1.14.2), cli (≥ 3.0.0), glue (≥ 1.6.0), lifecycle (≥ 1.0.0), jsonlite (≥ 1.7.2), R6 (≥ 2.5.0), survey (≥ 4.2.1), methods

Suggests:

archive, convey, httr2 (≥ 1.0.0), haven, openxlsx, visNetwork (≥ 2.0.9), roxygen2 (≥ 7.1.2), testthat (≥ 3.0.0), tibble (≥ 3.1.3), dplyr (≥ 1.0.7), knitr (≥ 1.33), foreign (≥ 0.8-81), rmarkdown (≥ 2.11), parallel, rio (≥ 0.5.27), here (≥ 1.0.1), gt (≥ 0.10.0), magrittr, shiny (≥ 1.7.0), bslib (≥ 0.5.0), bsicons (≥ 0.1.0), htmltools (≥ 0.5.0), xml2 (≥ 1.3.0), eph, PNADcIBGE, ipumsr

Config/testthat/edition:

Encoding:

UTF-8

RoxygenNote:

7.3.3

Depends:

R (≥ 4.1)

VignetteBuilder:

knitr

NeedsCompilation:

Packaged:

2026-02-20 17:09:34 UTC; mauroloprete

Author:

Mauro Loprete

[aut, cre], Natalia da Silva

[aut], Fabricio Machado

[aut]

Maintainer:

Mauro Loprete <mauro.loprete@icloud.com>

Repository:

CRAN

Date/Publication:

2026-02-25 10:30:07 UTC

metasurvey: Survey Processing with Meta-Programming

Description

The metasurvey package provides a comprehensive framework for processing complex survey data using meta-programming techniques. It integrates seamlessly with the survey package while adding powerful features for reproducible survey analysis workflows.

Key Features

Survey Objects and Classes:

Survey: Basic survey object for cross-sectional data
RotativePanelSurvey: Panel survey with implantation and follow-up
PoolSurvey: Pool of multiple surveys for time series analysis

Steps and Workflows:

step_compute: Create computed variables
step_recode: Recode variables with multiple conditions
workflow: Execute estimation workflows with variance adjustment

Recipes and Reproducibility:

recipe: Create reusable recipe objects
bake_recipes: Apply recipes to survey data
get_recipe: Retrieve recipes from repository

Data Loading and Weights:

load_survey: Load single survey data
load_panel_survey: Load panel survey data
add_weight: Add survey weights
add_replicate: Add bootstrap/jackknife replicates

Quality Assessment:

evaluate_cv: Evaluate coefficient of variation quality
Built-in variance estimation with multiple engines

Supported Survey Types

The package includes built-in support for several survey types:

ECH: Encuesta Continua de Hogares (Uruguay)
EAII: Encuesta de Actividad, Innovación e I+D
EAI: Encuesta de Actividades de Innovación
Generic survey types with flexible configuration

Workflow Example

library(metasurvey)
# Note: examples use base R pipe (|>) to avoid extra dependencies

# Load survey data
survey_data <- load_survey(
  data_path = "path/to/data.csv",
  svy_edition = "2023",
  svy_type = "ech",
  svy_weight = add_weight(annual = "weight_var")
)

# Add processing steps
processed_survey <- survey_data |>
  step_recode("employed", status == 1 ~ 1, .default = 0) |>
  step_compute(unemployment_rate = unemployed / labor_force)

# Apply steps and run workflow
final_survey <- bake_steps(processed_survey)

results <- workflow(
  survey = list(final_survey),
  survey::svytotal(~employed),
  survey::svymean(~unemployment_rate),
  estimation_type = "annual"
)

Meta-Programming Features

The package leverages R's meta-programming capabilities to:

Generate survey code dynamically based on metadata
Create reusable workflows that adapt to different survey structures
Validate and harmonize variable definitions across time periods
Automatically handle complex variance estimation procedures

Author(s)

Mauro Loprete mauro.loprete@icloud.com, Natalia da Silva natalia.dasilva@fcea.edu.uy, Fabricio Machado fabricio.mch.slv@gmail.com

References

Lumley, T. (2020). "survey: analysis of complex survey samples". R package version 4.0.

Extract downloaded ANDA file (handles ZIP, RAR, CSV, SAV)

Description

Extract downloaded ANDA file (handles ZIP, RAR, CSV, SAV)

Usage

.anda_extract_file(raw_path, dest_dir, label)

Arguments

raw_path

Path to the raw downloaded file

dest_dir

Destination directory

label

Label for naming extracted files

Value

Path to the extracted data file

Find the main data file in an extracted directory

Description

Find the main data file in an extracted directory

Usage

.anda_find_data_file(dir, label)

Arguments

dir

Character path to extracted directory

label

Character label for error messages

Value

Character path to the data file

Parse ANDA download page for resource titles and IDs

Description

Parse ANDA download page for resource titles and IDs

Usage

.anda_parse_resources(html, catalog_id)

Arguments

html

Character HTML content

catalog_id

Integer catalog ID

Value

data.frame with columns: id, title

Select resources matching the requested type

Description

Select resources matching the requested type

Usage

.anda_select_resource(resources, resource, edition)

Arguments

resources

data.frame from .anda_parse_resources

resource

Character resource type

edition

Character edition year

Value

data.frame subset of matching resources

PoolSurvey Class

Description

This class represents a collection of surveys grouped by specific periods (e.g., monthly, quarterly, annual). It provides methods to access and manipulate the grouped surveys.

Value

An object of class PoolSurvey.

Public fields

surveys: A list containing the grouped surveys.

Methods

Method `new()`

Initializes a new instance of the PoolSurvey class.

Usage

PoolSurvey$new(surveys)

Arguments

surveys: A list containing the grouped surveys.

Method `get_surveys()`

Retrieves surveys for a specific period.

Usage

PoolSurvey$get_surveys(period = NULL)

Arguments

period: A string specifying the period to retrieve (e.g., "monthly", "quarterly").

Returns

A list of surveys for the specified period.

Method `print()`

Prints metadata about the PoolSurvey object.

Usage

PoolSurvey$print()

Method `clone()`

The objects of this class are cloneable with this method.

Usage

PoolSurvey$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

s1 <- Survey$new(
  data = data.table::data.table(id = 1:3, w = 1),
  edition = "2023", type = "test", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
s2 <- Survey$new(
  data = data.table::data.table(id = 4:6, w = 1),
  edition = "2023", type = "test", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
pool <- PoolSurvey$new(list(annual = list("group1" = list(s1, s2))))

Recipe R6 class

Description

Recipe R6 class

Format

An R6 class generator (R6ClassGenerator)

Details

R6 class representing a reproducible data transformation recipe for surveys. It encapsulates metadata, declared dependencies, and a list of transformation steps to be applied to a Survey object.

Value

An object of class Recipe.

Methods

$new(name, edition, survey_type, default_engine, depends_on, user, description, steps, id, doi, topic): Class constructor.
$doc(): Auto-generate documentation from recipe steps. Returns a list with metadata, input_variables, output_variables, and pipeline information.
$validate(svy): Validate that a survey object has all required input variables.

Public fields

name: Descriptive name of the recipe (character).
edition: Target edition/period (character or Date).
survey_type: Survey type (character), e.g., "ech", "eaii".
default_engine: Default evaluation engine (character).
depends_on: Vector/list of dependencies declared by the steps.
user: Author/owner (character).
description: Recipe description (character).
id: Unique identifier (character/numeric).
steps: List of step calls that make up the workflow.
doi: DOI or external identifier (character|NULL).
bake: Logical flag indicating whether it has been applied.
topic: Recipe topic (character|NULL).
step_objects: List of Step R6 objects (list|NULL), used for documentation generation.
categories: List of RecipeCategory objects for classification.
downloads: Integer download/usage count.
certification: RecipeCertification object (default community).
user_info: RecipeUser object or NULL.
version: Recipe version string.
depends_on_recipes: List of recipe IDs that must be applied before this one.
data_source: List with S3 bucket info (s3_bucket, s3_prefix, file_pattern, provider) or NULL.
labels: List with variable and value labels (var_labels, val_labels) or NULL.

Methods

Public methods

Recipe$new()
Recipe$increment_downloads()
Recipe$certify()
Recipe$add_category()
Recipe$remove_category()
Recipe$to_list()
Recipe$doc()
Recipe$validate()
Recipe$clone()

Method `new()`

Create a Recipe object

Usage

Recipe$new(
  name,
  edition,
  survey_type,
  default_engine,
  depends_on,
  user,
  description,
  steps,
  id,
  doi = NULL,
  topic = NULL,
  step_objects = NULL,
  cached_doc = NULL,
  categories = list(),
  downloads = 0L,
  certification = NULL,
  user_info = NULL,
  version = "1.0.0",
  depends_on_recipes = list(),
  data_source = NULL,
  labels = NULL
)

Arguments

name: Descriptive name of the recipe (character)
edition: Target edition/period (character or Date)
survey_type: Survey type (character), e.g., "ech", "eaii"
default_engine: Default evaluation engine (character)
depends_on: Vector or list of declared dependencies
user: Author or owner of the recipe (character)
description: Detailed description of the recipe (character)
steps: List of step calls that make up the workflow
id: Unique identifier (character or numeric)
doi: DOI or external identifier (character or NULL)
topic: Recipe topic (character or NULL)
step_objects: List of Step R6 objects (optional, used for doc generation)
cached_doc: Pre-computed documentation (optional, used when loading from JSON)
categories: List of RecipeCategory objects (optional)
downloads: Integer download count (default 0)
certification: RecipeCertification object (optional, default community)
user_info: RecipeUser object (optional)
version: Recipe version string (default "1.0.0")
depends_on_recipes: List of recipe IDs that must be applied before this one (optional)
data_source: List with S3 bucket info (optional)
labels: List with var_labels and val_labels (optional)

Method `increment_downloads()`

Increment the download counter

Usage

Recipe$increment_downloads()

Method `certify()`

Certify the recipe at a given level

Usage

Recipe$certify(user, level)

Arguments

user: RecipeUser who is certifying
level: Character certification level ("reviewed" or "official")

Method `add_category()`

Add a category to the recipe

Usage

Recipe$add_category(category)

Arguments

category: RecipeCategory to add

Method `remove_category()`

Remove a category by name

Usage

Recipe$remove_category(name)

Arguments

name: Character category name to remove

Method `to_list()`

Serialize Recipe to a plain list suitable for JSON/API publishing. Steps are encoded as character strings via deparse().

Usage

Recipe$to_list()

Returns

A named list with all recipe fields.

Method `doc()`

Auto-generate documentation from recipe steps

Usage

Recipe$doc()

Returns

A list with metadata, input_variables, output_variables, and pipeline information

Method `validate()`

Validate that a survey has all required input variables

Usage

Recipe$validate(svy)

Arguments

svy: A Survey object

Returns

TRUE if valid, otherwise stops with error listing missing variables

Method `clone()`

The objects of this class are cloneable with this method.

Usage

Recipe$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Use the recipe() constructor:
svy <- survey_empty(type = "ech", edition = "2023")
r <- recipe(
  name = "Example", user = "Test", svy = svy,
  description = "Example recipe"
)

RecipeBackend

Description

Backend-agnostic factory for recipe storage and retrieval. Supports "local" (JSON-backed RecipeRegistry) and "api" (remote plumber API) backends.

Public fields

type: Character backend type ("local" or "api").

Methods

Public methods

RecipeBackend$new()
RecipeBackend$publish()
RecipeBackend$search()
RecipeBackend$get()
RecipeBackend$increment_downloads()
RecipeBackend$rank()
RecipeBackend$filter()
RecipeBackend$list_all()
RecipeBackend$save()
RecipeBackend$load()
RecipeBackend$clone()

Method `new()`

Create a new RecipeBackend

Usage

RecipeBackend$new(type, path = NULL)

Arguments

type: Character. "local" or "api".
path: Character. File path for local backend (optional).

Method `publish()`

Publish a recipe to the backend

Usage

RecipeBackend$publish(recipe)

Arguments

recipe: Recipe object

Method `search()`

Search recipes

Usage

RecipeBackend$search(query)

Arguments

query: Character search string

Returns

List of matching Recipe objects

Method `get()`

Get a recipe by id

Usage

RecipeBackend$get(id)

Arguments

id: Recipe id

Returns

Recipe object or NULL

Method `increment_downloads()`

Increment download count for a recipe

Usage

RecipeBackend$increment_downloads(id)

Arguments

id: Recipe id

Method `rank()`

Rank recipes by downloads

Usage

RecipeBackend$rank(n = NULL)

Arguments

n: Integer max to return

Returns

List of Recipe objects

Method `filter()`

Filter recipes by criteria

Usage

RecipeBackend$filter(
  survey_type = NULL,
  edition = NULL,
  category = NULL,
  certification_level = NULL
)

Arguments

survey_type: Character or NULL
edition: Character or NULL
category: Character or NULL
certification_level: Character or NULL

Returns

List of matching Recipe objects

Method `list_all()`

List all recipes

Usage

RecipeBackend$list_all()

Returns

List of Recipe objects

Method `save()`

Save local backend to disk

Usage

RecipeBackend$save()

Method `load()`

Load local backend from disk

Usage

RecipeBackend$load()

Method `clone()`

The objects of this class are cloneable with this method.

Usage

RecipeBackend$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

RecipeCategory

Description

Standardized taxonomy for classifying recipes by domain. Supports hierarchical categories with parent-child relationships.

Value

An object of class RecipeCategory.

Methods

$new(name, description, parent): Constructor for creating a new category
$is_subcategory_of(ancestor_name): Check if this category is a subcategory of another
$get_path(): Get full hierarchical path
$equals(other): Check equality by name
$to_list(): Serialize to list for JSON
$print(...): Print category information
$from_list(lst): Class method to reconstruct from list (see details)

Public fields

name: Character. Category identifier.
description: Character. Human-readable description.
parent: RecipeCategory or NULL. Parent category for hierarchy.

Methods

Public methods

RecipeCategory$new()
RecipeCategory$is_subcategory_of()
RecipeCategory$get_path()
RecipeCategory$equals()
RecipeCategory$to_list()
RecipeCategory$print()
RecipeCategory$from_list()
RecipeCategory$clone()

Method `new()`

Create a new RecipeCategory

Usage

RecipeCategory$new(name, description, parent = NULL)

Arguments

name: Character. Category identifier (non-empty string).
description: Character. Description of the category.
parent: RecipeCategory or NULL. Parent category.

Method `is_subcategory_of()`

Check if this category is a subcategory of another

Usage

RecipeCategory$is_subcategory_of(ancestor_name)

Arguments

ancestor_name: Character. Name of the potential ancestor category.

Returns

Logical

Method `get_path()`

Get full hierarchical path

Usage

RecipeCategory$get_path()

Returns

Character string with slash-separated path

Method `equals()`

Check equality by name

Usage

RecipeCategory$equals(other)

Arguments

other: RecipeCategory to compare with.

Returns

Logical

Method `to_list()`

Serialize to list for JSON

Usage

RecipeCategory$to_list()

Returns

List representation

Method `print()`

Print category

Usage

RecipeCategory$print(...)

Arguments

...: Additional arguments (not used)

Method `from_list()`

Deserialize a RecipeCategory from a list

Usage

RecipeCategory$from_list(lst)

Arguments

lst: List with name, description, parent fields, or NULL

Returns

RecipeCategory object or NULL

Method `clone()`

The objects of this class are cloneable with this method.

Usage

RecipeCategory$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Use recipe_category() for the public API:
cat <- recipe_category(
  "economics", "Economic indicators"
)
sub <- recipe_category(
  "labor_market", "Labor market",
  parent = "economics"
)

Deserialize a RecipeCategory from a list

Description

Class method to reconstruct a RecipeCategory from its list representation.

Arguments

lst

List with name, description, parent fields, or NULL.

Value

RecipeCategory object or NULL

RecipeCertification

Description

Quality certification for recipes with three tiers: community (default), reviewed (peer-reviewed by institutional member), and official (certified by institution).

Value

An object of class RecipeCertification.

Public fields

level: Character. Certification level.
certified_by: RecipeUser or NULL. The certifying user.
certified_at: POSIXct. Timestamp of certification.
notes: Character or NULL. Additional notes.

Methods

Public methods

RecipeCertification$new()
RecipeCertification$numeric_level()
RecipeCertification$is_at_least()
RecipeCertification$to_list()
RecipeCertification$print()
RecipeCertification$clone()

Method `new()`

Create a new RecipeCertification

Usage

RecipeCertification$new(
  level,
  certified_by = NULL,
  notes = NULL,
  certified_at = NULL
)

Arguments

level: Character. One of "community", "reviewed", "official".
certified_by: RecipeUser or NULL. Required for reviewed/official.
notes: Character or NULL. Additional notes.
certified_at: POSIXct or NULL. Auto-set if NULL.

Method `numeric_level()`

Get numeric level for ordering (1=community, 2=reviewed, 3=official)

Usage

RecipeCertification$numeric_level()

Returns

Integer

Method `is_at_least()`

Check if certification is at least a given level

Usage

RecipeCertification$is_at_least(level)

Arguments

level: Character. Level to compare against.

Returns

Logical

Method `to_list()`

Serialize to list for JSON

Usage

RecipeCertification$to_list()

Returns

List representation

Method `print()`

Print certification badge

Usage

RecipeCertification$print(...)

Arguments

...: Additional arguments (not used)

Method `clone()`

The objects of this class are cloneable with this method.

Usage

RecipeCertification$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Use recipe_certification() for the public API:
cert <- recipe_certification()
inst <- recipe_user("IECON", type = "institution")
official <- recipe_certification("official", certified_by = inst)

RecipeRegistry

Description

Local JSON-backed catalog for recipe discovery, ranking, and filtering.

Methods

Public methods

RecipeRegistry$new()
RecipeRegistry$register()
RecipeRegistry$unregister()
RecipeRegistry$search()
RecipeRegistry$filter()
RecipeRegistry$rank_by_downloads()
RecipeRegistry$rank_by_certification()
RecipeRegistry$get()
RecipeRegistry$list_all()
RecipeRegistry$save()
RecipeRegistry$load()
RecipeRegistry$list_by_user()
RecipeRegistry$list_by_institution()
RecipeRegistry$stats()
RecipeRegistry$print()
RecipeRegistry$clone()

Method `new()`

Create a new empty RecipeRegistry

Usage

RecipeRegistry$new()

Method `register()`

Usage

RecipeRegistry$register(recipe)

Arguments

recipe: Recipe object to register

Method `unregister()`

Remove a recipe from the catalog by id

Usage

RecipeRegistry$unregister(recipe_id)

Arguments

recipe_id: Recipe id to remove

Method `search()`

Search recipes by name or description (case-insensitive)

Usage

RecipeRegistry$search(query)

Arguments

query: Character search query

Returns

List of matching Recipe objects

Method `filter()`

Filter recipes by criteria

Usage

RecipeRegistry$filter(
  survey_type = NULL,
  edition = NULL,
  category = NULL,
  certification_level = NULL
)

Arguments

survey_type: Character survey type or NULL
edition: Character edition or NULL
category: Character category name or NULL
certification_level: Character certification level or NULL

Returns

List of matching Recipe objects

Method `rank_by_downloads()`

Rank recipes by download count (descending)

Usage

RecipeRegistry$rank_by_downloads(n = NULL)

Arguments

n: Integer max number to return, or NULL for all

Returns

List of Recipe objects sorted by downloads

Method `rank_by_certification()`

Rank recipes by certification level then downloads

Usage

RecipeRegistry$rank_by_certification(n = NULL)

Arguments

n: Integer max number to return, or NULL for all

Returns

List of Recipe objects sorted by cert level then downloads

Method `get()`

Get a single recipe by id

Usage

RecipeRegistry$get(recipe_id)

Arguments

recipe_id: Recipe id

Returns

Recipe object or NULL

Method `list_all()`

List all registered recipes

Usage

RecipeRegistry$list_all()

Returns

List of all Recipe objects

Method `save()`

Save the registry catalog to a JSON file

Usage

RecipeRegistry$save(path)

Arguments

path: Character file path

Method `load()`

Load a registry catalog from a JSON file

Usage

RecipeRegistry$load(path)

Arguments

path: Character file path

Method `list_by_user()`

List recipes by author user name

Usage

RecipeRegistry$list_by_user(user_name)

Arguments

user_name: Character user name

Returns

List of matching Recipe objects

Method `list_by_institution()`

List recipes by institution (including members)

Usage

RecipeRegistry$list_by_institution(institution_name)

Arguments

institution_name: Character institution name

Returns

List of matching Recipe objects

Method `stats()`

Get registry statistics

Usage

RecipeRegistry$stats()

Returns

List with total, by_category, by_certification counts

Method `print()`

Print registry summary

Usage

RecipeRegistry$print(...)

Arguments

...: Additional arguments (not used)

Method `clone()`

The objects of this class are cloneable with this method.

Usage

RecipeRegistry$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

RecipeUser

Description

User identity for the recipe ecosystem. Supports three account types: individual, institutional_member, and institution.

Value

An object of class RecipeUser.

Public fields

name: Character. User or institution name.
email: Character or NULL. Email address.
user_type: Character. One of "individual", "institutional_member", "institution".
affiliation: Character or NULL. Organizational affiliation.
institution: RecipeUser or NULL. Parent institution (for institutional_member).
url: Character or NULL. Institution URL.
verified: Logical. Whether the account is verified.
review_status: Character. One of "approved", "pending", "rejected".

Methods

Public methods

RecipeUser$new()
RecipeUser$trust_level()
RecipeUser$can_certify()
RecipeUser$to_list()
RecipeUser$print()
RecipeUser$clone()

Method `new()`

Create a new RecipeUser

Usage

RecipeUser$new(
  name,
  user_type,
  email = NULL,
  affiliation = NULL,
  institution = NULL,
  url = NULL,
  verified = FALSE,
  review_status = "approved"
)

Arguments

name: Character. User or institution name.
user_type: Character. One of "individual", "institutional_member", "institution".
email: Character or NULL. Email address.
affiliation: Character or NULL. Organizational affiliation.
institution: RecipeUser or NULL. Parent institution for institutional_member.
url: Character or NULL. Institution URL.
verified: Logical. Whether account is verified.
review_status: Character. "approved", "pending", or "rejected".

Method `trust_level()`

Get trust level (1=individual, 2=member, 3=institution)

Usage

RecipeUser$trust_level()

Returns

Integer trust level

Method `can_certify()`

Check if user can certify at a given level

Usage

RecipeUser$can_certify(level)

Arguments

level: Character. Certification level ("reviewed" or "official").

Returns

Logical

Method `to_list()`

Serialize to list for JSON

Usage

RecipeUser$to_list()

Returns

List representation

Method `print()`

Print user card

Usage

RecipeUser$print(...)

Arguments

...: Additional arguments (not used)

Method `clone()`

The objects of this class are cloneable with this method.

Usage

RecipeUser$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

# Use recipe_user() for the public API:
user <- recipe_user("Juan Perez", email = "juan@example.com")
inst <- recipe_user("IECON", type = "institution")
member <- recipe_user(
  "Maria",
  type = "institutional_member",
  institution = inst
)

RecipeWorkflow R6 class

Description

RecipeWorkflow R6 class

Format

An R6 class generator (R6ClassGenerator)

Details

R6 class representing a publishable workflow that captures statistical estimations applied to survey data. Workflows reference the recipes they use and document the estimation calls made.

Value

An object of class RecipeWorkflow.

Methods

$new(...): Class constructor.
$doc(): Generate documentation for the workflow.
$to_list(): Serialize to a plain list for JSON export.
$increment_downloads(): Increment the download counter.
$add_category(category): Add a category.
$certify(user, level): Certify the workflow.

Public fields

id: Unique identifier (character).
name: Descriptive name (character).
description: Workflow description (character).
user: Author/owner (character).
user_info: RecipeUser object or NULL.
survey_type: Survey type (character).
edition: Survey edition (character).
estimation_type: Character vector of estimation types used.
recipe_ids: Character vector of recipe IDs referenced.
calls: List of deparsed call strings.
call_metadata: List of lists with type, formula, by, description fields.
categories: List of RecipeCategory objects.
downloads: Integer download count.
certification: RecipeCertification object.
version: Version string.
doi: DOI or external identifier (character|NULL).
created_at: Creation timestamp (character).
weight_spec: Named list with weight configuration per periodicity (list|NULL).

Methods

Public methods

RecipeWorkflow$new()
RecipeWorkflow$doc()
RecipeWorkflow$to_list()
RecipeWorkflow$increment_downloads()
RecipeWorkflow$certify()
RecipeWorkflow$add_category()
RecipeWorkflow$remove_category()
RecipeWorkflow$clone()

Method `new()`

Create a RecipeWorkflow object

Usage

RecipeWorkflow$new(
  id = NULL,
  name,
  description = "",
  user = "Unknown",
  user_info = NULL,
  survey_type = "Unknown",
  edition = "Unknown",
  estimation_type = character(0),
  recipe_ids = character(0),
  calls = list(),
  call_metadata = list(),
  categories = list(),
  downloads = 0L,
  certification = NULL,
  version = "1.0.0",
  doi = NULL,
  created_at = NULL,
  weight_spec = NULL
)

Arguments

id: Unique identifier
name: Descriptive name
description: Workflow description
user: Author name
user_info: RecipeUser object or NULL
survey_type: Survey type
edition: Survey edition
estimation_type: Character vector of estimation types
recipe_ids: Character vector of recipe IDs
calls: List of deparsed call strings
call_metadata: List of call metadata lists
categories: List of RecipeCategory objects
downloads: Integer download count
certification: RecipeCertification or NULL
version: Version string
doi: DOI or NULL
created_at: Timestamp string or NULL (auto-generated)
weight_spec: Named list with weight configuration per periodicity

Method `doc()`

Generate documentation for this workflow

Usage

RecipeWorkflow$doc()

Returns

List with meta, recipe_ids, estimations, and estimation_types

Method `to_list()`

Serialize to a plain list for JSON export

Usage

RecipeWorkflow$to_list()

Returns

A list suitable for jsonlite::write_json

Method `increment_downloads()`

Increment the download counter

Usage

RecipeWorkflow$increment_downloads()

Method `certify()`

Certify the workflow at a given level

Usage

RecipeWorkflow$certify(user, level)

Arguments

user: RecipeUser who is certifying
level: Character certification level

Method `add_category()`

Add a category to the workflow

Usage

RecipeWorkflow$add_category(category)

Arguments

category: RecipeCategory to add

Method `remove_category()`

Remove a category by name

Usage

RecipeWorkflow$remove_category(name)

Arguments

name: Character category name to remove

Method `clone()`

The objects of this class are cloneable with this method.

Usage

RecipeWorkflow$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

wf <- RecipeWorkflow$new(
  name = "Labor workflow", description = "Unemployment rate",
  user = "test", survey_type = "ech", edition = "2023",
  estimation_type = "annual", recipe_ids = "r_001",
  calls = list("svymean(~desocupado, na.rm = TRUE)")
)

RotativePanelSurvey Class

Description

This class represents a rotative panel survey, which includes implantation and follow-up surveys. It provides methods to access and manipulate survey data, steps, recipes, workflows, and designs.

Value

An object of class RotativePanelSurvey.

Public fields

implantation: A survey object representing the implantation survey.
follow_up: A list of survey objects representing the follow-up surveys.
type: A string indicating the type of the survey.
default_engine: A string specifying the default engine used for processing.
steps: A list of steps applied to the survey.
recipes: A list of recipes associated with the survey.
workflows: A list of workflows associated with the survey.
design: A design object for the survey.
periodicity: A list containing the periodicity of the implantation and follow-up surveys.

Methods

Public methods

RotativePanelSurvey$new()
RotativePanelSurvey$get_implantation()
RotativePanelSurvey$get_follow_up()
RotativePanelSurvey$get_type()
RotativePanelSurvey$get_default_engine()
RotativePanelSurvey$get_steps()
RotativePanelSurvey$get_recipes()
RotativePanelSurvey$get_workflows()
RotativePanelSurvey$get_design()
RotativePanelSurvey$print()
RotativePanelSurvey$clone()

Method `new()`

Initializes a new instance of the RotativePanelSurvey class.

Usage

RotativePanelSurvey$new(
  implantation,
  follow_up,
  type,
  default_engine,
  steps,
  recipes,
  workflows,
  design
)

Arguments

implantation: A survey object representing the implantation survey.
follow_up: A list of survey objects representing the follow-up surveys.
type: A string indicating the type of the survey.
default_engine: A string specifying the default engine used for processing.
steps: A list of steps applied to the survey.
recipes: A list of recipes associated with the survey.
workflows: A list of workflows associated with the survey.
design: A design object for the survey.

Method `get_implantation()`

Retrieves the implantation survey.

Usage

RotativePanelSurvey$get_implantation()

Returns

A survey object representing the implantation survey.

Method `get_follow_up()`

Retrieves the follow-up surveys.

Usage

RotativePanelSurvey$get_follow_up(
  index = length(self$follow_up),
  monthly = NULL,
  quarterly = NULL,
  semiannual = NULL,
  annual = NULL
)

Arguments

index: An integer specifying the index of the follow-up survey to retrieve.
monthly: A vector of integers specifying monthly intervals.
quarterly: A vector of integers specifying quarterly intervals.
semiannual: A vector of integers specifying semiannual intervals.
annual: A vector of integers specifying annual intervals.

Returns

A list of follow-up surveys matching the specified criteria.

Method `get_type()`

Retrieves the type of the survey.

Usage

RotativePanelSurvey$get_type()

Returns

A string indicating the type of the survey.

Method `get_default_engine()`

Retrieves the default engine used for processing.

Usage

RotativePanelSurvey$get_default_engine()

Returns

A string specifying the default engine.

Method `get_steps()`

Retrieves the steps applied to the survey.

Usage

RotativePanelSurvey$get_steps()

Returns

A list containing the steps for the implantation and follow-up surveys.

Method `get_recipes()`

Retrieves the recipes associated with the survey.

Usage

RotativePanelSurvey$get_recipes()

Returns

A list of recipes.

Method `get_workflows()`

Retrieves the workflows associated with the survey.

Usage

RotativePanelSurvey$get_workflows()

Returns

A list of workflows.

Method `get_design()`

Retrieves the design object for the survey.

Usage

RotativePanelSurvey$get_design()

Returns

A design object.

Method `print()`

Prints metadata about the RotativePanelSurvey object.

Usage

RotativePanelSurvey$print()

Method `clone()`

The objects of this class are cloneable with this method.

Usage

RotativePanelSurvey$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

impl <- Survey$new(
  data = data.table::data.table(id = 1:5, w = 1),
  edition = "2023", type = "ech", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
fu1 <- Survey$new(
  data = data.table::data.table(id = 1:5, w = 1),
  edition = "2023_01", type = "ech", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
panel <- RotativePanelSurvey$new(
  implantation = impl, follow_up = list(fu1),
  type = "ech", default_engine = "data.table",
  steps = list(), recipes = list(), workflows = list(), design = NULL
)

Step Class Represents a step in a survey workflow.

Description

The Step class is used to define and manage individual steps in a survey workflow. Each step can include operations such as recoding variables, computing new variables, or validating dependencies.

Details

The Step class is part of the survey workflow system and is designed to encapsulate all the information and operations required for a single step in the workflow. Steps can be chained together to form a complete workflow.

Public fields

name: The name of the step.
edition: The edition of the survey associated with the step.
survey_type: The type of survey associated with the step.
type: The type of operation performed by the step (e.g., "compute", "recode").
new_var: The name of the new variable created by the step, if applicable.
exprs: A list of expressions defining the step's operations.
call: The function call associated with the step.
svy_before: Deprecated. Always NULL to prevent memory retention chains. Kept for backwards compatibility.
default_engine: The default engine used for processing the step.
depends_on: A list of variables that the step depends on.
comment: Comments or notes about the step.
bake: A logical value indicating whether the step has been executed.

Methods

Public methods

Step$new()
Step$clone()

Method `new()`

Create a new Step object

Usage

Step$new(
  name,
  edition,
  survey_type,
  type,
  new_var,
  exprs,
  call,
  svy_before,
  default_engine,
  depends_on,
  comment = NULL,
  bake = !lazy_default(),
  comments = NULL
)

Arguments

name: The name of the step.
edition: The edition of the survey associated with the step.
survey_type: The type of survey associated with the step.
type: The type of operation performed by the step (e.g., "compute" or "recode").
new_var: The name of the new variable created by the step, if applicable.
exprs: A list of expressions defining the step's operations.
call: The function call associated with the step.
svy_before: Deprecated. Ignored (always set to NULL) to prevent memory retention chains.
default_engine: The default engine used for processing the step.
depends_on: A list of variables that the step depends on.
comment: Comments or notes about the step.
bake: A logical value indicating whether the step has been executed.
comments: Use comment instead.

Method `clone()`

The objects of this class are cloneable with this method.

Usage

Step$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Survey R6 class

Description

Survey R6 class

Format

An R6 class generator (R6ClassGenerator)

Details

R6 class that encapsulates survey data, metadata (type, edition, periodicity), sampling design (simple/replicate), steps/recipes/workflows, and utilities to manage them.

Only copies the data; reuses design objects and other metadata. Much faster than clone(deep=TRUE) but design objects are shared.

Value

R6 class generator for Survey.

Main methods

$new(data, edition, type, psu, strata, engine, weight, design = NULL, steps = NULL, recipes = list()): Constructor.
$set_data(data): Set data.
$set_edition(edition): Set edition.
$set_type(type): Set type.
$set_weight(weight): Set weight specification.
$print(): Print summarized metadata.
$add_step(step): Add a step and invalidate design.
$add_recipe(recipe): Add a recipe (validates type and dependencies).
$add_workflow(workflow): Add a workflow.
$bake(): Apply recipes and return updated Survey.
$ensure_design(): Lazily initialize the sampling design.
$update_design(): Update design variables with current data.
$shallow_clone(): Efficient copy (shares design, copies data).

Public fields

data: Survey data (data.frame/data.table).
edition: Reference edition or period.
type: Survey type, e.g., "ech" (character).
periodicity: Periodicity detected by validate_time_pattern.
default_engine: Default engine (character).
weight: List with weight specifications per estimation type.
steps: List of steps applied to the survey.
recipes: List of Recipe objects associated.
workflows: List of workflows.
design: List of survey design objects (survey/surveyrep).
psu: Primary Sampling Unit specification (formula or character).
strata: Stratification variable name (character or NULL).
design_initialized: Logical flag for lazy design initialization.
provenance: Data lineage metadata (see provenance()).

Active bindings

design_active: Deprecated. Use ensure_design() instead.

Methods

Public methods

Survey$new()
Survey$get_data()
Survey$get_edition()
Survey$get_type()
Survey$set_data()
Survey$set_edition()
Survey$set_type()
Survey$set_weight()
Survey$print()
Survey$add_step()
Survey$add_recipe()
Survey$add_workflow()
Survey$bake()
Survey$head()
Survey$str()
Survey$set_design()
Survey$ensure_design()
Survey$update_design()
Survey$shallow_clone()
Survey$clone()

Method `new()`

Create a Survey object

Usage

Survey$new(
  data,
  edition,
  type,
  psu = NULL,
  strata = NULL,
  engine,
  weight,
  design = NULL,
  steps = NULL,
  recipes = list()
)

Arguments

data: Survey data
edition: Edition or period
type: Survey type (character)
psu: PSU variable or formula (optional)
strata: Stratification variable name (optional)
engine: Default engine
weight: Weight specification(s) per estimation type
design: Pre-built design (optional)
steps: Initial steps list (optional)
recipes: List of Recipe (optional)

Method `get_data()`

Return the underlying data

Usage

Survey$get_data()

Method `get_edition()`

Return the survey edition/period

Usage

Survey$get_edition()

Method `get_type()`

Return the survey type identifier

Usage

Survey$get_type()

Method `set_data()`

Set the underlying data

Usage

Survey$set_data(data)

Arguments

data: New survey data

Method `set_edition()`

Set the survey edition/period

Usage

Survey$set_edition(edition)

Arguments

edition: New edition or period

Method `set_type()`

Set the survey type

Usage

Survey$set_type(type)

Arguments

type: New type identifier

Method `set_weight()`

Set weight specification(s) per estimation type

Usage

Survey$set_weight(weight)

Arguments

weight: Weight specification list or character

Method `print()`

Print summarized metadata to console

Usage

Survey$print()

Method `add_step()`

Add a step and invalidate design

Usage

Survey$add_step(step)

Arguments

step: Step object

Method `add_recipe()`

Add a recipe

Usage

Survey$add_recipe(recipe, bake = lazy_default())

Arguments

recipe: Recipe object
bake: Whether to bake lazily (internal flag)

Method `add_workflow()`

Add a workflow to the survey

Usage

Survey$add_workflow(workflow)

Arguments

workflow: Workflow object

Method `bake()`

Apply recipes and return updated Survey

Usage

Survey$bake()

Method `head()`

Return the head of the underlying data

Usage

Survey$head()

Method `str()`

Display the structure of the underlying data

Usage

Survey$str()

Method `set_design()`

Set the survey design object

Usage

Survey$set_design(design)

Arguments

design: Survey design object or list

Method `ensure_design()`

Ensure survey design is initialized (lazy initialization)

Usage

Survey$ensure_design()

Returns

Invisibly returns self

Method `update_design()`

Update design variables using current data and weight

Usage

Survey$update_design()

Method `shallow_clone()`

Create a shallow copy of the Survey (optimized for performance)

Usage

Survey$shallow_clone()

Returns

New Survey object with copied data but shared design

Method `clone()`

The objects of this class are cloneable with this method.

Usage

Survey$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Examples

dt <- data.table::data.table(id = 1:5, x = rnorm(5), w = rep(1, 5))
svy <- Survey$new(
  data = dt, edition = "2023", type = "test",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
svy

WorkflowBackend

Description

Backend-agnostic factory for workflow storage and retrieval. Supports "local" (JSON-backed WorkflowRegistry) and "api" (remote plumber API) backends.

Public fields

type: Character backend type ("local" or "api").

Methods

Public methods

WorkflowBackend$new()
WorkflowBackend$publish()
WorkflowBackend$search()
WorkflowBackend$get()
WorkflowBackend$increment_downloads()
WorkflowBackend$find_by_recipe()
WorkflowBackend$rank()
WorkflowBackend$filter()
WorkflowBackend$list_all()
WorkflowBackend$save()
WorkflowBackend$load()
WorkflowBackend$clone()

Method `new()`

Create a new WorkflowBackend

Usage

WorkflowBackend$new(type, path = NULL)

Arguments

type: Character. "local" or "api".
path: Character. File path for local backend (optional).

Method `publish()`

Publish a workflow to the backend

Usage

WorkflowBackend$publish(wf)

Arguments

wf: RecipeWorkflow object

Method `search()`

Search workflows

Usage

WorkflowBackend$search(query)

Arguments

query: Character search string

Returns

List of matching RecipeWorkflow objects

Method `get()`

Get a workflow by id

Usage

WorkflowBackend$get(id)

Arguments

id: Workflow id

Returns

RecipeWorkflow object or NULL

Method `increment_downloads()`

Increment download count for a workflow

Usage

WorkflowBackend$increment_downloads(id)

Arguments

id: Workflow id

Method `find_by_recipe()`

Find workflows that reference a specific recipe

Usage

WorkflowBackend$find_by_recipe(recipe_id)

Arguments

recipe_id: Character recipe ID

Returns

List of RecipeWorkflow objects

Method `rank()`

Rank workflows by downloads

Usage

WorkflowBackend$rank(n = NULL)

Arguments

n: Integer max to return

Returns

List of RecipeWorkflow objects

Method `filter()`

Filter workflows by criteria

Usage

WorkflowBackend$filter(
  survey_type = NULL,
  edition = NULL,
  recipe_id = NULL,
  certification_level = NULL
)

Arguments

survey_type: Character or NULL
edition: Character or NULL
recipe_id: Character or NULL
certification_level: Character or NULL

Returns

List of matching RecipeWorkflow objects

Method `list_all()`

List all workflows

Usage

WorkflowBackend$list_all()

Returns

List of RecipeWorkflow objects

Method `save()`

Save local backend to disk

Usage

WorkflowBackend$save()

Method `load()`

Load local backend from disk

Usage

WorkflowBackend$load()

Method `clone()`

The objects of this class are cloneable with this method.

Usage

WorkflowBackend$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

WorkflowRegistry

Description

Local JSON-backed catalog for workflow discovery, ranking, and filtering.

Methods

Public methods

WorkflowRegistry$new()
WorkflowRegistry$register()
WorkflowRegistry$unregister()
WorkflowRegistry$search()
WorkflowRegistry$filter()
WorkflowRegistry$find_by_recipe()
WorkflowRegistry$rank_by_downloads()
WorkflowRegistry$get()
WorkflowRegistry$list_all()
WorkflowRegistry$save()
WorkflowRegistry$load()
WorkflowRegistry$stats()
WorkflowRegistry$print()
WorkflowRegistry$clone()

Method `new()`

Create a new empty WorkflowRegistry

Usage

WorkflowRegistry$new()

Method `register()`

Usage

WorkflowRegistry$register(wf)

Arguments

wf: RecipeWorkflow object to register

Method `unregister()`

Remove a workflow from the catalog by id

Usage

WorkflowRegistry$unregister(workflow_id)

Arguments

workflow_id: Workflow id to remove

Method `search()`

Search workflows by name or description (case-insensitive)

Usage

WorkflowRegistry$search(query)

Arguments

query: Character search query

Returns

List of matching RecipeWorkflow objects

Method `filter()`

Filter workflows by criteria

Usage

WorkflowRegistry$filter(
  survey_type = NULL,
  edition = NULL,
  recipe_id = NULL,
  certification_level = NULL
)

Arguments

survey_type: Character survey type or NULL
edition: Character edition or NULL
recipe_id: Character recipe ID or NULL (find workflows using this recipe)
certification_level: Character certification level or NULL

Returns

List of matching RecipeWorkflow objects

Method `find_by_recipe()`

Find workflows that reference a specific recipe

Usage

WorkflowRegistry$find_by_recipe(recipe_id)

Arguments

recipe_id: Character recipe ID

Returns

List of RecipeWorkflow objects referencing this recipe

Method `rank_by_downloads()`

Rank workflows by download count (descending)

Usage

WorkflowRegistry$rank_by_downloads(n = NULL)

Arguments

n: Integer max number to return, or NULL for all

Returns

List of RecipeWorkflow objects sorted by downloads

Method `get()`

Get a single workflow by id

Usage

WorkflowRegistry$get(workflow_id)

Arguments

workflow_id: Workflow id

Returns

RecipeWorkflow object or NULL

Method `list_all()`

List all registered workflows

Usage

WorkflowRegistry$list_all()

Returns

List of all RecipeWorkflow objects

Method `save()`

Save the registry catalog to a JSON file

Usage

WorkflowRegistry$save(path)

Arguments

path: Character file path

Method `load()`

Load a registry catalog from a JSON file

Usage

WorkflowRegistry$load(path)

Arguments

path: Character file path

Method `stats()`

Get registry statistics

Usage

WorkflowRegistry$stats()

Returns

List with total, by_survey_type, by_certification counts

Method `print()`

Print registry summary

Usage

WorkflowRegistry$print(...)

Arguments

...: Additional arguments (not used)

Method `clone()`

The objects of this class are cloneable with this method.

Usage

WorkflowRegistry$clone(deep = FALSE)

Arguments

deep: Whether to make a deep clone.

Add a category to a recipe

Description

Pipe-friendly function to add a category to a Recipe object. Accepts either a category name (string) or a RecipeCategory object.

Usage

add_category(recipe, category, description = "")

Arguments

recipe

A Recipe object.

category

Character category name or RecipeCategory object.

description

Character. Description for the category (used when category is a string). Defaults to empty.

Value

The modified Recipe object (invisibly for piping).

Examples

r <- recipe(
  name = "Example", user = "Test",
  svy = survey_empty(type = "ech", edition = "2023"),
  description = "Example recipe"
)
r <- r |>
  add_category("labor_market", "Labor market indicators") |>
  add_category("income")

Add a recipe to a Survey

Description

Tidy wrapper for svy$add_recipe(recipe).

Usage

add_recipe(svy, recipe, bake = lazy_default())

Arguments

svy

Survey object

recipe

A Recipe object

bake

Logical; whether to bake immediately (default: lazy_default())

Value

The Survey object (invisibly), modified in place

Examples

svy <- survey_empty(type = "ech", edition = "2023")
r <- recipe(
  name = "Example", user = "test",
  svy = svy, description = "Example"
)
svy <- add_recipe(svy, r)

Configure replicate weights for variance estimation

Description

This function configures replicate weights (bootstrap, jackknife, etc.) that allow estimation of variance for complex statistics in surveys with complex sampling designs. It is essential for obtaining correct standard errors in population estimates.

Usage

add_replicate(
  weight,
  replicate_pattern,
  replicate_path = NULL,
  replicate_id = NULL,
  replicate_type
)

Arguments

weight

String with the name of the main weight variable in the survey (e.g., "pesoano", "pesomes")

replicate_pattern

String with regex pattern to identify replicate weight columns. Examples: "wr\d+" for columns wr1, wr2, etc.

replicate_path

Path to the file containing replicate weights. If NULL, assumes they are in the same main dataset

replicate_id

Named vector specifying how to join between the main dataset and replicate file. Format: c("main_var" = "replicate_var")

replicate_type

Type of replication used. Options: "bootstrap", "jackknife", "BRR" (Balanced Repeated Replication)

Details

Replicate weights are essential for:

Correctly estimating variance in complex designs
Calculating appropriate confidence intervals
Obtaining reliable coefficients of variation
Performing valid statistical tests

The regex pattern must exactly match the replicate weight column names in the file. For example, if columns are named "wr001", "wr002", etc., use the pattern "wr\\d+".

This function is typically used within add_weight() for more complex weight configurations.

Value

List with replicate configuration that will be used by the sampling design for variance estimation

Examples

# Basic configuration with external file
annual_replicates <- add_replicate(
  weight = "pesoano",
  replicate_pattern = "wr\\d+",
  replicate_path = "bootstrap_weights_2023.xlsx",
  replicate_id = c("ID_HOGAR" = "ID"),
  replicate_type = "bootstrap"
)

# With replicates in same dataset
integrated_replicates <- add_replicate(
  weight = "main_weight",
  replicate_pattern = "rep_\\d{3}",
  replicate_type = "jackknife"
)

# Use within add_weight
weight_config <- add_weight(
  annual = add_replicate(
    weight = "pesoano",
    replicate_pattern = "wr\\d+",
    replicate_path = "bootstrap_annual.xlsx",
    replicate_id = c("numero" = "ID_HOGAR"),
    replicate_type = "bootstrap"
  ),
  monthly = "pesomes"
)

Configure weights by periodicity for Survey objects

Description

This function creates a weight structure that allows specifying different weight variables according to estimation periodicity. It is essential for proper functioning of workflows with multiple temporal estimation types.

Usage

add_weight(monthly = NULL, annual = NULL, quarterly = NULL, biannual = NULL)

Arguments

monthly

String with monthly weight variable name, or replicate list created with add_replicate for monthly weights

annual

String with annual weight variable name, or replicate list for annual weights

quarterly

String with quarterly weight variable name, or replicate list for quarterly weights

biannual

String with biannual weight variable name, or replicate list for biannual weights

Details

This function is fundamental for surveys that require different weights according to temporal estimation type. For example, Uruguay's ECH has specific weights for monthly, quarterly, and annual estimations.

Each parameter can be:

A simple string with the weight variable name
A replicate structure created with add_replicate() for bootstrap or jackknife estimations

Weights are automatically selected in workflow() according to the specified estimation_type parameter.

Value

Named list with weight configuration by periodicity, which will be used by load_survey and workflow to automatically select the appropriate weight

Examples

# Basic configuration with simple weight variables
ech_weights <- add_weight(
  monthly = "pesomes",
  quarterly = "pesotri",
  annual = "pesoano"
)

# With bootstrap replicates for variance estimation
weights_with_replicates <- add_weight(
  monthly = add_replicate(
    weight = "pesomes",
    replicate_pattern = "wr\\d+",
    replicate_path = "monthly_replicate_weights.xlsx",
    replicate_id = c("ID_HOGAR" = "ID"),
    replicate_type = "bootstrap"
  ),
  annual = "pesoano"
)

Search ANDA5 catalog

Description

Queries the ANDA5 REST API to list available survey datasets.

Usage

anda_catalog_search(
  keyword = "ECH",
  base_url = "https://www4.ine.gub.uy/Anda5",
  limit = 50
)

Arguments

keyword

Character search term (e.g., "ECH")

base_url

Character base URL of the ANDA5 instance

limit

Integer max results to return

Value

A data.frame with columns: id, title, year_start, year_end

Download ECH microdata from ANDA5

Description

Downloads microdata files for a given ECH edition from INE Uruguay's ANDA5 catalog. Automatically accepts the terms of use, parses available resources, and downloads the appropriate file.

For editions >= 2022, ANDA provides separate files for implantation, monthly follow-ups, and bootstrap replicate weights. Use the resource parameter to select which file to download.

Usage

anda_download_microdata(
  edition,
  resource = "implantation",
  dest_dir = tempdir(),
  base_url = "https://www4.ine.gub.uy/Anda5"
)

Arguments

edition

Character year (e.g., "2023")

resource

Character type of resource to download. One of:

"implantation": (default) Main implantation file. For editions < 2022, downloads the main microdata file.
"monthly": Monthly follow-up files (editions >= 2022 only). Returns a character vector of paths, one per month.
"bootstrap_annual": Annual bootstrap replicate weights.
"bootstrap_monthly": Monthly bootstrap replicate weights.
"bootstrap_quarterly": Quarterly bootstrap replicate weights.
"bootstrap_semestral": Semestral bootstrap replicate weights.
"poverty": Poverty line microdata (Microdatos_LP).

dest_dir

Character directory where to save files. Defaults to a temporary directory.

base_url

Character base URL of the ANDA5 instance

Value

Character path (or vector of paths for monthly) to the downloaded file(s), ready to pass to load_survey() or data.table::fread().

Examples

## Not run: 
path <- anda_download_microdata("2023", resource = "implantation")
svy <- load_survey(path, svy_type = "ech", svy_edition = "2023")

## End(Not run)

Fetch DDI XML from ANDA5 catalog

Description

Downloads the DDI (Data Documentation Initiative) XML file for a given catalog entry from INE Uruguay's ANDA5 catalog.

Usage

anda_fetch_ddi(
  catalog_id,
  base_url = "https://www4.ine.gub.uy/Anda5",
  dest_file = NULL
)

Arguments

catalog_id

Integer catalog ID (e.g., 767 for ECH 2024)

base_url

Character base URL of the ANDA5 instance

dest_file

Character path where to save the XML file. If NULL, uses a temporary file.

Value

Character path to the downloaded DDI XML file

List available ECH editions in ANDA5

Description

Returns the known ECH editions available in INE Uruguay's ANDA5 catalog.

Usage

anda_list_editions()

Value

A data.frame with columns: edition, catalog_id

Parse variables from a DDI XML file

Description

Reads a DDI Codebook XML file and extracts variable-level metadata: name, label, type (discrete/continuous), and value labels.

Usage

anda_parse_variables(ddi_xml_path)

Arguments

ddi_xml_path

Character path to the DDI XML file

Value

A list of variable metadata, each with: name, label, type, value_labels (named list or NULL), description

Get detailed metadata for a single ANDA variable

Description

Get detailed metadata for a single ANDA variable

Usage

anda_variable_detail(survey_type = "ech", var_name)

Arguments

survey_type

Character survey type (default "ech")

var_name

Character variable name

Value

A list with: name, label, type, value_labels, description. NULL if not found.

Query ANDA variable metadata from the API

Description

Fetches variable metadata (labels, types, value labels) from the metasurvey API's ANDA endpoint.

Usage

anda_variables(survey_type = "ech", var_names = NULL)

Arguments

survey_type

Character survey type (default "ech")

var_names

Character vector of variable names to look up. If NULL, returns all variables for the survey type.

Value

A data.frame with columns: name, label, type

Examples

## Not run: 
anda_variables("ech", c("pobpcoac", "e27"))

## End(Not run)

API Client for metasurvey

Description

HTTP client for the metasurvey REST API (plumber).

The metasurvey API is a server deployed separately (Docker, Cloud Run, etc.). Users of the R package interact with it through these client functions.

Setup

# 1. Point to the deployed API
configure_api(url = "https://metasurvey-api.example.com")

# 2. Register or login
api_register("Ana Garcia", "ana@example.com", "password123")
api_login("ana@example.com", "password123")

# 3. Use the API
api_list_recipes(survey_type = "ech")
api_publish_recipe(my_recipe)

Add a comment to a recipe

Description

Post a text comment on a recipe. Requires authentication.

Usage

api_comment_recipe(id, text)

Arguments

id

Recipe ID

text

Comment text (max 2000 characters)

Value

Invisibly, the API response.

Examples

## Not run: 
api_comment_recipe("r_1739654400_742", "Great recipe!")

## End(Not run)

Add a comment to a workflow

Description

Post a text comment on a workflow. Requires authentication.

Usage

api_comment_workflow(id, text)

Arguments

id

Workflow ID

text

Comment text (max 2000 characters)

Value

Invisibly, the API response.

Examples

## Not run: 
api_comment_workflow("w_1739654400_123", "Very useful workflow!")

## End(Not run)

Delete a comment

Description

Delete a comment by its ID. Only the comment author can delete it. Requires authentication.

Usage

api_delete_comment(comment_id)

Arguments

comment_id

Comment ID

Value

Invisibly, the API response.

Examples

## Not run: 
api_delete_comment("c_abc123")

## End(Not run)

Increment recipe download counter

Description

Increment recipe download counter

Usage

api_download_recipe(id)

Arguments

id

Recipe ID

Increment workflow download counter

Description

Increment workflow download counter

Usage

api_download_workflow(id)

Arguments

id

Workflow ID

Get ANDA variable metadata from the API

Description

Get ANDA variable metadata from the API

Usage

api_get_anda_variables(survey_type = "ech", var_names = NULL)

Arguments

survey_type

Character survey type (default "ech")

var_names

Character vector of variable names. If NULL, returns all.

Value

A list of variable metadata objects

Examples

## Not run: 
api_get_anda_variables("ech", c("pobpcoac", "e27"))

## End(Not run)

Get recipe(s) by ID

Description

Retrieve one or more recipes from the API by their IDs.

Usage

api_get_recipe(id)

Arguments

id

Character vector of recipe ID(s). If length > 1, returns a list.

Value

A single Recipe object (or NULL) when length(id) == 1. A list of Recipe objects when length(id) > 1 (NULLs are dropped).

Examples

## Not run: 
api_get_recipe("r_1739654400_742")

## End(Not run)

Get comments for a recipe

Description

List all comments on a recipe, sorted by creation date.

Usage

api_get_recipe_comments(id)

Arguments

id

Recipe ID

Value

List of comment objects.

Examples

## Not run: 
api_get_recipe_comments("r_1739654400_742")

## End(Not run)

Get recipes that depend on a recipe

Description

Find all recipes whose depends_on_recipes field references the given recipe ID (backlinks).

Usage

api_get_recipe_dependents(id)

Arguments

id

Recipe ID

Value

List of recipe summaries (id, name, user).

Examples

## Not run: 
api_get_recipe_dependents("r_1739654400_742")

## End(Not run)

Get star summary for a recipe

Description

Returns average rating, count, and the current user's rating (if authenticated).

Usage

api_get_recipe_stars(id)

Arguments

id

Recipe ID

Value

List with average, count, and optionally user_value.

Examples

## Not run: 
api_get_recipe_stars("r_1739654400_742")

## End(Not run)

Get a single workflow by ID

Description

Retrieve a workflow from the API by its ID.

Usage

api_get_workflow(id)

Arguments

id

Workflow ID

Value

RecipeWorkflow object or NULL

Examples

## Not run: 
api_get_workflow("w_1739654400_123")

## End(Not run)

Get comments for a workflow

Description

List all comments on a workflow, sorted by creation date.

Usage

api_get_workflow_comments(id)

Arguments

id

Workflow ID

Value

List of comment objects.

Examples

## Not run: 
api_get_workflow_comments("w_1739654400_123")

## End(Not run)

Get star summary for a workflow

Description

Returns average rating, count, and the current user's rating (if authenticated).

Usage

api_get_workflow_stars(id)

Arguments

id

Workflow ID

Value

List with average, count, and optionally user_value.

Examples

## Not run: 
api_get_workflow_stars("w_1739654400_123")

## End(Not run)

List recipes from API

Description

Fetch recipes with optional search and filters.

Usage

api_list_recipes(
  search = NULL,
  survey_type = NULL,
  topic = NULL,
  certification = NULL,
  user = NULL,
  limit = 50,
  offset = 0
)

Arguments

search

Text search (matches name/description)

survey_type

Filter by survey type (e.g., "ech")

topic

Filter by topic

certification

Filter by certification level

user

Filter by author email

limit

Maximum results (default 50)

offset

Skip first N results (default 0)

Value

List of Recipe objects

Examples

## Not run: 
configure_api("https://metasurvey-api.example.com")
recipes <- api_list_recipes(survey_type = "ech")

## End(Not run)

List workflows from API

Description

Fetch workflows with optional search and filters.

Usage

api_list_workflows(
  search = NULL,
  survey_type = NULL,
  recipe_id = NULL,
  user = NULL,
  limit = 50,
  offset = 0
)

Arguments

search

Text search

survey_type

Filter by survey type

recipe_id

Filter workflows that reference this recipe

user

Filter by author email

limit

Maximum results

offset

Skip first N results

Value

List of RecipeWorkflow objects

Examples

## Not run: 
api_list_workflows(survey_type = "ech")

## End(Not run)

Login

Description

Authenticate with the metasurvey API. On success the JWT token is stored automatically.

Usage

api_login(email, password)

Arguments

email

Email address

password

Password

Value

Invisibly, the API response.

Examples

## Not run: 
api_login("ana@example.com", "s3cret")

## End(Not run)

Logout

Description

Clear the stored API token from memory and the environment.

Usage

api_logout()

Value

Invisibly, NULL.

Examples


api_logout()

Get current user profile

Description

Returns profile info for the currently authenticated user.

Usage

api_me()

Value

List with user fields (name, email, user_type, etc.)

Examples

## Not run: 
api_me()

## End(Not run)

Publish a recipe

Description

Publish a Recipe object to the API. Requires authentication (call api_login() first).

Usage

api_publish_recipe(recipe)

Arguments

recipe

A Recipe object

Value

Invisibly, the API response with the assigned ID.

Examples

## Not run: 
api_publish_recipe(my_recipe)

## End(Not run)

Publish a workflow

Description

Publish a RecipeWorkflow object to the API. Requires authentication.

Usage

api_publish_workflow(workflow)

Arguments

workflow

A RecipeWorkflow object

Value

Invisibly, the API response.

Examples

## Not run: 
api_publish_workflow(my_workflow)

## End(Not run)

Refresh JWT token

Description

Request a new JWT token using the current (still valid) token. The new token is stored automatically. This is called internally by api_request() when the current token is close to expiry (within 5 minutes).

Usage

api_refresh_token()

Value

The new token string (invisibly), or NULL if refresh fails.

Examples

## Not run: 
api_refresh_token()

## End(Not run)

Register a new user

Description

Create an account on the metasurvey API. On success the JWT token is stored automatically via options(metasurvey.api_token).

Usage

api_register(
  name,
  email,
  password,
  user_type = "individual",
  institution = NULL
)

Arguments

name

Display name

email

Email address

password

Password

user_type

One of "individual", "institutional_member", "institution"

institution

Institution name (required for "institutional_member")

Value

Invisibly, the API response (list with ok, token, user).

Examples

## Not run: 
configure_api("https://metasurvey-api.example.com")
api_register("Ana Garcia", "ana@example.com", "s3cret")

## End(Not run)

Rate a recipe

Description

Give a star rating (1-5) to a recipe. Requires authentication. Each user can have one rating per recipe (upserts on subsequent calls).

Usage

api_star_recipe(id, value)

Arguments

id

Recipe ID

value

Integer rating from 1 to 5

Value

Invisibly, the API response.

Examples

## Not run: 
api_star_recipe("r_1739654400_742", 5)

## End(Not run)

Rate a workflow

Description

Give a star rating (1-5) to a workflow. Requires authentication.

Usage

api_star_workflow(id, value)

Arguments

id

Workflow ID

value

Integer rating from 1 to 5

Value

Invisibly, the API response.

Examples

## Not run: 
api_star_workflow("w_1739654400_123", 4)

## End(Not run)

Increment download counter (generic)

Description

Increment download counter (generic)

Usage

api_track_download(type, id)

Arguments

type

"recipe" or "workflow"

id

Object ID

Bake recipes

Description

Bake recipes

Usage

bake_recipes(svy)

Arguments

svy

Survey object

Value

Survey object with all recipes applied

Examples


dt <- data.table::data.table(id = 1:20, x = rnorm(20), w = runif(20, 0.5, 2))
svy <- Survey$new(
  data = dt, edition = "2023", type = "demo",
  psu = NULL, engine = "data.table",
  weight = add_weight(annual = "w")
)
r <- recipe(
  name = "Demo", user = "test", svy = svy,
  description = "Demo recipe"
)
svy <- add_recipe(svy, r)
processed <- bake_recipes(svy)

Execute all pending steps

Description

Iterates over all pending (lazy) steps attached to a Survey or RotativePanelSurvey and executes them sequentially, mutating the underlying data.table. Each step is validated before execution (checks that required variables exist).

Usage

bake_steps(svy)

Arguments

svy

A Survey or RotativePanelSurvey object with pending steps

Details

Steps are executed in the order they were added. Each step's expressions can reference variables created by previous steps.

For RotativePanelSurvey objects, steps are applied to both the implantation and all follow-up surveys.

Value

The same object with all steps materialized in the data and each step marked as bake = TRUE.

Examples

dt <- data.table::data.table(id = 1:5, age = c(15, 30, 45, 50, 70), w = 1)
svy <- Survey$new(
  data = dt, edition = "2023", type = "test",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
svy <- step_compute(svy, age2 = age * 2)
svy <- bake_steps(svy)
get_data(svy)

Display survey design information

Description

Pretty-prints the sampling design configuration for each estimation type in a Survey object, showing PSU, strata, weights, and other design elements in a color-coded, readable format.

Usage

cat_design(self)

Arguments

self

Survey object containing design information

Details

This function displays design information including:

Primary Sampling Units (PSU/clusters)
Stratification variables
Weight variables for each estimation type
Finite Population Correction (FPC) if used
Calibration formulas if applied
Overall design type classification

Output is color-coded for better readability in supporting terminals.

Value

Invisibly returns NULL; called for side effect of printing design info

Examples


dt <- data.table::data.table(id = 1:20, x = rnorm(20), w = runif(20, 0.5, 2))
svy <- Survey$new(
  data = dt, edition = "2023", type = "demo",
  psu = NULL, engine = "data.table",
  weight = add_weight(annual = "w")
)
cat_design(svy)

cat_design_type

Description

Cast design type from survey

Usage

cat_design_type(self, design_name)

Arguments

self

Object of class Survey

design_name

Name of design

Value

Character string describing the design type, or "None".

Examples


dt <- data.table::data.table(id = 1:20, x = rnorm(20), w = runif(20, 0.5, 2))
svy <- Survey$new(
  data = dt, edition = "2023", type = "demo",
  psu = NULL, engine = "data.table",
  weight = add_weight(annual = "w")
)
svy$ensure_design()
cat_design_type(svy, "annual")

cat_estimation_default

Description

cat_estimation_default

Usage

cat_estimation.default(estimation, call)

Arguments

estimation

Estimation

call

Call

cat_recipes

Description

Cast recipes from survey

Usage

cat_recipes(self)

Arguments

self

Object of class Survey

Value

Character string listing recipe names, or "None".

Certify a recipe

Description

Pipe-friendly function to certify a Recipe at a given quality level.

Usage

certify_recipe(recipe, user, level)

Arguments

recipe

A Recipe object.

user

RecipeUser who is certifying.

level

Character. Certification level: "reviewed" or "official".

Value

The modified Recipe object.

Examples

r <- recipe(
  name = "Example", user = "Test",
  svy = survey_empty(type = "ech", edition = "2023"),
  description = "Example recipe"
)
inst <- recipe_user("IECON", type = "institution")
r <- r |> certify_recipe(inst, "official")

Collect lines inside a loop body until closing brace

Description

Collect lines inside a loop body until closing brace

Usage

collect_loop_body(lines, start_idx)

Arguments

lines

All source lines

start_idx

Index of the line with opening brace

Value

List with body (character vector) and end_idx

Configure metasurvey API

Description

Set API base URL and optionally load stored credentials. The URL can also be set via the METASURVEY_API_URL environment variable, and the token via METASURVEY_TOKEN.

Usage

configure_api(url)

Arguments

url

API base URL (e.g., "https://metasurvey-api.example.com")

Value

Invisibly, the previous URL (for restoring).

Examples


configure_api(url = "https://metasurvey-api.example.com")

Default recipe categories

Description

Returns a list of standard built-in categories for recipe classification.

Usage

default_categories()

Value

List of RecipeCategory objects

Examples

cats <- default_categories()
vapply(cats, function(c) c$name, character(1))

Set default engine

Description

Sets a default engine for loading surveys. If an engine is already configured, it keeps it; otherwise, it sets "data.table" as the default engine.

Usage

default_engine(.engine = "data.table")

Arguments

.engine

Character vector with the name of the default engine. By default, "data.table" is used.

Evaluate estimation with Coefficient of Variation

Description

Evaluate estimation with Coefficient of Variation

Usage

evaluate_cv(cv)

Arguments

cv

Numeric coefficient of variation value.

Value

Character string with the quality category (e.g. "Excellent", "Good").

Examples

evaluate_cv(3) # "Excellent"
evaluate_cv(12) # "Good"
evaluate_cv(30) # "Use with caution"

Substitute loop variable in body lines

Description

Substitute loop variable in body lines

Usage

expand_loop_body(loopvar, values, body)

Arguments

loopvar

Loop variable name

values

Character vector of values

body

Character vector of body lines

Value

Character vector of expanded lines

Expand a numlist specification into numeric values

Description

Expand a numlist specification into numeric values

Usage

expand_numlist(spec)

Arguments

spec

Numlist string like "1/14" or "81/89"

Value

Character vector of values

Expand STATA foreach and forvalues loops

Description

Detects foreach/forvalues blocks and unrolls them by substituting loop variables into body lines.

Usage

expand_stata_loops(lines)

Arguments

lines

Character vector of source lines

Value

Character vector with loops expanded

Expand a STATA variable range to individual variable names

Description

STATA allows variable ranges like suma1-suma4 which means suma1 suma2 suma3 suma4. This function detects and expands such ranges by incrementing the trailing numeric suffix.

Usage

expand_var_range(spec)

Arguments

spec

Variable specification (may contain ranges with -)

Value

Character vector of individual variable names

Launch the Recipe Explorer Shiny App

Description

Opens an interactive web application to explore, search, and browse metasurvey recipes with visual documentation cards. Supports user registration and login via MongoDB Atlas.

Usage

explore_recipes(port = NULL, host = "127.0.0.1", launch.browser = TRUE)

Arguments

port

Integer port number, or NULL for automatic.

host

Character. The host to listen on. Defaults to "127.0.0.1" for local use. Set to "0.0.0.0" for server deployments (Railway, etc.).

launch.browser

Logical. Open the app in a browser?

Value

NULL (called for side effect of launching the app).

Examples

## Not run: 
# Local / RStudio viewer
explore_recipes()

# Server deployment (Railway, Docker, etc.)
explore_recipes(host = "0.0.0.0", port = 3838, launch.browser = FALSE)

## End(Not run)

Extract surveys by periodicity from a rotating panel

Description

Extracts subsets of surveys from a RotativePanelSurvey object based on temporal criteria. Allows obtaining surveys for different types of analysis (monthly, quarterly, annual) respecting the rotating panel's temporal structure.

Usage

extract_surveys(
  RotativePanelSurvey,
  index = NULL,
  monthly = NULL,
  annual = NULL,
  quarterly = NULL,
  biannual = NULL,
  use.parallel = FALSE
)

Arguments

RotativePanelSurvey

A RotativePanelSurvey object containing the rotating panel surveys organized temporally

index

Integer vector specifying survey indices to extract. If a single value, returns that survey; if a vector, returns a list

monthly

Integer vector specifying which months to extract for monthly analysis (1-12)

annual

Integer vector specifying which years to extract for annual analysis

quarterly

Integer vector specifying which quarters to extract for quarterly analysis (1-4)

biannual

Integer vector specifying which semesters to extract for biannual analysis (1-2)

use.parallel

Logical indicating whether to use parallel processing for intensive operations. Default FALSE

Details

This function is essential for working with rotating panels because:

Enables periodicity-based analysis: Extract data for different types of temporal estimations
Preserves temporal structure: Respects temporal relationships between different panel waves
Optimizes memory: Only loads surveys needed for the analysis
Facilitates comparisons: Extract specific periods for comparative analysis
Supports parallelization: For operations with large data volumes

Extraction criteria are interpreted according to survey frequency:

For monthly ECH: monthly=c(1,3,6) extracts January, March and June
For annual analysis: annual=1 typically extracts the first available year
For quarterly analysis: quarterly=c(1,4) extracts Q1 and Q4

If no criteria are specified, the function returns the implantation survey with a warning.

Value

A list of Survey objects matching the specified criteria, or a single Survey object if a single index is specified

Examples

## Not run: 
# Load rotating panel
panel_ech <- load_panel_survey(
  path = "ech_panel_2023.dta",
  svy_type = "ech_panel",
  svy_edition = "2023"
)

# Extract specific monthly surveys
ech_q1 <- extract_surveys(
  panel_ech,
  monthly = c(1, 2, 3) # January, February, March
)

# Extract by index
ech_first <- extract_surveys(panel_ech, index = 1)
ech_several <- extract_surveys(panel_ech, index = c(1, 3, 6))

# Quarterly analysis
ech_Q1_Q4 <- extract_surveys(
  panel_ech,
  quarterly = c(1, 4)
)

# Annual analysis (typically all surveys for the year)
ech_annual <- extract_surveys(
  panel_ech,
  annual = 1
)

# With parallel processing for large volumes
ech_full <- extract_surveys(
  panel_ech,
  monthly = 1:12,
  use.parallel = TRUE
)

# Use in workflow
results <- workflow(
  survey = extract_surveys(panel_ech, quarterly = c(1, 2)),
  svymean(~unemployed, na.rm = TRUE),
  estimation_type = "quarterly"
)

## End(Not run)

Extract time pattern

Description

Extract time pattern

Usage

extract_time_pattern(svy_edition)

Arguments

svy_edition

Survey edition string (e.g. "2023", "2023-06", "2023_Q1").

Value

List with components: periodicity, year, month (when applicable).

Examples

# Annual edition
extract_time_pattern("2023")

# Monthly edition
extract_time_pattern("2023-06")

Filter recipes by criteria

Description

Filter recipes in the active backend by survey type, edition, category, or certification level.

Usage

filter_recipes(
  survey_type = NULL,
  edition = NULL,
  category = NULL,
  certification_level = NULL
)

Arguments

survey_type

Character survey type or NULL.

edition

Character edition or NULL.

category

Character category name or NULL.

certification_level

Character certification level or NULL.

Value

List of matching Recipe objects.

Examples

set_backend("local", path = tempfile(fileext = ".json"))
ech_recipes <- filter_recipes(survey_type = "ech")
length(ech_recipes)

Filter workflows by criteria

Description

Filter workflows in the active backend by survey type, edition, recipe ID, or certification level.

Usage

filter_workflows(
  survey_type = NULL,
  edition = NULL,
  recipe_id = NULL,
  certification_level = NULL
)

Arguments

survey_type

Character survey type or NULL (default NULL).

edition

Character edition or NULL (default NULL).

recipe_id

Character recipe ID or NULL (default NULL). Find workflows using this recipe.

certification_level

Character certification level or NULL (default NULL).

Value

List of matching RecipeWorkflow objects.

Examples

set_workflow_backend("local", path = tempfile(fileext = ".json"))
ech_wf <- filter_workflows(survey_type = "ech")
length(ech_wf)

Find workflows that use a specific recipe

Description

Cross-reference query: find all workflows that reference a given recipe ID.

Usage

find_workflows_for_recipe(recipe_id)

Arguments

recipe_id

Character recipe ID to search for.

Value

List of RecipeWorkflow objects that reference this recipe.

Examples

set_workflow_backend("local", path = tempfile(fileext = ".json"))
wfs <- find_workflows_for_recipe("recipe_001")
length(wfs)

Get recipe backend

Description

Returns the currently configured recipe backend. Defaults to "api" if not configured.

Usage

get_backend()

Value

RecipeBackend object

Examples

set_backend("local", path = tempfile(fileext = ".json"))
backend <- get_backend()
backend

get_data

Description

Get data from survey

Usage

get_data(svy)

Arguments

svy

Survey object

Value

A data.table (or data.frame) containing the survey microdata.

Examples

dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60),
  w = rep(1, 5)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "ech",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
head(get_data(svy))

Get the current survey data engine

Description

Retrieves the currently configured engine for loading surveys from system options or environment variables.

Usage

get_engine()

Value

Character vector with the name of the configured engine.

Examples

get_engine()

Get follow-up surveys from a rotating panel

Description

Extracts one or more follow-up surveys (waves after the implantation) from a RotativePanelSurvey object. Follow-up surveys represent subsequent data collections and are essential for longitudinal and temporal change analysis.

Usage

get_follow_up(
  RotativePanelSurvey,
  index = seq_along(RotativePanelSurvey$follow_up)
)

Arguments

RotativePanelSurvey

A RotativePanelSurvey object from which to extract the follow-up surveys

index

Integer vector specifying which follow-up surveys to extract. Defaults to all available (1:length(follow_up)). Can be a single index or a vector of indices

Details

Follow-up surveys are fundamental in rotating panels because:

Enable longitudinal analysis: Track the same units over time
Capture temporal changes: Evolution of economic, social, and demographic variables
Maintain representativeness: Each wave preserves population representativeness through controlled rotation
Optimize resources: Reuse information from previous waves to reduce collection costs
Facilitate comparisons: Consistent temporal structure for trend analysis

In rotating panels like ECH:

Each follow-up wave covers a specific period (monthly/quarterly)
Units rotate gradually maintaining temporal overlap
Indices correspond to the chronological collection order
Each follow-up maintains methodological consistency with implantation

Value

A list of Survey objects corresponding to the specified follow-up surveys. If a single index is specified, returns a list with one element

Examples

impl <- Survey$new(
  data = data.table::data.table(id = 1:5, w = 1),
  edition = "2023", type = "test", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
fu1 <- Survey$new(
  data = data.table::data.table(id = 1:5, w = 1),
  edition = "2023_01", type = "test", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
fu2 <- Survey$new(
  data = data.table::data.table(id = 1:5, w = 1),
  edition = "2023_02", type = "test", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
panel <- RotativePanelSurvey$new(
  implantation = impl, follow_up = list(fu1, fu2),
  type = "test", default_engine = "data.table",
  steps = list(), recipes = list(), workflows = list(), design = NULL
)
get_follow_up(panel, index = 1)
get_follow_up(panel)

Get implantation survey from a rotating panel

Description

Extracts the implantation (baseline) survey from a RotativePanelSurvey object. The implantation survey represents the first data collection wave and is essential for establishing the baseline and structural characteristics of the panel.

Usage

get_implantation(RotativePanelSurvey)

Arguments

RotativePanelSurvey

A RotativePanelSurvey object from which to extract the implantation survey

Details

The implantation survey is special in a rotating panel because:

Establishes the baseline: Defines initial characteristics of all panel units
Contains the full sample: Includes all units that will participate in the different panel waves
Defines temporal structure: Establishes rotation and follow-up patterns
Configures metadata: Contains information about periodicity, key variables, and stratification
Serves as tracking reference: Basis for unit tracking in subsequent waves

This function is essential for analysis requiring:

Temporal comparisons from the baseline
Analysis of the complete panel structure
Configuration of longitudinal models
Evaluation of sampling design quality

Value

A Survey object containing the implantation survey with all its metadata, data, and design configuration

Examples

impl <- Survey$new(
  data = data.table::data.table(id = 1:5, w = 1),
  edition = "2023", type = "test", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
fu <- Survey$new(
  data = data.table::data.table(id = 1:5, w = 1),
  edition = "2023_01", type = "test", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
panel <- RotativePanelSurvey$new(
  implantation = impl, follow_up = list(fu),
  type = "test", default_engine = "data.table",
  steps = list(), recipes = list(), workflows = list(), design = NULL
)
get_implantation(panel)

get_metadata

Description

Get metadata from survey

Usage

get_metadata(self)

Arguments

self

Object of class Survey

Value

NULL (called for side effect: prints metadata to console).

Examples

dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60),
  w = rep(1, 5)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "ech",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
get_metadata(svy)

Get recipe from repository or API

Description

This function retrieves data transformation recipes from the metasurvey repository or API, based on specific criteria such as survey type, edition, and topic. It is the primary way to access predefined and community-validated recipes.

Usage

get_recipe(
  svy_type = NULL,
  svy_edition = NULL,
  topic = NULL,
  allowMultiple = TRUE
)

Arguments

svy_type

String specifying the survey type. Examples: "ech", "eaii", "eai", "eph"

svy_edition

String specifying the survey edition. Supported formats: "YYYY", "YYYYMM", "YYYY-YYYY"

topic

String specifying the recipe topic. Examples: "labor_market", "poverty", "income", "demographics"

allowMultiple

Logical indicating whether multiple recipes are allowed. If FALSE and multiple matches exist, returns the most recent one

Details

This function is essential for:

Accessing official recipes: Get validated and maintained recipes by specialized teams
Reproducibility: Ensure different users apply the same standard transformations
Automation: Integrate recipes into automatic pipelines
Collaboration: Share methodologies between teams and organizations
Versioning: Access different recipe versions according to edition

The function queries the metasurvey API to retrieve recipes. Internet connection is required. If the API is unavailable or you need to work offline:

Working Offline:

Don't call get_recipe() - work directly with steps
Set options(metasurvey.skip_recipes = TRUE) to disable API calls
Load recipes from local files using read_recipe()
Create custom recipes with recipe()

Search criteria are combined with AND operator, so all specified criteria must match for a recipe to be returned.

Value

Recipe object or list of Recipe objects according to the specified criteria and the value of allowMultiple

Examples

## Not run: 
# Get specific recipe for ECH 2023
ech_recipe <- get_recipe(
  svy_type = "ech",
  svy_edition = "2023"
)

# Recipe for specific topic
labor_recipe <- get_recipe(
  svy_type = "ech",
  svy_edition = "2023",
  topic = "labor_market"
)

# Allow multiple recipes
available_recipes <- get_recipe(
  svy_type = "eaii",
  svy_edition = "2019-2021",
  allowMultiple = TRUE
)

# Use recipe in load_survey
ech_with_recipe <- load_survey(
  path = "ech_2023.dta",
  svy_type = "ech",
  svy_edition = "2023",
  recipes = get_recipe("ech", "2023"),
  bake = TRUE
)

# Working offline - don't use recipes
ech_offline <- load_survey(
  path = "ech_2023.dta",
  svy_type = "ech",
  svy_edition = "2023",
  svy_weight = add_weight(annual = "PESOANO")
)

# Disable recipe API globally
options(metasurvey.skip_recipes = TRUE)
# Now get_recipe() will return NULL with a warning

# For year ranges
panel_recipe <- get_recipe(
  svy_type = "ech_panel",
  svy_edition = "2020-2023"
)

## End(Not run)

get_steps

Description

Get steps from survey

Usage

get_steps(svy)

Arguments

svy

Survey object

Value

List of Step objects

Examples

dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60),
  w = rep(1, 5)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "ech",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
svy <- step_compute(svy, age2 = age * 2)
get_steps(svy) # list of Step objects

Get workflow backend

Description

Returns the currently configured workflow backend. Defaults to "local" if not configured.

Usage

get_workflow_backend()

Value

WorkflowBackend object

Examples

backend <- get_workflow_backend()

Group dates

Description

Group dates

Usage

group_dates(dates, type = c("monthly", "quarterly", "biannual"))

Arguments

dates

Vector of Date objects.

type

Grouping type: "monthly", "quarterly", or "biannual".

Value

Integer vector of group indices (e.g. 1-12 for monthly, 1-4 for quarterly).

Examples

dates <- as.Date(c(
  "2023-01-15", "2023-04-20",
  "2023-07-10", "2023-11-05"
))
group_dates(dates, "quarterly")
group_dates(dates, "biannual")

Check if survey has a design

Description

Check if survey has a design

Usage

has_design(svy)

Arguments

svy

A Survey object.

Value

Logical.

Examples

svy <- survey_empty(type = "test", edition = "2023")
has_design(svy) # FALSE

Check if survey has recipes

Description

Check if survey has recipes

Usage

has_recipes(svy)

Arguments

svy

A Survey or RotativePanelSurvey object.

Value

Logical.

Examples

svy <- survey_empty(type = "test", edition = "2023")
has_recipes(svy) # FALSE

Check if survey has steps

Description

Check if survey has steps

Usage

has_steps(svy)

Arguments

svy

A Survey or RotativePanelSurvey object.

Value

Logical.

Examples

svy <- survey_empty(type = "test", edition = "2023")
has_steps(svy) # FALSE

Check if all steps are baked

Description

Returns TRUE when every step attached to the survey has been executed (bake == TRUE), or when there are no steps.

Usage

is_baked(svy)

Arguments

svy

A Survey or RotativePanelSurvey object.

Value

Logical.

Examples

svy <- survey_empty(type = "test", edition = "2023")
is_baked(svy) # TRUE (no steps)

Detect if a STATA replace RHS is a constant value

Description

Returns TRUE if the expression is a simple numeric constant, string literal, or negative number. Used to decide between step_recode (constants) and step_compute (expressions).

Usage

is_constant_rhs(expr)

Arguments

expr

STATA expression string

Value

Logical

Join broken expression lines after block comment removal

Description

After removing /* */ block comments, expressions can be split across lines. This function re-joins lines where the previous line ends with an operator or open paren, or the next line starts with an operator or close paren.

Usage

join_broken_expressions(lines)

Arguments

lines

Character vector of source lines

Value

Character vector with broken expressions re-joined

Join STATA continuation lines

Description

STATA uses /// for line continuation. This function joins continued lines into single commands.

Usage

join_continuation_lines(lines)

Arguments

lines

Character vector of source lines

Value

Character vector with continuations joined

Lazy processing

Description

Lazy processing

Usage

lazy_default()

Value

Logical indicating the current lazy processing setting.

Examples

# Check current lazy processing default
lazy_default()

List all recipes

Description

List all recipes from the active backend.

Usage

list_recipes()

Value

List of all Recipe objects.

Examples

set_backend("local", path = tempfile(fileext = ".json"))
all <- list_recipes()
length(all)

List all workflows

Description

List all workflows from the active workflow backend.

Usage

list_workflows()

Value

List of all RecipeWorkflow objects.

Examples

set_workflow_backend("local", path = tempfile(fileext = ".json"))
all <- list_workflows()
length(all)

Read panel survey files from different formats and create a RotativePanelSurvey object

Description

Read panel survey files from different formats and create a RotativePanelSurvey object

Usage

load_panel_survey(
  path_implantation,
  path_follow_up,
  svy_type,
  svy_weight_implantation,
  svy_weight_follow_up,
  svy_strata = NULL,
  ...
)

Arguments

path_implantation

Survey implantation path, file can be in different formats, csv, xtsx, dta, sav and rds

path_follow_up

Path with all the needed files with only survey valid files but also can be character vector with path files.

svy_type

String with the survey type, supported types; "ech" (Encuensta Continua de Hogares, Uruguay), "eph" (Encuesta Permanente de Hogares, Argentina), "eai" (Encuesta de Actividades de Innovacion, Uruguay)

svy_weight_implantation

List with survey implantation weights information specifing periodicity and the name of the weight variable. Recomended to use the helper function add_weight().

svy_weight_follow_up

List with survey follow_up weights information specifing periodicity and the name of the weight variable. Recomended to use the helper function add_weight().

svy_strata

Stratification variable name (character or NULL). Passed to Survey$new(strata = ...).

...

Further arguments to be passed to load_panel_survey

Value

RotativePanelSurvey object

Examples

## Not run: 
# example code
path_dir <- here::here("example-data", "ech", "ech_2023")
ech_2023 <- load_panel_survey(
  path_implantation = file.path(
    path_dir,
    "ECH_implantacion_2023.csv"
  ),
  path_follow_up = file.path(
    path_dir,
    "seguimiento"
  ),
  svy_type = "ECH_2023",
  svy_weight_implantation = add_weight(
    annual = "W_ANO"
  ),
  svy_weight_follow_up = add_weight(
    monthly = add_replicate(
      "W",
      replicate_path = file.path(
        path_dir,
        c(
          "Pesos replicados Bootstrap mensuales enero_junio 2023",
          "Pesos replicados Bootstrap mensuales julio_diciembre 2023"
        ),
        c(
          "Pesos replicados mensuales enero_junio 2023",
          "Pesos replicados mensuales Julio_diciembre 2023"
        )
      ),
      replicate_id = c("ID" = "ID"),
      replicate_pattern = "wr[0-9]+",
      replicate_type = "bootstrap"
    )
  )
)

## End(Not run)
## Not run: 
# Example of loading a panel survey
panel_survey <- load_panel_survey(
  path_implantation = "path/to/implantation.csv",
  path_follow_up = "path/to/follow_up",
  svy_type = "ech",
  svy_weight_implantation = add_weight(annual = "w_ano"),
  svy_weight_follow_up = add_weight(monthly = "w_monthly")
)
print(panel_survey)

## End(Not run)

Load survey from file and create Survey object

Description

This function reads survey files in multiple formats and creates a Survey object with all necessary metadata for subsequent analysis. Supports various survey types with specific configurations for each one.

Usage

load_survey(
  path = NULL,
  svy_type = NULL,
  svy_edition = NULL,
  svy_weight = NULL,
  svy_psu = NULL,
  svy_strata = NULL,
  ...,
  bake = FALSE,
  recipes = NULL
)

Arguments

path

Path to the survey file. Supports multiple formats: csv, xlsx, dta (Stata), sav (SPSS), rds (R). If NULL, survey arguments must be specified to create an empty object

svy_type

Survey type as string. Supported types:

"ech": Encuesta Continua de Hogares (Uruguay)
"eph": Encuesta Permanente de Hogares (Argentina)
"eai": Encuesta de Actividades de Innovación (Uruguay)
"eaii": Encuesta de Actividades de Innovación e I+D (Uruguay)

svy_edition

Survey edition as string. Supports different temporal patterns:

"YYYY": Year (e.g., "2023")
"YYYYMM" or "MMYYYY": Year-month (e.g., "202301" or "012023")
"YYYY-YYYY": Year range (e.g., "2020-2022")

svy_weight

List with weight information specifying periodicity and weight variable name. Use helper function add_weight

svy_psu

Primary sampling unit (PSU) variable as string

svy_strata

Stratification variable name as string (optional). Used in survey::svydesign() for stratified sampling designs.

...

Additional arguments passed to specific reading functions

bake

Logical indicating whether recipes are processed automatically when loading data. Defaults to FALSE

recipes

Recipe object obtained with get_recipe. If bake=TRUE, these recipes are applied automatically

Details

The function automatically detects file format and uses the appropriate reader. For each survey type, it applies specific configurations such as standard variables, data types, and validations.

When bake=TRUE is specified, recipes are applied immediately after loading the data, creating an analysis-ready object.

If no path is provided, an empty Survey object is created that can be used to build step pipelines without initial data.

Value

Survey object with structure:

data: Survey data
metadata: Information about type, edition, weights
steps: History of applied transformations
recipes: Available recipes
design: Sample design information

Examples

## Not run: 
# Load ECH 2023 with recipes
ech_2023 <- load_survey(
  path = "data/ech_2023.csv",
  svy_type = "ech",
  svy_edition = "2023",
  svy_weight = add_weight(annual = "pesoano"),
  recipes = get_recipe("ech", "2023"),
  bake = TRUE
)

# Load monthly survey
ech_january <- load_survey(
  path = "data/ech_202301.dta",
  svy_type = "ech",
  svy_edition = "202301",
  svy_weight = add_weight(monthly = "pesomes")
)

# Create empty object for pipeline
pipeline <- load_survey(
  svy_type = "ech",
  svy_edition = "2023"
) %>%
  step_compute(new_var = operation)

# With included example data
ech_example <- load_survey(
  path = load_survey_example("ech", "ech_2022"),
  svy_type = "ech",
  svy_edition = "2022",
  svy_weight = add_weight(annual = "pesoano")
)

## End(Not run)

Load survey example data

Description

Downloads and loads example survey data from the metasurvey data repository. This function provides access to sample datasets for testing and demonstration purposes, including ECH (Continuous Household Survey) and other survey types.

Usage

load_survey_example(svy_type, svy_edition)

Arguments

svy_type

Character string specifying the survey type (e.g., "ech")

svy_edition

Character string specifying the survey edition/year (e.g., "2023")

Details

This function downloads example data from the official metasurvey data repository on GitHub. The data is cached locally in a temporary file to avoid repeated downloads in the same session.

Available survey types and editions can be found at: https://github.com/metasurveyr/metasurvey_data

Value

Character string with the path to the downloaded CSV file containing the example survey data

Examples

## Not run: 
# Load ECH 2023 example data
ech_path <- load_survey_example("ech", "2023")

# Use with load_survey
ech_data <- load_survey(
  path = load_survey_example("ech", "2023"),
  svy_type = "ech",
  svy_edition = "2023"
)

## End(Not run)

Parse a STATA destring command

Description

Parse a STATA destring command

Usage

parse_destring_args(args, options = NULL)

Arguments

args

Arguments string after "destring"

options

Options string

Value

List with var_name, replace (logical), force (logical), gen_var

Parse a STATA .do file into structured commands

Description

Reads a .do file and returns a list of parsed command objects. Handles comment stripping, line continuation, loop expansion, and command tokenization.

Usage

parse_do_file(do_file, encoding = "latin1")

Arguments

do_file

Path to a STATA .do file

encoding

File encoding (default "latin1" for legacy STATA files)

Value

A list of StataCommand lists, each with fields: cmd, args, if_clause, options, raw_line, line_num, capture

Examples


tf <- tempfile(fileext = ".do")
writeLines(c("gen age2 = edad^2", "replace sexo = 1 if sexo == ."), tf)
cmds <- parse_do_file(tf)
length(cmds)
cmds[[1]]$cmd

Parse a STATA egen command

Description

Parse a STATA egen command

Usage

parse_egen_args(args, by_group = NULL, options = NULL)

Arguments

args

Arguments string after "egen"

by_group

By-group from bysort prefix or by() option

options

Options string

Value

List with var_name, func, func_arg, by_group

Parse a STATA gen command into variable name and expression

Description

Parse a STATA gen command into variable name and expression

Usage

parse_gen_args(args)

Arguments

args

The arguments part of a gen command (after "gen")

Value

List with var_name and expr

Parse a STATA mvencode command

Description

Parse a STATA mvencode command

Usage

parse_mvencode_args(args, options = NULL)

Arguments

args

Arguments string after "mvencode"

options

Options string

Value

List with var_names and mv_value

Parse a STATA recode command

Description

Handles patterns like:

recode var (old=new) (old=new)
recode var (old=new), gen(newvar)
recode var .=0
recode var 23/38=22

Usage

parse_recode_args(args, options = NULL)

Arguments

args

Arguments string after "recode"

options

Options string (may contain gen())

Value

List with var_name, gen_var (or NULL), and mappings list

Parse a STATA replace command into variable name and expression

Description

Parse a STATA replace command into variable name and expression

Usage

parse_replace_args(args)

Arguments

args

The arguments part of a replace command

Value

List with var_name and expr

Parse a single STATA command line into structured data

Description

Parse a single STATA command line into structured data

Usage

parse_stata_command(line, line_num = NA_integer_)

Arguments

line

A single STATA command string

line_num

Original line number (for error reporting)

Value

A list with cmd, args, if_clause, options, raw_line, line_num, capture, or NULL for empty/skippable lines

Parse STATA label commands from source lines

Description

Extracts variable labels, value label definitions, and value label assignments from label commands.

Usage

parse_stata_labels(lines)

Arguments

lines

Character vector of source lines (already comment-stripped)

Value

A list with var_labels (named list) and val_labels (named list of named lists)

Examples


lines <- c(
  'label variable edad "Age in years"',
  'label define sexo_lbl 1 "Male" 2 "Female"',
  "label values sexo sexo_lbl"
)
labels <- parse_stata_labels(lines)
labels$var_labels
labels$val_labels

Print method for Recipe objects

Description

Displays a formatted recipe card showing metadata, required variables, pipeline steps, and produced variables.

Usage

## S3 method for class 'Recipe'
print(x, ...)

Arguments

x

A Recipe object

...

Additional arguments (currently unused)

Value

Invisibly returns the Recipe object

Examples

rec <- Recipe$new(
  id = "r1", name = "Example", user = "tester",
  edition = "2023", survey_type = "test",
  default_engine = "data.table", depends_on = list(),
  description = "Demo recipe", steps = list()
)
print(rec)

Print method for RecipeWorkflow objects

Description

Print method for RecipeWorkflow objects

Usage

## S3 method for class 'RecipeWorkflow'
print(x, ...)

Arguments

x

A RecipeWorkflow object

...

Additional arguments (unused)

Value

Invisibly returns the object

Examples

wf <- RecipeWorkflow$new(
  id = "w1", name = "Example Workflow", user = "tester",
  edition = "2023", survey_type = "test",
  recipe_ids = "r1",
  calls = list(), description = "Demo"
)
print(wf)

Print provenance information

Description

Print provenance information

Usage

## S3 method for class 'metasurvey_provenance'
print(x, ...)

Arguments

x

A metasurvey_provenance list.

...

Additional arguments (unused).

Value

Invisibly returns x.

Examples

s <- survey_empty("ech", "2023")
s <- set_data(s, data.table::data.table(age = 18:65))
s <- step_compute(s, age2 = age * 2)
s <- bake_steps(s)
print(provenance(s))

Print provenance diff

Description

Print provenance diff

Usage

## S3 method for class 'metasurvey_provenance_diff'
print(x, ...)

Arguments

x

A metasurvey_provenance_diff list.

...

Additional arguments (unused).

Value

Invisibly returns x.

Examples

s1 <- survey_empty("ech", "2022")
s1 <- set_data(s1, data.table::data.table(age = 18:65))
s1 <- step_compute(s1, age2 = age * 2)
s1 <- bake_steps(s1)

s2 <- survey_empty("ech", "2023")
s2 <- set_data(s2, data.table::data.table(age = 20:70))
s2 <- step_compute(s2, age2 = age * 2)
s2 <- bake_steps(s2)

diff_result <- provenance_diff(provenance(s1), provenance(s2))
print(diff_result)

Get provenance from a survey or workflow result

Description

Returns the provenance metadata recording the full data lineage: source file, step history with row counts, and environment info.

Usage

provenance(x, ...)

## S3 method for class 'Survey'
provenance(x, ...)

## S3 method for class 'data.table'
provenance(x, ...)

## Default S3 method:
provenance(x, ...)

Arguments

x

A Survey object or a data.table from workflow().

...

Additional arguments (unused).

Value

A metasurvey_provenance list, or NULL if no provenance is available.

Examples

svy <- Survey$new(
  data = data.table::data.table(id = 1:10, age = 20:29, w = 1),
  edition = "2023", type = "test", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
provenance(svy)

Compare two provenance objects

Description

Shows differences between two provenance records, useful for comparing processing across survey editions.

Usage

provenance_diff(prov1, prov2)

Arguments

prov1

First provenance list.

prov2

Second provenance list.

Value

A metasurvey_provenance_diff list with detected differences.

Examples

svy1 <- Survey$new(
  data = data.table::data.table(id = 1:5, w = rep(1, 5)),
  edition = "2023", type = "test",
  engine = "data.table", weight = add_weight(annual = "w")
)
svy2 <- Survey$new(
  data = data.table::data.table(id = 1:5, w = rep(2, 5)),
  edition = "2024", type = "test",
  engine = "data.table", weight = add_weight(annual = "w")
)
provenance_diff(provenance(svy1), provenance(svy2))

Export provenance to JSON

Description

Serializes a provenance object to JSON format, optionally writing to a file.

Usage

provenance_to_json(prov, path = NULL)

Arguments

prov

A metasurvey_provenance list.

path

File path to write JSON. If NULL, returns the JSON string.

Value

JSON string (invisibly if path is provided).

Examples

svy <- Survey$new(
  data = data.table::data.table(id = 1:5, w = rep(1, 5)),
  edition = "2023", type = "test",
  engine = "data.table", weight = add_weight(annual = "w")
)
prov <- provenance(svy)
provenance_to_json(prov)

Publish Recipe

Description

Publishes a Recipe object to the active backend (local JSON registry or remote API).

Usage

publish_recipe(recipe)

Arguments

recipe

A Recipe object.

Value

The Recipe object (invisibly).

Examples

set_backend("local", path = tempfile(fileext = ".json"))
r <- recipe(
  name = "Example", user = "Test",
  svy = survey_empty(type = "ech", edition = "2023"),
  description = "Example recipe"
)
publish_recipe(r)
length(list_recipes())

Publish a workflow to the active backend

Description

Publishes a RecipeWorkflow to the configured workflow backend.

Usage

publish_workflow(wf)

Arguments

wf

A RecipeWorkflow object.

Value

NULL (called for side effect).

Examples

set_workflow_backend("local", path = tempfile(fileext = ".json"))
wf <- RecipeWorkflow$new(
  name = "Example", description = "Test",
  survey_type = "ech", edition = "2023",
  recipe_ids = "r_001", estimation_type = "svymean"
)
publish_workflow(wf)

Rank recipes by downloads

Description

Get the top recipes ranked by download count from the active backend.

Usage

rank_recipes(n = NULL)

Arguments

n

Integer. Maximum number of recipes to return, or NULL for all.

Value

List of Recipe objects sorted by downloads (descending).

Examples

set_backend("local", path = tempfile(fileext = ".json"))
top10 <- rank_recipes(n = 10)

Rank workflows by downloads

Description

Get the top workflows ranked by download count.

Usage

rank_workflows(n = NULL)

Arguments

n

Integer or NULL (default NULL). Maximum number to return, or NULL for all.

Value

List of RecipeWorkflow objects sorted by downloads.

Examples

set_workflow_backend("local", path = tempfile(fileext = ".json"))
top5 <- rank_workflows(n = 5)
length(top5)

Read Recipe

Description

Reads a Recipe object from a JSON file.

Usage

read_recipe(file)

Arguments

file

A character string specifying the file path.

Details

This function reads a JSON file and decodes it into a Recipe object.

Value

A Recipe object.

Examples

r <- recipe(
  name = "Example", user = "Test",
  svy = survey_empty(type = "ech", edition = "2023"),
  description = "Example recipe"
)
f <- tempfile(fileext = ".json")
save_recipe(r, f)
r2 <- read_recipe(f)
r2

Read a RecipeWorkflow from a JSON file

Description

Read a RecipeWorkflow from a JSON file

Usage

read_workflow(file)

Arguments

file

Character file path

Value

A RecipeWorkflow object

Examples

wf <- RecipeWorkflow$new(
  name = "Example", description = "Test",
  survey_type = "ech", edition = "2023",
  recipe_ids = "r_001", estimation_type = "svymean"
)
f <- tempfile(fileext = ".json")
save_workflow(wf, f)
wf2 <- read_workflow(f)

Create a survey data transformation recipe

Description

Creates a Recipe object that encapsulates a sequence of data transformations that can be applied to surveys in a reproducible manner. Recipes allow documenting, sharing, and reusing data processing workflows.

Usage

recipe(...)

Arguments

...

Required metadata and optional steps. Required parameters:

name: Descriptive name for the recipe
user: User/author creating the recipe
svy: Base Survey object (use survey_empty() for generic recipes)
description: Detailed description of the recipe's purpose

Optional parameters include data transformation steps.

Details

Recipes are essential for:

Reproducibility: Ensure transformations are applied consistently
Documentation: Keep a record of what transformations are performed and why
Collaboration: Share workflows between users and teams
Versioning: Maintain different processing versions for different editions
Automation: Apply complex transformations automatically

Steps included in the recipe can be any combination of step_compute, step_recode, or other transformation steps.

Recipes can be saved with save_recipe(), loaded with read_recipe(), and applied automatically with bake_recipes().

Value

A Recipe object containing metadata, transformation steps, dependency information, and default engine configuration.

Examples

# Basic recipe without steps
r <- recipe(
  name = "Basic ECH Indicators",
  user = "Analyst",
  svy = survey_empty(type = "ech", edition = "2023"),
  description = "Basic labor indicators for ECH 2023"
)
r


# Recipe with steps using local data
dt <- data.table::data.table(
  id = 1:50, age = sample(18:65, 50, TRUE),
  income = runif(50, 1000, 5000), w = runif(50, 0.5, 2)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "demo",
  psu = NULL, engine = "data.table",
  weight = add_weight(annual = "w")
)
svy <- svy |>
  step_compute(income_cat = ifelse(income > 3000, "high", "low")) |>
  step_recode(age_group, age < 30 ~ "young", .default = "adult")
r2 <- recipe(
  name = "Demo", user = "test", svy = svy,
  description = "Demo recipe", steps = get_steps(svy)
)
r2

Create a recipe category

Description

Creates a RecipeCategory object for classifying recipes.

Usage

recipe_category(name, description = "", parent = NULL)

Arguments

name

Character. Category identifier (e.g. "labor_market").

description

Character. Human-readable description. Defaults to empty.

parent

RecipeCategory object or character parent category name (default NULL). If a string is provided, it creates a parent category with that name.

Value

A RecipeCategory object.

Examples

cat <- recipe_category("labor_market", "Labor market indicators")

# With parent hierarchy
sub <- recipe_category(
  "employment", "Employment stats",
  parent = "labor_market"
)

Create a recipe certification

Description

Creates a RecipeCertification object. Typically you would use certify_recipe to certify a recipe in a pipeline instead.

Usage

recipe_certification(level = "community", certified_by = NULL, notes = NULL)

Arguments

level

Character. One of "community" (default), "reviewed", or "official".

certified_by

RecipeUser or NULL (default NULL). Required for reviewed/official.

notes

Character or NULL (default NULL). Additional notes.

Value

A RecipeCertification object.

Examples

# Default community certification
cert <- recipe_certification()

# Official certification
inst <- recipe_user("IECON", type = "institution")
cert <- recipe_certification("official", certified_by = inst)

recipe to json

Description

recipe to json

Usage

recipe_to_json(recipe)

Arguments

recipe

A Recipe object

Value

A JSON object

Create a recipe user

Description

Creates a RecipeUser object with a simple functional interface.

Usage

recipe_user(
  name,
  type = "individual",
  email = NULL,
  affiliation = NULL,
  institution = NULL,
  url = NULL,
  verified = FALSE
)

Arguments

name

Character. User or institution name.

type

Character. One of "individual" (default), "institutional_member", or "institution".

email

Character or NULL (default NULL). Email address.

affiliation

Character or NULL (default NULL). Organizational affiliation.

institution

RecipeUser object or character institution name. Required for "institutional_member" type. If a string is provided, it creates an institution user with that name automatically.

url

Character or NULL (default NULL). Institution URL.

verified

Logical (default FALSE). Whether the account is verified.

Value

A RecipeUser object.

Examples

# Individual user
user <- recipe_user("Juan Perez", email = "juan@example.com")

# Institution
inst <- recipe_user(
  "Instituto de Economia",
  type = "institution", verified = TRUE
)

# Member linked to institution
member <- recipe_user(
  "Maria",
  type = "institutional_member",
  institution = inst
)

# Member with institution name shortcut
member2 <- recipe_user(
  "Pedro",
  type = "institutional_member",
  institution = "IECON"
)

Remove a category from a recipe

Description

Pipe-friendly function to remove a category from a Recipe by name.

Usage

remove_category(recipe, name)

Arguments

recipe

A Recipe object.

name

Character. Category name to remove.

Value

The modified Recipe object.

Examples

r <- recipe(
  name = "Example", user = "Test",
  svy = survey_empty(type = "ech", edition = "2023"),
  description = "Example recipe"
)
r <- r |>
  add_category("labor_market") |>
  remove_category("labor_market")

Reproduce a workflow from its published specification

Description

Given a RecipeWorkflow (typically fetched from the registry), downloads the data, resolves the weight configuration, fetches referenced recipes, and returns a Survey object ready for workflow() estimation.

Usage

reproduce_workflow(wf, data_path = NULL, dest_dir = tempdir())

Arguments

wf

RecipeWorkflow object

data_path

Character path to survey microdata. If NULL, attempts to download from ANDA for ECH surveys.

dest_dir

Character directory for downloaded files

Value

Survey object with recipes applied and weight configuration set

Examples

## Not run: 
wf <- api_get_workflow("w_123")
svy <- reproduce_workflow(wf)

## End(Not run)

Resolve a portable weight specification to a usable weight configuration

Description

Converts the portable weight_spec from a RecipeWorkflow back into the format expected by load_survey() and add_weight(). For replicate weights with ANDA sources, automatically downloads the replicate file.

Usage

resolve_weight_spec(weight_spec, dest_dir = tempdir())

Arguments

weight_spec

Named list from RecipeWorkflow$weight_spec

dest_dir

Character directory for downloaded files (default: tempdir())

Value

Named list compatible with add_weight() output

Examples

## Not run: 
wf <- api_get_workflow("w_123")
weight <- resolve_weight_spec(wf$weight_spec)

## End(Not run)

Save Recipe

Description

Saves a Recipe object to a file in JSON format.

Usage

save_recipe(recipe, file)

Arguments

recipe

A Recipe object.

file

A character string specifying the file path.

Details

This function encodes the Recipe object and writes it to a JSON file.

Value

NULL.

Examples

r <- recipe(
  name = "Example", user = "Test",
  svy = survey_empty(type = "ech", edition = "2023"),
  description = "Example recipe"
)
f <- tempfile(fileext = ".json")
save_recipe(r, f)

Save a RecipeWorkflow to a JSON file

Description

Save a RecipeWorkflow to a JSON file

Usage

save_workflow(wf, file)

Arguments

wf

A RecipeWorkflow object

file

Character file path

Value

NULL (called for side-effect)

Examples

wf <- RecipeWorkflow$new(
  name = "Example", description = "Test",
  survey_type = "ech", edition = "2023",
  recipe_ids = "r_001", estimation_type = "svymean"
)
f <- tempfile(fileext = ".json")
save_workflow(wf, f)

Search recipes

Description

Search for recipes by name or description in the active backend.

Usage

search_recipes(query)

Arguments

query

Character search string.

Value

List of matching Recipe objects.

Examples

set_backend("local", path = tempfile(fileext = ".json"))
r <- recipe(
  name = "Labor Market", user = "Test",
  svy = survey_empty(type = "ech", edition = "2023"),
  description = "Labor market indicators"
)
publish_recipe(r)
results <- search_recipes("labor")
length(results)

Search workflows

Description

Search for workflows by name or description in the active workflow backend.

Usage

search_workflows(query)

Arguments

query

Character search string.

Value

List of matching RecipeWorkflow objects.

Examples

set_workflow_backend("local", path = tempfile(fileext = ".json"))
results <- search_workflows("labor market")
length(results)

Set recipe backend

Description

Configure the active recipe backend via options.

Usage

set_backend(type, path = NULL)

Arguments

type

Character. "local" or "api" (also accepts "mongo" for backward compat).

path

Character. File path for local backend.

Value

Invisibly, the RecipeBackend object created.

Examples

set_backend("local", path = tempfile(fileext = ".json"))

Set data on a Survey

Description

Tidy wrapper for svy$set_data(data).

Usage

set_data(svy, data, .copy = FALSE)

Arguments

svy

Survey object

data

A data.frame or data.table with survey microdata

.copy

Logical; if TRUE, clone the Survey before modifying (default FALSE)

Value

The Survey object (invisibly). If .copy=TRUE, returns a new clone.

Examples

dt <- data.table::data.table(id = 1:5, x = rnorm(5), w = rep(1, 5))
svy <- Survey$new(
  data = dt, edition = "2023", type = "test",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
new_dt <- data.table::data.table(id = 1:3, x = rnorm(3), w = rep(1, 3))
svy <- set_data(svy, new_dt)

Configure the survey data engine

Description

Configures the engine to be used for loading surveys. Checks if the provided engine is supported, sets the default engine if none is specified, and generates a message indicating the configured engine. If the engine is not supported, it throws an error.

Usage

set_engine(.engine = show_engines())

Arguments

.engine

Character vector with the name of the engine to configure. By default, the engine returned by the show_engines() function is used.

Value

Invisibly, the previous engine name (for restoring).

Examples


old <- set_engine("data.table")
get_engine()

Set lazy processing

Description

Set lazy processing

Usage

set_lazy_processing(lazy)

Arguments

lazy

Logical. If TRUE, steps are deferred until bake_steps() is called.

Value

Invisibly, the previous value.

Examples

old <- lazy_default()
set_lazy_processing(FALSE)
lazy_default() # now FALSE
set_lazy_processing(old) # restore

Set data copy option

Description

Configures whether survey operations should create copies of the data or modify existing data in place. This setting affects memory usage and performance across the metasurvey package.

Usage

set_use_copy(use_copy)

Arguments

use_copy

Logical value: TRUE to create data copies (safer), FALSE to modify data in place (more efficient)

Details

Setting use_copy affects all subsequent survey operations:

TRUE (default): Operations create data copies, preserving original data
FALSE: Operations modify data in place, reducing memory usage

Use FALSE for large datasets where memory is a concern, but ensure you don't need the original data after operations.

Value

Invisibly, the previous value (for restoring).

Examples

# Set to use copies (default behavior)
set_use_copy(TRUE)
use_copy_default()

# Set to modify in place for better performance
set_use_copy(FALSE)
use_copy_default()

# Reset to default
set_use_copy(TRUE)

Set user info on a recipe

Description

Pipe-friendly function to assign a RecipeUser to a Recipe.

Usage

set_user_info(recipe, user)

Arguments

recipe

A Recipe object.

user

A RecipeUser object.

Value

The modified Recipe object.

Examples

r <- recipe(
  name = "Example", user = "Test",
  svy = survey_empty(type = "ech", edition = "2023"),
  description = "Example recipe"
)
user <- recipe_user("Juan Perez", email = "juan@example.com")
r <- r |> set_user_info(user)

Set version on a recipe

Description

Pipe-friendly function to set the version string on a Recipe.

Usage

set_version(recipe, version)

Arguments

recipe

A Recipe object.

version

Character version string (e.g. "2.0.0").

Value

The modified Recipe object.

Examples

r <- recipe(
  name = "Example", user = "Test",
  svy = survey_empty(type = "ech", edition = "2023"),
  description = "Example recipe"
)
r <- r |> set_version("2.0.0")

Set workflow backend

Description

Configure the active workflow backend via options.

Usage

set_workflow_backend(type, path = NULL)

Arguments

type

Character. "local" or "api" (also accepts "mongo" for backward compat).

path

Character. File path for local backend.

Value

Invisibly, the WorkflowBackend object created.

Examples

set_workflow_backend("local", path = tempfile(fileext = ".json"))

List available survey data engines

Description

Returns a character vector of available engines that can be used for loading surveys.

Usage

show_engines()

Value

Character vector with the names of the available engines.

Examples

show_engines()

Create computation steps for survey variables

Description

This function uses optimized expression evaluation with automatic dependency detection and error prevention. All computations are validated before execution.

Usage

step_compute(
  svy = NULL,
  ...,
  .by = NULL,
  .copy = use_copy_default(),
  comment = "Compute step",
  .level = "auto",
  use_copy = deprecated()
)

Arguments

svy

A Survey or RotativePanelSurvey object. If NULL, creates a step that can be applied later using the pipe operator (%>%)

...

Computation expressions with automatic optimization. Names are assigned using new_var = expression

.by

Vector of variables to group computations by. The system automatically validates these variables exist before execution

.copy

Logical indicating whether to create a copy of the object before applying transformations. Defaults to use_copy_default()

comment

Descriptive text for the step for documentation and traceability. Compatible with Markdown syntax. Defaults to "Compute step"

.level

For RotativePanelSurvey objects (default "auto"), specifies the level where computations are applied: "implantation", "follow_up", "quarter", "month", or "auto"

use_copy

Use .copy instead.

Details

Lazy evaluation (default): By default, steps are recorded but not executed until bake_steps() is called. This allows building a full pipeline before materializing any changes. Set options(metasurvey.lazy_processing = FALSE) to apply steps immediately.

Expression processing: Expressions are evaluated using data.table's ⁠:=⁠ operator. Variable dependencies are detected automatically via all.vars(). Missing variables are caught before execution.

Grouped computations: Use .by to compute aggregated values (e.g., group means) that are automatically joined back to the data.

For RotativePanelSurvey objects, .level controls where computations are applied:

"auto" (default): applies to both implantation and follow-ups
"implantation": household/dwelling level only
"follow_up": individual/person level only

Value

Same type of input object (Survey or RotativePanelSurvey) with new computed variables and the step added to the history

Examples

# Basic computation
dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60), w = 1
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "test",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
svy <- svy |> step_compute(age_squared = age^2, comment = "Age squared")
svy <- bake_steps(svy)
get_data(svy)


# ECH example: labor indicator
# ech <- ech |>
#   step_compute(
#     unemployed = ifelse(POBPCOAC %in% 3:5, 1, 0),
#     comment = "Unemployment indicator")

Filter rows from survey data

Description

Creates a step that filters (subsets) rows from the survey data based on logical conditions. Multiple conditions are combined with AND.

Usage

step_filter(
  svy,
  ...,
  .by = NULL,
  .copy = use_copy_default(),
  comment = "Filter step",
  .level = "auto"
)

Arguments

svy

A Survey or RotativePanelSurvey object.

...

Logical expressions evaluated against the data. Each must return a logical vector. Multiple conditions are combined with AND.

.by

Optional grouping variable(s) for within-group filtering.

.copy

Whether to operate on a copy (default: use_copy_default()).

comment

Descriptive text for the step (default "Filter step").

.level

For RotativePanelSurvey, the level to apply (default "auto"): "implantation", "follow_up", or "auto" (both).

Details

Lazy evaluation (default): Like all steps, filter is recorded but not executed until bake_steps() is called.

Value

The survey object with rows filtered and the step recorded.

Examples

svy <- Survey$new(
  data = data.table::data.table(
    id = 1:10, age = c(15, 25, 35, 45, 55, 65, 75, 20, 30, 40), w = 1
  ),
  edition = "2023", type = "test", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
svy <- svy |> step_filter(age >= 18) |> bake_steps()
nrow(get_data(svy))

Join external data into survey (step)

Description

Creates a step that joins additional data into a Survey or RotativePanelSurvey.

Usage

step_join(
  svy,
  x,
  by = NULL,
  type = c("left", "inner", "right", "full"),
  suffixes = c("", ".y"),
  .copy = use_copy_default(),
  comment = "Join step",
  use_copy = deprecated(),
  lazy = lazy_default(),
  record = TRUE
)

Arguments

svy

A Survey or RotativePanelSurvey object. If NULL, returns a step call

x

A data.frame/data.table or a Survey to join into svy

by

Character vector of join keys. Named vector for different names between svy and x (names are keys in svy, values are keys in x). If NULL, tries to infer common column names

type

Join type: "left" (default), "inner", "right", or "full"

suffixes

Length-2 character vector of suffixes for conflicting columns from svy and x respectively. Defaults to c("", ".y")

.copy

Whether to operate on a copy (default: use_copy_default())

comment

Optional description for the step (default "Join step").

use_copy

Use .copy instead.

lazy

Internal. Whether to delay execution (default lazy_default()).

record

Internal. Whether to record the step (default TRUE).

Details

Lazy evaluation (default): By default, steps are recorded but not executed until bake_steps() is called.

Supports left, inner, right, and full joins. Allows named by mapping (e.g., c("id" = "code")) or a simple character vector. Conflicting column names are resolved by appending suffixes to the right-hand side columns.

Value

Modified survey object with the join recorded as a step (and applied immediately when baked). For RotativePanelSurvey, the join is applied to implantation and every follow_up survey.

Examples

# With data.frame
s <- Survey$new(
  data = data.table::data.table(id = 1:3, w = 1, a = c("x", "y", "z")),
  edition = "2023", type = "ech", psu = NULL, engine = "data.table",
  weight = add_weight(annual = "w")
)
info <- data.frame(id = c(1, 2), b = c(10, 20))
s2 <- step_join(s, info, by = "id", type = "left")
s2 <- bake_steps(s2)

# With another Survey
s_right <- Survey$new(
  data = data.table::data.table(id = c(2, 3), b = c(200, 300), w2 = 1),
  edition = "2023", type = "ech", psu = NULL, engine = "data.table",
  weight = add_weight(annual = "w2")
)
s3 <- step_join(s, s_right, by = c("id" = "id"), type = "inner")
s3 <- bake_steps(s3)

Create recoding steps for categorical variables

Description

This function uses optimized expression evaluation for all recoding conditions. All conditional expressions are validated and optimized for efficient execution.

Usage

step_recode(
  svy,
  new_var,
  ...,
  .default = NA_character_,
  .name_step = NULL,
  ordered = FALSE,
  .copy = use_copy_default(),
  comment = "Recode step",
  .to_factor = FALSE,
  .level = "auto",
  use_copy = deprecated()
)

Arguments

svy

A Survey or RotativePanelSurvey object. If NULL, creates a step that can be applied later using the pipe operator (%>%)

new_var

Name of the new variable to create (unquoted)

...

Sequence of two-sided formulas defining recoding rules. Left-hand side (LHS) is a conditional expression, right-hand side (RHS) defines the replacement value. Format: condition ~ value

.default

Default value assigned when no condition is met. Defaults to NA_character_

.name_step

Custom name for the step in the history. Now auto-generated from the variable name. Use comment for user-facing documentation instead.

ordered

Logical indicating whether the new variable should be an ordered factor. Defaults to FALSE

.copy

Logical indicating whether to create a copy of the object before applying transformations. Defaults to use_copy_default()

comment

Descriptive text for the step for documentation and traceability. Compatible with Markdown syntax. Defaults to "Recode step"

.to_factor

Logical indicating whether the new variable should be converted to a factor. Defaults to FALSE

.level

For RotativePanelSurvey objects (default "auto"), specifies the level where recoding is applied: "implantation", "follow_up", "quarter", "month", or "auto"

use_copy

Use .copy instead.

Details

Lazy evaluation (default): By default, steps are recorded but not executed until bake_steps() is called. This allows building a full pipeline before materializing any changes.

Condition evaluation: Conditions are two-sided formulas evaluated in order. The first matching condition determines the assigned value. If no condition matches, .default is used.

Condition examples:

Simple: variable == 1 ~ "Yes"
Complex: age >= 18 & income > 12000 ~ "High"
Vectorized: variable %in% c(1,2,3) ~ "Group A"
Logical: !is.na(education) ~ "Has education"

Value

Same type of input object (Survey or RotativePanelSurvey) with the new recoded variable and the step added to the history

Examples

# Basic recode: categorize ages
dt <- data.table::data.table(
  id = 1:6, age = c(10, 25, 45, 60, 70, 80), w = 1
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "test",
  psu = NULL, engine = "data.table",
  weight = add_weight(annual = "w")
)
svy <- svy |>
  step_recode(
    age_group,
    age < 18 ~ "Under 18",
    age >= 18 & age < 65 ~ "Working age",
    age >= 65 ~ "Senior",
    .default = "Unknown"
  )
svy <- bake_steps(svy)
get_data(svy)


# ECH example: labor force status
# ech <- ech |>
#   step_recode(labor_status,
#     POBPCOAC == 2 ~ "Employed",
#     POBPCOAC %in% 3:5 ~ "Unemployed",
#     .default = "Missing")

Remove variables from survey data (step)

Description

Creates a step that removes one or more variables from the survey data when baked.

Usage

step_remove(
  svy,
  ...,
  vars = NULL,
  .copy = use_copy_default(),
  comment = "Remove variables",
  use_copy = deprecated(),
  lazy = lazy_default(),
  record = TRUE
)

Arguments

svy

A Survey or RotativePanelSurvey object

...

Unquoted variable names to remove, or a character vector

vars

Character vector of variable names to remove. Alternative to ... for programmatic use.

.copy

Whether to operate on a copy (default: use_copy_default())

comment

Descriptive text for the step for documentation and traceability (default "Remove variables").

use_copy

Use .copy instead.

lazy

Internal. Whether to delay execution (default lazy_default()).

record

Internal. Whether to record the step (default TRUE).

Details

Lazy evaluation (default): By default, steps are recorded but not executed until bake_steps() is called.

Variables can be specified in two ways:

Unquoted names: step_remove(svy, age, income)
Character vector: step_remove(svy, vars = c("age", "income"))

Variables that don't exist in the data produce a warning (not an error), allowing pipelines to be robust to missing columns.

Value

Survey object with the specified variables removed (or queued for removal).

Examples

dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60),
  w = rep(1, 5)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "ech",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
svy2 <- step_remove(svy, age)
svy2 <- bake_steps(svy2)
"age" %in% names(get_data(svy2)) # FALSE

Rename variables in survey data (step)

Description

Creates a step that renames variables in the survey data when baked.

Usage

step_rename(
  svy,
  ...,
  mapping = NULL,
  .copy = use_copy_default(),
  comment = "Rename variables",
  use_copy = deprecated(),
  lazy = lazy_default(),
  record = TRUE
)

Arguments

svy

A Survey or RotativePanelSurvey object

...

Pairs in the form new_name = old_name (unquoted).

mapping

A named character vector of the form c(new_name = "old_name"). Alternative to ... for programmatic use.

.copy

Whether to operate on a copy (default: use_copy_default())

comment

Descriptive text for the step for documentation and traceability (default "Rename variables").

use_copy

Use .copy instead.

lazy

Internal. Whether to delay execution (default lazy_default()).

record

Internal. Whether to record the step (default TRUE).

Details

Lazy evaluation (default): By default, steps are recorded but not executed until bake_steps() is called.

Variables can be renamed in two ways:

Unquoted pairs: step_rename(svy, new_name = old_name)
Named character vector: step_rename(svy, mapping = c(new_name = "old_name"))

Variables that don't exist in the data cause an error, unlike step_remove() which issues a warning.

Value

Survey object with the specified variables renamed (or queued for renaming).

Examples

dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60),
  w = rep(1, 5)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "ech",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
svy2 <- step_rename(svy, edad = age)
svy2 <- bake_steps(svy2)
"edad" %in% names(get_data(svy2)) # TRUE

Validate data during the step pipeline

Description

Creates a non-mutating step that checks data invariants when bake_steps is called. Each check is a logical expression evaluated row-wise against the survey data. If any row fails a check, the pipeline stops (or warns).

Usage

step_validate(
  svy,
  ...,
  .action = c("stop", "warn"),
  .min_n = NULL,
  .copy = use_copy_default(),
  comment = "Validate step"
)

Arguments

svy

A Survey or RotativePanelSurvey object

...

Logical expressions evaluated against the data. Each must return a logical vector with one value per row. Named expressions use the name in error messages; unnamed expressions use the deparsed code. Examples: income > 0, !is.na(age), sex %in% c(1, 2).

.action

What to do when a check fails: "stop" (default) raises an error, "warn" issues a warning and continues.

.min_n

Minimum number of rows required. Checked before row-level expressions.

.copy

Whether to operate on a copy (default: use_copy_default())

comment

Descriptive text for the step for documentation and traceability (default "Validate step").

Details

Lazy evaluation (default): Like all steps, validation checks are recorded but not executed until bake_steps is called. This means step_validate can reference variables created by preceding step_compute calls.

The validate step does not modify the data in any way. It only inspects the current state of the data.table and raises an error or warning if any check fails.

Value

The survey object with a validate step recorded (no data mutation).

Examples

dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60),
  income = c(1000, 2000, 3000, 4000, 5000), w = 1
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "test",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)

# Validate that all ages are positive and income is not NA
svy <- svy |>
  step_validate(age > 0, !is.na(income), .min_n = 3) |>
  bake_steps()

Convert a list of steps to a recipe

Description

Convert a list of steps to a recipe

Usage

steps_to_recipe(
  name,
  user,
  svy = survey_empty(type = "eaii", edition = "2019-2021"),
  description,
  steps,
  doi = NULL,
  topic = NULL
)

Arguments

name

A character string with the name of the recipe

user

A character string with the user of the recipe

svy

A Survey object

description

A character string with the description of the recipe

steps

A list with the steps of the recipe

doi

A character string with the DOI of the recipe

topic

A character string with the topic of the recipe

Value

A Recipe object

Examples


dt <- data.table::data.table(
  id = 1:20, age = sample(18:65, 20, TRUE),
  w = runif(20, 0.5, 2)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "demo",
  psu = NULL, engine = "data.table",
  weight = add_weight(annual = "w")
)
svy <- step_compute(svy, age2 = age^2)
my_recipe <- steps_to_recipe(
  name = "age_vars", user = "analyst",
  svy = svy, description = "Age-derived variables",
  steps = get_steps(svy)
)
my_recipe

Strip STATA comments from source lines

Description

Removes single-line comments (// and * at start) and multi-line block comments (/* ... */).

Usage

strip_stata_comments(lines)

Arguments

lines

Character vector of source lines

Value

Character vector with comments removed

survey_empty

Description

Create an empty survey

Usage

survey_empty(
  edition = NULL,
  type = NULL,
  weight = NULL,
  engine = NULL,
  psu = NULL,
  strata = NULL
)

Arguments

edition

Edition of survey

type

Type of survey

weight

Weight of survey

engine

Engine of survey

psu

PSU variable or formula (optional)

strata

Stratification variable name (optional)

Value

Survey object

Examples

empty <- survey_empty(edition = "2023", type = "test")
empty_typed <- survey_empty(edition = "2023", type = "ech")

survey_to_data_frame

Description

Convert survey to data.frame

Usage

survey_to_data_frame(svy)

Arguments

svy

Survey object

Value

data.frame

Examples

dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60),
  w = rep(1, 5)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "ech",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
df <- survey_to_data_frame(svy)
class(df) # "data.frame"

Convert survey to data.table

Description

Extracts the survey microdata as a data.table.

Usage

survey_to_datatable(svy)

survey_to_data.table(svy)

Arguments

svy

Survey object.

Value

A data.table.

Examples

dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60),
  w = rep(1, 5)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "ech",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
result <- survey_to_datatable(svy)
data.table::is.data.table(result) # TRUE

survey_to_tibble

Description

Convert survey to tibble

Usage

survey_to_tibble(svy)

Arguments

svy

Survey object

Value

A tibble (tbl_df) containing the survey data.

Examples


dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60),
  w = rep(1, 5)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "ech",
  psu = NULL, engine = "data.table", weight = add_weight(annual = "w")
)
tbl <- survey_to_tibble(svy)
class(tbl)

Translate a STATA expression to an R expression string

Description

Converts STATA-specific syntax (inrange, inlist, missing values, etc.) to equivalent R expressions suitable for data.table evaluation.

Usage

translate_stata_expr(expr)

Arguments

expr

STATA expression string

Value

R expression string

Analyze transpilation coverage for STATA do-files

Description

Reports what percentage of commands in a .do file (or directory of files) can be automatically transpiled vs require manual review.

Usage

transpile_coverage(path, recursive = TRUE)

Arguments

path

Path to a .do file or directory of .do files

recursive

If TRUE and path is a directory, search subdirectories

Value

A data.frame with columns: file, total_commands, translated, skipped, manual_review, coverage_pct

Examples


tf <- tempfile(fileext = ".do")
writeLines(c("gen x = 1", "replace x = 2 if y == 3", "drop z"), tf)
transpile_coverage(tf)

Transpile a STATA .do file to metasurvey steps

Description

Parses a STATA .do file and translates its commands into metasurvey step call strings suitable for use in Recipe objects.

Usage

transpile_stata(do_file, survey_type = "ech", user = "iecon", strict = FALSE)

Arguments

do_file

Path to a STATA .do file

survey_type

Survey type (default "ech")

user

Author name for the recipe

strict

If TRUE, stops on untranslatable commands; if FALSE, inserts MANUAL_REVIEW comments as warnings

Value

A list with:

steps: character vector of step call strings
labels: list with var_labels and val_labels (if label commands found)
warnings: character vector of MANUAL_REVIEW items
stats: list with command counts

Examples


tf <- tempfile(fileext = ".do")
writeLines(c("gen age2 = edad^2", "replace sexo = 1 if sexo == ."), tf)
result <- transpile_stata(tf)
result$steps
result$stats

Transpile and group do-files by thematic module

Description

Processes all do-files in a year directory and groups them into thematic Recipe objects (demographics, income, etc.).

Usage

transpile_stata_module(year_dir, year, user = "iecon", output_dir = NULL)

Arguments

year_dir

Path to a year directory (e.g., "do_files_iecon/2022")

year

Year of the edition (character or numeric)

user

Author name

output_dir

Directory to write JSON recipes (NULL = no file output)

Value

A named list of Recipe objects, one per thematic module

Examples

## Not run: 
# Requires a directory of .do files organized by year
recipes <- transpile_stata_module("do_files_iecon/2022", year = 2022)
names(recipes)

## End(Not run)

Get data copy option

Description

Retrieves the current setting for the use_copy option, which controls whether survey operations create copies of the data or modify in place.

Usage

use_copy_default()

Details

The use_copy option affects memory usage and performance:

TRUE: Creates copies, safer but uses more memory
FALSE: Modifies in place, more efficient but requires caution

Value

Logical value indicating whether to use data copies (TRUE) or modify data in place (FALSE). Default is TRUE.

Examples

# Check current setting
current_setting <- use_copy_default()
print(current_setting)

Check valid recipe

Description

Check valid recipe

Usage

validate_recipe(svy_type, svy_edition, recipe_svy_edition, recipe_svy_type)

Arguments

svy_type

Type of survey

svy_edition

Edition of the survey

recipe_svy_edition

Edition of the recipe

recipe_svy_type

Type of the recipe

Value

Logical

Validate time pattern

Description

Validate time pattern

Usage

validate_time_pattern(svy_type = NULL, svy_edition = NULL)

Arguments

svy_type

Survey type (e.g. "ech").

svy_edition

Survey edition string (e.g. "2023", "2023-06").

Value

List with components: svy_type, svy_edition (parsed), svy_periodicity.

Examples

validate_time_pattern(svy_type = "ech", svy_edition = "2023")
validate_time_pattern(
  svy_type = "ech", svy_edition = "2023-06"
)

View graph

Description

View graph

Usage

view_graph(svy, init_step = "Load survey")

Arguments

svy

Survey object

init_step

Initial step label (default: "Load survey")

Value

A visNetwork interactive graph of the survey processing steps.

Examples


dt <- data.table::data.table(
  id = 1:5, age = c(25, 30, 45, 50, 60),
  w = rep(1, 5)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "ech",
  psu = NULL, engine = "data.table",
  weight = add_weight(annual = "w")
)
svy <- step_compute(svy, age2 = age * 2)
view_graph(svy)

Execute estimation workflow for surveys

Description

This function executes a sequence of statistical estimations on Survey objects, applying functions from the R survey package with appropriate metadata. Automatically handles different survey types and periodicities.

Usage

workflow(svy, ..., estimation_type = "monthly")

Arguments

svy

A list of Survey objects, or a PoolSurvey. Even for a single survey, wrap it in list(): workflow(svy = list(my_survey), ...). Must contain properly configured sample design.

...

Calls to survey package functions (such as svymean, svytotal, svyratio, etc.) that will be executed sequentially

estimation_type

Type of estimation (default "monthly") that determines which weight to use. Options: "monthly", "quarterly", "annual", or vector with multiple types

Details

The function automatically selects the appropriate sample design according to the specified estimation_type. For each Survey in the input list, it executes all functions specified in ... and combines the results.

Supported estimation types:

"monthly": Monthly estimations
"quarterly": Quarterly estimations
"annual": Annual estimations

For PoolSurvey objects, it uses a specialized methodology that handles pooling of multiple surveys.

Value

data.table with results from all estimations, including columns:

stat: Name of estimated statistic
value: Estimation value
se: Standard error
cv: Coefficient of variation
estimation_type: Type of estimation used
survey_edition: Survey edition
Other columns depending on estimation type

Examples

# Simple estimation with a test survey
dt <- data.table::data.table(
  x = rnorm(100), g = sample(c("a", "b"), 100, TRUE),
  w = rep(1, 100)
)
svy <- Survey$new(
  data = dt, edition = "2023", type = "test",
  psu = NULL, engine = "data.table",
  weight = add_weight(annual = "w")
)
result <- workflow(
  svy = list(svy),
  survey::svymean(~x, na.rm = TRUE),
  estimation_type = "annual"
)


# ECH example with domain estimations
# result <- workflow(
#   survey = list(ech_2023),
#   svyby(~unemployed, ~region, svymean, na.rm = TRUE),
#   estimation_type = "annual")

Construct a RecipeWorkflow from a plain list

Description

Construct a RecipeWorkflow from a plain list

Usage

workflow_from_list(lst)

Arguments

lst

A list (typically from JSON) with workflow fields

Value

A RecipeWorkflow object

Examples

lst <- list(
  name = "example", user = "test", survey_type = "ech",
  edition = "2023", estimation_type = "svymean"
)
wf <- workflow_from_list(lst)

Create publication-quality table from workflow results

Description

Formats a workflow() result as a gt table with confidence intervals, CV quality classification, and provenance-based source notes.

Usage

workflow_table(
  result,
  ci = 0.95,
  digits = 2,
  compare_by = NULL,
  show_cv = TRUE,
  show_se = TRUE,
  title = NULL,
  subtitle = NULL,
  source_note = TRUE,
  locale = "en",
  theme = "publication"
)

Arguments

result

A data.table from workflow().

ci

Confidence level for intervals (default 0.95). Set to NULL to hide confidence intervals.

digits

Number of decimal places (default 2).

compare_by

Column name to pivot for side-by-side comparison (e.g., "survey_edition").

show_cv

Logical; show CV column with quality classification.

show_se

Logical; show SE column.

title

Table title. Auto-generated if NULL.

subtitle

Table subtitle. Auto-generated if NULL.

source_note

Logical; show provenance footer.

locale

Locale for number formatting ("en" or "es").

theme

Table theme: "publication" (clean) or "minimal".

Details

CV quality classification follows INE/CEPAL standards:

Excellent: CV < 5\
Very good: 5-10\
Good: 10-15\
Acceptable: 15-25\
Use with caution: 25-35\
Do not publish: >= 35\

Value

A gt_tbl object. Export via gt::gtsave() to .html, .docx, .pdf, .png, or .rtx. Falls back to knitr::kable() if gt is not installed.

Examples

svy <- Survey$new(
  data = data.table::data.table(
    x = rnorm(100), g = sample(c("a", "b"), 100, TRUE), w = rep(1, 100)
  ),
  edition = "2023", type = "test", psu = NULL,
  engine = "data.table", weight = add_weight(annual = "w")
)
result <- workflow(
  list(svy), survey::svymean(~x, na.rm = TRUE),
  estimation_type = "annual"
)

if (requireNamespace("gt", quietly = TRUE)) {
  workflow_table(result)
}

metasurvey: Survey Processing with Meta-Programming

Description

Key Features

Supported Survey Types

Workflow Example

Meta-Programming Features

Author(s)

References

See Also

Extract downloaded ANDA file (handles ZIP, RAR, CSV, SAV)

Description

Usage

Arguments

Value

Find the main data file in an extracted directory

Description

Usage

Arguments

Value

Parse ANDA download page for resource titles and IDs

Description

Usage

Arguments

Value

Select resources matching the requested type

Description

Usage

Arguments

Value

PoolSurvey Class

Description

Value

Public fields

Methods

Public methods

Method new()

Usage

Arguments

Method get_surveys()

Usage

Arguments

Returns

Method print()

Usage

Method clone()

Usage

Arguments

See Also

Examples

Recipe R6 class

Description

Format

Details

Value

Methods

Public fields

Methods

Public methods

Method new()

Usage

Arguments

Method increment_downloads()

Usage

Method certify()

Usage

Arguments

Method add_category()

Usage

Arguments

Method remove_category()

Usage

Arguments

Method to_list()

Usage

Returns

Method doc()

Usage

Returns

Method validate()

Usage

Method `new()`

Method `get_surveys()`

Method `print()`

Method `clone()`

Method `new()`

Method `increment_downloads()`

Method `certify()`

Method `add_category()`

Method `remove_category()`

Method `to_list()`

Method `doc()`

Method `validate()`

Method `clone()`

Method `new()`

Method `publish()`

Method `search()`

Method `get()`

Method `increment_downloads()`

Method `rank()`

Method `filter()`

Method `list_all()`

Method `save()`

Method `load()`

Method `clone()`

Method `new()`

Method `is_subcategory_of()`

Method `get_path()`

Method `equals()`