genesysr Tutorial

Matija Obreza & Nora Castaneda

2025-08-19

Querying Genesys PGR

Genesys PGR is the global database on plant genetic resources maintained ex situ in national, regional and international genebanks around the world.

genesysr uses the Genesys API to query Genesys data. The API is accessible at https://api.genesys-pgr.org.

Accessing data with genesysr is similar to downloading data in CSV or Excel format and loading it into R.

For the impatient

Accession passport data is retrieved with the get_accessions function.

Accessing Genesys requires authentication so the first thing to do is to login:

## Setup: use Genesys Sandbox environment
# genesysr::setup_sandbox() # Use this to connect to our test environment https://sandbox.genesys-pgr.org
# genesysr::setup_production() # This is initialized by default when loading genesysr

# Open a browser: login to Genesys and authorize access
genesysr::user_login()

The database is queried by providing a filter (see Filters below) and the list of passport data fields that you wish to download from Genesys. A basic list of MCPD descriptors (“INSTCODE”, “ACCENUMB”, “DOI”, “HISTORIC”, “GENUS”, “SPECIES”, “SUBTAXA”, “SAMPSTAT”) is used if you don’t specify your own list.

# Retrieve accessions for genus *Musa*
musa <- get_accessions(filters = list(taxonomy = list(genus = list('Musa'))))

# Retrieve all accession data for the Musa International Transit Center, Bioversity International
itc <- get_accessions(list(institute = list(code = list('BEL084'))))

# Retrieve all accession data for the Musa International Transit Center, Bioversity International (BEL084) and the International Center for Tropical Agriculture (COL003)
some <- get_accessions(list(institute = list(code = list('BEL084','COL003'))))

genesysr provides utility functions to create filter objects using Multi-Crop Passport Descriptors (MCPD) definitions:

# Retrieve data by country of origin (MCPD)
get_accessions(mcpd_filter(ORIGCTY = list("DEU", "SVN")))

Processing fetched data

Passport data follows MCPD standard and where multiple values are possible, they will be separated by a semicolon ;.

Example: Column “STORAGE” may include 11;12 or a single 11.

Filters

The filter object is a named list() where names match a Genesys filter and the value specifies the criteria to match.

The records returned by Genesys match all filters provided (AND operation), while individual filters allow for specifying multiple criteria (OR operation):

# (GENUS == Musa) AND ((ORIGCTY == NGA) OR (ORIGCTY == CIV))
filter <- list(taxonomy = list(genus = c('Musa'), species = c('aa')), countryOfOrigin = list(iso3 = c('NGA', 'CIV')))

# OR
filter <- list();
filter$taxonomy$genus = list('Musa')
filter$taxonomy$species = list('aa')
filter$countryOfOrigin$iso3 = list('NGA', 'CIV')

# See filter object as JSON
jsonlite::toJSON(filters)

There are a number of filtering options to retrieve data from Genesys. Best explore how filtering works on the actual website https://www.genesys-pgr.org/a/overview by inspecting the HTTP requests sent by your browser to the API server and then replicating them here.

Taxonomy

taxonomy$genus filters by a list of genera.

filters <- list(taxonomy = list(genus = list('Hordeum', 'Musa')))
# Print
jsonlite::toJSON(filters)

taxonomy$species filters by a list of species.

filters <- list(taxonomy = list(genus = list('Hordeum'), species = list('vulgare')))
# Print
jsonlite::toJSON(filters)

Origin of material

countryOfOrigin$iso3 filters by ISO3 code of country of origin of PGR material.

# Material originating from Germany (DEU) and France (FRA)
filters <- list(countryOfOrigin = list(iso3 = list('DEU', 'FRA')))

geo.latitude and geo.longitude filters by latitude/longitude (in decimal format) of the collecting site.

filters <- list(geo = list(latitude = genesysr::range(-10, 30), longitude = genesysr::range(30, 50)))

Holding institute

institute$code filters by a list of FAO WIEWS institute codes of the holding institutes.

# Filter for ITC (BEL084) and CIAT (COL003)
list(institute = list(code = list('BEL084', 'COL003')))

institute$country$iso3 filters by a list of ISO3 country codes of country of the holding institute.

# Filter for genebanks in Slovenia (SVN) and Belgium (BEL)
list(institute = list(country = list(iso3 = list('SVN', 'BEL'))))

Selecting columns

Step-by-step example

Let’s take a look of all the process of fetching accession passport data from Genesys.

  1. Load genesysr
library(genesysr)
  1. Setup using user credentials
setup_sandbox()
user_login()
  1. Fetch basic data
musa <- genesysr::get_accessions(list(taxonomy = list(genus = list('Musa'))))
  1. Download columns of interest
# Fetch only accession number, storage and taxonomic data for *Musa* accessions
musa <- genesysr::get_accessions(list(taxonomy = list(genus = list('Musa'))), fields = list("ACCENUMB", "STORAGE", "GENUS", "SPECIES", "SUBTAXA"))

The following column names are available:

Column Description
INSTCODE FAO WIEWS code of the genebank managing the material
ACCENUMB Accession number
DOI DOI of the accession
HISTORIC Flag indicating if the accession record is historical (true) or active (false)
CURATION Type of curation applied to this accession
GENUS Genus
SPECIES Specific epithet
SPAUTHOR Species authority
SUBTAXA Subtaxon information at the most detailed taxonomic level
SUBTAUTHOR Subtaxon authority
GRIN_TAXON_ID GRIN Taxonomy ID of the taxon
GRIN_NAME Taxon name according to GRIN Taxonomy
GRIN_AUTHOR Taxon authority
CROPNAME Crop name(s) as provided by the genebank
CROPCODE Crop code used by Genesys
SAMPSTAT Biological status of the accession
ACQDATE Acquisition date
ACCENAME Accession name
ORIGCTY Country of provenance of the material
COLLSITE Site of collecting
DECLATITUDE Latitude of the collecting site
DECLONGITUDE Longitude of the collecting site
COORDUNCERT Coordinate uncertainty in meters
COORDDATUM Coordinate datum
GEOREFMETH Georeferencing method
ELEVATION Elevation of the collecting site
COLLDATE Collecting date
COLLSRC Collecting source
COLLNUMB Collecting number
COLLCODE FAO WIEWS code of the institute that originally collected the material
COLLNAME Name of the institute that collected the material
COLLINSTADDRESS Address of the institute that collected the material
COLLMISSID Collecting mission name/identifier
DONORCODE FAO WIEWS code of the institute from which this accession was acquired
DONORNAME Name of the institute from which this accession was acquired
DONORNUMB Accession number at the donor institute
OTHERNUMB Other numbers/identifiers associated with this accession
BREDCODE FAO WIES code of the institute that developed/bred this material
BREDNAME Name of the institute that developed this material
ANCEST Ancestral data or pedigree information
DUPLSITE FAO WIEWS codes of institutes where this accession is safety duplicated by the genebank
STORAGE Types of germplasm storage
MLSSTAT Status of the accession in the Multilateral System of the ITPGRFA
ACCEURL Accession URL
REMARKS Notes and remarks
DATAPROVIDERID Database ID of this record in genebank’s own database
PDCI Passport Data Completeness Index for this accession
UUID UUID assigned to this record by Genesys
LASTMODIFIED Date when this record was last updated in Genesys

Downloading all non-historical records

Please use sparingly!

accessions <- get_accessions(filters = c(historic = list('false')))

mirror server hosted at Truenetwork, Russian Federation.