Type: Package
Title: Simple Transcriptome Meta-Analysis for Identifying Stress-Responsive Genes
Version: 0.1.2
Author: Yusuke Fukuda [aut, cre], Atsushi Fukushima [aut]
Maintainer: Yusuke Fukuda <s823631038@kpu.ac.jp>
Description: Stress Response score (SRscore) is a stress responsiveness measure for transcriptome datasets and is based on the vote-counting method. The SRscore is determined to evaluate and score genes on the basis of the consistency of the direction of their regulation (Up-regulation, Down-regulation, or No change) under stress conditions across multiple analyzed research projects. This package is based on the HN-score (score based on the ratio of gene expression between hypoxic and normoxic conditions) proposed by Tamura and Bono (2022) <doi:10.3390/life12071079>, and can calculate both the original method and an extended calculation method described in Fukuda et al. (2025) <doi:10.1093/plphys/kiaf105>.
License: MIT + file LICENSE
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.3
Imports: dplyr, tidyr, utils, rlang
Suggests: knitr, rmarkdown, testthat (≥ 3.0.0), ggplot2, tibble, ComplexHeatmap, clusterProfiler, org.At.tair.db, BiocStyle, RColorBrewer, genefilter, DT
Config/testthat/edition: 3
VignetteBuilder: knitr
Depends: R (≥ 3.5.0)
NeedsCompilation: no
Packaged: 2026-01-07 09:36:19 UTC; kpufukuda
Repository: CRAN
Date/Publication: 2026-01-08 18:50:19 UTC

Reproducing HN-scores from HN-ratios Using the SRscore Package

Description

The HN-score is a scoring metric derived from the HN-ratio, which represents the gene expression ratio between hypoxic and normoxic conditions, and was originally proposed by Tamura and Bono (2022) doi:10.3390/life12071079. It is publicly available on figshare doi:10.6084/m9.figshare.20055086. HNscore is provided as a data frame containing HN-scores calculated from logHNratioHypoxia and is implemented as test data in the SRscore package. To reduce data size, HNscore includes HN-scores for a subset of 1,000 genes extracted from the original dataset.

Usage

HNscore

Format

A data frame with 1000 rows and 11 variables:

Transcript_id_At

Transcript ID in Arabidopsis thaliana

Upregulated

Total number of times HNratio exceeds 2

Downregulated

Total number of times HNratio is below 0.5

Unchanged

Total number of times SRratio is between 0.5 and 2

All

Maximum possible HNscore

HN.score

HN-score

Gene_name_At

Gene name in Arabidopsis thaliana

Gene_description_At

Gene description in Arabidopsis thaliana

Protein_id_Hs

Transcript ID in Homo Sapiens

Gene_name_Hs

Gene name in Homo Sapiens

Gene_description_Hs

Gene name in Homo Sapiens

Source

Tamura, Keita, and Hidemasa Bono. 2022. “Meta-Analysis of RNA Sequencing Data of Arabidopsis and Rice Under Hypoxia.” Life 12 (7).


Metadata of abscisic acid stress microarray dataset in Arabidopsis thaliana

Description

MetadataABA is the metadata for the experimental dataset related to Arabidopsis thaliana under ABA stress conditions. Metadata are used to define pairs for comparison between the target sample group and the experimental sample group.

Usage

MetadataABA

Format

A data frame with 19 rows and 4 variables:

Series

Research project ID

control_sample

control sample ID

treated_sample

treatment sample ID

treatment

treatment condition

tissue

tissue name


Metadata of Hypoxia stress RNA-Seq dataset in Arabidopsis thaliana

Description

This is metadata of RNA-Seq data that is used in the study by Tamura and Bono (2022) doi:10.3390/life12071079. It is publicly available on figshare doi:10.6084/m9.figshare.20055086.

Usage

MetadataHypoxia

Format

A data frame with 29 rows and 4 variables:

study_accession

Research project ID

run_accession..hypoxia.

RNA-Seq run accession ID

run_accession..normoxia.

RNA-Seq run accession ID

Treatment

treatment condition


Test data for the SRscore package

Description

SRGA is a reference test dataset that integrates standardized SRscores across 11 stress conditions as reported in Fukuda et al. (2025) doi:10.1093/plphys/kiaf105. Because SRscore scales differ by stress type, SRscores were standardized using z-scores. This dataset is provided solely for demonstrating and testing template matching (Pavlidis and Noble, 2001) doi:10.1186/gb-2001-2-10-research0042 workflows implemented in the SRscore package and is not intended to introduce a new analysis method. To reduce file size, the dataset includes SRscores for a subset of 1,000 genes.

Usage

SRGA

Format

A data frame with 1000 rows and 13 variables:

ensembl_gene_id

Ensembl gene ID

ABA

SRscore derived from ABA dataset

Cold

SRscore derived from cold dataset

DC3000

SRscore derived from DC3000 dataset

Drought

SRscore derived from drought dataset

Heat

SRscore derived from heat dataset

High-light

SRscore derived from highlight dataset

Hypoxia

SRscore derived from hypoxia dataset

Osmotic

SRscore derived from osmotic dataset

Oxidation

SRscore derived from oxidation dataset

Salt

SRscore derived from salt dataset

Wound

SRscore derived from wound dataset

SYMBOL

Gene symbol


Test data for calculating SRratio using sample data

Description

A dataframe containing SRratio (Stress Response ratio) calculated from TranscriptomeABA.

Usage

SRratio_test

Format

An object of class data.frame with 1000 rows and 20 columns.

Details

Column components :

Ensembl gene id + 19 treatment sample id


Test data for calculating SRscore using sample data

Description

A dataframe containing SRscore (Stress Response score) calculated from SRratio_test.

Usage

SRscore_test

Format

A data frame with 1000 rows and 6 variables:

ensembl_gene_id

Ensembl gene ID

up

Total number of times SRratio exceeds 2

dn

Total number of times SRratio is below 2

unchange

Total number of times SRratio is between -2 and 2

all

Maximum possible SRscore

score

SRscore with absolute value 2 as threshold


Hypoxia stress RNA-Seq data in Arabidopsis thaliana

Description

This is RNA-Seq data that is used in the study by Tamura and Bono (2022) doi:10.3390/life12071079. The quantitative RNA-Seq data, which were calculated as transcripts per million (TPM), are available at figshare doi:10.6084/m9.figshare.20055086.

Usage

TPMHypoxia

Format

An object of class data.frame with 1000 rows and 59 columns.

Details

Column components :

Ensembl gene id + 58 sample id (control : 29, treatment : 29)


Abscisic acid stress microarray dataset in Arabidopsis thaliana

Description

This is a gene expression matrix for Arabidopsis under ABA stress conditions. The first column is the gene ID column, all others are sample ID columns. The expression data are read as raw data (CEL files) and summarized and normalized by Robust Multi-array Average (RMA). To keep the file size small, the data is limited to 1,000 genes.

Usage

TranscriptomeABA

Format

An object of class data.frame with 1000 rows and 39 columns.

Details

Column components :

Ensembl gene id + 38 sample id (control : 19, treatment : 19)


Calculate the Stress Response ratio (SRratio)

Description

This function computes the Stress Response ratio (SR ratio) for paired variables in a dataset. The function supports both log2-transformed and non-log2-transformed data and calculates the mean SRratio for grouped variables.

Usage

calcSRratio(.data, var1, var2, pair, is.log2 = NA)

Arguments

.data

A data frame containing expression values for a series of arrays, with rows corresponding to genes and columns to samples.

var1

A character vector containing column names of control samples.

var2

A character vector containing column names of treatment samples.

pair

A data frame with control samples and treatment samples.

is.log2

A logical value (TRUE, FALSE) or NA indicating whether the data in .data is log2-transformed:

  • If TRUE, the SR ratio is calculated as the difference between the target and reference variables.

  • If FALSE, the SR ratio is calculated as the log2-transformed ratio: log2((target + 1) / (reference + 1)).

  • If NA (default), the user will be prompted interactively to confirm whether the data is log-transformed.

Value

A data frame containing:

Examples

var1 <- "control_sample"
var2 <- "treated_sample"
grp <- "Series"

ebg <- expand_by_group(MetadataABA, grp, var1, var2)

SRratio <- calcSRratio(TranscriptomeABA, var1, var2, ebg, is.log2 = TRUE)


Create a Data Frame of SRscore

Description

SRscore is score value of genes based expression profiles across different research projects. SRratio is required to calculate SRscore.

Usage

calcSRscore(srratio, threshold = c(-1, 1))

Arguments

srratio

A data frame of SRratio.

threshold

A vector of length 2 (x, y) indicating threshold values. c(-1, 1) is default.

Value

A data frame containing results.

Examples

grp <- "Series"
var1 <- "control_sample"
var2 <- "treated_sample"

ebg <- expand_by_group(MetadataABA,
                       grp,
                       var1,
                       var2)

SRratio <- calcSRratio(TranscriptomeABA,
                       var1,
                       var2,
                       ebg,
                       is.log2 = TRUE)

head(calcSRscore(SRratio, threshold = c(-1, 1)))


Aggregate the results of the three functions into a single list

Description

The SRscore calculation process is divided into three major processes, and functions are provided for each process (see the respective function documents for details). directly_calcSRscore() aggregates the results of the three functions into a single list.

Usage

directly_calcSRscore(
  .data1,
  grp,
  var1,
  var2,
  .data2,
  is.log2 = NA,
  threshold = c(-1, 1)
)

Arguments

.data1

A data frame containing the two variables you want to compare, as well as the variables of the group to which they belong.

grp

Column name of groups.

var1

Column name of first variable.

var2

Column name of second variable.

.data2

A data frame containing expression values for a series of arrays, with rows corresponding to genes and columns to samples.

is.log2

A logical specifying if .data2 is log-2transformed.

threshold

A vector of length 2 (x, y) indicating threshold values. c(-1, 1) is default.

Value

A data frame containing results.

See Also

expand_by_group()

calcSRratio()

calcSRscore()

Examples

grp <- "Series"
var1 <- "control_sample"
var2 <- "treated_sample"

ls <- directly_calcSRscore(MetadataABA,
                           grp,
                           var1,
                           var2,
                           TranscriptomeABA,
                           is.log2 = TRUE,
                           threshold = c(-1, 1))
lapply(ls, head)


Create a data frame from all combinations between two specified variables within each group

Description

expand_by_group() generates all combinations (Cartesian product) of two specified variables within each group in your dataframe.

Usage

expand_by_group(.data, grp, var1, var2)

Arguments

.data

A data frame.

grp

A column name indicating the group.

var1

A column name indicating the control.

var2

A column name indicating the treatment.

Value

Returns a data frame containing all combinations of the specified variables for each group. The structure of the returned data frame includes:

Examples

grp <- "Series"
var1 <- "control_sample"
var2 <- "treated_sample"

ebg <- expand_by_group(MetadataABA,
                       grp,
                       var1,
                       var2)

unique_series <- unique(MetadataABA$Series)

lapply(unique_series,
       function(x) subset(ebg, Series == x))



Find the expression ratio for each experimental sample for the specified gene.

Description

This function retrieves SRratio (Stress Response ratio) values for one or more specified genes across all experimental samples and combines them with the corresponding sample metadata. In addition, the corresponding SRscore (Stress Response score) values for the specified genes are returned. The output is intended for downstream inspection and visualization of gene-level expression patterns across experimental conditions.

Usage

find_diffexp(genes, srratio, srscore, metadata)

Arguments

genes

character vector that can consist of gene IDs

srratio

A dataframe of srratio

srscore

A dataframe of srratio

metadata

A dataframe of metadata

Value

Data frame of metadata with SRratio corresponding to the specified gene ID in the back row

Examples

vr1 <- "control_sample"
vr2 <- "treated_sample"
grp <- "Series"

ebg <- expand_by_group(MetadataABA,
                       vr1,
                       vr2,
                       grp)

SRratio <- calcSRratio(TranscriptomeABA,
                             vr1,
                             vr2,
                             ebg,
                             is.log = 1)

SRscore <- calcSRscore(SRratio)

set.seed(1)
find_diffexp(sample(SRratio$ensembl_gene_id, 1), SRratio, SRscore, MetadataABA)


Reproducing HN-scores from HN-ratios Using the SRscore Package

Description

The HN-ratio, which quantifies gene expression changes between hypoxic and normoxic conditions across multiple experiments, was originally proposed by Tamura and Bono (2022) doi:10.3390/life12071079. It is publicly available on figshare doi:10.6084/m9.figshare.20055086. In the SRscore package, the HN-ratio is introduced solely as an intermediate quantity required to compute HN-scores. logHNratioHypoxia is a data frame containing log2-transformed HN-ratios. To reduce data size, logHNratioHypoxia includes HN-ratios for a subset of 1,000 genes extracted from the original dataset.

Usage

logHNratioHypoxia

Format

An object of class data.frame with 1000 rows and 30 columns.

Details

Column components :

Ensembl gene id + 29 treatment sample id

Source

Tamura, Keita, and Hidemasa Bono. 2022. “Meta-Analysis of RNA Sequencing Data of Arabidopsis and Rice Under Hypoxia.” Life 12 (7).


Plot the Distribution of SRscore Values

Description

This function visualizes the distribution of SRscore values using a barplot. Values equal to 0 are excluded from the plot by design because they typically represent genes without detectable stress response activity.

Usage

plot_SRscore_distr(srscore, log = FALSE)

Arguments

srscore

A data.frame containing at least one column named score, which represents the SRscore values to be plotted.

log

Logical (default: FALSE). If TRUE, the y-axis of the barplot is shown in logarithmic scale (log = "y").

Details

The function provides both a linear-scale plot and a log-scale version, which is particularly useful when the frequency of SRscore values spans a wide range.

The function performs the following steps:

Value

This function returns NULL invisibly and produces a barplot as a side effect.

Examples

# Example SRscore data
df <- data.frame(score = c(-5, -3, -3, 1, 2, 2, 2, 4, 5, 5, 0))

# Linear-scale plot
plot_SRscore_distr(df)

# Log-scale plot
plot_SRscore_distr(df, log = TRUE)


Plot Ranked SRscore Values with Threshold-Based Highlighting

Description

This function visualizes SRscore values sorted in descending order and colors each point based on user-defined thresholds. Genes with SRscore above the upper threshold are colored red (up-regulated), those below the lower threshold are colored blue (down-regulated), and values within the range are shown in black.

Usage

plot_SRscore_rank(srscore, threshold = c(1, -1))

Arguments

srscore

A data.frame containing at least a column named score, representing SRscore values for genes.

threshold

A numeric vector of length 2 specifying c(upper_threshold, lower_threshold). Default is c(1, -1).

Details

The function performs the following:

Value

Invisibly returns the sorted SRscore vector. The function produces a scatter plot as a side effect.

Examples

df <- data.frame(
  gene = paste0("Gene", 1:10),
  score = c(-5, -3, -1, 0, 0.5, 1.2, 2, 3, 4, 5)
)

# Basic usage
plot_SRscore_rank(df)

# Custom thresholds
plot_SRscore_rank(df, threshold = c(2, -2))


Barplot of Top Genes Ranked by Absolute SRscore

Description

This function selects the top top_n genes with the largest absolute SRscore values and visualizes their SRscores using a barplot. The function is useful for quickly identifying genes with the strongest positive or negative stress responses.

Usage

plot_SRscore_top(srscore, top_n = 20)

Arguments

srscore

A data.frame containing at least a column named score, representing the SRscore values for genes.

top_n

Integer (default: 20). The number of top genes to plot, ranked by |SRscore|.

Details

The function performs the following steps:

The barplot displays:

Graphical parameters are temporarily modified, and restored automatically using on.exit() to avoid affecting the user's plotting environment.

Value

Invisibly returns the data.frame of selected top genes (after sorting). A barplot is produced as a side effect.

Examples

# Example data.frame of SRscore
df <- data.frame(
  gene = paste0("Gene", 1:10),
  score = c(-12, -6, -3, 1, 2, 3, 5, 8, 10, 11)
)

# Plot top 5 genes by |SRscore|
plot_SRscore_top(df, top_n = 5)


Test data to create data frames from all combinations between two specified variables within each group using sample data

Description

Test data to create data frames from all combinations between two specified variables within each group using sample data

Usage

sample_pair_test

Format

A data frame with 71 rows and 2 variables:

control_sample

Control Sample ID

treated_sample

Treated Sample ID

mirror server hosted at Truenetwork, Russian Federation.