irclassThe purpose of this vignette is to describe the structure and methods
of objects of class ir. ir objects are used by
the ‘ir’ package to store spectra and their metadata. This vignette
could be helpful if you want to understand better how the ‘ir’ package
works, how to handle metadata, how to manipulate ir
objects, or if you want to construct a subclass based on the
irclass.
This vignette does not give an overview on how to use the ‘ir’
package, on functions for spectral preprocessing, and on how to plot
ir objects. For this, see vignette Introduction to the ‘ir’ package.
This vignette has three parts:
ir classir objectsir objectsIn part The ir class, I will
describe the structure of ir objects and list available
methods for it.
In part Subsetting and
modifying ir objects, I will show how ir
objects can be subsetted and modified (including tidyverse
functions).
In part Special
functions to manipulate ir objects, I will present some
more specialized functions to manipulate the data in ir
objects (including the spectra).
To follow this vignette, you have to install the ‘ir’ package as described in the Readme file and you have to load it:
library(ir)ir classObjects of class ir are in principle data frames (or
tibbles):
ir_sample_dataEach row represents one measurement for a spectrum. The
ir object must a column spectra which is a
list of data frames, each element representing a spectrum.
Besides this, ir objects may have additional columns
with metadata. This is useful to analyze spectra of samples in an
integrated way with other data, for example nitrogen content (see part
Subsetting and modifying
ir objects).
The spectra column is a list of data frames, each
element representing a spectrum. The data frames have a row for each
intensity values measured for a spectral channel (“x axis value”,
e.g. wavenumber) and a column x storing the wavenumber
values and a column y storing the respective intensity
values. No additional columns are allowed:
head(ir_sample_data$spectra[[1]])
#> # A tibble: 6 × 2
#> x y
#> <int> <dbl>
#> 1 4000 0.000361
#> 2 3999 0.000431
#> 3 3998 0.000501
#> 4 3997 0.000571
#> 5 3996 0.000667
#> 6 3995 0.000704If there is no spectrum available for a sample, an empty data frame is a placeholder:
d <- ir_sample_data
d$spectra[[1]] <- d$spectra[[1]][0, ]
d$spectra[[1]]
ir_normalize(d, method = "area")#> # A tibble: 0 × 2
#> # … with 2 variables: x <int>, y <dbl>
Currently, the following methods are available for ir
objects:
methods(class = "ir")
#> [1] $ $<- Ops [ [<- [[ [[<- cbind
#> [9] filter ir_as_ir max median min plot range rbind
#> [17] rep
#> see '?methods' for accessing help and source codeir objectsSince ir objects are data frames, subsetting and
modifying works the same way as for data frames. For example, specific
rows (= measurements) can be filtered:
ir_sample_data[5:10, ]The advantage of storing spectra as list columns is that filtering spectral data and metadata and other data can be performed simultaneously.
One exception is that while subsetting, one must not remove the
spectra column. If it is removed, the ir class
attribute is dropped:
d1 <- ir_sample_data
class(d1[, setdiff(colnames(d), "id_sample")])
#> [1] "ir" "tbl_df" "tbl" "data.frame"
d1$spectra <- NULL
class(d1)
#> [1] "tbl_df" "tbl" "data.frame"Another exception is that when the spectra column
contains unsupported elements (e.g. wrong column names, additional
columns, duplicated “x axis values”), the object also loses its
ir class:
d2 <- ir_sample_data
d2$spectra[[1]] <- rep(d2$spectra[[1]], 2)
class(d2)
#> [1] "tbl_df" "tbl" "data.frame"
d3 <- ir_sample_data
colnames(d3$spectra[[1]]) <- c("a", "b")
class(d3)
#> [1] "tbl_df" "tbl" "data.frame"Tidyverse methods for manipulating ir objects are also supported. For
example, we can use mutate to add new variables and we can
use pipes (%>%) to make coding and reading code
easier:
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
d <- ir_sample_data
d <-
d %>%
mutate(a = rnorm(n = nrow(.)))
head(ir_sample_data)Or, a another example, we can summarize spectra for some defined
groups (here the maximum intensity value for each “x axis value” and
unique sample_type value):
library(purrr)
library(ggplot2)
d2 <-
d %>%
group_by(sample_type) %>%
summarize(
spectra = {
res <- map_dfc(spectra, function(.x) .x[, 2, drop = TRUE])
spectra[[1]] %>%
dplyr::mutate(
y =
res %>%
rowwise() %>%
mutate(y = max(c_across(everything()))) %>%
pull(y)
) %>%
list()
},
.groups = "drop"
)
plot(d2) +
facet_wrap(~ sample_type)ir objectsThere are some more special functions to manipulate ir
objects which are not described in vignette Introduction to the ‘ir’ package. These
will be described here.
Sometimes, it is useful to replicate one or multiple measurements.
This can be done with the rep() method for ir
objects. For example, we can replicate the second spectrum in
ir_sample_data:
ir_sample_data %>%
slice(2) %>%
rep(20)The ir packages supports arithmetic operations with
spectra, i.e. addition, subtraction, multiplication, and division of
intensity values with the same “x axis values”. For example, we can
subtract the third spectrum in ir_sample_data from the
second:
ir_sample_data %>%
slice(2) %>%
ir_subtract(y = ir_sample_data[3, ]) %>%
dplyr::mutate(id_sample = "subtraction_result") %>%
rbind(ir_sample_data[2:3, ]) %>%
plot() + facet_wrap(~ id_sample)Note that all metadata of the first argument (x) will be
retained, but not of the second (y). This is why we had to
manually change id_sample before rbinding the
other spectra above. Note also that x can contain multiple
spectra, y must either only contain one spectrum or the
same number of spectra as x in which case spectra of
matching rows are subtracted (added, multiplied, divided):
# This will not work
ir_sample_data %>%
slice(6) %>%
ir_add(y = ir_sample_data[3:4, ])
#> Error in `ir_add()`:
#> ! `y` must either have only one row or as many rows as `x`.
# but this will
ir_sample_data %>%
slice(2:6) %>%
ir_add(y = ir_sample_data[3, ])
#> # A tibble: 5 × 7
#> id_measurement id_sample sample_type sample_comment klason_lignin
#> * <int> <chr> <chr> <chr> <units>
#> 1 2 GN 11-400 needles Cupressocyparis leylandii … 0.339405
#> 2 3 GN 11-407 needles Juniperus chinensis Chines… 0.267552
#> 3 4 GN 11-411 needles Metasequoia glyptostroboid… 0.350016
#> 4 5 GN 11-416 needles Pinus strobus Torulosa 0.331100
#> 5 6 GN 11-419 needles Pseudolarix amabili Golden… 0.279360
#> # … with 2 more variables: holocellulose <units>, spectra <list>Note that arithmetic operations are also available as infix operators, i.e. it is possible to compute:
ir_sample_data[2, ] + ir_sample_data[3, ]
#> # A tibble: 1 × 7
#> id_measurement id_sample sample_type sample_comment klason_lignin
#> * <int> <chr> <chr> <chr> <units>
#> 1 2 GN 11-400 needles Cupressocyparis leylandii … 0.339405
#> # … with 2 more variables: holocellulose <units>, spectra <list>
ir_sample_data[2, ] - ir_sample_data[3, ]
#> # A tibble: 1 × 7
#> id_measurement id_sample sample_type sample_comment klason_lignin
#> * <int> <chr> <chr> <chr> <units>
#> 1 2 GN 11-400 needles Cupressocyparis leylandii … 0.339405
#> # … with 2 more variables: holocellulose <units>, spectra <list>
ir_sample_data[2, ] * ir_sample_data[3, ]
#> # A tibble: 1 × 7
#> id_measurement id_sample sample_type sample_comment klason_lignin
#> * <int> <chr> <chr> <chr> <units>
#> 1 2 GN 11-400 needles Cupressocyparis leylandii … 0.339405
#> # … with 2 more variables: holocellulose <units>, spectra <list>
ir_sample_data[2, ] / ir_sample_data[3, ]
#> # A tibble: 1 × 7
#> id_measurement id_sample sample_type sample_comment klason_lignin
#> * <int> <chr> <chr> <chr> <units>
#> 1 2 GN 11-400 needles Cupressocyparis leylandii … 0.339405
#> # … with 2 more variables: holocellulose <units>, spectra <list>Many more functions and options to handle and process spectra are
available in the ‘ir’ package. These are described in the documentation.
In the documentation, you can also read more details about the functions
and options presented here.
To learn more about how ir objects can be useful can be
plotted, and the spectral preprocessing functions, see the vignette Introduction to the ‘ir’ package.
The data contained in the csv file used in this vignette
are derived from Hodgkins et al.
(2018)
#> R version 4.2.0 (2022-04-22 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 22000)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=C LC_CTYPE=German_Germany.utf8
#> [3] LC_MONETARY=German_Germany.utf8 LC_NUMERIC=C
#> [5] LC_TIME=German_Germany.utf8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ggplot2_3.3.5 purrr_0.3.4 dplyr_1.0.8 ir_0.2.1
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.1.2 xfun_0.30 bslib_0.3.1
#> [4] lpSolve_5.6.15 lattice_0.20-45 limSolve_1.5.6
#> [7] colorspace_2.0-3 hyperSpec_0.100.0 vctrs_0.4.1
#> [10] generics_0.1.2 testthat_3.1.3 htmltools_0.5.2
#> [13] yaml_2.3.5 utf8_1.2.2 rlang_1.0.2
#> [16] jquerylib_0.1.4 pillar_1.7.0 withr_2.5.0
#> [19] glue_1.6.2 RColorBrewer_1.1-3 jpeg_0.1-9
#> [22] lifecycle_1.0.1 stringr_1.4.0 munsell_0.5.0
#> [25] gtable_0.3.0 evaluate_0.15 labeling_0.4.2
#> [28] latticeExtra_0.6-29 knitr_1.39 fastmap_1.1.0
#> [31] SparseM_1.81 fansi_1.0.3 highr_0.9
#> [34] scales_1.2.0 jsonlite_1.8.0 farver_2.1.0
#> [37] brio_1.1.3 png_0.1-7 digest_0.6.29
#> [40] stringi_1.7.6 baseline_1.3-1 rbibutils_2.2.8
#> [43] grid_4.2.0 quadprog_1.5-8 Rdpack_2.3
#> [46] cli_3.3.0 tools_4.2.0 magrittr_2.0.3
#> [49] sass_0.4.1 lazyeval_0.2.2 tibble_3.1.6
#> [52] tidyr_1.2.0 crayon_1.5.1 pkgconfig_2.0.3
#> [55] MASS_7.3-56 ellipsis_0.3.2 xml2_1.3.3
#> [58] rmarkdown_2.13 rstudioapi_0.13 R6_2.5.1
#> [61] compiler_4.2.0