Title: | Optimizing Acoustic Signal Detection |
Version: | 1.0.3 |
Maintainer: | Marcelo Araya-Salas <marcelo.araya@ucr.ac.cr> |
Description: | Facilitates the automatic detection of acoustic signals, providing functions to diagnose and optimize the performance of detection routines. Detections from other software can also be explored and optimized. This package has been peer-reviewed by rOpenSci. Araya-Salas et al. (2022) <doi:10.1101/2022.12.13.520253>. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Encoding: | UTF-8 |
URL: | https://docs.ropensci.org/ohun/, https://github.com/ropensci/ohun/ |
BugReports: | https://github.com/ropensci/ohun/issues/ |
VignetteBuilder: | knitr |
RoxygenNote: | 7.3.2 |
Language: | en-US |
Imports: | tuneR, warbleR (≥ 1.1.32), cli, methods, stats, utils, seewave (≥ 2.0.1), fftw, rlang, sf, igraph, checkmate, ggplot2 |
Depends: | R (≥ 3.2.1) |
Suggests: | knitr, rmarkdown, testthat, viridis, Sim.DiffProc, vdiffr |
NeedsCompilation: | no |
Packaged: | 2025-07-22 15:43:14 UTC; m |
Author: | Marcelo Araya-Salas
|
Repository: | CRAN |
Date/Publication: | 2025-07-22 16:30:54 UTC |
ohun: Optimizing sound event detection
Description
ohun is intended to facilitate the automated detection of sound events, providing functions to diagnose and optimize detection routines. Detections from other software can also be explored and optimized.
Details
The main features of the package are:
The use of reference annotations for detection optimization and diagnostic
The use of signal detection theory diagnostic parameters to evaluate detection performance
The batch processing of sound files for improve computational performance
The package offers functions for:
Energy-based detection
Template-based detection
Diagnose detection precision
Improve detection by adjusting parameters to optimize accuracy
All functions allow the parallelization of tasks, which distributes the tasks among several processors to improve computational efficiency. The package works on sound files in '.wav', '.mp3', '.flac' and '.wac' format.
License: GPL (>= 2)
Author(s)
Marcelo Araya-Salas
Maintainer: Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr)
See Also
Useful links:
Report bugs at https://github.com/ropensci/ohun/issues/
Remove ambiguous detections
Description
consensus_detection
removes ambiguous detections
Usage
consensus_detection(
detection,
by = "overlap",
filter = "max",
cores = 1,
pb = TRUE
)
Arguments
detection |
Data frame or selection table (using the warbleR package's format, see |
by |
Character vector of length 1 indicating a column in 'detection' that will be used to filter detections. Must refer to a numeric column. Default is 'overlap', which is return by |
filter |
Character vector of length 1 indicating the criterium used to filter the column refer to by the 'by' argument. Current options are 'max' (maximum) and 'min' (minimum). Default is 'max'. |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
pb |
Logical argument to control progress bar. Default is |
Details
This function removes ambiguous detections keeping only the one that maximizes a criterium given by 'filter'. By default it keeps the detection with the highest overlap to the reference signal. It works on the output of label_detection
. Mostly useful when several detections match the same reference as in the case of template detection with multiple templates (see template_detector
). Keep in mind that the argument 'solve.ambiguous' most be FALSE
to keep those ambiguous detections.
Value
A data frame or selection table (if 'detection' was also a selection table, warbleR package's format, see selection_table
) as in 'X' but removing ambiguous detections (split and merged positives).
Author(s)
Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr).
References
#' Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
label_detection
, template_detector
Examples
{
# load example data
data("lbh1", "lbh_reference")
# save sound files
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
# template for the first sound file in 'lbh_reference'
templ1 <- lbh_reference[lbh_reference$sound.files == "lbh1.wav" & lbh_reference$selec == 11, ]
# generate template correlations
tc <- template_correlator(
templates = templ1, path = tempdir(),
files = "lbh1.wav"
)
# template detection
td <- template_detector(template.correlations = tc, threshold = 0.35)
# this detection generates 2 split positives
diagnose_detection(
reference = lbh_reference[lbh_reference == "lbh1.wav", ],
detection = td,
solve.ambiguous = FALSE
)
# label detection
ltd <- label_detection(
reference = lbh_reference[lbh_reference == "lbh1.wav", ],
detection = td,
solve.ambiguous = FALSE
)
# now they can be filter to keep the detection with the highest score for each split
ftd <- consensus_detection(ltd, by = "scores")
# splits must be 0
diagnose_detection(
reference = lbh_reference[lbh_reference == "lbh1.wav", ],
detection = ftd,
solve.ambiguous = FALSE
)
}
Evaluate the performance of a sound event detection procedure
Description
diagnose_detection
evaluates the performance of a sound event detection procedure comparing the output selection table to a reference selection table
Usage
diagnose_detection(
reference,
detection,
by.sound.file = FALSE,
time.diagnostics = FALSE,
cores = 1,
pb = TRUE,
path = NULL,
by = NULL,
macro.average = FALSE,
min.overlap = 0.5,
solve.ambiguous = TRUE
)
Arguments
reference |
Data frame or 'selection.table' (following the warbleR package format) with the reference selections (start and end of the sound events) that will be used to evaluate the performance of the detection, represented by those selections in 'detection'. Must contained at least the following columns: "sound.files", "selec", "start" and "end". It must contain the reference selections that will be used for detection optimization. |
detection |
Data frame or 'selection.table' with the detections (start and end of the sound events) that will be compared against the 'reference' selections. Must contained at least the following columns: "sound.files", "selec", "start" and "end". It can contain data for additional sound files not found in 'references'. In this case the routine assumes that no sound events are found in those files, so detection from those files are all false positives. |
by.sound.file |
Logical argument to control whether performance diagnostics are summarized across sound files (when |
time.diagnostics |
Logical argument to control if diagnostics related to the duration of the sound events ("mean.duration.true.positives", "mean.duration.false.positives", "mean.duration.false.negatives" and "proportional.duration.true.positives") are returned (if |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
pb |
Logical argument to control progress bar. Default is |
path |
Character string containing the directory path where the sound files are located. If supplied then duty cycle (fraction of a sound file in which sounds were detected)is also returned. This feature is more helpful for tuning an energy-based detection. Default is |
by |
Character vector with the name of a column in 'reference' for splitting diagnostics. Diagnostics will be returned separated for each level in 'by'. Default is |
macro.average |
Logical argument to control if diagnostics are first calculated for each sound file and then averaged across sound files, which can minimize the effect of unbalanced sample sizes between sound files. If |
min.overlap |
Numeric. Controls the minimum amount of overlap required for a detection and a reference sound for it to be counted as true positive. Default is 0.5. Overlap is measured as intersection over union. Only used if |
solve.ambiguous |
Logical argument to control whether ambiguous detections (i.e. split and merged positives) are solved using maximum bipartite graph matching. Default is |
Details
The function evaluates the performance of a sound event detection procedure by comparing its output selection table to a reference selection table in which all sound events of interest have been selected. The function takes any overlap between detected sound events and target sound events as true positives. Note that all sound files located in the supplied 'path' will be analyzed even if not all of them are listed in 'reference'. When several possible matching pairs of sound event and detections are found, the optimal set of matching pairs is found through maximum bipartite matching (using the R package igraph). Priority for assigning a detection to a reference is given by the amount of time overlap. 'splits' and 'merge.positives' are also counted (i.e. counted twice) as 'true.positives'. Therefore "true.positives + false.positives = detections".
Value
A data frame including the following detection performance diagnostics:
-
detections
: total number of detections -
true.positives
: number of sound events in 'reference' that correspond to any detection. Matching is defined as some degree of overlap in time. In a perfect detection routine it should be equal to the number of rows in 'reference'. -
false.positives
: number of detections that don't match (i.e. don't overlap with) any of the sound events in 'reference'. In a perfect detection routine it should be 0. -
false.negatives
: number of sound events in 'reference' that were not detected (not found in 'detection'. In a perfect detection routine it should be 0. -
splits
: number of detections overlapping reference sounds that also overlap with other detections. In a perfect detection routine it should be 0. -
merges
: number of detections that overlap with two or more reference sounds. In a perfect detection routine it should be 0. -
mean.duration.true.positives
: mean duration of true positives (in ms). Only included whentime.diagnostics = TRUE
. -
mean.duration.false.positives
: mean duration of false positives (in ms). Only included whentime.diagnostics = TRUE
. -
mean.duration.false.negatives
: mean duration of false negatives (in ms). Only included whentime.diagnostics = TRUE
. -
overlap
: mean intersection over union overlap of true positives. -
proportional.duration.true.positives
: ratio of duration of true positives to the duration of sound events in 'reference'. In a perfect detection routine it should be 1. Based only on true positives that were not split or merged. -
duty.cycle
: proportion of a sound file in which sounds were detected. Only included whentime.diagnostics = TRUE
andpath
is supplied. Useful when conducting energy-based detection as a perfect detection can be obtained with a very low amplitude threshold, which will detect everything, but will produce a duty cycle close to 1. -
recall
: Proportion of sound events in 'reference' that were detected. In a perfect detection routine it should be 1. -
precision
: Proportion of detections that correspond to sound events in 'reference'. In a perfect detection routine it should be 1. -
f.score
: Combines recall and precision as the harmonic mean of these two. Provides a single value for evaluating performance. In a perfect detection routine it should be 1.
Author(s)
Marcelo Araya-Salas marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
optimize_energy_detector
, optimize_template_detector
Examples
{
# load data
data("lbh_reference")
# perfect detection
diagnose_detection(reference = lbh_reference, detection = lbh_reference)
# missing one in detection
diagnose_detection(reference = lbh_reference, detection = lbh_reference[-1, ])
# an extra one in detection
diagnose_detection(reference = lbh_reference[-1, ], detection = lbh_reference)
# with time diagnostics
diagnose_detection(
reference = lbh_reference[-1, ],
detection = lbh_reference, time.diagnostics = TRUE
)
# and extra sound file in reference
diagnose_detection(
reference = lbh_reference,
detection =
lbh_reference[lbh_reference$sound.files != "lbh1", ]
)
# and extra sound file in detection
diagnose_detection(
reference =
lbh_reference[lbh_reference$sound.files != "lbh1", ],
detection = lbh_reference
)
# and extra sound file in detection by sound file
dd <- diagnose_detection(
reference =
lbh_reference[lbh_reference$sound.files != "lbh1", ],
detection = lbh_reference, time.diagnostics = TRUE, by.sound.file = TRUE
)
# get summary
summarize_diagnostic(dd)
}
Detects the start and end of sound events
Description
energy_detector
detects the start and end of sound events based on energy and time attributes
Usage
energy_detector(
files = NULL,
envelopes = NULL,
path = ".",
hop.size = 11.6,
wl = NULL,
thinning = 1,
bp = NULL,
smooth = 5,
threshold = 5,
peak.amplitude = 0,
hold.time = 0,
min.duration = 0,
max.duration = Inf,
cores = 1,
pb = TRUE
)
Arguments
files |
Character vector indicating the sound files that will be analyzed. Optional. If 'files' and 'envelopes' are not supplied then the function will work on all supported format sound files in the working directory. Supported file formats:'.wav', '.mp3', '.flac' and '.wac'. If not supplied the function will work on all sound files (in the supported format) in 'path'. |
envelopes |
An object of class 'envelopes' (generated by |
path |
Character string containing the directory path where the sound files are located. The current working directory is used as default. |
hop.size |
A numeric vector of length 1 specifying the time window duration (in ms). Default is 11.6 ms, which is equivalent to 512 wl for a 44.1 kHz sampling rate. Ignored if 'wl' is supplied. |
wl |
A numeric vector of length 1 specifying the window length of the spectrogram. Default is |
thinning |
Numeric vector of length 1 in the range 0~1 indicating the proportional reduction of the number of
samples used to represent amplitude envelopes (i.e. the thinning of the envelopes). Usually amplitude envelopes have many more samples
than those needed to accurately represent amplitude variation in time, which affects the size of the
output (usually very large R objects / files). Default is |
bp |
Numeric vector of length 2 giving the lower and upper limits of a frequency bandpass filter (in kHz). Default is |
smooth |
A numeric vector of length 1 to smooth the amplitude envelope with a sum smooth function. It controls the time 'neighborhood' (in ms) in which amplitude samples are smoothed (i.e. averaged with neighboring samples). Default is 5. 0 means no smoothing is applied. Note that smoothing is applied before thinning (see 'thinning' argument). The function |
threshold |
Numeric vector of length 1 with a value between 0 and 100 specifying the amplitude threshold for detecting sound event occurrences. Amplitude is represented as a percentage so 0 and 100 represent the lowest amplitude and highest amplitude respectively. Default is 5. |
peak.amplitude |
Numeric vector of length 1 with the minimum peak amplitude value. Detections below that value are excluded. Peak amplitude is the maximum sound pressure level (in decibels) across the sound event (see |
hold.time |
Numeric vector of length 1. Specifies the time range (in ms) at which selections will be merged (i.e. if 2 selections are separated by less than the specified 'hold.time' they will be merged in to a single selection). Default is |
min.duration |
Numeric vector of length 1 giving the shortest duration (in ms) of the sound events to be detected. It removes sound events below that threshold. If 'hold.time' is supplied sound events are first merged and then filtered by duration. Default is 0 (i.e. no filtering based on minimum duration). |
max.duration |
Numeric vector of length 1 giving the longest duration (in
ms) of the sound events to be detected. It removes sound events above that
threshold. If 'hold.time' is supplied sound events are first merged and then filtered by duration. Default is |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
pb |
Logical argument to control progress bar. Default is |
Details
This function detects the time position of target sound events based on energy and time thresholds. It first detect all sound above a given energy threshold (argument 'energy'). If 'hold.time' is supplied then detected sounds are merged if necessary. Then the sounds detected are filtered based on duration attributes ('min.duration' and 'max.duration'). If 'peak.amplitude' is higher than 0 then only those sound events with higher peak amplitude are kept. Band pass filtering ('bp'), thinning ('thinning') and envelope smoothing ('smooth') are applied (if supplied) before threshold detection.
Value
The function returns a 'selection_table' (warbleR package's formats, see selection_table
) or data frame (if sound files can't be found) containing the start and end of each sound event by
sound file. If no sound event was detected for a sound file it is not included in the output data frame.
Author(s)
Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
Examples
# Save example files into temporary working directory
data("lbh1", "lbh2", "lbh_reference")
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# using smoothing and minimum duration
detec <- energy_detector(files = c("lbh1.wav", "lbh2.wav"),
path = tempdir(), threshold = 6, smooth = 6.8,
bp = c(2, 9), hop.size = 3, min.duration = 0.05)
# diagnose detection
diagnose_detection(reference = lbh_reference,
detection = detec)
# without declaring 'files'
detec <- energy_detector(path = tempdir(), threshold = 60, smooth = 6.8,
bp = c(2, 9), hop.size = 6.8, min.duration = 90)
# diagnose detection
diagnose_detection(reference = lbh_reference,
detection = detec)
# using hold time
detec <- energy_detector(threshold = 10, hold.time = 150,
bp = c(2, 9), hop.size = 6.8, path = tempdir())
# diagnose detection
diagnose_detection(reference = lbh_reference, detection = detec)
# calculate envelopes first
envs <- get_envelopes(bp = c(2, 9), hop.size = 6.8, path = tempdir())
# then run detection providing 'envelopes' (but no 'files')
detec <- energy_detector(envelopes = envs, threshold = 10, hold.time = 150, min.duration = 50)
# diagnose detection
diagnose_detection(reference = lbh_reference, detection = detec, time.diagnostics = TRUE)
# USING OTHER SOUND FILE FORMAT (flac program must be installed)
# fisrt convert files to flac
warbleR::wav_2_flac(path = tempdir())
# change sound file extension to flac
flac_reference <- lbh_reference
flac_reference$sound.files <- gsub(".wav", ".flac", flac_reference$sound.files)
# run detection
detec <- energy_detector(files = c("lbh1.flac", "lbh2.flac"), path = tempdir(), threshold = 60,
smooth = 6.8, bp = c(2, 9), hop.size = 6.8, min.duration = 90)
# diagnose detection
diagnose_detection(reference = flac_reference, detection = detec)
alternative name for summarize_acoustic_data
Description
alternative name for summarize_acoustic_data
Usage
feature_acoustic_data(path = ".", digits = 2)
Details
see summarize_acoustic_data
for documentation. feature_acoustic_data
will be deprecated in future versions.
alternative name for summarize_reference
Description
alternative name for summarize_reference
Usage
feature_reference(
reference,
path = NULL,
by.sound.file = FALSE,
units = c("ms", "kHz"),
digits = 2
)
Details
see summarize_reference
for documentation. feature_reference
will be deprecated in future versions.
alternative name for consensus_detection
Description
alternative name for consensus_detection
Usage
filter_detection(
detection,
by = "overlap",
filter = "max",
cores = 1,
pb = TRUE
)
Details
see consensus_detection
for documentation. filter_detection
will be deprecated in future versions.
Extract absolute amplitude envelopes
Description
get_envelopes
extracts absolute amplitude envelopes to speed up energy detection
Usage
get_envelopes(
path = ".",
files = NULL,
bp = NULL,
hop.size = 11.6,
wl = NULL,
cores = 1,
thinning = 1,
pb = TRUE,
smooth = 5,
normalize = TRUE
)
Arguments
path |
Character string containing the directory path where the sound files are located. The current working directory is used as default. |
files |
character vector or indicating the sound files that will be analyzed. Supported file formats:'.wav', '.mp3', '.flac' and '.wac'. If not supplied the function will work on all sound files (in the supported format) in 'path'. |
bp |
Numeric vector of length 2 giving the lower and upper limits of a
frequency bandpass filter (in kHz). Default is |
hop.size |
A numeric vector of length 1 specifying the time window duration (in ms). Default is 11.6 ms, which is equivalent to 512 wl for a 44.1 kHz sampling rate. Ignored if 'wl' is supplied. |
wl |
A numeric vector of length 1 specifying the window length of the spectrogram. Default is |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
thinning |
Numeric vector of length 1 in the range 0~1 indicating the proportional reduction of the number of
samples used to represent amplitude envelopes (i.e. the thinning of the envelopes). Usually amplitude envelopes have many more samples
than those needed to accurately represent amplitude variation in time, which affects the size of the
output (usually very large R objects / files). Default is |
pb |
Logical argument to control progress bar. Default is |
smooth |
A numeric vector of length 1 to smooth the amplitude envelope
with a sum smooth function. It controls the time 'neighborhood' (in ms) in which amplitude samples are smoothed (i.e. averaged with neighboring samples). Default is 5. 0 means no smoothing is applied. Note that smoothing is applied before thinning (see 'thinning' argument). The function |
normalize |
Logical argument to control if envelopes are normalized to a 0-1 range. |
Details
This function extracts the absolute amplitude envelopes of sound files. Can be used to manipulate envelopes before running energy_detector
.
Value
An object of class 'envelopes'.
Author(s)
Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr).
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
Examples
{
# Save to temporary working directory
data(list = c("lbh1", "lbh2"))
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# get raw absolute amplitude envelopes
envs <- get_envelopes(path = tempdir())
# extract segment for the first sound event in the first sound file
x <- envs[[1]]$envelope
# and plot it
plot(x[(length(x) / 9):(length(x) / 4)], type = "l", xlab = "samples", ylab = "amplitude")
# smoothing envelopes
envs <- get_envelopes(path = tempdir(), smooth = 6.8)
x <- envs[[1]]$envelope
plot(x[(length(x) / 9):(length(x) / 4)], type = "l", xlab = "samples", ylab = "amplitude")
# smoothing and thinning
envs <- get_envelopes(path = tempdir(), thinning = 1 / 10, smooth = 6.8)
x <- envs[[1]]$envelope
plot(x[(length(x) / 9):(length(x) / 4)], type = "l", xlab = "samples", ylab = "amplitude")
# no normalization
envs <- get_envelopes(path = tempdir(), thinning = 1 / 10, smooth = 6.8)
x <- envs[[1]]$envelope
plot(x[(length(x) / 9):(length(x) / 4)],
type = "l", xlab = "samples", ylab = "amplitude",
normalize = FALSE
)
}
Find templates representative of the structural variation of sound events
Description
get_templates
find the sound events that are closer to the acoustic space centroid (i.e. close to the average acoustic structure) in a reference table.
Usage
get_templates(
reference,
acoustic.space = NULL,
path = ".",
n.sub.spaces = 1,
plot = TRUE,
color = "#21908C4D",
...
)
Arguments
reference |
Selection table (using the warbleR package's format, see |
acoustic.space |
Numeric matrix or data frame with the two dimensions of a custom acoustic space to be used for finding templates. if not supplied the acoustic space is calculated internally (default). Optional. Note that the function assumes that 'reference' and 'acoustic.space' refer to the same sound events and similarly ordered. |
path |
Character string containing the directory path where the sound files are located. The current working directory is used as default. |
n.sub.spaces |
Integer vector of length 1 with the number of sub-spaces to split the total acoustic space. If |
plot |
Logical to control if the plot is created. Default is |
color |
Character string with the point color. Default is '#21908C4D'. |
... |
Additional arguments to be passed to |
Details
This function finds sound events (from a reference table) that are representative of the acoustic structure variation of all sound events. This is done by finding the events closer to the centroid of the acoustic space. If the acoustic space is not supplied ('acoustic.space' argument) then the function will estimate it by measuring several acoustic features using the function spectro_analysis
(features related to energy distribution in the frequency and time domain as well as features of the dominant frequency contours, see spectro_analysis
for more details) and summarizing it with Principal Component Analysis (after z-transforming parameters) using the function prcomp
. Acoustic features with missing values are removed before estimating Principal Component Analysis. The rationale is that a sound event close to the average structure is more likely to share structural features with most events across the acoustic space than a sound event in the periphery of the space.
If only 1 template is required the function returns the sound event closest to the acoustic space centroid. If more than 1 template is required additional sound events are returned that are representative of the acoustic space. To do this, the function defines sub-spaces as equal-size slices of a circle centered at the centroid of the acoustic space. A column 'template' is included in the output selection table that identifies each template. Custom acoustic spaces can be supplied with argument 'acoustic.space'. Notice that the function aims to partition spaces in which sounds are somehow homogeneously distributed. When clear clusters are found in the distribution of the acoustic space thus clusters might not match the sub-spaces defined by the function.
Value
The function returns a 'selection_table' (warbleR package's formats, see selection_table
) or data frame (if sound files can't be found) containing the start and end of each sound event by
sound file.
Author(s)
Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr). Implements a modified version of the timer function from seewave.
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
Examples
{
# Save example files into temporary working directory
data("lbh1", "lbh2", "lbh_reference")
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# get a single mean template
template <- get_templates(reference = lbh_reference, path = tempdir())
# get 3 templates
template <- get_templates(reference = lbh_reference, n.sub.spaces = 3, path = tempdir())
}
Label detections from a sound event detection procedure
Description
label_detection
labels the performance of a sound event detection procedure comparing the output selection table to a reference selection table
Usage
label_detection(
reference,
detection,
cores = 1,
pb = TRUE,
min.overlap = 0.5,
by = NULL,
solve.ambiguous = TRUE
)
Arguments
reference |
Data frame or 'selection.table' (following the warbleR package format) with the reference selections (start and end of the sound events) that will be used to evaluate the performance of the detection, represented by those selections in 'detection'. Must contained at least the following columns: "sound.files", "selec", "start" and "end". It must contain the reference selections that will be used for detection optimization. |
detection |
Data frame or 'selection.table' with the detections (start and end of the sound events) that will be compared against the 'reference' selections. Must contained at least the following columns: "sound.files", "selec", "start" and "end". It can contain data for additional sound files not found in 'references'. In this case the routine assumes that no sound events are found in those files, so detection from those files are all false positives. |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
pb |
Logical argument to control progress bar. Default is |
min.overlap |
Numeric. Controls the minimum amount of overlap required for a detection and a reference sound for it to be counted as true positive. Default is 0.5. Overlap is measured as intersection over union. Only used if |
by |
Character vector with the name of a categorical column in 'reference' for running a stratified. Labels will be returned separated for each level in 'by'. Default is |
solve.ambiguous |
Logical argument to control whether ambiguous detections (i.e. split and merged positives) are solved using maximum bipartite graph matching. Default is |
Details
The function identifies the rows in the output of a detection routine as true or false positives. This is achieved by comparing the data frame to a reference selection table in which all sound events of interest have been selected.
Value
A data frame or selection table (if 'detection' was also a selection table, warbleR package's format, see selection_table
) including three additional columns, 'detection.class', which indicates the class of each detection, 'reference' which identifies the event in the 'reference' table that was detected and 'overlap' which refers to the amount overlap to the reference sound. See diagnose_detection
for a description of the labels used in 'detection.class'. The output data frame also contains an additional data frame with the overlap for each pair of overlapping detection/reference. Overlap is measured as intersection over union.
Author(s)
Marcelo Araya-Salas marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
diagnose_detection
, summarize_diagnostic
Examples
{
# load data
data("lbh_reference")
# an extra one in detection (1 false positive)
label_detection(reference = lbh_reference[-1, ], detection = lbh_reference)
# missing one in detection (all true positives)
label_detection(reference = lbh_reference, detection = lbh_reference[-1, ])
# perfect detection (all true positives)
label_detection(reference = lbh_reference, detection = lbh_reference)
# and extra sound file in reference (all true positives)
label_detection(
reference = lbh_reference, detection =
lbh_reference[lbh_reference$sound.files != "lbh1.wav", ]
)
# and extra sound file in detection (some false positives)
label_detection(
reference =
lbh_reference[lbh_reference$sound.files != "lbh1.wav", ],
detection = lbh_reference
)
# duplicate 1 detection row (to get 2 splits)
detec <- lbh_reference[c(1, seq_len(nrow(lbh_reference))), ]
detec$selec[1] <- 1.2
label_detection(
reference = lbh_reference,
detection = detec
)
# merge 2 detections (to get split and merge)
Y <- lbh_reference
Y$end[1] <- 1.2
label_detection(reference = lbh_reference, detection = Y)
# remove split to get only merge
Y <- Y[-2, ]
label_detection(reference = lbh_reference, detection = Y)
}
Plot a labeled spectrogram
Description
label_spectro
plot a spectrogram along with amplitude envelopes or cross-correlation scores
Usage
label_spectro(
wave,
reference = NULL,
detection = NULL,
envelope = FALSE,
threshold = NULL,
smooth = 5,
collevels = seq(-100, 0, 5),
palette = viridis::viridis,
template.correlation = NULL,
line.x.position = 2,
hop.size = NULL,
...
)
Arguments
wave |
A 'wave' class object. |
reference |
Data frame or 'selection.table' (following the warbleR package format) with the reference selections (start and end of the sound events). Must contained at least the following columns: "sound.files", "selec", "start" and "end". |
detection |
Data frame or 'selection.table' with the detection (start and end of the sound events) Must contained at least the following columns: "sound.files", "selec", "start" and "end". |
envelope |
Logical to control whether the amplitude envelope is plotted. Default is |
threshold |
A numeric vector on length 1 indicated the amplitude or correlation threshold to plot on the envelope or correlation scores respectively. Default is |
smooth |
A numeric vector of length 1 to smooth the amplitude envelope with a sum smooth function. It controls the time range (in ms) in which amplitude samples are smoothed (i.e. averaged with neighboring samples). Default is 5. 0 means no smoothing is applied. |
collevels |
Numeric sequence of negative numbers to control color partitioning and amplitude values that are shown (as in |
palette |
Function with the color palette to be used on the spectrogram (as in |
template.correlation |
List extracted from the output of |
line.x.position |
Numeric vector of length 1 with the position in the frequency axis (so in kHz) of the lines highlighting sound events. Default is 2. |
hop.size |
A numeric vector of length 1 specifying the time window duration (in ms). Default is 11.6 ms, which is equivalent to 512 'wl' for a 44.1 kHz sampling rate. |
... |
Additional arguments to be passed to |
Details
This function plots spectrograms annotated with the position of sound events. Created for graphs included in the vignette, and probably only useful for that or for very short recordings. Only works on a single 'wave' object at the time.
Value
A spectrogram along with lines highlighting the position of sound events in 'reference' and/or 'detection'. If supplied it will also plot the amplitude envelope or corelation scores below the spectrogram.
Author(s)
Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr).
References
#' Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
energy_detector
, template_correlator
, template_detector
Examples
{
# load example data
data(list = "lbh1", "lbh_reference")
# adding labels
label_spectro(
wave = lbh1,
reference = lbh_reference[lbh_reference$sound.files == "lbh1.wav", ],
wl = 200, ovlp = 50, flim = c(1, 10)
)
# adding envelope
label_spectro(
wave = lbh1,
detection = lbh_reference[lbh_reference$sound.files == "lbh1.wav", ],
wl = 200, ovlp = 50, flim = c(1, 10)
)
# see the package vignette for more examples
}
Long-billed hermit recording
Description
lbh1
a wave object with long-billed hermit (Phaethornis longirostris) songs extracted from xeno-canto's '154138' recording.
Usage
data(lbh1)
Format
An object of class Wave
of length 110250.
Source
Marcelo Araya-Salas
Long-billed hermit recording
Description
lbh2
a wave object with long-billed hermit (Phaethornis longirostris) songs extracted from xeno-canto's '154129' recording.
Usage
data(lbh2)
Format
An object of class Wave
of length 110250.
Source
Marcelo Araya-Salas
Example data frame of a selection table including all sound events of interests
Description
lbh_reference
is a data frame containing the start, end, bottom and top frequency of all songs in 'lbh_1.wav' and 'lbh_2.wav' recordings.
Usage
data(lbh_reference)
Format
A data frame with 19 rows and 6 variables:
- sound.files
recording names
- selec
selection numbers within recording
- start
start times of selected sound event
- end
end times of selected sound event
- bottom.freq
lower limit of frequency range
- top.freq
upper limit of frequency range
Details
A data frame containing the start, end, low and high frequency of Phaethornis longirostris (Long-billed Hermit) songs from the 2 example sound files included in this package ('lbh_1' and 'lbh_2'). These two files are clips extracted from the xeno-canto's '154138' and '154129' recordings respectively.
Source
Marcelo Araya-Salas, ohun
Merge overlapping selections
Description
merge_overlaps
merges several overlapping selections into a single selection
Usage
merge_overlaps(X, pb = TRUE, cores = 1)
Arguments
X |
Data frame or 'selection.table' (following the warbleR package format) with selections (start and end of the sound events). Must contained at least the following columns: "sound.files", "selec", "start" and "end". |
pb |
Logical argument to control progress bar. Default is |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
Details
The function finds time-overlapping selection in reference tables and collapses them into a single selection. It can be useful to prepare reference tables to be used in an energy detection routine. In such cases overlapping selections are expected to be detected as a single sound. Therefore, merging them can be useful to prepare references in a format representing a more realistic expectation of how a perfect energy detection routine would look like.
Value
If any time-overlapping selection is found it returns a data frame in which overlapping selections are collapse into a single selection.
Author(s)
Marcelo Araya-Salas marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
summarize_diagnostic
, label_detection
Examples
{
# load data
data("lbh_reference")
# nothing to merge
merge_overlaps(lbh_reference)
# create artificial overlapping selections
lbh_ref2 <- rbind(as.data.frame(lbh_reference[c(3, 10), ]), lbh_reference[c(3, 10), ])
lbh_ref2$selec <- seq_len(nrow(lbh_ref2))
merge_overlaps(lbh_ref2)
}
Optimize energy-based sound event detection
Description
Optimize energy-based sound event detection under different correlation threshold values
Usage
optimize_energy_detector(
reference,
files = NULL,
threshold = 5,
peak.amplitude = 0,
hop.size = 11.6,
wl = NULL,
smooth = 5,
hold.time = 0,
min.duration = NULL,
max.duration = NULL,
thinning = 1,
cores = 1,
pb = TRUE,
by.sound.file = FALSE,
bp = NULL,
path = ".",
previous.output = NULL,
envelopes = NULL,
macro.average = FALSE,
min.overlap = 0.5
)
Arguments
reference |
Selection table (using the warbleR package's format, see |
files |
Character vector indicating the sound files that will be analyzed. Optional. If not supplied the function will work on the sound files in 'reference'. It can be used to include sound files with no target sound events. Supported file formats:'.wav', '.mp3', '.flac' and '.wac'. If not supplied the function will work on all sound files (in the supported format) in 'path'. |
threshold |
A numeric vector specifying the amplitude threshold for detecting sound events (in %). Default is 5. Several values can be supplied for optimization. |
peak.amplitude |
Numeric vector of length 1 with the minimum peak amplitude value. A detection below that value would be excluded. Peak amplitude is the maximum sound pressure level (in decibels) across the sound event (see |
hop.size |
A numeric vector of length 1 specifying the time window duration (in ms). Default is 11.6 ms, which is equivalent to 512 wl for a 44.1 kHz sampling rate. Ignored if 'wl' is supplied. |
wl |
A numeric vector of length 1 specifying the window length of the spectrogram. Default is |
smooth |
A numeric vector of length 1 to smooth the amplitude envelope
with a sum smooth function. It controls the time 'neighborhood' (in ms) in which amplitude samples are smoothed (i.e. averaged with neighboring samples). Default is 5. 0 means no smoothing is applied. Note that smoothing is applied before thinning (see 'thinning' argument). The function |
hold.time |
Numeric vector of length 1. Specifies the time range (in ms) at which selections will be merged (i.e. if 2 selections are separated by less than the specified 'hold.time' they will be merged in to a single selection). Default is |
min.duration |
Numeric vector giving the shortest duration (in ms) of the sound events to be detected. It removes sound events below that threshold. Several values can be supplied for optimization. |
max.duration |
Numeric vector giving the longest duration (in ms) of the sound events to be detected. It removes sound events above that threshold. Several values can be supplied for optimization. |
thinning |
Numeric vector in the range 0~1 indicating the proportional reduction of the number of
samples used to represent amplitude envelopes (i.e. the thinning of the envelopes). Usually amplitude envelopes have many more samples
than those needed to accurately represent amplitude variation in time, which affects the size of the
output (usually very large R objects / files). Default is |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
pb |
Logical argument to control progress bar and messages. Default is |
by.sound.file |
Logical argument to control whether performance diagnostics are summarized across sound files (when |
bp |
Numeric vector of length 2 giving the lower and upper limits of a
frequency bandpass filter (in kHz). Default is |
path |
Character string containing the directory path where the sound files are located. The current working directory is used as default. |
previous.output |
Data frame with the output of a previous run of this function. This will be used to include previous results in the new output and avoid recalculating detection performance for parameter combinations previously evaluated. |
envelopes |
An object of class 'envelopes' (generated by |
macro.average |
Logical argument to control if diagnostics are first calculated for each sound file and then averaged across sound files, which can minimize the effect of unbalanced sample sizes between sound files. If |
min.overlap |
Numeric. Controls the minimum amount of overlap required for a detection and a reference sound for it to be counted as true positive. Default is 0.5. Overlap is measured as intersection over union. |
Details
This function takes a selections data frame or 'selection_table' ('reference') estimates the detection performance of a energy detector under different detection parameter combinations. This is done by comparing the position in time of the detection to those of the reference selections in 'reference'. The function returns several diagnostic metrics to allow user to determine which parameter values provide a detection that more closely matches the selections in 'reference'. Those parameters can be later used for performing a more efficient detection using energy_detector
.
Value
A data frame in which each row shows the result of a detection job with a particular combination of tuning parameters (including in the data frame). It also includes the following diagnostic metrics:
-
true.positives
: number of sound events in 'reference' that correspond to any detection. Matching is defined as some degree of overlap in time. In a perfect detection routine it should be equal to the number of rows in 'reference'. -
false.positives
: number of detections that don't match any of the sound events in 'reference'. In a perfect detection routine it should be 0. -
false.negatives
: number of sound events in 'reference' that were not detected (not found in 'detection'. In a perfect detection routine it should be 0. -
splits
: number of detections overlapping reference sounds that also overlap with other detections. In a perfect detection routine it should be 0. -
merges
: number of detections that overlap with two or more reference sounds. In a perfect detection routine it should be 0. -
mean.duration.true.positives
: mean duration of true positives (in ms). Only included whentime.diagnostics = TRUE
. -
mean.duration.false.positives
: mean duration of false positives (in ms). Only included whentime.diagnostics = TRUE
. -
mean.duration.false.negatives
: mean duration of false negatives (in ms). Only included whentime.diagnostics = TRUE
. -
overlap
: mean intersection over union overlap of true positives. -
proportional.duration.true.positives
: ratio of duration of true positives to th duration of sound events in 'reference'. In a perfect detection routine it should be 1. Based only on true positives that were not split or merged. Only included whentime.diagnostics = TRUE
. -
duty.cycle
: proportion of a sound file in which sounds were detected. Only included whentime.diagnostics = TRUE
andpath
is supplied. -
recall
: Proportion of sound events in 'reference' that were detected. In a perfect detection routine it should be 1. -
precision
: Proportion of detections that correspond to sound events in 'reference'. In a perfect detection routine it should be 1.
Author(s)
Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr).
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
Examples
# Save example files into temporary working directory
data("lbh1", "lbh2", "lbh_reference")
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# using smoothing and minimum duration
optimize_energy_detector(
reference = lbh_reference, path = tempdir(),
threshold = c(6, 10), smooth = 6.8, bp = c(2, 9), hop.size = 6.8,
min.duration = 90
)
# with thinning and smoothing
optimize_energy_detector(
reference = lbh_reference, path = tempdir(),
threshold = c(6, 10, 15), smooth = c(7, 10), thinning = c(0.1, 0.01),
bp = c(2, 9), hop.size = 6.8, min.duration = 90
)
# by sound file
(opt_ed <- optimize_energy_detector(
reference = lbh_reference,
path = tempdir(), threshold = c(6, 10, 15), smooth = 6.8, bp = c(2, 9),
hop.size = 6.8, min.duration = 90, by.sound.file = TRUE
))
# summarize
summarize_diagnostic(opt_ed)
# using hold time
(op_ed <- optimize_energy_detector(
reference = lbh_reference,
threshold = 10, hold.time = c(100, 150), bp = c(2, 9), hop.size = 6.8,
path = tempdir()
))
# including previous output in new call
optimize_energy_detector(
reference = lbh_reference, threshold = 10,
hold.time = c(50, 200), previous.output = op_ed, smooth = 6.8,
bp = c(2, 9), hop.size = 7, path = tempdir()
)
# having and extra file in files (simulating a file that should have no detetions)
sub_reference <- lbh_reference[lbh_reference$sound.files != "lbh1.wav", ]
optimize_energy_detector(
reference = sub_reference, files = unique(lbh_reference$sound.files),
threshold = 10, hold.time = c(1, 150), bp = c(2, 9), smooth = 6.8,
hop.size = 7, path = tempdir()
)
Optimize acoustic template detection
Description
optimize_template_detector
optimizes acoustic template detection
Usage
optimize_template_detector(
template.correlations,
reference,
threshold,
cores = 1,
pb = TRUE,
by.sound.file = FALSE,
previous.output = NULL,
macro.average = FALSE,
min.overlap = 0.5
)
Arguments
template.correlations |
An object of class 'template_correlations' (generated by |
reference |
Data frame or 'selection.table' (following the warbleR package format) with the reference selections (start and end of the sound events) that will be used to evaluate the performance of the detection, represented by those selections in 'detection'. Must contained at least the following columns: "sound.files", "selec", "start" and "end". It must contain the reference selections that will be used for detection optimization. |
threshold |
Numeric vector of length > 1 with values between 0 and 1 specifying the correlation threshold for detecting sound event occurrences (i.e. correlation peaks). Must be supplied. Several values should be supplied for optimization. |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
pb |
Logical argument to control progress bar and messages. Default is |
by.sound.file |
Logical to control if diagnostics are calculated for each sound file independently ( |
previous.output |
Data frame with the output of a previous run of this function. This will be used to include previous results in the new output and avoid recalculating detection performance for parameter combinations previously evaluated. |
macro.average |
Logical argument to control if diagnostics are first calculated for each sound file and then averaged across sound files, which can minimize the effect of unbalanced sample sizes between sound files. If |
min.overlap |
Numeric. Controls the minimum amount of overlap required for a detection and a reference sound for it to be counted as true positive. Default is 0.5. Overlap is measured as intersection over union. |
Details
This function takes a a reference data frame or 'selection_table' ('X') and the output of template_correlator
and estimates the detection performance for different detection parameter combinations. This is done by comparing the position in time of the detection to those of the reference selections. The function returns several diagnostic metrics to allow user to determine which parameter values provide a detection that more closely matches the selections in 'reference'. Those parameters can be later used for performing a more efficient detection using template_detector
. Supported file formats:'.wav', '.mp3', '.flac' and '.wac'.
Value
A data frame in which each row shows the result of a detection job for each cutoff value, including the following diagnostic metrics:
-
true.positives
: number of sound events in 'reference' that correspond to any detection. Matching is defined as some degree of overlap in time. In a perfect detection routine it should be equal to the number of rows in 'reference'. -
false.positives
: number of detections that don't match any of the sound events in 'reference'. In a perfect detection routine it should be 0. -
false.negatives
: number of sound events in 'reference' that were not detected (not found in 'detection'. In a perfect detection routine it should be 0. -
splits
: number of detections overlapping reference sounds that also overlap with other detections. In a perfect detection routine it should be 0. -
merges
: number of sound events in 'detection' that overlap with more than one sound event in 'reference'. In a perfect detection routine it should be 0. -
recall
: Proportion of sound events in 'reference' that were detected. In a perfect detection routine it should be 1. -
precision
: Proportion of detections that correspond to sound events in 'reference' that were detected. In a perfect detection routine it should be 1.
Author(s)
Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr).
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
optimize_energy_detector
, template_correlator
, template_detector
Examples
{
# Save sound files to temporary working directory
data("lbh1", "lbh2", "lbh_reference")
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# template for the second sound file in 'lbh_reference'
templ <- lbh_reference[11, ]
# generate template correlations
tc <- template_correlator(templates = templ, path = tempdir(),
files = "lbh2.wav")
# using 2 threshold
optimize_template_detector(template.correlations = tc, reference =
lbh_reference[lbh_reference$sound.files == "lbh2.wav", ],
threshold = c(0.2, 0.5))
# using several thresholds
optimize_template_detector(template.correlations = tc,
reference = lbh_reference[lbh_reference$sound.files == "lbh2.wav", ],
threshold = seq(0.5, 0.9, by = 0.05))
# template for the first and second sound file in 'lbh_reference'
templ <- lbh_reference[c(1, 11), ]
# generate template correlations
tc <- template_correlator(templates = templ, path = tempdir(),
files = c("lbh1.wav", "lbh2.wav"))
optimize_template_detector(template.correlations = tc, reference =
lbh_reference, threshold = seq(0.5, 0.7, by = 0.1))
# showing diagnostics by sound file
optimize_template_detector(template.correlations = tc, reference =
lbh_reference,
threshold = seq(0.5, 0.7, by = 0.1), by.sound.file = TRUE)
}
Plot detection and reference annotations
Description
plot_detection
evaluates the performance of a sound event detection procedure comparing the output selection table to a reference selection table
Usage
plot_detection(
reference,
detection,
mid.point = FALSE,
size = 20,
positions = c(1, 2)
)
Arguments
reference |
Data frame or 'selection.table' (following the warbleR package format) with the reference selections (start and end of the sound events) that will be used to evaluate the performance of the detection, represented by those selections in 'detection'. Must contained at least the following columns: "sound.files", "selec", "start" and "end". It must contain the reference selections that will be used for detection optimization. |
detection |
Data frame or 'selection.table' with the detections (start and end of the sound events) that will be compared against the 'reference' selections. Must contained at least the following columns: "sound.files", "selec", "start" and "end". It can contain data for additional sound files not found in 'references'. In this case the routine assumes that no sound events are found in those files, so detection from those files are all false positives. |
mid.point |
Logical argument to control if each annotations is shown as a rectangle with fix width center at the mid point of the time position (if |
size |
Numeric. Controls the size of the rectangles if |
positions |
Numeric. Controls the vertical position of the rectangles representing anotations. Default is c(1, 2). This can be used to get reference and detection annotations closer in the vertical axis. Note that the height of rectangles is 0.5. |
Details
The function helps to visualize the match between reference and detection annotations by plotting them next to each other as rectangles along the time axis. If the annotations contain data for several sound files each sound file will be plotted in its own panel. The plot can be further modify by users using regular ggplot syntax.
Value
A ggplot graph (i.e. an object of class "ggplot").
Author(s)
Marcelo Araya-Salas marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
label_spectro
, diagnose_detection
Examples
{
# load data
data("lbh_reference")
# mid point and regular size
plot_detection(
reference = lbh_reference[-14, ],
detection = lbh_reference[-1, ], mid.point = TRUE
)
# mid point and larger size
plot_detection(
reference = lbh_reference[-14, ],
detection = lbh_reference[-1, ], mid.point = TRUE, size = 25
)
# true time rectangles
plot_detection(
reference = lbh_reference[-14, ],
detection = lbh_reference[-1, ]
)
# use position to make reference and anotations overlap vertically
plot_detection(
reference = lbh_reference[-14, ],
detection = lbh_reference[-1, ], positions = c(1, 1.4)
)
# modified using ggplot
gg_pd <- plot_detection(
reference = lbh_reference[-14, ],
detection = lbh_reference[-1, ], positions = c(1, 1.4)
)
gg_pd + ggplot2::theme_classic(base_size = 25)
}
Class 'envelopes': list of absolute amplitude envelopes
Description
Class for absolute amplitude envelopes
Usage
## S3 method for class 'envelopes'
print(x, ...)
Arguments
x |
Object of class |
... |
further arguments passed to or from other methods. Ignored when printing envelopes. |
Details
An object of class envelopes
created by get_envelopes
is a list with sound files absolute amplitude envelopes and metadata
Value
Prints a summary of an object of class 'envelopes'.
See Also
get_envelopes
print method for class envelopes
print method for class template_correlations
Description
print method for class template_correlations
Usage
## S3 method for class 'template_correlations'
print(x, ...)
Arguments
x |
Object of class |
... |
further arguments passed to or from other methods. Ignored when printing 'template_correlations' class objects. |
Value
Prints a summary of an object of class 'template_correlations'.
Reassemble annotations from clips
Description
reassemble_detection
reassembles detections made on clips so they refer to the original sound files
Usage
reassemble_detection(detection, Y, cores = 1, pb = TRUE)
Arguments
detection |
Data frame or selection table (using the warbleR package's format, see |
Y |
Data frame with the start and end of clips in the orignal sound files. Must contain the column "original.sound.files", "sound.files" (clip files), "start" and "end". |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
pb |
Logical argument to control progress bar. Default is |
Details
This function will take detections made on clips created with split_acoustic_data
, and reset their information so they refer back to the original (unsplit) sound files.
Value
A data frame or selection table (if 'detection' was also a selection table, warbleR package's format, see selection_table
) as in 'detection' but removing ambiguous detections (split and merged positives).
Author(s)
Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr).
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
label_detection
, split_acoustic_data
Examples
{
# load example data
data("lbh1", "lbh2", "lbh_reference")
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
## if X is a data frame #####
df_ref <- as.data.frame(lbh_reference)
# get split annotations
split_df_ref <- split_acoustic_data(X = df_ref,
only.sels = TRUE, sgmt.dur = 1.5,
path = tempdir(), pb = FALSE,
files = c("lbh1.wav", "lbh2.wav"))
# get clip information
Y <- split_acoustic_data(sgmt.dur = 1.5,
path = tempdir(), pb = FALSE,
output.path = tempdir(), files = c("lbh1.wav", "lbh2.wav"))
# reassemble annotations
tc <- reassemble_detection(detection = split_df_ref,
Y = Y, pb = FALSE)
# start and end are the same as in the original unsplit data
df_ref <- df_ref[order(df_ref$sound.files, df_ref$start), ]
all(tc$end == df_ref$end)
all(tc$start == df_ref$start)
### if X is a selection table ##
# split annotations and files
split_lbh_reference <- split_acoustic_data(X = lbh_reference,
sgmt.dur = 1.5, path = tempdir(),
output.path = tempdir(),
files = c("lbh1.wav", "lbh2.wav"))
# reassemble annotations
tc <- reassemble_detection(detection = split_lbh_reference,
Y = attributes(split_lbh_reference)$clip.info)
# start and end are the same as in the original unsplit data
lbh_reference <- lbh_reference[order(lbh_reference$sound.files, lbh_reference$start), ]
all(tc$end == lbh_reference$end)
all(tc$start == lbh_reference$start)
}
Splits sound files and associated annotations
Description
split_acoustic_data
splits sound files (and corresponding selection tables) in shorter clips
Usage
split_acoustic_data(
path = ".",
sgmt.dur = 10,
sgmts = NULL,
files = NULL,
cores = 1,
pb = TRUE,
only.sels = FALSE,
output.path = file.path(path, "clips"),
X = NULL
)
Arguments
path |
Directory path where sound files are found. The current working directory is used as default. |
sgmt.dur |
Numeric. Duration (in s) of segments in which sound files would be split. Sound files shorter than 'sgmt.dur' won't be split. Ignored if 'sgmts' is supplied. |
sgmts |
Numeric. Number of segments in which to split each sound file. If supplied 'sgmt.dur' is ignored. |
files |
Character vector indicating the subset of files that will be split. Supported file formats:'.wav', '.mp3', '.flac' and '.wac'. If not supplied the function will work on all sound files (in the supported format) in 'path'. |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
pb |
Logical argument to control progress bar. Default is |
only.sels |
Logical argument to control if only the data frame is returned (no wave files are saved). Default is |
output.path |
Directory path where the output files will be saved. If not supplied then a subfolder called 'clips' will be created within the supplied 'path'. |
X |
'selection_table' object or a data frame with columns
for sound file name (sound.files), selection number (selec), and start and end time of signal
(start and end). If supplied the data frame/selection table is modified to reflect the position of the selections in the new sound files. Note that some selections could split between 2 segments. To deal with this, a 'split.sels' column is added to the data frame in which those selection are labeled as 'split'. Default is |
Details
This function aims to reduce the size of sound files in order to simplify some processes that are limited by sound file size (big files can be manipulated, e.g. energy_detector
).
Value
Wave files for each segment (e.g. clips) in the working directory (if only.sels = FALSE
, named as 'sound.file.name-#.wav'). Clips are saved in .wav format. If 'X' is not supplied the function returns a data frame in the containing the name of the original sound files (original.sound.files), the name of the segments (sound.files) and the start and end of segments in the original files. If 'X' is supplied then a data frame with the position of the selections in the newly created clips is returned instead. However, if 'X' is a 'selection table' and the clips have been saved, a data frame with the information of the position of clips in the original sound files is also returned as an attribute in the output selection table ("clip.info"). Output annotation data contains the position of the annotations in the new clips, with an additional column, 'split.sels', that inform users whether annotations have been split into multiple clips ('split') or not (NA
). For split annotations the 'selec' column will contain the original 'selec' id plus an additional index (selec-index) so users can still identify from which annotation splits came from. Sound files in 'path' that are not referenced in 'X' will stil be split. The function may not work properly with very short segments (< 1 s).
Author(s)
Marcelo Araya-Salas (marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
Examples
{
# load data and save to temporary working directory
data("lbh1", "lbh2")
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# split files in 1 s files
split_acoustic_data(sgmt.dur = 1, path = tempdir(), files = c("lbh1.wav", "lbh2.wav"))
# Check this folder
tempdir()
}
Summarize information about file format in an acoustic data set
Description
summarize_acoustic_data
summarizes information about file format in an acoustic data set
Usage
summarize_acoustic_data(path = ".", digits = 2)
Arguments
path |
Character string containing the directory path where the sound files are located. Default is |
digits |
Numeric vector of length 1 with the number of decimals to include. Default is 2. |
Details
The function summarizes information about file format in an acoustic data set. It provides information about the number of files, file formats, sampling rates, bit depts, channels, duration and file size (in MB). For file format, sampling rate, bit depth and number of channels the function includes information about the number of files for each format (e.g. '44.1 kHz (2)' means 2 files with a sampling rate of 44.1 kHz).
Value
The function prints a summary of the format of the files in an acoustic data set.
Author(s)
Marcelo Araya-Salas marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
Examples
{
# load data and save example files into temporary working directory
data("lbh1", "lbh2", "lbh_reference")
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# summary across sound files
summarize_acoustic_data(path = tempdir())
}
Summarize detection diagnostics
Description
summarize_diagnostic
summarizes detection diagnostics
Usage
summarize_diagnostic(
diagnostic,
time.diagnostics = FALSE,
macro.average = FALSE
)
Arguments
diagnostic |
A data frame with the output of a detection optimization function ( |
time.diagnostics |
Logical argument to control if diagnostics related to the duration of the sound events ("mean.duration.true.positives", "mean.duration.false.positives", "mean.duration.false.negatives" and "proportional.duration.true.positives") are returned (if |
macro.average |
Logical argument to control if diagnostics are first calculated for each sound file and then averaged across sound files, which can minimize the effect of unbalanced sample sizes between sound files. If |
Details
The function summarizes a detection diagnostic data frame in which diagnostic parameters are shown split by (typically) a categorical column, usually sound files. This function is used internally by diagnose_detection
. 'splits' and 'merge.positives' are also counted (i.e. counted twice) as 'true.positives'. Therefore "true.positives + false.positives = detections".
Value
A data frame, similar to the output of a detection optimization function (diagnose_detection
, optimize_energy_detector
, optimize_template_detector
) including the following detection performance diagnostics:
-
detections
: total number of detections -
true.positives
: number of sound events in 'reference' that correspond to any detection. Matching is defined as some degree of overlap in time. In a perfect detection routine it should be equal to the number of rows in 'reference'. -
false.positives
: number of detections that don't match (i.e. don't overlap with) any of the sound events in 'reference'. In a perfect detection routine it should be 0. -
false.negatives
: number of sound events in 'reference' that were not detected (not found in 'detection'. In a perfect detection routine it should be 0. -
splits
: number of detections overlapping reference sounds that also overlap with other detections. In a perfect detection routine it should be 0. -
merges
: number of detections that overlap with two or more reference sounds. In a perfect detection routine it should be 0. -
mean.duration.true.positives
: mean duration of true positives (in s). Only included whentime.diagnostics = TRUE
. -
mean.duration.false.positives
: mean duration of false positives (in ms). Only included whentime.diagnostics = TRUE
. -
mean.duration.false.negatives
: mean duration of false negatives (in ms). Only included whentime.diagnostics = TRUE
. -
overlap
: mean intersection over union overlap of true positives. -
proportional.duration.true.positives
: ratio of duration of true positives to the duration of sound events in 'reference'. In a perfect detection routine it should be 1. Based only on true positives that were not split or merged. -
duty.cycle
: proportion of a sound file in which sounds were detected. Only included whentime.diagnostics = TRUE
andpath
is supplied. Useful when conducting energy-based detection as a perfect detection can be obtained with a very low amplitude threshold, which will detect everything, but will produce a duty cycle close to 1. -
recall
: Proportion of sound events in 'reference' that were detected. In a perfect detection routine it should be 1. -
precision
: Proportion of detections that correspond to sound events in 'reference'. In a perfect detection routine it should be 1. -
f.score
: Combines recall and precision as the harmonic mean of these two. Provides a single value for evaluating performance. In a perfect detection routine it should be 1.
Author(s)
Marcelo Araya-Salas marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. 2022. ohun: an R package for diagnosing and optimizing automatic sound event detection. BioRxiv, 2022.12.13.520253. Mesaros, A., Heittola, T., & Virtanen, T. (2016). Metrics for polyphonic sound event detection. Applied Sciences, 6(6), 162.
See Also
Examples
{
# load example selection tables
data("lbh_reference")
# run diagnose_detection() by sound file
diag <- diagnose_detection(
reference = lbh_reference,
detection = lbh_reference[-1, ], by.sound.file = TRUE
)
# summarize
summarize_diagnostic(diagnostic = diag)
# should be the same as this:
diagnose_detection(
reference = lbh_reference,
detection = lbh_reference[-1, ], by.sound.file = FALSE
)
}
Summarize temporal and frequency dimensions of annotations and gaps
Description
summarize_reference
summarizes temporal and frequency dimensions of annotations and gaps
Usage
summarize_reference(
reference,
path = NULL,
by.sound.file = FALSE,
units = c("ms", "kHz"),
digits = 2
)
Arguments
reference |
Data frame or 'selection.table' (following the warbleR package format) with the reference selections (start and end of the sound events) that will be used to evaluate the performance of the detection, represented by those selections in 'detection'. Must contained at least the following columns: "sound.files", "selec", "start" and "end". If frequency range columns are included ("bottom.freq" and "top.freq") these are also used to characterize reference selections. |
path |
Character string containing the directory path where the sound files are located. If supplied then duty cycle and peak frequency features are returned. These features are more helpful for tuning a energy-based detection. Default is |
by.sound.file |
Logical argument to control whether features are summarized across sound files (when |
units |
A character vector of length 2 with the units to be used for time and frequency parameters, in that order. Default is |
digits |
Numeric vector of length 1 with the number of decimals to include. Default is 2. |
Details
The function extracts quantitative features from reference tables that can inform the range of values to be used in a energy-based detection optimization routine. Features related to selection duration can be used to set the 'max.duration' and 'min.duration' values, frequency related features can inform bandpass values, gap related features inform hold time values and duty cycle can be used to evaluate performance.
Value
The function returns the mean, minimum and maximum duration of selections and gaps (time intervals between selections) and of the number of annotations by sound file. If frequency range columns are included in the reference table (i.e. "bottom.freq" and "top.freq") the minimum bottom frequency ('min.bottom.freq') and the maximum top frequency ('max.top.freq') are also estimated. Finally, if the path to the sound files in 'reference' is supplied the duty cycle (fraction of a sound file corresponding to target sound events) and peak amplitude (highest amplitude in a detection) are also returned. If ‘by.sound.file = FALSE' a matrix with features in rows is returned. Otherwise a data frame is returned in which each row correspond to a sound file. By default, time features are returned in ’ms' while frequency features in 'kHz' (but see 'units' argument).
Author(s)
Marcelo Araya-Salas marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
optimize_energy_detector
, optimize_template_detector
Examples
{
# load data and save example files into temporary working directory
data("lbh1", "lbh2", "lbh_reference")
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# summary across sound files
summarize_reference(reference = lbh_reference, path = tempdir())
# summary across sound files
summarize_reference(reference = lbh_reference, by.sound.file = TRUE, path = tempdir())
}
Acoustic templates correlator using time-frequency cross-correlation
Description
template_correlator
estimates templates cross-correlation across multiple sound files.
Usage
template_correlator(
templates,
files = NULL,
hop.size = 11.6,
wl = NULL,
ovlp = 0,
wn = "hanning",
cor.method = "pearson",
cores = 1,
path = ".",
pb = TRUE,
type = "fourier",
fbtype = "mel",
...
)
Arguments
templates |
'selection_table', 'extended_selection_table' (warbleR package's formats, see |
files |
Character vector with name of the files in which to run the cross-correlation with the supplied template(s). Supported file formats:'.wav', '.mp3', '.flac' and '.wac'. If not supplied the function will work on all sound files (in the supported formats) in 'path'. |
hop.size |
A numeric vector of length 1 specifying the time window duration (in ms). Default is 11.6 ms, which is equivalent to 512 wl for a 44.1 kHz sampling rate. Ignored if 'wl' is supplied. |
wl |
A numeric vector of length 1 specifying the window length of the spectrogram. Default is |
ovlp |
Numeric vector of length 1 specifying % of overlap between two
consecutive windows, as in |
wn |
A character vector of length 1 specifying the window name as in |
cor.method |
A character vector of length 1 specifying the correlation method as in |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
path |
Character string containing the directory path where the sound files are located. The current working directory is used as default. |
pb |
Logical argument to control progress bar. Default is |
type |
A character vector of length 1 specifying the type of cross-correlation: "fourier" (i.e. spectrographic cross-correlation using Fourier transform; internally using |
fbtype |
Character vector indicating the auditory frequency scale to use: "mel", "bark", "htkmel", "fcmel". |
... |
Additional arguments to be passed to |
Details
This function calculates the similarity of acoustic templates across sound files by means of time-frequency cross-correlation. Fourier spectrograms or time-frequency representations from auditory scales (including cepstral coefficients) can be used. Several templates can be run over several sound files. Note that template-based detection is divided in two steps: template correlation (using this function) and template detection (or peak detection as it infers detection based on peak correlation scores, using the function template_detector
). So the output of this function (and object of 'template_correlations') must be input into template_detector
for inferring sound event occurrences. optimize_template_detector
can be used to optimize template detection.
Value
The function returns an object of class 'template_correlations' which is a list with the correlation scores for each combination of templates and files. 'template_correlations' objects must be used to infer sound event occurrences using template_detector
or to graphically explore template correlations across sound files using full_spectrograms
.
Author(s)
Marcelo Araya-Salas marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
Khanna H., Gaunt S.L.L. & McCallum D.A. (1997). Digital spectrographic cross-correlation: tests of recall. Bioacoustics 7(3): 209-234.
Lyon, R. H., & Ordubadi, A. (1982). Use of cepstra in acoustical signal analysis. Journal of Mechanical Design, 104(2), 303-306.
See Also
energy_detector
, template_detector
, optimize_template_detector
Examples
{
# load example data
data("lbh1", "lbh2", "lbh_reference")
# save sound files
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# create template
templ <- lbh_reference[4, ]
templ2 <- warbleR::selection_table(templ,
extended = TRUE,
path = tempdir()
)
# fourier spectrogram
(tc_fr <- template_correlator(templates = templ, path = tempdir(), type = "fourier"))
# mel auditory spectrograms
(tc_ma <- template_correlator(templates = templ, path = tempdir(), type = "mel-auditory"))
# mfcc spectrograms
(tc_mfcc <- template_correlator(templates = templ, path = tempdir(), type = "mfcc"))
# similar results (but no exactly the same) are found with the 3 methods
# these are the correlation of the correlation vectors
# fourier vs mel-auditory
cor(
tc_fr$`lbh2.wav-4/lbh2.wav`$correlation.scores,
tc_ma$`lbh2.wav-4/lbh2.wav`$correlation.scores
)
# fourier vs mfcc
cor(
tc_fr$`lbh2.wav-4/lbh2.wav`$correlation.scores,
tc_mfcc$`lbh2.wav-4/lbh2.wav`$correlation.scores
)
# mel-auditory vs mfcc
cor(
tc_ma$`lbh2.wav-4/lbh2.wav`$correlation.scores,
tc_mfcc$`lbh2.wav-4/lbh2.wav`$correlation.scores
)
# using an extended selection table
templ_est <- warbleR::selection_table(templ,
extended = TRUE,
path = tempdir()
)
tc_fr_est <- template_correlator(templates = templ_est, path = tempdir(), type = "fourier")
# produces the same result as templates in a regular data frame
cor(
tc_fr$`lbh2.wav-4/lbh2.wav`$correlation.scores,
tc_fr_est$`lbh2.wav_4-1/lbh2.wav`$correlation.scores
)
}
Acoustic template detection from time-frequency cross-correlations
Description
template_detector
find sound event occurrences in cross-correlation vectors from template_correlator
Usage
template_detector(
template.correlations,
cores = 1,
threshold,
pb = TRUE,
verbose = TRUE
)
Arguments
template.correlations |
object of class 'template_correlations' generated by |
cores |
Numeric. Controls whether parallel computing is applied. It specifies the number of cores to be used. Default is 1 (i.e. no parallel computing). |
threshold |
Numeric vector of length 1 with a value between 0 and 1 specifying the correlation threshold for detecting sound event occurrences (i.e. correlation peaks). Must be supplied. Correlation scores are forced to between 0 and 1 (by converting negative scores to 0). 0 and 1 represent the lowest and highest similarity to the template respectively. |
pb |
Logical argument to control progress bar. Default is |
verbose |
Logical argument to control if some summary messages are printed to the console. |
Details
This function infers sound events occurrences from cross-correlation scores along sound files. Correlation scores must be generated first using template_correlator
. The output is a data frame (or selection table if sound files are still found in the original path supplied to template_correlator
, using the warbleR package's format, see selection_table
) containing the start and end of the detected sound events as well as the cross-correlation score ('scores' column) for each detection. Note that the detected sounds are assumed to have the same duration as the template, so their start and end correspond to the correlation peak position +/- half the template duration.
Value
The function returns a 'selection_table' (warbleR package's formats, see selection_table
) or data frame (if sound files can't be found) with the start and end and correlation score for the
detected sound events.
Author(s)
Marcelo Araya-Salas marcelo.araya@ucr.ac.cr)
References
Araya-Salas, M., Smith-Vidaurre, G., Chaverri, G., Brenes, J. C., Chirino, F., Elizondo-Calvo, J., & Rico-Guevara, A. (2023). ohun: An R package for diagnosing and optimizing automatic sound event detection. Methods in Ecology and Evolution, 14, 2259–2271. https://doi.org/10.1111/2041-210X.14170
See Also
energy_detector
, template_correlator
, optimize_template_detector
Examples
{
# load example data
data("lbh1", "lbh2", "lbh_reference")
# save sound files
tuneR::writeWave(lbh1, file.path(tempdir(), "lbh1.wav"))
tuneR::writeWave(lbh2, file.path(tempdir(), "lbh2.wav"))
# template for the first sound file in 'lbh_reference'
templ1 <- lbh_reference[1, ]
# generate template correlations
tc <- template_correlator(templates = templ1, path = tempdir(), files = "lbh1.wav")
# template detection
td <- template_detector(template.correlations = tc, threshold = 0.4)
# diagnose detection
diagnose_detection(
reference =
lbh_reference[lbh_reference$sound.files == "lbh1.wav", ],
detection = td
)
# template for the second and third sound file in 'lbh_reference'
# which have similar song types
templ2 <- lbh_reference[4, ]
# generate template correlations
tc <- template_correlator(
templates = templ2, path = tempdir(),
files = c("lbh1.wav", "lbh2.wav")
)
# template detection
td <- template_detector(template.correlations = tc, threshold = 0.3)
# diagnose detection
diagnose_detection(reference = lbh_reference, detection = td)
}