Type: | Package |
Title: | Hydrologic Model Evaluation and Time-Series Tools |
Version: | 1.1.3 |
Date: | 2024-08-26 |
Description: | Facilitates the analysis and evaluation of hydrologic model output and time-series data with functions focused on comparison of modeled (simulated) and observed data, period-of-record statistics, and trends. |
URL: | https://code.usgs.gov/hymett/hymett |
BugReports: | https://code.usgs.gov/hymett/hymett/-/issues |
Depends: | R (≥ 3.6.0) |
Imports: | checkmate, dplyr, EnvStats, lmomco, lubridate, plyr, rlang, stats, tibble, zoo |
Suggests: | knitr, rmarkdown, roxygen2, testthat |
License: | CC0 |
LazyLoad: | yes |
LazyData: | yes |
VignetteBuilder: | knitr |
BuildVignettes: | true |
Copyright: | This software is in the public domain because it contains materials that originally came from the U.S. Geological Survey, an agency of the U.S. Department of Interior. For more information, see the official USGS copyright policy at http://www.usgs.gov/visual-id/credit_usgs.html#copyright |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
NeedsCompilation: | no |
Packaged: | 2024-08-28 14:59:29 UTC; cpenn |
Author: | Colin Penn |
Maintainer: | Colin Penn <cpenn@usgs.gov> |
Repository: | CRAN |
Date/Publication: | 2024-08-28 23:00:11 UTC |
Hydrologic Model Evaluation and Time-series Tools
Description
Facilitates the analysis and evaluation of hydrologic model output and time-series data with functions focused on comparison of modeled (simulated) and observed data, period-of-record statistics, and trends.
Details
Please see doi:10.5066/P9FNXEWI for more details.
Calculates Kendall's Tau, Spearman's Rho, Pearson Correlation
Description
Calculates Kendall's Tau, Spearman's Rho, Pearson Correlation, and p-values
as a wrapper to the stats::cor.test
function. Output is tidy-style data.frame.
Usage
GOF_correlation_tests(mod, obs, na.rm = TRUE, ...)
Arguments
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
na.rm |
'boolean' |
... |
Further arguments to be passed to or from |
Details
See stats::cor.test
for more details and further arguments to be passed to or from methods.
Defaults are used.
Value
A tibble (tibble::tibble
) with test statistic values and p-values.
See Also
Examples
GOF_correlation_tests(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
Calculate Kling–Gupta Efficiency (KGE)
Description
Calculate Kling–Gupta Efficiency (KGE) (or modified KGE ('KGE)) between modeled (simulated) and observed values.
Usage
GOF_kling_gupta_efficiency(mod, obs, modified = FALSE, na.rm = TRUE)
Arguments
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
modified |
'boolean' |
na.rm |
'boolean' |
Value
Value of computed KGE or 'KGE.
References
Kling, H., Fuchs, M. and Paulin, M., 2012. Runoff conditions in the upper Danube basin under an
ensemble of climate change scenarios: Journal of Hydrology, v. 424-425, p. 264-277.
[Also available at https://doi.org/10.1016/j.jhydrol.2012.01.011.]
Gupta, H.V., Kling, H., Yilmaz, K.K., and Martinez, G.G., 2009. Decomposition of the mean
squared error and NSE performance criteria: Implications for improving hydrological modelling:
Journal of Hydrology, v. 377, no.1-2, p. 80-91.
[Also available at https://doi.org/10.1016/j.jhydrol.2009.08.003.]
Examples
GOF_kling_gupta_efficiency(
mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs
)
Calculates mean absolute error (MAE).
Description
Calculates mean absolute error (MAE) between modeled (simulated) and observed values. Error is defined as modeled minus observed.
Usage
GOF_mean_absolute_error(mod, obs, na.rm = TRUE)
Arguments
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
na.rm |
'boolean' |
Details
The absolute value of each modeled-observed pair error is calculated, then the mean of those values taken. Values returned are in units of input data.
Value
Value of calculated mean absolute error (MAE).
Examples
GOF_mean_absolute_error(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
Calculates mean error.
Description
Calculates mean error between modeled (simulated) and observed values. Error is defined as modeled minus observed.
Usage
GOF_mean_error(mod, obs, na.rm = TRUE)
Arguments
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
na.rm |
'boolean' |
Details
Values returned are in units of input data.
Value
Value of calculated mean error.
Examples
GOF_mean_error(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
Calculate Nash–Sutcliffe Efficiency (NSE)
Description
Calculate Nash–Sutcliffe Efficiency (NSE) (with options for modified NSE) between modeled (simulated) and observed values.
Usage
GOF_nash_sutcliffe_efficiency(mod, obs, j = 2, na.rm = TRUE)
Arguments
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
j |
'numeric' value. Exponent value for modified NSE (mNSE) equation. Default value is
|
na.rm |
'boolean' |
Value
Value of computed NSE or mNSE.
References
Krause, P., Boyle, D.P., and Base, F., 2005. Comparison of different efficiency criteria for
hydrological model assessment: Advances in Geosciences, v. 5, p. 89-97.
[Also available at https://doi.org/10.5194/adgeo-5-89-2005.]
Legates D.R and McCabe G.J., 1999, Evaluating the use of "goodness-of-fit" measures in hydrologic and hydroclimatic model validation: Water Resources Research. v. 35, no. 1, p. 233-241. [Also available at https://doi.org/10.1029/1998WR900018.]
Nash, J.E. and Sutcliffe, J.V., 1970, River flow forecasting through conceptual models part I: A discussion of principles: Journal of Hydrology, v. 10, no. 3, p. 282-290. [Also available at https://doi.org/10.1016/0022-1694(70)90255-6.]
Examples
GOF_nash_sutcliffe_efficiency(
mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs
)
Calculates percent bias.
Description
Calculates percent bias between modeled (simulated) and observed values.
Usage
GOF_percent_bias(mod, obs, na.rm = TRUE)
Arguments
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
na.rm |
'boolean' |
Details
Values returned are in percent.
Value
Value of calculated percent bias as percent.
Examples
GOF_percent_bias(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
Calculate root-mean-square error with options to normalize
Description
Calculate root-mean-square error (RMSE) between modeled (simulated) and observed values. Error is defined as modeled minus observed.
Usage
GOF_rmse(
mod,
obs,
normalize = c("none", "mean", "range", "stdev", "iqr", "iqr-1", "iqr-2", "iqr-3",
"iqr-4", "iqr-5", "iqr-6", "iqr-7", "iqr-8", "iqr-9", NULL),
na.rm = TRUE
)
Arguments
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
normalize |
'character' value. Option to normalize the root-mean-square error (NRMSE) by
several normalizing options. Default is |
na.rm |
'boolean' |
Value
'numeric' value of computed root-mean-square error (RMSE) or normalized root-mean-square error (NRMSE)
Examples
# RMSE
GOF_rmse(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
# NRMSE
GOF_rmse(
mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs, normalize = 'stdev'
)
Calculate Goodness-of-fit metrics and output into table
Description
Calculate Goodness-of-fit (GOF) metrics for correlation, Kling–Gupta efficiency, mean absolute error, mean error, Nash–Sutcliffe efficiency, percent bias, root-mean-square error, normalized root-mean-square error, and volumetric efficiency, and output into a table.
Usage
GOF_summary(
mod,
obs,
metrics = c("cor", "kge", "mae", "me", "nse", "pb", "rmse", "nrmse", "ve"),
censor_threshold = NULL,
censor_symbol = NULL,
na.rm = TRUE,
kge_modified = FALSE,
nse_j = 2,
rmse_normalize = c("mean", "range", "stdev", "iqr", "iqr-1", "iqr-2", "iqr-3", "iqr-4",
"iqr-5", "iqr-6", "iqr-7", "iqr-8", "iqr-9", NULL),
...
)
Arguments
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
metrics |
'character' vector. Which GOF metrics should be computed and output. Default is
|
censor_threshold |
'numeric' value. Threshold to censor values on utilizing
|
censor_symbol |
'character' string. Inequality symbol to censor values based on
|
na.rm |
'boolean' |
kge_modified |
'boolean' |
nse_j |
'numeric' value. Exponent value for modified NSE (mNSE) equation, utilized if
|
rmse_normalize |
'character' value. Normalize option for NRMSE, utilized if "nrmse" option
is in paramter |
... |
Further arguments to be passed to or from |
Details
See GOF_correlation_tests
, GOF_kling_gupta_efficiency
,
GOF_mean_absolute_error
, GOF_mean_error
,
GOF_nash_sutcliffe_efficiency
, GOF_percent_bias
, GOF_rmse
,
and GOF_volumetric_efficiency
.
Value
A tibble (see tibble::tibble
) with GOF metrics
See Also
censor_values
, GOF_correlation_tests
,
GOF_kling_gupta_efficiency
,
GOF_mean_absolute_error
, GOF_mean_error
,
GOF_nash_sutcliffe_efficiency
, GOF_percent_bias
,
GOF_rmse
,
GOF_volumetric_efficiency
Examples
GOF_summary(mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs)
Calculate Volumetric Efficiency
Description
Calculate Volumetric efficiency (VE) between modeled (simulated) and observed values. VE is defined as the fraction of water delivered at the proper time (Criss and Winston, 2008).
Usage
GOF_volumetric_efficiency(mod, obs, na.rm = TRUE)
Arguments
mod |
'numeric' vector. Modeled or simulated values. Must be same length as |
obs |
'numeric' vector. Observed or comparison values. Must be same length as |
na.rm |
'boolean' |
Details
Volumetric efficiency was proposed in order to circumvent some problems associated to the
Nash–Sutcliffe efficiency. It ranges from 0
to 1
and represents the fraction of water
delivered at the proper time; its compliment represents the fractional volumetric mismatch
(Criss and Winston, 2008).
Value
Value of computed Volumetric efficiency.
References
Criss, R.E. and Winston, W.E., 2008, Do Nash values have value? Discussion and alternate
proposals: Hydrological Processes, v. 22, p. 2723-2725.
[Also available at https://doi.org/10.1002/hyp.7072.]
Zambrano-Bigiarini, M., 2020, hydroGOF: Goodness-of-fit functions for comparison of simulated and observed hydrological time series R package version 0.4-0. accessed September 16, 2020, at https://github.com/hzambran/hydroGOF. [Also available at https://doi.org/10.5281/zenodo.839854.]
Examples
GOF_volumetric_efficiency(
mod = example_mod$streamflow_cfs, obs = example_obs$streamflow_cfs
)
Calculate the 50th and 90th percentiles of a streamflow time series
Description
This function computes the 50th and 90th percentiles of a streamflow time series from annual n-day high flow values and returns a data.frame in the format of other period-of-record (POR) metrics.
Usage
POR_apply_annual_hiflow_stats(annual_max, quantile_type = 8)
Arguments
annual_max |
'numeric' vector or data.frame. Vector or data.frame with columns of annual n-day maximum streamflows. |
quantile_type |
'numeric' value. The distribution type used in the |
Details
annual maximum of n-day moving averages can be computed during pre-processing step using
preproc_precondition_data
and calc_annual_flow_stats
, or preproc_main
for both
observed and modeled data.
Value
Data.frame of 0.5 and 0.9 non-exceedance probabilities (50th and 90th percentiles),
with metric names if annual_max
is a data.frame with columns named by metric.
See Also
quantile
, preproc_precondition_data
,
calc_annual_flow_stats
, preproc_main
Examples
POR_apply_annual_hiflow_stats(annual_max = example_annual[ , c("high_q1", "high_q30")])
Calculate 10-year and 2-year return periods of a streamflow time series
Description
Calculates 10-year and 2-year return periods of a streamflow time series from annual n-day low streamflow values and returns a data.frame in the format of other period-of-record (POR) metrics.
Usage
POR_apply_annual_lowflow_stats(annual_min)
Arguments
annual_min |
'numeric' vector or data.frame. Vector or data.frame with columns of annual n-day minimum streamflows. |
Details
POR_apply_POR_lowflow_metrics
is a helper function that applies the POR_calc_lp3_quantile
function to the data.frame of n-day moving averages, which can be computed during pre-processing
step using preproc_precondition_data
and calc_annual_flow_stats
, or preproc_main
for
both observed and modeled data. This function returns a data.frame with the 10-year and 2-year
return period streamflows for each n-day low streamflow in the input data.frame.
Value
data.frame with 10-year and 2-year return period of n-day streamflows.
See Also
POR_calc_lp3_quantile
, preproc_precondition_data
,
calc_annual_flow_stats
,
preproc_main
Examples
POR_apply_annual_lowflow_stats(annual_min = example_annual[ , c("low_q1", "low_q30")])
calculates lag-one autocorrelation (AR1) coefficient for a time series
Description
calculates lag-one autocorrelation (AR1) coefficient for a time series
Usage
POR_calc_AR1(data = NULL, Date, value, time_step = c("daily", "monthly"))
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'numeric' vector of Dates corresponding to each |
value |
'numeric' vector of values (often streamflow) when |
time_step |
'character' value. Either |
Details
The function calculates lag-one autocorrelation (AR1) coefficient for a time series using the
stats::ar
function. When applied to an observed or modeled time series of streamflow, the
POR_deseasonalize
function can be applied to the raw data prior to running the
POR_calc_AR1
function.
Value
A data.frame with calculated seasonal amplitude and phase.
References
Farmer, W.H., Archfield, S.A., Over, T.M., Hay, L.E., LaFontaine, J.H., and Kiang, J.E., 2014, A comparison of methods to predict historical daily streamflow time series in the southeastern United States: U.S. Geological Survey Scientific Investigations Report 2014–5231, 34 p. [Also available at https://doi.org/10.3133/sir20145231.]
See Also
Examples
POR_calc_AR1(data = example_obs, Date = "Date", value = "streamflow_cfs")
Calculate the seasonal amplitude and phase of a daily time series
Description
Calculates the seasonal amplitude and phase of a daily time series.
Usage
POR_calc_amp_and_phase(
data = NULL,
Date,
value,
time_step = c("daily", "monthly")
)
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'numeric' vector of Dates corresponding to each |
value |
'numeric' vector of values (often streamflow) when |
time_step |
'character' value. Either |
Value
A data.frame with calculated seasonal amplitude and phase
References
Farmer, W.H., Archfield, S.A., Over, T.M., Hay, L.E., LaFontaine, J.H., and Kiang, J.E., 2014, A comparison of methods to predict historical daily streamflow time series in the southeastern United States: U.S. Geological Survey Scientific Investigations Report 2014–5231, 34 p. [Also available at https://doi.org/10.3133/sir20145231.]
Examples
POR_calc_amp_and_phase(data = example_obs, Date = "Date", value = "streamflow_cfs")
Calculate quantile from fitted log-Pearson type III distribution
Description
Calculate the specified flow quantile from a fitted log-Pearson type III distribution from a time series of n-day low flows.
Usage
POR_calc_lp3_quantile(annual_min, p)
Arguments
annual_min |
'numeric' vector. Vector of minimum annual n-day mean flows. |
p |
'numeric' value of exceedance probabilities. Quantile of fitted distribution that is
returned ( |
Details
POR_calc_lp3_quantile
fits an log-Pearson type III distribution to a series of annual n-day
flows and returns the quantile of a user-specified probability using calc_qlpearsonIII
. This
represents a theoretical return period for than n-day flow.
Value
Specified quantile from the fitted log-Pearson type 3 distribution.
References
Asquith, W.H., Kiang, J.E., and Cohn, T.A., 2017, Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities: U.S. Geological Survey Scientific Investigation Report 2017–5038, 93 p. [Also available at https://doi.org/10.3133/sir20175038.]
See Also
Examples
POR_calc_lp3_quantile(annual_min = example_annual$low_q1, p = 0.1)
Removes seasonal trends from a daily or monthly time series.
Description
Removes seasonal trends from a daily or monthly time series. Daily data are deseasonalized by subtracting monthly mean values. Monthly data are deseasonalized by subtracting mean monthly values.
Usage
POR_deseasonalize(data = NULL, Date, value, time_step = c("daily", "monthly"))
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'numeric' vector of Dates corresponding to each |
value |
'numeric' vector of values (often streamflow) when |
time_step |
'character' value. Either |
Details
The deseasonalize function removes seasonal trends from a daily or monthly time series
and returns a deseasonalized time series, which can be used in the POR_calc_AR1
function.
Value
Deseasonalized values.
See Also
Examples
POR_deseasonalize(data = example_obs, Date = "Date", value = "streamflow_cfs")
Calculates various metrics that describe the distribution of a time series of streamflow
Description
Calculates various metrics that describe the distribution of a time series of streamflow, which can be of any time step.
Usage
POR_distribution_metrics(value, quantile_type = 8, na.rm = TRUE)
Arguments
value |
'numeric' vector of values (assumed to be streamflow) at any time step. |
quantile_type |
'numeric' value. The distribution type used in the |
na.rm |
'boolean' |
Details
Metrics computed include:
p_
nFlow-duration curve (FDC) percentile where n = 1, 5, 10, 25, 50, 75, 90, 95, and 99
POR_mean
Period of record mean
POR_sd
Period of record standard deviation
POR_cv
Period of record coefficient of variation
POR_min
Period of record minimum
POR_max
Period of record maximum
LCV
L-moment coefficient of variation
Lskew
L-moment skewness
Lkurtosis
L-moment kurtosis
Value
A data.frame with FDC quantiles, and distribution metrics. See Details. This function calculates various metrics that describe the distribution of a time series of streamflow, which can be of any time step.
References
Farmer, W.H., Archfield, S.A., Over, T.M., Hay, L.E., LaFontaine, J.H., and Kiang, J.E., 2014, A comparison of methods to predict historical daily streamflow time series in the southeastern United States: U.S. Geological Survey Scientific Investigations Report 2014–5231, 34 p. [Also available at https://doi.org/10.3133/sir20145231.]
Asquith, W.H., Kiang, J.E., and Cohn, T.A., 2017, Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities: U.S. Geological Survey Scientific Investigation Report 2017–5038, 93 p. [Also available at https://doi.org/10.3133/sir20175038.]
Asquith, W.H., 2021, lmomco—L-moments, censored L-moments, trimmed L-moments,
L-comoments, and many distributions. R package version 2.3.7, Texas Tech University,
Lubbock, Texas.
See Also
Examples
POR_distribution_metrics(value = example_obs$streamflow_cfs)
Calculate benchmark Kling–Gupta efficiency (KGE) values from day-of-year (DOY) observations
Description
Calculate benchmark Kling–Gupta efficiency (KGE) values from daily observed time-series data
Usage
benchmark_KGE_DOY(obs_preproc)
Arguments
obs_preproc |
'data.frame' of daily observational data, preprocessed as output from |
Details
This function calculates a "benchmark" KGE value (see Knoben and others, 2020) from a daily
observed data time-series. First, the interannual mean and median is calculated for each day of
the calendar year. Next, the interannual mean and median values are joined to each corresponding
day in the observation time series. Finally, a KGE value (GOF_kling_gupta_efficiency
) is
calculated comparing the mean or median value repeated time series to the daily observational
time series. These benchmark KGE values can be used as comparisons for modeled (simulated)
calibration results.
Value
A data.frame with columns "KGE_DOY_mean"
and "KGE_DOY_median"
.
References
Knoben, W.J.M, Freer, J.E., Peel, M.C., Fowler, K.J.A, Woods, R.A., 2020. A Brief Analysis of
Conceptual Model Structure Uncertainty Using 36 Models and 559 Catchments: Water Resources
Research, v. 56.
[Also available at https://doi.org/10.1029/2019WR025975.]
Examples
benchmark_KGE_DOY(obs_preproc = example_preproc)
Calculate annual flow statistics from daily data
Description
Calculate annual flow statistics from daily data
Usage
calc_annual_flow_stats(
data = NULL,
Date,
year_group,
Q,
Q3 = NA_real_,
Q7 = NA_real_,
Q30 = NA_real_,
jd = NA_integer_,
calc_high = FALSE,
calc_low = FALSE,
calc_percentiles = FALSE,
calc_monthly = FALSE,
calc_WSCVD = FALSE,
longitude = NA,
calc_ICVD = FALSE,
zero_threshold = 33,
quantile_type = 8,
na.action = c("na.omit", "na.pass")
)
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
year_group |
'numeric' vector when |
Q |
'numeric' vector when |
Q3 |
'numeric' vector when |
Q7 |
'numeric' vector when |
Q30 |
'numeric' vector when |
jd |
'numeric' vector when |
calc_high |
'boolean' value. Calculate high flow statistics for years in |
calc_low |
'boolean' value. Calculate low flow statistics for years in |
calc_percentiles |
'boolean' value. Calculate percentiles for years in |
calc_monthly |
'boolean' value. Calculate monthly statistics for years in |
calc_WSCVD |
'boolean' value. Calculate winter-spring center volume date for years in
|
longitude |
'numeric' value. Site longitude in North American Datum of 1983 (NAD83),
required in WSCVD calculation. Default is |
calc_ICVD |
'boolean' value. Calculate inverse center volume date for years in |
zero_threshold |
'numeric' value as percentage. The percentage of years of a statistic that
need to be zero in order for it to be deemed a zero flow site for that statistic. For use in
trend calculation. See Details on attributes. Default is |
quantile_type |
'numeric' value. The distribution type used in the |
na.action |
'character' string indicating na.action passed to |
Details
year_group
is commonly water year, climate year, or calendar year.
Default annual statistics returned:
annual_mean
annual mean in
year_group
annual_sd
annual standard deviation in
year_group
annual_sum
annual sum in
year_group
If calc_high/low
are selected, annual statistics returned:
1-, 3-, 7-, and 30-day high/low and Julian date (jd) of n-day high/low.
high_q
nwhere n = 1, 3, 7, and 30
high_q
n_jd
where n = 1, 3, 7, and 30
low_q
nwhere n = 1, 3, 7, and 30
low_q
n_jd
where n = 1, 3, 7, and 30
If calc_percentiles
is selected, annual statistics returned:
1, 5, 10, 25, 50, 75, 90, 95, 99 percentile based on daily streamflow.
annual_
n_percentile
where n = 1, 5, 10, 25, 50, 75, 90, 95, and 99
If calc_monthly
is selected, annual statistics returned:
Monthly mean, standard deviation, max, min, percent of annual for each month in year_group
.
- month
_mean
monthly mean, where month =
month.abb
- month
_sd
monthly standard deviation, where month =
month.abb
- month
_max
monthly maximum, where month =
month.abb
- month
_min
monthly minimum, where month =
month.abb
- month
_percent_annual
monthly percent of annual, where month =
month.abb
If calc_WSCVD
is selected, Julian date of annual winter-spring center volume date is returned.
Longitude (in NAD83 datum) is used to determine the ending month of spring. July for longitudes
West of -
95 degrees, May for longitudes east of -
95 degrees. See References
Dudley and others, 2017. Commonly calculated when year_group
is water year.
WSCVD
Julian date of winter-spring center volume
If calc_ICVD
is selected, Julian date of annual inverse center volume date is returned.
Commonly calculated when year_group
is climate year.
ICVD
Julian date of inverse center volume date
Attribute: zero_flow_years
A data.frame with each annual statistic calculated, the percentage of years where the
statistic = 0, a flag indicating if the percentage is over the zero_threshold
parameter,
and the number of years with a zero value. Columns in zero_flow_years
:
annual_stat
annual statistic
percent_zeros
percentage of years with 0 statistic value
over_threshold
boolean if percentage is over threshold
number_years
number of years with 0 value statistic
The zero_flow_years
attribute can be useful in trend calculation, where a trend may not be
appropriate to calculate with many zero flow years.
Value
A tibble (see tibble::tibble
) with annual statistics depending on options selected.
See Details.
References
Dudley, R.W., Hodgkins, G.A, McHale, M.R., Kolian, M.J., Renard, B., 2017, Trends in snowmelt-related streamflow timing in the conterminous United States: Journal of Hydrology, v. 547, p. 208-221. [Also available at https://doi.org/10.1016/j.jhydrol.2017.01.051.]
See Also
Examples
calc_annual_flow_stats(data = example_preproc, Date = "Date", year_group = "WY", Q = "value")
Calculate trend in annual statistics
Description
Calculate trend in annual statistics
Usage
calc_annual_stat_trend(data = NULL, year, value, ...)
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
year |
'numeric' vector when |
value |
'numeric' vector when |
... |
further arguments to be passed to or from |
Details
This function is a wrapper for EnvStats::kendallTrendTest
with the passed equation
value ~ year
. The returned values include Mann-Kendall test statistic and p-value,
Theil-Sen slope and intercept values, and trend details (Millard, 2013; Helsel and others, 2020).
z_stat
Mann-Kendall test statistic, returned directly from
EnvStats::kendallTrendTest
p_value
z_stat
p-value, returned directly fromEnvStats::kendallTrendTest
sen_slope
Sen slope in units value per year, returned directly from
EnvStats::kendallTrendTest
intercept
Sen slope intercept, returned directly from
EnvStats::kendallTrendTest
trend_mag
Trend magnitude over entire period, in units of
value
, calculated assen_slope * (max(year)
-
min(year))
val_beg/end
Calculated value at beginning or end of period, calculated as
sen_slope * year + intercept
val_perc_change
Percentage change over period, calculated as
(val_end - val_beg) / val_beg * 100
Value
A tibble (see tibble::tibble
) with test statistic, p-value, trend coefficients, and
trend calculations. See Details.
References
Millard, S.P., 2013, EnvStats: An R Package for Environmental Statistics: New York, New York, Springer, 291 p. [Also available at https://doi.org/10.1007/978-1-4614-8456-1.]
Helsel, D.R., Hirsch, R.M., Ryberg, K.R., Archfield, S.A., and Gilroy, E.J., 2020, Statistical methods in water resources: U.S. Geological Survey Techniques and Methods, book 4, chap. A3, 458 p. [Also available at https://doi.org/10.3133/tm4a3.]
See Also
Examples
calc_annual_stat_trend(data = example_annual, year = "WY", value = "annual_mean")
Calculate logistic regression in annual statistics with zero values
Description
Calculate logistic regression (Everitt and Hothorn, 2009) in annual statistics with zero values. A model fit to compute the probability of a zero flow annual statistic.
Usage
calc_logistic_regression(data = NULL, year, value, ...)
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
year |
'numeric' vector when |
value |
'numeric' vector when |
... |
further arguments to be passed to or from |
Details
This function is a wrapper for stats::glm(y ~ year, family = stats::binomial(link="logit")
with y = 1
when value = 0
(for example a zero flow annual statistic) and y = 0
otherwise.
The returned values include
p_value
Probability value of the explanatory (
year
) variable in the logistic modelstdErr_slope
Standard error of the regression slope (log odds per year)
odds_ratio
Exponential of the explanatory coefficient (year coefficient)
prob_beg/end
Logistic regression predicted (fitted) values at the beginning and ending year.
prob_change
Change in probability from beginning to end.
Example, an odds ratio of 1.05 represents the odds of a zero-flow year (versus non-zero) increase by a factor of 1.05 (or 5 percent).
Value
A tibble (see tibble::tibble
) with logistic regression p-value, standard error of
slope, odds ratio, beginning and ending probability, and probability change. See Details.
References
Everitt, B. S. and Hothorn T., 2009, A Handbook of Statistical Analyses Using R, 2nd Ed. Boca Raton, Florida, Chapman and Hall/CRC, 376p.
See Also
Examples
calc_logistic_regression(data = example_annual, year = "WY", value = "annual_mean")
Quantile of Pearson Type III distribution for log-transformed data
Description
Quantile of Pearson Type III distribution for log-transformed data
Usage
calc_qlpearsonIII(p, meanlog = 0, sdlog = 1, skew = 0)
Arguments
p |
Vector of non-exceedance probabilities, between 0 and 1, to calculate quantiles. |
meanlog |
Vector of mean of the distribution of the log-transformed data. |
sdlog |
Vector of standard deviation of the distribution of the log-transformed data. |
skew |
Vector of skewness of the distribution of the log-transformed data. |
Details
calc_qpearsonIII
and calc_qlpearsonIII
are functions to fit a log-Pearson type III
distribution from a given mean, standard deviation, and skew. This source code is replicated,
unchanged, from the swmrBase
package in order to reduce the dependency on that package.
Value
Quantiles for the described distribution
References
Asquith, W.H., Kiang, J.E., and Cohn, T.A., 2017, Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities: U.S. Geological Survey Scientific Investigation Report 2017–5038, 93 p. [Also available at https://doi.org/10.3133/sir20175038.]
Lorenz, D.L., 2015, smwrBase—An R package for managing hydrologic data, version 1.1.1: U.S.
Geological Survey Open-File Report 2015–1202, 7 p.
[Also available at https://doi.org/10.3133/ofr20151202.]
See Also
Examples
calc_qlpearsonIII(0.1)
Quantile of Pearson Type III distribution
Description
Quantile of Pearson Type III distribution
Usage
calc_qpearsonIII(p, mean = 0, sd = 1, skew = 0)
Arguments
p |
Vector of non-exceedance probabilities, between 0 and 1, to calculate quantiles. |
mean |
Vector of means of the distribution of the data. |
sd |
Vector of standard deviation of the distribution of the data. |
skew |
Vector of skewness of the distribution of the data. |
Details
calc_qpearsonIII
and calc_qlpearsonIII
are functions to fit a log-Pearson type III
distribution from a given mean, standard deviation, and skew. This source code is replicated,
unchanged, from the swmrBase
package in order to reduce the dependency on that package.
Value
Quantiles for the described distribution
References
Asquith, W.H., Kiang, J.E., and Cohn, T.A., 2017, Application of at-site peak-streamflow frequency analyses for very low annual exceedance probabilities: U.S. Geological Survey Scientific Investigation Report 2017–5038, 93 p. [Also available at https://doi.org/10.3133/sir20175038.]
Lorenz, D.L., 2015, smwrBase—An R package for managing hydrologic data, version 1.1.1: U.S.
Geological Survey Open-File Report 2015–1202, 7 p.
[Also available at https://doi.org/10.3133/ofr20151202.]
Examples
calc_qpearsonIII(0.1)
Censor values above or below a threshold
Description
Replaces values in a vector with NA
when above or below a censor level.
Censoring is values censor_symbol censor_threshold
are censored,
for example with the defaults (values lte 0 set to NA
) all values <= 0 are
replaced with NA
.
Usage
censor_values(
value,
censor_threshold = 0,
censor_symbol = c("lte", "lt", "gt", "gte")
)
Arguments
value |
'numeric' vector. Values to censor. |
censor_threshold |
'numeric' value. Threshold to censor values on. Default is 0. |
censor_symbol |
'character' string. |
Value
'numeric' vector with censored values replaced with NA
Examples
censor_values(value = seq.int(1, 10, 1), censor_threshold = 5)
Example Annual Observations
Description
An example dataset with daily observed streamflow processed to annual water year values.
Usage
example_annual
Format
A data.frame with the following variables:
WY
water year
annual_mean
annual mean
annual_sd
annual standard deviation
annual_sum
annual sum
high_q1
annual maximum of daily mean
high_q3
annual maximum of 3-day mean
high_q7
annual maximum of 7-day mean
high_q30
annual maximum of 30-day mean
high_q1_jd
Julian day of annual maximum of daily mean
high_q3_jd
Julian day of annual maximum of 3-day mean
high_q7_jd
Julian day of annual maximum of 7-day mean
high_q30_jd
Julian day of annual maximum of 30-day mean
low_q7
annual minimum of 7-day mean
low_q30
annual minimum of 30-day mean
low_q3
annual minimum of 3-day mean
low_q1
annual minimum of daily mean
low_q7_jd
Julian day of annual minimum of 7-day mean
low_q30_jd
Julian day of annual minimum of 30-day mean
low_q3_jd
Julian day of annual minimum of 3-day mean
low_q1_jd
Julian day of annual minimum of daily mean
annual_1_percentile
annual first percentile
annual_5_percentile
annual 5th percentile
annual_10_percentile
annual 10th percentile
annual_25_percentile
annual 25th percentile
annual_50_percentile
annual 50th percentile
annual_75_percentile
annual 75th percentile
annual_90_percentile
annual 90th percentile
annual_95_percentile
annual 95th percentile
annual_99_percentile
annual 99th percentile
Jan_mean
annual January mean
Jan_sd
annual January standard deviation
Jan_max
annual January maximum
Jan_min
annual January minimum
Jan_percent_annual
annual January percentage of annual sum
Feb_mean
annual February mean
Feb_sd
annual February standard deviation
Feb_max
annual February maximum
Feb_min
annual February minimum
Feb_percent_annual
annual February percentage of annual sum
Mar_mean
annual March mean
Mar_sd
annual March standard deviation
Mar_max
annual March maximum
Mar_min
annual March minimum
Mar_percent_annual
annual March percentage of annual sum
Apr_mean
annual April mean
Apr_sd
annual April standard deviation
Apr_max
annual April maximum
Apr_min
annual April minimum
Apr_percent_annual
annual April percentage of annual sum
May_mean
annual May mean
May_sd
annual May standard deviation
May_max
annual May maximum
May_min
annual May minimum
May_percent_annual
annual May percentage of annual sum
Jun_mean
annual June mean
Jun_sd
annual June standard deviation
Jun_max
annual June maximum
Jun_min
annual June minimum
Jun_percent_annual
annual June percentage of annual sum
Jul_mean
annual July mean
Jul_sd
annual July standard deviation
Jul_max
annual July maximum
Jul_min
annual July minimum
Jul_percent_annual
annual July percentage of annual sum
Aug_mean
annual August mean
Aug_sd
annual August standard deviation
Aug_max
annual August maximum
Aug_min
annual August minimum
Aug_percent_annual
annual August percentage of annual sum
Sep_mean
annual September mean
Sep_sd
annual September standard deviation
Sep_max
annual September maximum
Sep_min
annual September minimum
Sep_percent_annual
annual September percentage of annual sum
Oct_mean
annual October mean
Oct_sd
annual October standard deviation
Oct_max
annual October maximum
Oct_min
annual October minimum
Oct_percent_annual
annual October percentage of annual sum
Nov_mean
annual November mean
Nov_sd
annual November standard deviation
Nov_max
annual November maximum
Nov_min
annual November minimum
Nov_percent_annual
annual November percentage of annual sum
Dec_mean
annual December mean
Dec_sd
annual December standard deviation
Dec_max
annual December maximum
Dec_min
annual December minimum
Dec_percent_annual
annual December percentage of annual sum
WSV
winter-spring volume
wscvd
Julian date of winter-spring center volume
Details
Generated with example_obs
from
HyMETT::preproc_main(data = example_obs, Date = "Date", value = "streamflow_cfs", longitude = -68)$annual
See Also
Examples
str(example_annual)
Example Model Output
Description
An example dataset with daily modeled (simulated) streamflow.
Usage
example_mod
Format
A data.frame with the following variables:
date
date as 'character' column class.
streamflow_cfs
modeled streamflow in units of feet^3/second.
Date
date as 'Date' column class.
Details
Generated from example data available at system.file("extdata", "01013500_MOD.csv", package = "HyMETT")
References
Johnson, M., D. Blodgett, 2020, NOAA National Water Model Reanalysis Data at RENCI, HydroShare,
accessed September 17, 2020 at
https://doi.org/10.4211/hs.89b0952512dd4b378dc5be8d2093310f
Johnson, M., 2021, nwmHistoric: National Water Model Historic Data. R package version 0.0.0.9000, accessed September 17, 2020 at https://github.com/mikejohnson51/nwmHistoric
Examples
str(example_mod)
Example Model Output with zero flows
Description
An example dataset with daily modeled (simulated) streamflow that includes zero flows.
Usage
example_mod_zf
Format
A data.frame with the following variables:
date
date as 'character' column class.
streamflow_cfs
modeled streamflow in units of feet^3/second.
Date
date as 'Date' column class.
Details
Generated from example data available at system.file("extdata", "08202700_MOD.csv", package = "HyMETT")
References
Johnson, M., D. Blodgett, 2020, NOAA National Water Model Reanalysis Data at RENCI, HydroShare,
accessed September 17, 2020 at
https://doi.org/10.4211/hs.89b0952512dd4b378dc5be8d2093310f
Johnson, M., 2021, nwmHistoric: National Water Model Historic Data. R package version 0.0.0.9000, accessed September 17, 2020 at https://github.com/mikejohnson51/nwmHistoric
Examples
str(example_mod_zf)
Example Observations
Description
An example dataset with daily observed streamflow.
Usage
example_obs
Format
A data.frame with the following variables:
date
date as 'character' column class.
streamflow_cfs
observed streamflow in units of feet^3/second.
quality_cd
qualifier for value in
streamflow_cfs
(U.S. Geological Survey, 2020b)Date
date as 'Date' column class.
Details
Generated from example data available at system.file("extdata", "01013500_OBS.csv", package = "HyMETT")
References
De Cicco, L.A., Hirsch, R.M., Lorenz, D., and Watkins, W.D., 2021, dataRetrieval: R packages for discovering and retrieving water data available from Federal hydrologic web services, accessed September 16, 2020 at https://doi.org/10.5066/P9X4L3GE.
U.S. Geological Survey, 2020a, USGS water data for the Nation: U.S. Geological Survey National
Water Information System database, accessed September 16, 2020, at
https://doi.org/10.5066/F7P55KJN.
U.S. Geological Survey, 2020b, Instantaneous and Daily Data-Value Qualification Codes, in USGS water data for the Nation: U.S. Geological Survey National Water Information System database, accessed September 16, 2020, at https://doi.org/10.5066/F7P55KJN. [information directly accessible at https://help.waterdata.usgs.gov/codes-and-parameters/instantaneous-value-qualification-code-uv_rmk_cd.]
Examples
str(example_obs)
Example Observations with zero flows
Description
An example dataset with daily observed streamflow that includes zero flows.
Usage
example_obs_zf
Format
A data.frame with the following variables:
date
date as 'character' column class.
streamflow_cfs
observed streamflow in units of feet^3/second.
quality_cd
qualifier for value in
streamflow_cfs
(U.S. Geological Survey, 2020b)Date
date as 'Date' column class.
Details
Generated from example data available at system.file("extdata", "08202700_OBS.csv", package = "HyMETT")
References
De Cicco, L.A., Hirsch, R.M., Lorenz, D., and Watkins, W.D., 2021, dataRetrieval: R packages for discovering and retrieving water data available from Federal hydrologic web services, accessed September 16, 2020 at https://doi.org/10.5066/P9X4L3GE.
U.S. Geological Survey, 2020a, USGS water data for the Nation: U.S. Geological Survey National
Water Information System database, accessed September 16, 2020, at
https://doi.org/10.5066/F7P55KJN.
U.S. Geological Survey, 2020b, Instantaneous and Daily Data-Value Qualification Codes, in USGS water data for the Nation: U.S. Geological Survey National Water Information System database, accessed September 16, 2020, at https://doi.org/10.5066/F7P55KJN. [information directly accessible at https://help.waterdata.usgs.gov/codes-and-parameters/instantaneous-value-qualification-code-uv_rmk_cd.]
Examples
str(example_obs_zf)
Example Observations prepocessed
Description
An example dataset with daily observed streamflow preprocessed to include additional timing and n-day moving averages.
Usage
example_preproc
Format
A data.frame with the following variables:
Date
value
year
month
day
decimal_date
WY
Water Year: October 1 - September 30
CY
Climate Year: April 1 - March 30
Q3
3-Day Moving Average: computed at end of moving interval
Q7
7-Day Moving Average: computed at end of moving interval
Q30
30-Day Moving Average: computed at end of moving interval
jd
Julian date
Details
Generated with example_obs
from
HyMETT::preproc_main(data = example_obs, Date = "Date", value = "streamflow_cfs", longitude = -68)$daily`
See Also
Examples
str(example_preproc)
Audit daily data for total days in year
Description
Audit daily data for total days in year. An audit is performed to inventory and flag missing days in daily data and help determine if further analyses are appropriate.
Usage
preproc_audit_data(
data = NULL,
Date,
value,
year_group,
use_specific_years = FALSE,
begin_year = NULL,
end_year = NULL,
days_cutoff = 360,
date_format = "%Y-%m-%d"
)
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
value |
'numeric' vector when |
year_group |
'numeric' vector when |
use_specific_years |
'boolean' value. Flag to clip data to a certain set of years in
|
begin_year |
'numeric' value. If |
end_year |
'numeric' value. If |
days_cutoff |
'numeric' value. Designating the number of days required for a year to be
counted as full. Default is |
date_format |
'character' string. Format of Date. Default is |
Details
Year grouping is commonly water year, climate year, or calendar year.
Value
A data.frame with year_group
, count (n, excluding NA
values)
of days in each year_group
, and a complete years 'boolean' flag.
See Also
preproc_fill_daily
, preproc_precondition_data
Examples
preproc_audit_data(
data = example_preproc, Date = "Date", value = "value", year_group = "WY"
)
Fills daily data with missing dates as NA
values
Description
Fills daily data with missing dates as NA
values. Days that are
absent from the daily time series are inserted with a corresponding value of NA
.
Usage
preproc_fill_daily(
data = NULL,
Date,
value,
POR_start = NA,
POR_end = NA,
date_format = "%Y-%m-%d"
)
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
value |
'numeric' vector when |
POR_start |
'character' value. Optional period of record start. If not specified, defaults
to |
POR_end |
'character' value. Optional period of record end. If not specified, defaults to
|
date_format |
'character' string. Format of Date. Default is |
Details
Can be used prior to preproc_precondition_data
to fill daily data before computation
of n-day moving averages, or prior to preproc_audit_data
.
Value
A data.frame with Date
and value
, sequenced from POR_start
to POR_end
by 1 day.
See Also
preproc_audit_data
, preproc_precondition_data
Examples
Dates = c(seq.Date(as.Date("2020-01-01"), as.Date("2020-01-10"), by = "1 day"),
seq.Date(as.Date("2020-01-20"), as.Date("2020-01-31"), by = "1 day"))
values = c(seq.int(1, 22, 1))
preproc_fill_daily(Date = Dates, value = values)
A wrapper function for preproc_precondition_data, preproc_audit_data, and calc_annual_flow_stats
Description
A wrapper function for preproc_precondition_data
,
preproc_audit_data
, and
calc_annual_flow_stats
Usage
preproc_main(
data = NULL,
Date,
value,
date_format = "%Y-%m-%d",
year_group = c("WY", "CY", "year"),
use_specific_years = FALSE,
begin_year = NULL,
end_year = NULL,
days_cutoff = 360,
calc_high = TRUE,
calc_low = TRUE,
calc_percentiles = TRUE,
calc_monthly = TRUE,
calc_WSCVD = TRUE,
longitude = NA,
calc_ICVD = FALSE,
zero_threshold = 33,
quantile_type = 8,
na.action = c("na.omit", "na.pass")
)
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
value |
'numeric' vector when |
date_format |
'character' string. Format of Date. Default is |
year_group |
'character' value. Specify either |
use_specific_years |
'boolean' value. Flag to clip data to a certain set of years in
|
begin_year |
'numeric' value. If |
end_year |
'numeric' value. If |
days_cutoff |
'numeric' value. Designating the number of days required for a year to be
counted as full. Default is |
calc_high |
'boolean' value. Calculate high streamflow statistics for years in |
calc_low |
'boolean' value. Calculate low streamflow statistics for years in |
calc_percentiles |
'boolean' value. Calculate percentiles for years in |
calc_monthly |
'boolean' value. Calculate monthly statistics for years in |
calc_WSCVD |
'boolean' value. Calculate winter-spring center volume date for years in
|
longitude |
'numeric' value. Site longitude in NAD83, required in WSCVD calculation.
Default is |
calc_ICVD |
'boolean' value. Calculate inverse center volume date for years in
|
zero_threshold |
'numeric' value as percentage. The percentage of years of a statistic that
need to be zero in order for it to be deemed a zero streamflow site for that statistic. For
use in trend calculation. See Details on attributes. Default is |
quantile_type |
'numeric' value. The distribution type used in the |
na.action |
'character' string indicating na.action passed to |
Details
This is a wrapper function of preproc_precondition_data
, preproc_audit_data
, and
calc_annual_flow_stats
. Data are first passed to the precondition function, then audited,
then annual statistics are computed.
It also checks the timestep of the data to make sure that it is daily timestep.
Other time steps are currently not supported and will return the data.frame without moving
averages computed.
Value
A list of three data.frames: 1 of preconditioned data, 1 data audit, and 1 annual statistics.
See Also
preproc_audit_data
, preproc_precondition_data
,
calc_annual_flow_stats
Examples
preproc_main(data = example_obs, Date = "Date", value = "streamflow_cfs", longitude = -68)
Pre-conditions data with time information and n-day moving averages
Description
Pre-conditions data with time information and n-day moving averages, with options
to fill missing days with NA
values.
Usage
preproc_precondition_data(
data = NULL,
Date,
value,
date_format = "%Y-%m-%d",
fill_daily = TRUE
)
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
value |
'numeric' vector when |
date_format |
'character' string. Format of |
fill_daily |
'logical' value. Should gaps in |
Details
These columns are added to the data:
year
month
day
decimal_date
WY
Water Year: October 1 to September 30
CY
Climate Year: April 1 to March 30
Q3
3-Day Moving Average: computed at end of moving interval
Q7
7-Day Moving Average: computed at end of moving interval
Q30
30-Day Moving Average: computed at end of moving interval
jd
Julian date
This function also checks the time step of the data to make sure that it is daily time step. Daily
values with gaps are important to fill with NA
to ensure proper calculation of n-day moving
averages. Use fill_daily = TRUE
or preproc_fill_daily
. Other time steps are currently not
supported and will return the data.frame without moving averages computed.
Value
A data.frame with Date, value, and additional columns with time and n-day moving average information.
See Also
Examples
preproc_precondition_data(data = example_obs, Date = "Date", value = "streamflow_cfs")
Validates that daily data do not contain gaps
Description
Validates that daily data do not contain gaps
Usage
preproc_validate_daily(
data = NULL,
Date = "Date",
value = "value",
date_format = "%Y-%m-%d"
)
Arguments
data |
'data.frame'. Optional data.frame input, with columns containing |
Date |
'Date' or 'character' vector when |
value |
'numeric' vector when |
date_format |
'character' string. Format of |
Details
Used to validate there are no gaps in the daily record before computing n-day moving averages in
preproc_precondition_data
or lag-1 autocorrelation in POR_calc_AR1
. If gaps are present,
preproc_fill_daily
can be used to fill them with NA
values.
Value
An error message with missing dates, otherwise nothing.
Examples
preproc_validate_daily(data = example_obs, Date = "Date", value = "streamflow_cfs")