This vignette summarizes DDESONN results across 1000 randomized seeds (two separate 500-seed runs) and compares them against a Keras benchmark summary stored in an Excel workbook bundled with the package.
The purpose of this benchmark is not to showcase a single favorable run. Instead, it evaluates distributional behavior across many random initializations, with emphasis on:
In this context, stronger stability across seeds is important because it indicates that the training procedure is less sensitive to random initialization and therefore more dependable at scale.
The four RDS artifacts included with the package are stored under:
inst/extdata/heart_failure_runs/
├─ run1/
│ ├─ SingleRun_Train_Acc_Val_Metrics_500_seeds_20251025.rds
│ └─ SingleRun_Test_Metrics_500_seeds_20251025.rds
└─ run2/
├─ SingleRun_Train_Acc_Val_Metrics_500_seeds_20251026.rds
└─ SingleRun_Test_Metrics_500_seeds_20251026.rds
Each folder represents one 500-seed run performed locally; together they form the 1000-seed composite.
This benchmark addresses a focused research question:
Can a fully R-native, from-first-principles neural network implementation achieve competitive statistical stability relative to an established deep-learning framework under repeated randomized initialization?
The Keras comparison is included as a reference benchmark, not as an implementation template. DDESONN was built independently from scratch and was not derived from Keras source code.
suppressPackageStartupMessages({
library(dplyr)
library(tibble)
library(knitr)
})
if (!requireNamespace("DDESONN", quietly = TRUE)) {
message("DDESONN not installed in this build session; skipping evaluation.")
knitr::opts_chunk$set(eval = FALSE)
}.render_tbl <- function(x, title = NULL, digits = 4) {
if (requireNamespace("DDESONN", quietly = TRUE) &&
exists("ddesonn_viewTables", envir = asNamespace("DDESONN"), inherits = FALSE)) {
get("ddesonn_viewTables", envir = asNamespace("DDESONN"))(x, title = title)
} else {
if (!is.null(title)) cat("\n\n###", title, "\n\n")
knitr::kable(x, digits = digits, format = "html")
}
}heart_failure_root <- system.file("extdata", "heart_failure_runs", package = "DDESONN")
if (!nzchar(heart_failure_root)) {
# Fallback when building from source before installation
heart_failure_root <- file.path("..", "inst", "extdata", "heart_failure_runs")
}
stopifnot(dir.exists(heart_failure_root))
train_run1_path <- file.path(
heart_failure_root, "run1",
"SingleRun_Train_Acc_Val_Metrics_500_seeds_20251025.rds"
)
test_run1_path <- file.path(
heart_failure_root, "run1",
"SingleRun_Test_Metrics_500_seeds_20251025.rds"
)
train_run2_path <- file.path(
heart_failure_root, "run2",
"SingleRun_Train_Acc_Val_Metrics_500_seeds_20251026.rds"
)
test_run2_path <- file.path(
heart_failure_root, "run2",
"SingleRun_Test_Metrics_500_seeds_20251026.rds"
)
stopifnot(
file.exists(train_run1_path),
file.exists(test_run1_path),
file.exists(train_run2_path),
file.exists(test_run2_path)
)
train_run1 <- readRDS(train_run1_path)
test_run1 <- readRDS(test_run1_path)
train_run2 <- readRDS(train_run2_path)
test_run2 <- readRDS(test_run2_path)
train_all <- dplyr::bind_rows(train_run1, train_run2)
test_all <- dplyr::bind_rows(test_run1, test_run2)
train_seed <- train_all %>%
group_by(seed) %>%
slice_max(order_by = best_val_acc, n = 1, with_ties = FALSE) %>%
ungroup() %>%
transmute(
seed,
train_acc = best_train_acc,
val_acc = best_val_acc
)
test_seed <- test_all %>%
group_by(seed) %>%
slice_max(order_by = accuracy, n = 1, with_ties = FALSE) %>%
ungroup() %>%
transmute(
seed,
test_acc = accuracy
)
merged <- inner_join(train_seed, test_seed, by = "seed") %>%
arrange(seed)
summarize_column <- function(x) {
pct <- function(p) stats::quantile(x, probs = p, names = FALSE, type = 7)
data.frame(
count = length(x),
mean = mean(x),
std = sd(x),
min = min(x),
`25%` = pct(0.25),
`50%` = pct(0.50),
`75%` = pct(0.75),
max = max(x),
check.names = FALSE
)
}
summary_train <- summarize_column(merged$train_acc)
summary_val <- summarize_column(merged$val_acc)
summary_test <- summarize_column(merged$test_acc)
summary_all <- data.frame(
stat = c("count","mean","std","min","25%","50%","75%","max"),
train_acc = unlist(summary_train[1, ]),
val_acc = unlist(summary_val[1, ]),
test_acc = unlist(summary_test[1, ]),
check.names = FALSE
)
round4 <- function(x) if (is.numeric(x)) round(x, 4) else x
pretty_summary <- as.data.frame(lapply(summary_all, round4))
.render_tbl(
pretty_summary,
title = "DDESONN — 1000-seed summary (train/val/test)"
)| stat | train_acc | val_acc | test_acc |
|---|---|---|---|
| count | 1000.0000 | 1000.0000 | 1000.0000 |
| mean | 0.9928 | 0.9992 | 0.9992 |
| std | 0.0014 | 0.0013 | 0.0013 |
| min | 0.9854 | 0.9893 | 0.9920 |
| 25% | 0.9920 | 0.9987 | 0.9987 |
| 50% | 0.9929 | 1.0000 | 1.0000 |
| 75% | 0.9937 | 1.0000 | 1.0000 |
| max | 0.9963 | 1.0000 | 1.0000 |
Keras parity results are stored in an Excel workbook included with the package under:
inst/scripts/vsKeras/1000SEEDSRESULTSvsKeras/1000seedsKeras.xlsx
The file is accessed programmatically using
system.file() so the path remains CRAN-safe and
cross-platform.
if (!requireNamespace("readxl", quietly = TRUE)) {
message("Skipping keras-summary chunk: 'readxl' not installed.")
} else {
keras_path <- system.file(
"scripts", "vsKeras", "1000SEEDSRESULTSvsKeras", "1000seedsKeras.xlsx",
package = "DDESONN"
)
if (nzchar(keras_path) && file.exists(keras_path)) {
keras_stats <- readxl::read_excel(keras_path, sheet = 2)
.render_tbl(
keras_stats,
title = "Keras — 1000-seed summary (Sheet 2)"
)
} else {
cat("Keras Excel not found in installed package.\n")
}
}| stat | seed | train_loss | train_acc | val_loss | val_acc | val_auc | val_auprc | test_loss | test_acc | test_auc | test_auprc |
|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 1000.0000 | 1000.0000000 | 1000.0000000 | 1000.0000000 | 1000.0000000 | 1000.0000000 | 1.00000e+03 | 1000.0000000 | 1000.0000000 | 1000.0000000 | 1000.0000000 |
| mean | 500.0999 | 0.1285539 | 0.9853164 | 0.0923288 | 0.9943097 | 0.9981954 | 9.97682e-01 | 0.0801902 | 0.9968086 | 0.9992427 | 0.9989122 |
| std | 288.9524 | 0.2685126 | 0.0031003 | 0.1312915 | 0.0048215 | 0.0046459 | 6.14290e-03 | 0.1931197 | 0.0035695 | 0.0031493 | 0.0052804 |
| min | 1.0000 | 0.0810060 | 0.9705710 | 0.0511130 | 0.9653330 | 0.9612140 | 9.03136e-01 | 0.0498250 | 0.9786670 | 0.9691980 | 0.8900960 |
| 25% | 250.0000 | 0.0951690 | 0.9837140 | 0.0653810 | 0.9920000 | 0.9989820 | 9.98331e-01 | 0.0591380 | 0.9946670 | 0.9999330 | 0.9998530 |
| 50% | 500.0000 | 0.1022350 | 0.9857140 | 0.0709260 | 0.9960000 | 0.9997410 | 9.99473e-01 | 0.0629980 | 0.9973330 | 1.0000000 | 1.0000000 |
| 75% | 750.0000 | 0.1121150 | 0.9874290 | 0.0812150 | 0.9986670 | 0.9999420 | 9.99873e-01 | 0.0697250 | 1.0000000 | 1.0000000 | 1.0000000 |
| max | 1000.0000 | 6.0387850 | 0.9925710 | 2.1582980 | 1.0000000 | 1.0000000 | 1.00000e+00 | 5.5763540 | 1.0000000 | 1.0000000 | 1.0000000 |
Across 1000 random neural network initializations, DDESONN demonstrated stronger stability than the Keras benchmark model on this heart-failure task.
benchmark_results <- data.frame(
Metric = c(
"Mean Test Accuracy",
"Standard Deviation",
"Minimum Test Accuracy",
"Maximum Test Accuracy"
),
DDESONN = c("≈ 99.92%", "≈ 0.0013", "≈ 99.20%", "100%"),
Keras = c("≈ 99.69%", "≈ 0.0036", "≈ 97.82%", "100%"),
check.names = FALSE
)
.render_tbl(
benchmark_results,
title = "Benchmark results across 1000 seeds"
)| Metric | DDESONN | Keras |
|---|---|---|
| Mean Test Accuracy | ≈ 99.92% | ≈ 99.69% |
| Standard Deviation | ≈ 0.0013 | ≈ 0.0036 |
| Minimum Test Accuracy | ≈ 99.20% | ≈ 97.82% |
| Maximum Test Accuracy | 100% | 100% |
These results suggest that DDESONN achieved:
This is important because lower variance implies the model is less sensitive to randomized initialization and more dependable across repeated training runs.
In large corporate environments, teams may train hundreds or thousands of models across changing datasets, validation windows, and deployment cycles. A lower-variance model reduces the need for repeated retraining simply to obtain a “good seed,” which lowers compute cost and improves operational predictability.
In trading, portfolio analytics, execution modeling, or risk forecasting, model instability can create inconsistent outputs across retrains. A model that is more stable across seeds can improve confidence in:
This does not guarantee trading profitability, but it does support stronger engineering reliability and more reproducible model behavior.
In healthcare and other regulated domains, reproducibility matters because stakeholders need confidence that retraining the same workflow will not produce materially unstable outcomes. Lower dispersion across seeds can help support validation, governance, and auditability.
In mission-critical environments such as autonomous control or space-related analytics, reproducibility and reliability are essential. More stable training behavior can be valuable when models need to be trusted under constrained or high-stakes deployment settings.
These results aggregate two independent 500-seed runs performed locally.
A master seed was not set for those original runs. Since then:
TestDDESONN_1000seeds.RTestKeras_1000seeds.pyKeras raw and summary outputs are compiled in:
inst/scripts/vsKeras/1000SEEDSRESULTSvsKeras/1000seedsKeras.xlsx
The results shown here were computed locally.
For large-scale experiments involving hundreds or thousands of seeds, DDESONN can be executed in distributed environments to reduce wall-clock time significantly. Distributed orchestration and development-stage scaling scripts are maintained in the GitHub repository and are intentionally excluded from the CRAN package so this vignette remains focused on validated results and benchmark methodology.