Type: | Package |
Title: | Artificial Intelligence Systems and Observer Performance |
Version: | 2.1.2 |
Date: | 2022-11-08 |
Depends: | R (≥ 3.5.0) |
Imports: | bbmle, binom, dplyr, ggplot2, mvtnorm, numDeriv, openxlsx, readxl, Rcpp, stats, stringr, tools, utils |
Suggests: | testthat, knitr, kableExtra, rmarkdown |
LinkingTo: | Rcpp |
Description: | Analyzing the performance of artificial intelligence (AI) systems/algorithms characterized by a 'search-and-report' strategy. Historically observer performance has dealt with measuring radiologists' performances in search tasks, e.g., searching for lesions in medical images and reporting them, but the implicit location information has been ignored. The implemented methods apply to analyzing the absolute and relative performances of AI systems, comparing AI performance to a group of human readers or optimizing the reporting threshold of an AI system. In addition to performing historical receiver operating receiver operating characteristic (ROC) analysis (localization information ignored), the software also performs free-response receiver operating characteristic (FROC) analysis, where lesion localization information is used. A book using the software has been published: Chakraborty DP: Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, Taylor-Francis LLC; 2017: https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840. Online updates to this book, which use the software, are at https://dpc10ster.github.io/RJafrocQuickStart/, https://dpc10ster.github.io/RJafrocRocBook/ and at https://dpc10ster.github.io/RJafrocFrocBook/. Supported data collection paradigms are the ROC, FROC and the location ROC (LROC). ROC data consists of single ratings per images, where a rating is the perceived confidence level that the image is that of a diseased patient. An ROC curve is a plot of true positive fraction vs. false positive fraction. FROC data consists of a variable number (zero or more) of mark-rating pairs per image, where a mark is the location of a reported suspicious region and the rating is the confidence level that it is a real lesion. LROC data consists of a rating and a location of the most suspicious region, for every image. Four models of observer performance, and curve-fitting software, are implemented: the binormal model (BM), the contaminated binormal model (CBM), the correlated contaminated binormal model (CORCBM), and the radiological search model (RSM). Unlike the binormal model, CBM, CORCBM and RSM predict 'proper' ROC curves that do not inappropriately cross the chance diagonal. Additionally, RSM parameters are related to search performance (not measured in conventional ROC analysis) and classification performance. Search performance refers to finding lesions, i.e., true positives, while simultaneously not finding false positive locations. Classification performance measures the ability to distinguish between true and false positive locations. Knowing these separate performances allows principled optimization of reader or AI system performance. This package supersedes Windows JAFROC (jackknife alternative FROC) software V4.2.1, https://github.com/dpc10ster/WindowsJafroc. Package functions are organized as follows. Data file related function names are preceded by 'Df', curve fitting functions by 'Fit', included data sets by 'dataset', plotting functions by 'Plot', significance testing functions by 'St', sample size related functions by 'Ss', data simulation functions by 'Simulate' and utility functions by 'Util'. Implemented are figures of merit (FOMs) for quantifying performance and functions for visualizing empirical or fitted operating characteristics: e.g., ROC, FROC, alternative FROC (AFROC) and weighted AFROC (wAFROC) curves. For fully crossed study designs significance testing of reader-averaged FOM differences between modalities is implemented via either Dorfman-Berbaum-Metz or the Obuchowski-Rockette methods. Also implemented is single treatment analysis, which allows comparison of performance of a group of radiologists to a specified value, or comparison of AI to a group of radiologists interpreting the same cases. Crossed-modality analysis is implemented wherein there are two crossed treatment factors and the aim is to determined performance in each treatment factor averaged over all levels of the second factor. Sample size estimation tools are provided for ROC and FROC studies; these use estimates of the relevant variances from a pilot study to predict required numbers of readers and cases in a pivotal study to achieve the desired power. Utility and data file manipulation functions allow data to be read in any of the currently used input formats, including Excel, and the results of the analysis can be viewed in text or Excel output files. The methods are illustrated with several included datasets from the author's collaborations. This update includes improvements to the code, some as a result of user-reported bugs and new feature requests, and others discovered during ongoing testing and code simplification. |
License: | GPL-3 |
LazyData: | true |
URL: | https://dpc10ster.github.io/RJafroc/ |
RoxygenNote: | 7.2.1 |
Encoding: | UTF-8 |
NeedsCompilation: | yes |
Packaged: | 2022-11-08 18:38:08 UTC; Dev |
Author: | Dev Chakraborty [cre, aut, cph], Peter Phillips [ctb], Xuetong Zhai [aut] |
Maintainer: | Dev Chakraborty <dpc10ster@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2022-11-08 19:10:02 UTC |
Artificial Intelligence Systems and Observer Performance
Description
RJafroc
analyzes the performance of artificial intelligence (AI) systems/algorithms characterized
by a search-and-report strategy. Historically observer performance has dealt with measuring radiologists'
performances in search tasks, e.g., searching for lesions in medical images and reporting them, but the implicit
location information has been ignored. The methods here apply to any task involving searching for
and reporting arbitrary targets in images. The implemented methods apply
to analyzing the absolute and relative performances of AI systems, comparing AI performance to a group of human
readers or optimizing the reporting threshold of an AI system. In addition to performing historical receiver operating
characteristic (ROC) analysis (localization information ignored), the software also performs free-response receiver operating
characteristic (FROC) analysis, where the implicit lesion localization information is used. A book describing the
underlying methodology and which uses the software has been
published: Chakraborty DP: Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications
with R-Based Examples, Taylor-Francis LLC; 2017: https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840.
Online updates to this book, which use the software, are at
https://dpc10ster.github.io/RJafrocQuickStart/, https://dpc10ster.github.io/RJafrocRocBook/ and at
https://dpc10ster.github.io/RJafrocFrocBook/. Supported data collection paradigms are the ROC, FROC and the location ROC (LROC).
ROC data consists of single ratings per images, where a rating is the perceived confidence level that the image is that of a
diseased patient. An ROC curve is a plot of true positive fraction vs. false
positive fraction. FROC data consists of a variable number (zero or more) of mark-rating pairs per image, where a mark is the
location of a reported suspicious region and the rating is the confidence level that it is a real lesion. LROC data
consists of a rating and a location of the most suspicious region, for every image. Four models of observer performance,
and curve-fitting software, are implemented: the binormal model (BM), the contaminated binormal model (CBM), the
correlated contaminated binormal model (CORCBM), and the radiological search model (RSM). Unlike the binormal model, CBM,
CORCBM and RSM predict "proper" ROC curves that do not inappropriately cross the chance diagonal. Additionally, RSM
parameters are related to search performance (not measured in conventional ROC analysis) and classification performance.
Search performance refers to finding lesions, i.e., true positives, while simultaneously not finding false positive
locations. Classification performance measures the ability to distinguish between true and false positive locations. Knowing
these separate performances allows principled optimization of reader or AI system performance. This package supersedes Windows
JAFROC (jackknife alternative FROC) software V4.2.1, https://github.com/dpc10ster/WindowsJafroc. Package functions
are organized as follows. Data file related function names are preceded by Df, curve fitting functions by Fit, included
data sets by dataset, plotting functions by Plot, significance testing functions by St, sample size related functions
by Ss, data simulation functions by Simulate and utility functions by Util. Implemented are figures of merit (FOMs)
for quantifying performance, functions for visualizing empirical operating characteristics: e.g., ROC, FROC, alternative FROC
(AFROC) and weighted AFROC (wAFROC) curves. For fully crossed study designs significance testing of reader-averaged FOM
differences between modalities is implemented via both Dorfman-Berbaum-Metz and the Obuchowski-Rockette methods. Also
implemented are single treatment analyses, allowing comparison of performance of a group of radiologists to a specified
value, or comparison of AI to a group of radiologists/algorithms interpreting the same cases. Crossed-modality analysis is implemented
wherein there are two crossed treatment factors and the aim is to determined performance in each treatment factor averaged
over all levels of the second factor. Sample size estimation tools are provided for ROC and FROC studies; these use estimates
of the relevant variances from a pilot study to predict required numbers of readers and cases in a pivotal study to achieve the
desired power. Utility and data file manipulation functions allow data to be read in any of the currently used input
formats, including Excel, and the results of the analysis can be viewed in text or Excel output files. The methods are
illustrated with several included datasets from the author's collaborations. This update includes improvements to the code,
some as a result of user-reported bugs and new feature requests, and others discovered during ongoing testing and code simplification.
All changes are noted in NEWS.md.
Details
Package: | RJafroc |
Type: | Package |
Version: | 2.1.2 |
Date: | 2022-11-08 |
License: | GPL-3 |
URL: | https://dpc10ster.github.io/RJafroc/ |
Definitions and abbreviations
-
a: The separation or "a" parameter of the binormal model
AFROC curve: plot of LLF (ordinate) vs. FPF, where FPF is inferred using highest rating of NL marks on non-diseased cases
AFROC: alternative FROC, see Chakraborty 1989
AFROC1 curve: plot of LLF (ordinate) vs. FPF1, where FPF1 is inferred using highest rating of NL marks on ALL cases
-
alpha
: The significance level\alpha
of the test of the null hypothesis of no treatment effect AUC: area under curve; e.g., ROC-AUC = area under ROC curve, an example of a FOM
-
b: The width or "b" parameter of the conventional binormal model
Binormal model: two unequal variance normal distributions, one at zero and one at
mu
, for modeling ROC ratings,sigma
is the std. dev. ratio of diseased to non-diseased distributionsCAD: computer aided detection algorithm
CBM: contaminated binormal model (CBM): two equal variance normal distributions for modeling ROC ratings, the diseased distribution is bimodal, with a peak at zero and one at
\mu
, the integrated fraction at\mu
is\alpha
(not to be confused with\alpha
of NH testing)CI: The (1-
\alpha
) confidence interval for the stated statisticCrossed modality: a dataset containing two modality (treatment) factors, with the levels of the two factors crossed, see paper by Thompson et al
DBM: Dorfman-Berbaum-Metz, a significance testing method for detecting a treatment effect in MRMC studies
DBMH: Hillis' modification of the DBM method
ddf: Denominator degrees of freedom of appropriate
F
-test; the corresponding ndf isI
- 1Empirical AUC: trapezoidal area under curve, same as the Wilcoxon statistic for ROC paradigm
FN: false negative, a diseased case classified as non-diseased
FOM: figure of merit, a quantitative measure of performance, performance metric
FP: false positive, a non-diseased case classified as diseased
FPF: number of FPs divided by number of non-diseased cases
FROC curve: plot of LLF (ordinate) vs. NLF
FROC: free-response ROC (a data collection paradigm where each image yields a random number, 0, 1, 2,..., of mark-rating pairs)
FRRC: Analysis that treats readers as fixed and cases as random factors
I: total number of modalities, indexed by
i
image/case: used interchangeably; a case can consist of several images of the same patient in the same treatment
iMRMC: A text file format used for ROC data by FDA/CDRH researchers
individual: A single-treatment single-reader dataset.
Intrinsic: Used in connection with RSM; a parameter that is independent of the RSM
\mu
parameter, but whose meaning may not be as transparent as the corresponding physical parameterJ: number of readers, indexed by
j
JAFROC file format: A .xlsx format file, applicable to ROC, ROI, FROC and LROC paradigms
JAFROC: jackknife AFROC: Windows software for analyzing observer performance data: no longer updated, replaced by current package; the name is a misnomer as the jackknife is used only for significance testing; alternatively, the bootstrap could be used; what distinguishes FROC from ROC analysis is the use of the AFROC-AUC as the FOM. With this change, the DBM or the OR method can be used for significance testing
-
K
: total number of cases,K
=K1
+K2
, indexed byk
-
K1
: total number of non-diseased cases, indexed byk1
-
K2
: total number of diseased cases, indexed byk2
LL: lesion localization i.e., a mark that correctly locates an existing localized lesion; TP is a special case, when the proximity criterion is lax (i.e., "acceptance radius" is large)
LLF: number of LLs divided by the total number of lesions
LROC: location receiver operating characteristic, a data collection paradigm where each image yields a single rating and one location
lrc/MRMC: A text file format used for ROC data by University of Iowa researchers
mark: the location of a suspected diseased region
maxLL: maximum number of lesions per case in dataset
maxNL: maximum number of NL marks per case in dataset
MRMC: multiple reader multiple case (each reader interprets each case in each treatment, i.e. fully crossed study design)
ndf: Numerator degrees of freedom of appropriate
F
-test, usually number of treatments minus oneNH: The null hypothesis that all treatment effects are zero; rejected if the
p
-value is smaller than\alpha
NL: non-lesion localization, of which FP is a special case, i.e., a mark that does not correctly locate any existing localized lesion(s)
NLF: number of NLs divided by the total number of cases
Operating characteristic: A plot of normalized correct decisions on diseased cases along ordinate vs. normalized incorrect decisions on non-diseased cases
Operating point: A point on an operating characteristic, e.g., (FPF, TPF) represents an operating point on an ROC
OR: Obuchowski-Rockette, a significance testing method for detecting a treatment effect in MRMC studies
ORH: Hillis' modification of the OR method
Physical parameter: Used in connection with RSM; a parameter whose meaning is more transparent than the corresponding intrinsic parameter, but which depends on the RSM
\mu
parameterProximity criterion / acceptance radius: Used in connection with FROC (or LROC data); the "nearness" criterion is used to determine if a mark is close enough to a lesion to be counted as a LL (or correct localization); otherwise it is counted as a NL (or incorrect localization)
p-value: the probability, under the null hypothesis, that the observed treatment effects, or larger, could occur by chance
Proper: a proper fit does not inappropriately fall below the chance diagonal, does not display a "hook" near the upper right corner
PROPROC: Metz's binormal model based fitting of proper ROC curves
RSM, Radiological Search Model: two unit variance normal distributions for modeling NL and LL ratings; four parameters,
\mu
,\nu
',\lambda
' and\zeta
1Rating: Confidence level assigned to a case; higher values indicate greater confidence in presence of disease;
-Inf
is allowed butNA
is not allowedReader/observer/radiologist/CAD: used interchangeably
RJafroc: the current software
ROC: receiver operating characteristic, a data collection paradigm where each image yields a single rating and location information is ignored
ROC curve: plot of TPF (ordinate) vs. FPF, as threshold is varied; an example of an operating characteristic
ROCFIT: Metz software for binormal model based fitting of ROC data
ROI: region-of-interest (each case is divided into a number of ROIs and the reader assigns an ROC rating to each ROI)
FRRC: Analysis that treats readers as fixed and cases as random factors
RRFC: Analysis that treats readers as random and cases as fixed factors
RRRC: Analysis that treats both readers and cases as random factors
RSCORE-II: original software for binormal model based fitting of ROC data
RSM: Radiological search model, also method for fitting a proper ROC curve to ROC data
RSM-
\zeta
1: Lowest reporting threshold, determines if suspicious region is actually markedRSM-
\lambda
: Intrinsic parameter of RSM corresponding to\lambda
', independent of\mu
RSM-
\lambda
': Physical Poisson parameter of RSM, average number of latent NLs per case; depends on\mu
RSM-
\mu
: separation of the unit variance distributions of RSMRSM-
\nu
: Intrinsic parameter of RSM, corresponding to\nu
', independent of\mu
RSM-
\nu
': binomial parameter of RSM, probability that lesion is foundSE: sensitivity, same as
TPF
Significance testing: determining the p-value of a statistical test
SP: specificity, same as
1-FPF
Threshold: Reporting criteria: if confidence exceeds a threshold value, report case as diseased, otherwise report non-diseased
TN: true negative, a non-diseased case classified as non-diseased
TP: true positive, a diseased case classified as diseased
TPF: number of TPs divided by number of diseased cases
Treatment/modality: used interchangeably, for example, computed tomography (CT) images vs. magnetic resonance imaging (MRI) images
wAFROC curve: plot of weighted LLF (ordinate) vs. FPF, where FPF is inferred using highest rating of NL marks on non-diseased cases ONLY
wAFROC1 curve: plot of weighted LLF (ordinate) vs. FPF1, where FPF1 is inferred using highest rating of NL marks on ALL cases
wAFROC1 FOM: weighted trapezoidal area under AFROC1 curve: only use if there are zero non-diseased cases is always number of treatments minus one
Dataset
The dataset
object has 3 list
elements: $ratings
, $lesions
and $descriptions
, where:
-
dataset$ratings
: contains 3 elements as sub-lists:$NL
,$LL
and$LL_IL
; these describe the structure of the ratings; -
dataset$lesions
: contains 3 elements as sub-lists:$perCase
,$IDs
and$weights
; these describe the structure of the lesions; -
dataset$descriptions
: contains 7 elements as sub-lists:$fileName
,$type
,$name
,$truthTableStr
,$design
,$modalityID
and$readerID
; these describe other characteristics of the dataset as detailed next.
Note: -Inf
is used to indicate the ratings of unmarked lesions
and/or missing values. As an example of the latter, if the maximum
number of NLs in a dataset is 4, but some images have fewer than 4 NL marks,
the corresponding "empty" positions would be filled with
-Inf
s. Do not use NA
to denote a missing rating.
Note: "dataset" in this package always represents
R
object(s) with the following structure(s):
General data structure, e.g., dataset02
, an ROC dataset, and
dataset05
, an FROC dataset.
-
ratings$NL
: a float array with dimensionsc(I, J, K, maxNL)
, containing the ratings of NL marks. The firstK1
locations of the third index corresponds to NL marks on non-diseased cases and the remaining locations correspond to NL marks on diseased cases. The 4th dimension allows for multiple NL marks on a case: the first index holds the first NL rating on the image, the second holds the second NL rating on the image, etc. The value ofmaxNL
is determined by the case with the maximum number of lesions per case in the dataset. For FROC datasets missing NL ratings are assigned the-Inf
rating. For ROC datasets, FP ratings are assigned to the firstK1
elements ofNL[,,1:K1,1]
and the remainingK2
elements ofNL[,,(K1+1):K,1]
are set to-Inf
. -
ratings$LL
: for non-LROC datasets a float array with dimensionsc(I, J, K2, maxLL)
containing the ratings of LL marks. The value ofmaxLL
is determined by the maximum number of lesions per case in the dataset. Unmarked lesions are assigned the-Inf
rating. For ROC datasetsTP
ratings are assigned toLL[,,1:K2,1]
. For LROC datasets it is a float array with dimensionsc(I, J, K2, 1)
containing the ratings of correct localizations, otherwise the rating is recorded in the incorrect localization array described next. -
ratings$LL_IL
: for LROC datasets the ratings of incorrect localization marks on abnormal cases. It is a float array with dimensionsc(I, J, K2, 1)
. For non-LROC datasets this array is filled with NAs. -
lesions$perCase
: an integer array with lengthK2
, the number of lesions on each diseased case. The maximum value of this array equalsmaxLL
. For example,dataset05$lesions$perCase[4
is 2, meaning the 4th diseased case has two lesions. -
lesions$IDs
: an integer array with dimensions [K2
,maxLL
], labeling (or naming) the lesions on the diseased cases. For example,dataset05$lesions$IDs[4,]
isc(1,2,-Inf)
, meaning the 4th diseased case has two lesions, labeled 1 and 2. -
lesions$weights
: a floating point array with dimensionsc(K2, maxLL)
, representing the relative importance of detecting each lesion. The weights for an abnormal case must sum to unity. For example,dataset05$lesions$weights[4,]
isc(0.5,0.5, -Inf)
, corresponding to equal weights (0.5) assigned to of the two lesions in the case. -
descriptions$fileName
: acharacter
variable containing the file name of the source data for this dataset. This is generated automatically by the DfReadDataFile function used to read the file. For a simulalated dataset it is set to "NA" (i.e., a character vector, not the variableNA
). -
descriptions$type
: acharacter
variable describing the data type: "ROC
", "LROC
", "ROI
" or "FROC
". -
descriptions$name
: acharacter
variable containing the name of the dataset: e.g., "dataset02" or "dataset05". This is generated automatically by the DfReadDataFile function used to read the file. -
descriptions$truthTableStr
: ac(I, J, L, maxLL+1)
object. For normal cases elementsc(I, J, L, 1)
are filled with 1s if the corresponding interpretations occurred or NAs otherwise. For abnormal cases elementsc(I, J, L, 2:(maxLL+1))
are filled with 1s if the corresponding interpretations occurred or NAs otherwise. This object is necessary for analyzing more complex designs, e.g., split-plot, as described next. -
descriptions$design
: acharacter
variable: "FCTRL
", "SPLIT-PLOT-A
" or "SPLIT-PLOT-A
", corresponding to factorial, split-plot-A or split-plot-C designs. The A and C refer to subparts of Table VII in a Hillis 2014 publication. -
descriptions$modalityID
: acharacter
vector of lengthI
, which labels/names the modalities in the dataset. For non-JAFROC data file formats, they must be unique integers. -
descriptions$readerID
: acharacter
vector of lengthJ
, which labels/names the readers in the dataset. For non-JAFROC data file formats, they must be unique integers.
ROI data structure, example datasetROI
Only changes from the previously described structure are described below:
-
ratings$NL
: a float array with dimensionsc(I, J, K, Q)
containing the ratings of each of Q quadrants for each non-diseased case. -
ratings$LL
: a float array with dimensionsc(I, J, K2, Q)
containing the ratings of quadrants for each diseased case. -
lesions$perCase
: this contains the locations, on abnormal cases, containing at least one lesion.
Crossed modality data structure, example datasetCrossedModality
Only changes from the previously described structure are described below:
-
ratings$NL
: a float array with dimensionc(I1, I2, J, K, maxNL)
containing the ratings of NL marks. Note the existence of two modality indices. -
LL
: a float array with dimensionc(I1, I2, J, K2, maxLL)
containing the ratings of all LL marks. Note the existence of two modality indices. -
modalityID1
: corresponding to first modality factor. -
modalityID2
: corresponding to second modality factor.
Df: Datafile Related Functions
-
Df2RJafrocDataset
: Convert a ratings array to a dataset object. -
DfBinDataset
: Return a binned dataset. -
DfCreateCorCbmDataset
:Create paired dataset for testing FitCorCbm. -
DfExtractDataset
: Extract a subset of modalities and readers from a dataset. -
DfFroc2Roc
: Convert an FROC dataset to a highest rating inferred ROC dataset. -
DfLroc2Roc
: Convert an LROC dataset to a highest rating inferred ROC dataset. -
DfLroc2Froc
: Simulates an "AUC-equivalent" FROC dataset from a supplied LROC dataset. -
DfFroc2Lroc
: Simulates an "AUC-equivalent" LROC dataset from a supplied FROC dataset. -
DfReadCrossedModalities
: Read a crossed-modalities data file. -
DfReadDataFile
: Read a general data file. -
DfSaveDataFile
: Save ROC data file in a different format. -
DfExtractCorCbmDataset
: Extract two arms of a pairing from an MRMC ROC dataset suitable for using FitCorCbm.
Fitting Functions
-
FitBinormalRoc
: Fit the binormal model to ROC data (R equivalent of ROCFIT or RSCORE). -
FitCbmRoc
: Fit the contaminated binormal model (CBM) to ROC data. -
FitRsmRoc
: Fit the radiological search model (RSM) to ROC data. -
FitCorCbm
: Fit the correlated contaminated binormal model (CORCBM) to paired ROC data. -
FitRsmRoc
: Fit the radiological search model (RSM) to ROC data.
Plotting Functions
-
PlotBinormalFit
: Plot binormal-predicted ROC curve with provided BM parameters. -
PlotEmpiricalOperatingCharacteristics
: Plot empirical operating characteristics for specified dataset. -
PlotRsmOperatingCharacteristics
: Plot RSM-fitted ROC curves.
Simulation Functions
-
SimulateFrocDataset
: Simulates an uncorrelated FROC dataset using the RSM. -
SimulateRocDataset
: Simulates an uncorrelated binormal model ROC dataset. -
SimulateCorCbmDataset
: Simulates an uncorrelated binormal model ROC dataset. -
SimulateLrocDataset
: Simulates an uncorrelated LROC dataset.
Sample size Functions
-
SsPowerGivenJK
: Calculate statistical power given numbers of readers J and casesK
. -
SsPowerTable
: Generate a power table. -
SsSampleSizeKGivenJ
: Calculate number of casesK
, for specified number of readers J, to achieve desired power for an ROC study.
Significance Testing Functions
-
StSignificanceTesting
: Perform significance testing, DBM or OR. -
StSignificanceTestingCadVsRad
: Perform significance testing, CAD vs. radiologists. -
StSignificanceTestingCrossedModalities
: Perform significance testing using crossed modalities analysis.
Miscellaneous and Utility Functions
-
UtilAucBinormal
: Binormal model AUC function. -
UtilAucCBM
: CBM AUC function. -
UtilAucPROPROC
: PROPROC AUC function. -
UtilAnalyticalAucsRSM
: RSM ROC/AFROC AUC calculator. -
UtilFigureOfMerit
: Calculate empirical figures of merit (FOMs) for specified dataset. -
UtilIntrinsic2RSM
: Convert from intrinsic to physical RSM parameters. -
UtilLesionWeightsMatrix
: Calculates the lesion weights matrix. -
UtilMeanSquares
: Calculates the mean squares used in the DBMH and ORH methods. -
UtilOutputReport
: Generate a formatted report file. -
UtilRSM2Intrinsic
: Convert from RSM parameters to intrinsic RSM parameters. -
UtilPseudoValues
: Return jackknife pseudovalues. -
UtilVarComponentsDBM
: Utility for Dorfman-Berbaum-Metz variance components. -
UtilORVarComponentsFactorial
: Utility for Obuchowski-Rockette variance components.
Author(s)
Author: Dev Chakraborty dpc10ster@gmail.com.
Author: Xuetong Zhai xuetong.zhai@gmail.com.
Contributor: Peter Phillips peter.phillips@cumbria.ac.uk.
References
Basics of ROC
Metz, CE (1978). Basic principles of ROC analysis. In Seminars in nuclear medicine (Vol. 8, pp. 283–298). Elsevier.
Metz, CE (1986). ROC Methodology in Radiologic Imaging. Investigative Radiology, 21(9), 720.
Metz, CE (1989). Some practical issues of experimental design and data analysis in radiological ROC studies. Investigative Radiology, 24(3), 234.
Metz, CE (2008). ROC analysis in medical imaging: a tutorial review of the literature. Radiological Physics and Technology, 1(1), 2–12.
Wagner, R. F., Beiden, S. V, Campbell, G., Metz, CE, & Sacks, W. M. (2002). Assessment of medical imaging and computer-assist systems: lessons from recent experience. Academic Radiology, 9(11), 1264–77.
Wagner, R. F., Metz, CE, & Campbell, G. (2007). Assessment of medical imaging systems and computer aids: a tutorial review. Academic Radiology, 14(6), 723–48.
DBM/OR methods and extensions
DORFMAN, D. D., BERBAUM, KS, & Metz, CE (1992). Receiver operating characteristic rating analysis: generalization to the population of readers and patients with the jackknife method. Investigative Radiology, 27(9), 723.
Obuchowski, NA, & Rockette, HE (1994). HYPOTHESIS TESTING OF DIAGNOSTIC ACCURACY FOR MULTIPLE READERS AND MULTIPLE TESTS: AN ANOVA APPROACH WITH DEPENDENT OBSERVATIONS. Communications in Statistics-Simulation and Computation, 24(2), 285–308.
Hillis, SL, Berbaum, KS, & Metz, CE (2008). Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study analysis. Academic Radiology, 15(5), 647–61.
Hillis, SL, Obuchowski, NA, & Berbaum, KS (2011). Power Estimation for Multireader ROC Methods: An Updated and Unified Approach. Acad Radiol, 18, 129–142.
Hillis, SL SL (2007). A comparison of denominator degrees of freedom methods for multiple observer ROC analysis. Statistics in Medicine, 26(3), 596–619.
FROC paradigm
Chakraborty DP. Maximum Likelihood analysis of free-response receiver operating characteristic (FROC) data. Med Phys. 1989;16(4):561–568.
Chakraborty, DP, & Berbaum, KS (2004). Observer studies involving detection and localization: modeling, analysis, and validation. Medical Physics, 31(8), 1–18.
Chakraborty, DP (2006). A search model and figure of merit for observer data acquired according to the free-response paradigm. Physics in Medicine and Biology, 51(14), 3449–62.
Chakraborty, DP (2006). ROC curves predicted by a model of visual search. Physics in Medicine and Biology, 51(14), 3463–82.
Chakraborty, DP (2011). New Developments in Observer Performance Methodology in Medical Imaging. Seminars in Nuclear Medicine, 41(6), 401–418.
Chakraborty, DP (2013). A Brief History of Free-Response Receiver Operating Characteristic Paradigm Data Analysis. Academic Radiology, 20(7), 915–919.
Chakraborty, DP, & Yoon, H.-J. (2008). Operating characteristics predicted by models for diagnostic tasks involving lesion localization. Medical Physics, 35(2), 435.
Thompson JD, Chakraborty DP, Szczepura K
, et al. (2016) Effect of reconstruction methods
and x-ray tube current-time product on nodule detection in an anthropomorphic
thorax phantom: a crossed-modality JAFROC observer study. Medical Physics.
43(3):1265-1274.
Zhai X, Chakraborty DP. (2017) A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. doi: 10.1002/mp.12263:2207–2222.
Hillis SL, Chakraborty DP, Orton CG. ROC or FROC? It depends on the research question. Medical Physics. 2017.
Chakraborty DP, Nishikawa RM, Orton CG. Due to potential concerns of bias and conflicts of interest, regulatory bodies should not do evaluation methodology research related to their regulatory missions. Medical Physics. 2017.
Dobbins III JT, McAdams HP, Sabol JM, Chakraborty DP, et al. (2016) Multi-Institutional Evaluation of Digital Tomosynthesis, Dual-Energy Radiography, and Conventional Chest Radiography for the Detection and Management of Pulmonary Nodules. Radiology. 282(1):236-250.
Warren LM, Mackenzie A, Cooke J, et al. Effect of image quality on calcification detection in digital mammography. Medical Physics. 2012;39(6):3202-3213.
Chakraborty DP, Zhai X. On the meaning of the weighted alternative free-response operating characteristic figure of merit. Medical physics. 2016;43(5):2548-2557.
Chakraborty DP. (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples. Taylor-Francis, LLC.
Compute the chisquare goodness of fit statistic for ROC fitting model
Description
Compute the chisquare goodness of fit statistic for specified ROC data fitting model
Usage
ChisqrGoodnessOfFit(fpCounts, tpCounts, parameters, model, lesDistr)
Arguments
fpCounts |
The FP counts table |
tpCounts |
The TP counts table |
parameters |
The parameters of the model including cutoffs, see details |
model |
The fitting model: "BINORMAL", "CBM" or "RSM |
lesDistr |
The lesion distribution matrix; not needed for "BINORMAL" or "CBM" models. Array [1:maxLL,1:2]. The probability mass function of the lesion distribution for diseased cases. The first column contains the actual numbers of lesions per case. The second column contains the fraction of diseased cases with the number of lesions specified in the first column. The second column must sum to unity. |
Details
For model = "BINORMAL" the parameters are c(a,b,zetas). For model = "CBM" the parameters are c(mu,alpha,zetas). For model = "RSM" the parameters are c(mu,lambda,nu,zetas). Due to the sparsity of the data, in most cases the goodness of fit statistic cannot be calculated as the criterion of at least 5 counts in each cell (TP and FP) is usually not met. An exception dataset is shown below.
Value
The return value is a list with the following elements:
chisq |
The chi-square statistic |
pVal |
The p-value of the fit |
df |
The degrees of freedom |
Examples
## Test with TONY data for which chisqr can be calculated
ds <- DfFroc2Roc(dataset01)
fit <- FitBinormalRoc(ds, 2, 3) # trt 2 and rdr 3
## fitted a,b and zeta parameters from preceding line were used to call the
## function as shown below:
fpCounts = c(119, 30, 9, 19, 7, 1)
tpCounts = c(10, 11, 7, 16, 29, 16)
gfit = ChisqrGoodnessOfFit(fpCounts, tpCounts,
parameters = c(fit$a, fit$b, fit$zetas), model="BINORMAL")
gfit
Convert ratings arrays to an RJafroc dataset
Description
Converts ratings arrays, ROC or FROC, but not LROC, to an RJafroc dataset, thereby allowing the user to leverage the file I/O, plotting and analyses capabilities of RJafroc.
Usage
Df2RJafrocDataset(NL, LL, InputIsCountsTable = FALSE, ...)
Arguments
NL |
Non-lesion localizations array (or FP array for ROC data). |
LL |
Lesion localizations array (or TP array for ROC data). |
InputIsCountsTable |
If |
... |
Other elements of RJafroc dataset that may, depending on the
context, need to be specified. |
Details
The function "senses" the data type (ROC or FROC) from the the absence
or presence of perCase
.
ROC data can be
NL[1:K1]
andLL[1:K2]
orNL[1:I,1:J,1:K1]
andLL[1:I,1:J,1:K2]
.FROC data can be
NL[1:K1,1:maxNL]
andLL[1:K2, 1:maxLL]
orNL[1:I,1:J,1:K1,1:maxNL]
andLL[1:I,1:J,1:K2,1:maxLL]
.
Here maxNL/maxLL
= maximum numbers of NLs/LLs, per case, over entire dataset.
Equal weights are assigned to every lesion (FROC data).
Consecutive characters/integers starting with "1" are assigned to IDs
, modalityID
and readerID
.
Value
A dataset with the structure described in RJafroc-package
.
Examples
## Input as ratings arrays
set.seed(1);NL <- rnorm(5);LL <- rnorm(7)*1.5 + 2
dataset <- Df2RJafrocDataset(NL, LL)
## Input as counts tables
K1t <- c(30, 19, 8, 2, 1)
K2t <- c(5, 6, 5, 12, 22)
dataset <- Df2RJafrocDataset(K1t, K2t, InputIsCountsTable = TRUE)
Returns a binned dataset
Description
Bins continuous (i.e. floating point) or quasi-continuous (e.g. integers 0-100) ratings in a dataset and returns the corresponding binned dataset in which the ratings are integers 1, 2,...., with higher values representing greater confidence in presence of disease
Usage
DfBinDataset(dataset, desiredNumBins = 7, opChType)
Arguments
dataset |
The dataset to be binned, with structure as in |
desiredNumBins |
The desired number of bins. The default is 7. |
opChType |
The operating characteristic relevant to the binning operation:
|
Details
For small datasets the number of bins may be smaller than desiredNumBins
.
The algorithm needs to know the type of operating characteristic
relevant to the binning operation. For ROC the bins are FP and TP counts, for
FROC the bins are NL and LL counts, for AFROC the bins are FP and LL counts,
and for wAFROC the bins are FP and wLL counts. Binning is generally
employed prior to fitting a statistical model, e.g., maximum likelihood, to the data.
This version chooses ctffs so as to maximize empirical AUC (this yields a
unique choice of ctffs which gives the reader the maximum deserved credit).
Value
The binned dataset
References
Miller GA (1956) The Magical Number Seven, Plus or Minus Two: Some limits on our capacity for processing information, The Psychological Review 63, 81-97
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Examples
binned <- DfBinDataset(dataset02, desiredNumBins = 3, opChType = "ROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "ROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "AFROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "wAFROC")
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 1)
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 2)
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 3)
## etc.
## takes longer than 5 sec on OSX
dataset <- SimulateRocDataset(I = 2, J = 5, K1 = 50, K2 = 70, a = 1, b = 0.5, seed = 123)
datasetB <- DfBinDataset(dataset, desiredNumBins = 7, opChType = "ROC")
fomOrg <- as.matrix(UtilFigureOfMerit(dataset, FOM = "Wilcoxon"))
print(fomOrg)
fomBinned <- as.matrix(UtilFigureOfMerit(datasetB, FOM = "Wilcoxon"))
print(fomBinned)
cat("mean, sd = ", mean(fomOrg), sd(fomOrg), "\n")
cat("mean, sd = ", mean(fomBinned), sd(fomBinned), "\n")
Create paired dataset for testing FitCorCbm
Description
The paired dataset is generated using bivariate sampling; details are in referenced publication
Usage
DfCreateCorCbmDataset(
seed = 123,
K1 = 50,
K2 = 50,
desiredNumBins = 5,
muX = 1.5,
muY = 3,
alphaX = 0.4,
alphaY = 0.7,
rhoNor = 0.3,
rhoAbn2 = 0.8
)
Arguments
seed |
The seed variable, default is 123; set to NULL for truly random seed |
K1 |
The number of non-diseased cases, default is 50 |
K2 |
The number of diseased cases, default is 50 |
desiredNumBins |
The desired number of bins; default is 5 |
muX |
The CBM |
muY |
The CBM |
alphaX |
The CBM |
alphaY |
The CBM ‘alpha’ parameter in condition Y |
rhoNor |
The correlation of non-diseased case z-samples |
rhoAbn2 |
The correlation of diseased case z-samples, when disease is visible in both conditions |
Details
The ROC data is bined to 5 bins in each condition.
Value
The return value is the desired dataset, suitable for testing FitCorCbm
.
References
Zhai X, Chakraborty DP (2017) A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.
Examples
## seed <- 1
## this gives unequal numbers of bins in X and Y conditions for 50/50 dataset
dataset <- DfCreateCorCbmDataset()
## this takes very long time!! used to show asymptotic convergence of ML estimates
## dataset <- DfCreateCorCbmDataset(K1 = 5000, K2 = 5000)
Extract two arms of a pairing from an MRMC ROC dataset
Description
Extract a paired dataset from a larger dataset. The pairing could be two readers in the same treatment, or different readers in different treatments, or the same reader in different treatments. If necessary The data is binned to 5 bins in each condition.
Usage
DfExtractCorCbmDataset(dataset, trts = 1, rdrs = 1)
Arguments
dataset |
The original dataset from which the pairing is to be extracted |
trts |
A vector, maximum length 2, contains the indices of the treatment or treatments to be extracted |
rdrs |
A vector, maximum length 2, contains the indices of the reader or readers to be extracted |
Details
The desired pairing is contained in the vectors trts
and rdrs
.
If either has length one, the other must
have length two and the pairing is implicit. If both are length two, then the pairing
is that implied by the first treatement
and the second reader, which is one arm, and the other arm is that implied by the second
treatment paired with the first
reader. Using this method any allowed pairing can be extracted and analyzed by FitCorCbm
.
The utility of this software is
in designing a ratings simulator that is statistically matched to a real dataset.
Value
A new dataset in which the number of treatments is one and the number of readers is two
Examples
## Extract the paired data corresponding to the second and third readers in the first treatment
## from the included ROC dataset
dataset11_23 <- DfExtractCorCbmDataset(dataset05, trts = 1, rdrs = c(2,3))
## Extract the paired data corresponding to the third reader in the first and second treatments
dataset12_33 <- DfExtractCorCbmDataset(dataset05, trts = c(1,2), rdrs = 3)
## Extract the data corresponding to the first reader in the first
## treatment paired with the data
## from the third reader in the second treatment
## (the bin indices are at different positions in the two arrays)
dataset12_13 <- DfExtractCorCbmDataset(dataset05,
trts = c(1,2), rdrs = c(1,3))
Extract a subset of treatments and readers from a dataset
Description
Extract a dataset consisting of a subset of treatments/readers from a larger dataset
Usage
DfExtractDataset(dataset, trts, rdrs)
Arguments
dataset |
The original dataset from which the subset is to be extracted |
trts |
A vector contains the indices of the treatments to be extracted. If this parameter is not supplied, all treatments are extracted. |
rdrs |
A vector contains the indices of the readers to be extracted. If this parameter is not supplied, all readers are extracted. |
Details
Note that trts
and rdrs
are the vectors of indices
not IDs. For example, if the ID of the first reader is "0", the
corresponding value in trts
should be 1 not 0.
Value
A new dataset containing only the specified treatments and readers that were extracted from the original dataset
Examples
## Extract the data corresponding to the second reader in the
## first treatment from an included ROC dataset
ds1 <- DfExtractDataset(dataset05, trts = 1, rdrs = 2)
## Extract the data of the first and third reader in all
## treatment from the included ROC dataset
ds2 <- DfExtractDataset(dataset05, rdrs = c(1, 3))
Simulates an "AUC-equivalent" LROC dataset from an FROC dataset
Description
Simulates a multiple-treatment multiple-reader "AUC-equivalent" LROC dataset from a supplied FROC dataset.
Usage
DfFroc2Lroc(dataset)
Arguments
dataset |
The FROC dataset to be converted to LROC. |
Details
The FROC paradigm can have 0 or more marks per case. However, LROC is restricted to exactly one mark per case. For the NL array of the LROC data, for non-disesed cases, the highest rating of the FROC marks, or -Inf if there are no marks, is copied to case index k1 = 1 to k1 = K1 of the LROC dataset. For each diseased case, if the max LL rating exceeds the max NL rating, then the max LL rating is copied to the LL array, otherwise the max NL rating is copied to the LL_IL array. The max NL rating on each diseased case is then set to -Inf (since the LROC paradigm only allows one mark. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this function is to test the Significance testing functions using MRMC LROC datasets, which I currently don't have.
Value
The equivalent LROC dataset
Examples
lrocDataset <- DfFroc2Lroc(dataset05)
frocHrAuc <- UtilFigureOfMerit(dataset05, FOM = "HrAuc")
lrocWilcoxonAuc <- UtilFigureOfMerit(lrocDataset, FOM = "Wilcoxon")
## expect_equal(frocHrAuc, lrocWilcoxonAuc)
Convert an FROC dataset to an ROC dataset
Description
Convert an FROC dataset to a highest rating inferred ROC dataset
Usage
DfFroc2Roc(dataset)
Arguments
dataset |
The FROC dataset to be converted, |
Details
The first member of the ROC dataset is NL
, whose 3rd dimension has
length (K1 + K2)
, the total number of cases. Ratings of cases (K1 + 1)
through (K1 + K2)
are -Inf
. This is because in an ROC dataset
FPs are only possible on non-diseased cases.The second member of the list is LL
.
Its 3rd dimension has length K2, the number of diseased cases. This is
because TPs are only possible on diseased cases. For each case the
inferred ROC rating is the highest of all FROC ratings on that case. If a case has
no marks, a finite ROC rating, guaranteed to be smaller than the rating on
any marked case, is assigned to it. The dataset structure is shown below:
NL
Ratings array [1:I, 1:J, 1:(K1+K2), 1], of false positives, FPsLL
Ratings array [1:I, 1:J, 1:K2, 1], of true positives, TPsperCase
array [1:K2], number of lesions per diseased caseIDs
array [1:K2, 1], labels of lesions on diseased casesweights
array [1:K2, 1], weights (or clinical importances) of lesionsdataType
"ROC", the data typemodalityID
[1:I] inherited modality labelsreaderID
[1:J] inherited reader labels
Value
An ROC dataset with finite ratings in NL[,,1:K1,1] and LL[,,1:K2,1].
Examples
rocDataSet <- DfFroc2Roc(dataset05)
rocSpDataSet <- DfFroc2Roc(datasetFROCSpC)
## in the following example, because of the smaller number of cases,
## it is easy to see the process at work:
set.seed(1);K1 <- 3;K2 <- 5
mu <- 1;nu <- 0.5;lambda <- 2;zeta1 <- 0
lambda_i <- UtilRSM2Intrinsic(mu,lambda,nu)$lambda_i
nu_i <- UtilRSM2Intrinsic(mu,lambda,nu)$nu_i
Lmax <- 2;Lk2 <- floor(runif(K2, 1, Lmax + 1))
frocDataRaw <- SimulateFrocDataset(mu, lambda_i, nu_i, zeta1, I = 1, J = 1,
K1, K2, perCase = Lk2)
hrData <- DfFroc2Roc(frocDataRaw)
## print("frocDataRaw$ratings$NL[1,1,,] = ")
## print("hrData$ratings$NL[1,1,1:K1,] = ")
## print("frocDataRaw$ratings$LL[1,1,,] = ")
## print("hrData$ratings$LL[1,1,,] = ")
## following is the output
## [1] "frocDataRaw$ratings$NL[1,1,,] = "
## [,1] [,2] [,3] [,4]
## [1,] 2.4046534 0.7635935 -Inf -Inf
## [2,] -Inf -Inf -Inf -Inf
## [3,] 0.2522234 -Inf -Inf -Inf
## [4,] 0.4356833 -Inf -Inf -Inf
## [5,] -Inf -Inf -Inf -Inf
## [6,] -Inf -Inf -Inf -Inf
## [7,] -Inf -Inf -Inf -Inf
## [8,] 0.8041895 0.3773956 0.1333364 -Inf
## > ## print("hrData$ratings$NL[1,1,1:K1,] = ")
## [1] "hrData$ratings$NL[1,1,1:K1,] = "
## [1] 2.4046534 -Inf 0.2522234
## > ## print("frocDataRaw$ratings$LL[1,1,,] = ")
## [1] "frocDataRaw$ratings$LL[1,1,,] = "
## [,1] [,2]
## [1,] -Inf -Inf
## [2,] 1.5036080 -Inf
## [3,] 0.8442045 -Inf
## [4,] 1.0467262 -Inf
## [5,] -Inf -Inf
## > ## print("hrData$ratings$LL[1,1,,] = ")
## [1] "hrData$ratings$LL[1,1,,] = "
## [1] 0.4356833 1.5036080 0.8442045 1.0467262 0.8041895
## Note that rating of the first and the last diseased case came from NL marks
Simulates an "AUC-equivalent" FROC dataset from an LROC dataset
Description
Simulates a multiple-treatment multiple-reader "AUC-equivalent" FROC dataset from a supplied LROC dataset, e.g., datasetCadLroc.
Usage
DfLroc2Froc(dataset)
Arguments
dataset |
The LROC dataset to be converted to FROC. |
Details
The LROC paradigm always yields a single mark per case. Therefore the equivalent FROC will also have only one mark per case. The NL arrays of the two datasets are identical. The LL array is created by copying the LLCl array of the LROC dataset to the LL array of the FROC dataset, from diseased case index k2 = 1 to k2 = K2. Additionally, the LLIl array of the LROC dataset is copied to the NL array of the FROC dataset, starting at case index k1 = K1+1 to k1 = K1+K2. Any zero ratings are replace by -Infs. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this function is to test the CAD significance testing functions using CAD FROC datasets, which I currently don't have.
Value
The equivalent FROC dataset
Examples
frocDataset <- DfLroc2Froc(datasetCadLroc)
lrocAuc <- UtilFigureOfMerit(datasetCadLroc, FOM = "Wilcoxon")
frocHrAuc <- UtilFigureOfMerit(frocDataset, FOM = "HrAuc")
Convert an LROC dataset to a ROC dataset
Description
Converts an LROC dataset to an ROC dataset
Usage
DfLroc2Roc(dataset)
Arguments
dataset |
The LROC dataset to be converted. |
Details
For the diseased cases one takes the maximum rating on each diseased case, which could be a LL ("true positive" correct localization) or a LL_IL ("true positive" incorrect localization) rating, whichever has the higher rating. For non-diseased cases the NL arrays are identical.
Value
An ROC dataset
Examples
rocDataSet <- DfLroc2Roc(datasetCadLroc)
Read a crossed-treatment data file
Description
Read an crossed-treatment data file, in which the two treatment factors are crossed
Usage
DfReadCrossedModalities(fileName, sequentialNames = FALSE)
Arguments
fileName |
A string specifying the name of the file that contains the dataset, which must be an extended-JAFROC format data file containing an additional treatment factor. |
sequentialNames |
If |
Details
The data format is similar to the JAFROC format (see RJafroc-package
).
The difference is that there are two treatment factors. TBA For an example see ... add
reference to FROC book chapter https://dpc10ster.github.io/RJafrocFrocBook/
Value
A dataset with the specified structure, similar to a standard
RJafroc dataset (see RJafroc-package
). Because of the extra treatment factor,
NL
and LL
are each five dimensional arrays. There are also two
treatment IDS: modalityID1
and modalityID2
.
References
Thompson JD, Chakraborty DP, Szczepura K, et al. (2016) Effect of reconstruction methods and x-ray tube current-time product on nodule detection in an anthropomorphic thorax phantom: a crossed-treatment JAFROC observer study. Medical Physics. 43(3):1265-1274.
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Read a data file
Description
Read a disk file and create a ROC, FROC or LROC dataset object from it.
Usage
DfReadDataFile(
fileName,
format = "JAFROC",
newExcelFileFormat = FALSE,
lrocForcedMark = NA,
delimiter = ",",
sequentialNames = FALSE
)
Arguments
fileName |
A string specifying the name of the file. The file-extension must match the format specified below. |
format |
A string specifying the format of the data file.
It can be |
newExcelFileFormat |
Logical. Must be true to read LROC data.
This argument only applies to the |
lrocForcedMark |
Logical: For LROC dataset only: is a forced mark required
on every image? The default is |
delimiter |
The string delimiter to be used for the |
sequentialNames |
A logical variable: if |
Value
A dataset with the structure specified in RJafroc-package
.
Note
The "MRMC"
format is deprecated. For non-JAFROC formats four file
extensions (.csv
, .txt
, .lrc
and .imrmc
) are possible,
all of which are restricted to ROC data. Only the iMRMC
format is actively
supported, i.e, files with extension .imrmc
. Other formats (.csv
,
.txt
, .lrc
) are deprecated. Such files can still be read by this
function and then saved to a JAFROC format file for further analysis within this
package. For non-JAFROC data file formats, the readerID
and
modalityID
fields must be unique integers.
Examples
fileName <- system.file("extdata", "toyFiles/ROC/rocCr.xlsx",
package = "RJafroc", mustWork = TRUE)
rdrArr1D <- DfReadDataFile(fileName, newExcelFileFormat = TRUE)
fileName <- system.file("extdata", "Roc.xlsx",
package = "RJafroc", mustWork = TRUE)
RocDataXlsx <- DfReadDataFile(fileName)
fileName <- system.file("extdata", "RocData.csv",
package = "RJafroc", mustWork = TRUE)
RocDataCsv<- DfReadDataFile(fileName, format = "MRMC")
fileName <- system.file("extdata", "RocData.imrmc",
package = "RJafroc", mustWork = TRUE)
RocDataImrmc<- DfReadDataFile(fileName, format = "iMRMC")
fileName <- system.file("extdata", "Froc.xlsx",
package = "RJafroc", mustWork = TRUE)
FrocDataXlsx <- DfReadDataFile(fileName, sequentialNames = TRUE)
Save ROC dataset in different formats
Description
Save ROC dataset in other formats so it can be analyzed with alternate software
Usage
DfSaveDataFile(
dataset,
fileName,
format = "MRMC",
dataDescription = "RJafroc dataset converted to imrmc format"
)
Arguments
dataset |
The dataset to be saved. |
fileName |
The file name of the output data file. The extension
of the data file must match the corresponding format, see |
format |
The format of the data file, which can be |
dataDescription |
An optional string variable describing the data file, the
default value is the variable name of |
Examples
## DfSaveDataFile(dataset = dataset02,
## fileName = "rocData2.csv", format = "MRMC")
## DfSaveDataFile(dataset = dataset02,
## fileName = "rocData2.lrc", format = "MRMC",
## dataDescription = "ExampleROCdata1")
## DfSaveDataFile(dataset = dataset02,
## fileName = "rocData2.txt", format = "MRMC",
## dataDescription = "ExampleROCdata2")
## DfSaveDataFile(dataset = dataset02,
## fileName = "dataset05.imrmc", format = "iMRMC",
## dataDescription = "ExampleROCdata3")
Save dataset object as a JAFROC format Excel file
Description
Save a dataset object as a JAFROC format Excel file
Usage
DfWriteExcelDataFile(dataset, fileName)
Arguments
dataset |
The dataset object, see |
fileName |
The file name to save to; the extension of the data file must be .xlsx |
Examples
##DfWriteExcelDataFile(dataset = dataset05, fileName = "rocData2.xlsx")
Fit the binormal model to selected treatment and reader in an ROC dataset
Description
Fit the binormal model-predicted ROC curve for a dataset. This is the R equivalent of ROCFIT or RSCORE
Usage
FitBinormalRoc(dataset, trt = 1, rdr = 1)
Arguments
dataset |
The ROC dataset |
trt |
The desired treatment, default is 1 |
rdr |
The desired reader, default is 1 |
Details
In the binormal model ratings (more accurately the latent decision variables)
from diseased cases are sampled from N(a,1)
while ratings for
non-diseased cases are sampled from N(0,b^2)
. To avoid clutter error
bars are only shown for the lowest and uppermost operating points. An FROC
dataset is internally converted to a highest rating inferred ROC dataset. To
many bins containing zero counts will cause the algorithm to fail; so be sure
to bin the data appropriately to fewer bins, where each bin has at least one
count.
Value
The returned value is a list with the following elements:
a |
The mean of the diseased distribution; the non-diseased distribution is assumed to have zero mean |
b |
The standard deviation of the non-diseased distribution. The diseased distribution is assumed to have unit standard deviation |
zetas |
The binormal model cutoffs, zetas or thresholds |
AUC |
The binormal model fitted ROC-AUC |
StdAUC |
The standard deviation of AUC |
NLLIni |
The initial value of negative LL |
NLLFin |
The final value of negative LL |
ChisqrFitStats |
The chisquare goodness of fit results |
covMat |
The covariance matrix of the parameters |
fittedPlot |
A ggplot2 object containing the
fitted operating characteristic along with the empirical operating
points. Use |
References
Dorfman DD, Alf E (1969) Maximum-Likelihood Estimation of Parameters of Signal-Detection Theory and Determination of Confidence Intervals - Rating-Method Data, Journal of Mathematical Psychology 6, 487-496.
Grey D, Morgan B (1972) Some aspects of ROC curve-fitting: normal and logistic models. Journal of Mathematical Psychology 9, 128-139.
Examples
## Test with an included ROC dataset
retFit <- FitBinormalRoc(dataset02);## print(retFit$fittedPlot)
## Test with an included FROC dataset; it needs to be binned
## as there are more than 5 discrete ratings levels
binned <- DfBinDataset(dataset05, desiredNumBins = 5, opChType = "ROC")
retFit <- FitBinormalRoc(binned);## print(retFit$fittedPlot)
## Test with single interior point data
fp <- c(rep(1,7), rep(2, 3))
tp <- c(rep(1,5), rep(2, 5))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitBinormalRoc(dataset);## print(retFit$fittedPlot)
## Test with two interior data points
fp <- c(rep(1,7), rep(2, 5), rep(3, 3))
tp <- c(rep(1,3), rep(2, 5), rep(3, 7))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitBinormalRoc(dataset);## print(retFit$fittedPlot)
## Test with TONY data for which chisqr can be calculated
ds <- DfFroc2Roc(dataset01)
retFit <- FitBinormalRoc(ds, 2, 3);## print(retFit$fittedPlot)
retFit$ChisqrFitStats
## Test with included degenerate ROC data
retFit <- FitBinormalRoc(datasetDegenerate);## print(retFit$fittedPlot)
Fit the contaminated binormal model (CBM) to selected treatment and reader in an ROC dataset
Description
Fit the CBM-predicted ROC curve for specified treatment and reader
Usage
FitCbmRoc(dataset, trt = 1, rdr = 1)
Arguments
dataset |
The dataset containing the data |
trt |
The desired treatment, default is 1 |
rdr |
The desired reader, default is 1 |
Details
In CBM ratings from diseased cases are sampled from a mixture distribution
with two components: (1) distributed normal with mean mu
and unit
variance with integrated area alpha
, and (2) from a unit-normal
distribution with integrated area 1-alpha
. Ratings for non-diseased
cases are sampled from a unit-normal distribution. The
ChisqrFitStats
consists of a list containing the chi-square value,
the p-value and the degrees of freedom.
Value
The return value is a list with the following elements:
mu |
The mean of the visible diseased distribution (the non-diseased) has zero mean |
alpha |
The proportion of diseased cases where the disease is visible |
zetas |
The cutoffs, zetas or thresholds |
AUC |
The AUC of the fitted ROC curve |
StdAUC |
The standard deviation of AUC |
NLLIni |
The initial value of negative LL |
NLLFin |
The final value of negative LL |
ChisqrFitStats |
The chisquare goodness of fit results |
covMat |
The covariance matrix of the parameters |
fittedPlot |
A ggplot2 object containing the fitted
operating characteristic along with the empirical operating points.
Use |
Note
This algorithm is very robust, much more so than the binormal model.
References
Dorfman DD, Berbaum KS (2000) A contaminated binormal model for ROC data: Part II. A formal model, Acad Radiol, 7:6, 427–437.
Examples
## CPU time 8.7 sec on Ubuntu (#13)
## Test with included ROC data
retFit <- FitCbmRoc(dataset02);## print(retFit$fittedPlot)
## Test with included degenerate ROC data (yes! CBM can fit such data)
retFit <- FitCbmRoc(datasetDegenerate);## print(retFit$fittedPlot)
## Test with single interior point data
fp <- c(rep(1,7), rep(2, 3))
tp <- c(rep(1,5), rep(2, 5))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitCbmRoc(dataset);## print(retFit$fittedPlot)
## Test with two interior data points
fp <- c(rep(1,7), rep(2, 5), rep(3, 3))
tp <- c(rep(1,3), rep(2, 5), rep(3, 7))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitCbmRoc(dataset);
## print(retFit$fittedPlot)
## Test with included ROC data (some bins have zero counts)
retFit <- FitCbmRoc(dataset02, 2, 1);## print(retFit$fittedPlot)
## Test with TONY data for which chisqr can be calculated
ds <- DfFroc2Roc(dataset01)
retFit <- FitCbmRoc(ds, 2, 3);## print(retFit$fittedPlot)
retFit$ChisqrFitStats
Fit CORCBM to a paired ROC dataset
Description
Fit the Correlated Contaminated Binormal Model (CORCBM) to a paired ROC dataset. The ROC dataset has to be formatted as a single treatment, two-reader dataset, even though the actual pairing may be different, see details.
Usage
FitCorCbm(dataset)
Arguments
dataset |
A paired ROC dataset |
Details
The conditions (X, Y) can be two readers interpreting images in the same
treatment, the same reader interpreting images in different treatments, or
different readers interpreting images in 2 different treatments. Function
DfExtractCorCbmDataset
can be used to construct a dataset suitable for
FitCorCbm
. With reference to the returned values, and assuming R bins
in condition X and L bins in conditon Y,
FPCounts
is the R x L matrix containing the counts for non-diseased cases,
TPCounts
is the R x L matrix containing the counts for diseased cases;
muX
,muY
,alphaX
,alphaY
,rhoNor
,rhoAbn2
are
the CORCBM parameters; aucX
,aucX
are the AUCs in the two conditions;
stdAucX
,stdAucY
are the corresponding standard errors;stdErr
contains the standard errors of the parameters of the model; areaStat
,
areaPval
,covMat
are the area-statistic, the p-value and the covariance
matrix of the parameters. If a parameter approaches a limit, e.g., rhoNor
= 0.9999, it is held constant at near the limiting value and the covariance matrix
has one less dimension (along each edge) for each parameter that is held constant.
The indices of the parameters held fixed are in fitCorCbmRet$fixParam
.
Value
The return value is a list containing three objects:
fitCorCbmRet |
list( |
stats |
list( |
fittedPlot |
The fitted plot with operating points, error bars, for both conditions |
References
Zhai X, Chakraborty DP (2017) A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.
Fit the radiological search model (RSM) to an ROC dataset
Description
Fit an RSM-predicted ROC curve to a binned single-treatment single-reader ROC dataset
Usage
FitRsmRoc(binnedRocData, lesDistr, trt = 1, rdr = 1)
Arguments
binnedRocData |
The binned ROC dataset containing the data |
lesDistr |
The lesion distribution 1D array. |
trt |
The selected treatment, default is 1 |
rdr |
The selected reader, default is 1 |
Details
If dataset is FROC, first convert it to ROC, using DfFroc2Roc
. MLE ROC algorithms
require binned datasets. Use DfBinDataset
to perform the binning prior to calling
this function.
In the RSM: (1) The (random) number of latent NLs per case is Poisson distributed
with mean parameter lambda, and the corresponding ratings are sampled from
N(0,1)
. The (2) The (random) number of latent LLs per diseased case is
binomial distributed with success probability nu and trial size equal to
the number of lesions in the case, and the corresponding ratings are sampled from
N(mu
,1). (3) A latent NL or LL is actually marked if its rating exceeds
the lowest threshold zeta1. To avoid clutter error bars are only shown for the
lowest and uppermost operating points. Because of the extra parameter, and the
requirement to have five counts, the chi-square statistic often cannot be calculated.
Value
The return value is a list with the following elements:
mu |
The mean of the diseased distribution relative to the non-diseased one |
lambda |
The Poisson parameter describing the distribution of latent NLs per case |
nu |
The binomial success probability describing the distribution of latent LLs per diseased case |
zetas |
The RSM cutoffs, zetas or thresholds |
AUC |
The RSM fitted ROC-AUC |
StdAUC |
The standard deviation of AUC |
NLLIni |
The initial value of negative LL |
NLLFin |
The final value of negative LL |
ChisqrFitStats |
The chisquare goodness of fit results |
covMat |
The covariance matrix of the parameters |
fittedPlot |
A ggplot2 object containing the fitted
operating characteristic along with the empirical operating points.
Use |
References
Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm. Phys Med Biol 51, 3449-3462.
Chakraborty DP (2006) ROC Curves predicted by a model of visual search. Phys Med Biol 51, 3463–3482.
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Examples
## Test with included ROC data (some bins have zero counts)
lesDistr <- UtilLesionDistrVector(dataset02)
retFit <- FitRsmRoc(dataset02, lesDistr)
## print(retFit$fittedPlot)
## Test with included degenerate ROC data
lesDistr <- UtilLesionDistrVector(datasetDegenerate)
retFit <- FitRsmRoc(datasetDegenerate, lesDistr)
## Test with single interior point data
fp <- c(rep(1,7), rep(2, 3))
tp <- c(rep(1,5), rep(2, 5))
binnedRocData <- Df2RJafrocDataset(fp, tp)
lesDistr <- UtilLesionDistrVector(binnedRocData)
retFit <- FitRsmRoc(binnedRocData, lesDistr)
## Test with two interior data points
fp <- c(rep(1,7), rep(2, 5), rep(3, 3))
tp <- c(rep(1,3), rep(2, 5), rep(3, 7))
binnedRocData <- Df2RJafrocDataset(fp, tp)
lesDistr <- UtilLesionDistrVector(binnedRocData)
retFit <- FitRsmRoc(binnedRocData, lesDistr)
## Test with three interior data points
fp <- c(rep(1,12), rep(2, 5), rep(3, 3), rep(4, 5)) #25
tp <- c(rep(1,3), rep(2, 5), rep(3, 7), rep(4, 10)) #25
binnedRocData <- Df2RJafrocDataset(fp, tp)
lesDistr <- UtilLesionDistrVector(binnedRocData)
retFit <- FitRsmRoc(binnedRocData, lesDistr)
## test for TONY data, i = 2 and j = 3
## only case permitting chisqure calculation
lesDistr <- UtilLesionDistrVector(dataset01)
rocData <- DfFroc2Roc(dataset01)
retFit <- FitRsmRoc(rocData, lesDistr, trt = 2, rdr = 3)
## print(retFit$fittedPlot)
retFit$ChisqrFitStats
Plot binormal fit
Description
Plot the binormal-predicted ROC curve with provided parameters
Usage
PlotBinormalFit(a, b)
Arguments
a |
vector: the mean(s) of the diseased distribution(s). |
b |
vector: the standard deviations(s) of the diseased distribution(s). |
Details
a
and b
must have the same length. The predicted ROC curve
for each a
and b
pair will be plotted.
Value
A ggplot2 object of the plotted ROC curve(s) are returned.
Use print
function to display the saved object.
Examples
binormalPlot <- PlotBinormalFit(c(1, 2), c(0.5, 0.5))
## print(binormalPlot)
Plot CBM fitted curve
Description
Plot the CBM-predicted ROC curve with provided CBM parameters
Usage
PlotCbmFit(mu, alpha)
Arguments
mu |
vector: the mean(s) of the z-samples of the diseased distribution(s) where the disease is visible |
alpha |
vector: the proportion(s) of the diseased distribution(s) where the disease is visible |
Details
mu
and alpha
must have equal length.
The predicted ROC curve for each mu
and alpha
pair will be plotted.
Value
A ggplot2 object of the plotted ROC curve(s)
References
Dorfman DD, Berbaum KS (2000) A contaminated binormal model for ROC data: Part II. A formal model, Acad Radiol 7, 427–437.
Examples
cbmPlot <- PlotCbmFit(c(1, 2), c(0.5, 0.5))
## print(cbmPlot)
Plot empirical operating characteristics, ROC, FROC or LROC
Description
Plot empirical operating characteristics (operating points connected by straight lines) for specified modalities and readers, or, if desired, plots (no operating points) averaged over specified modalities and / or readers.
Usage
PlotEmpiricalOperatingCharacteristics(
dataset,
trts = 1,
rdrs = 1,
opChType,
legend.position = c(0.8, 0.3),
maxDiscrete = 10
)
Arguments
dataset |
Dataset object. |
trts |
List or vector: integer indices of modalities to be plotted. Default is 1. |
rdrs |
List or vector: integer indices of readers to be plotted. Default is 1. |
opChType |
Type of operating characteristic to be plotted:
|
legend.position |
Where to position the legend. The default is c(0.8, 0.2), i.e., 0.8 rightward and 0.2 upward (the plot is a unit square). |
maxDiscrete |
maximum number of op. points in order to be considered discrete and to be displayed by symbols and connecting lines; any more points will be regarded as continuous and only connected by lines; default is 10. |
Details
The trts
and rdrs
are vectors or lists of integer
indices, not the corresponding string IDs. For example, if the string
ID of the first reader is "0", the value in rdrs
should be
1 not 0. The legend will display the string IDs.
If both of trts
and rdrs
are vectors, all combinations of modalities
and readers are plotted. See Example 1.
If both trts
and rdrs
are lists
, they must have the same length.
Only the combination of modality and reader at the same position in their
respective lists are plotted. If some elements of the modalities and / or
readers lists are vectors, the average operating characteristic over the
implied modalities and / or readers are plotted. See Example 2.
For LROC
datasets, opChType
can be "ROC" or "LROC".
Value
A ggplot2 object containing the operating characteristic plot(s) and a data frame containing the points defining the operating characteristics.
Plot |
ggplot2 object. For continuous or averaged data, operating characteristics curves are plotted without showing operating points. For binned (individual) data, both operating points and connecting lines are shown. To avoid clutter, if there are more than 20 operating points, they are not shown. |
Points |
Data frame with four columns: abscissa, ordinate,
|
Examples
## Example 1
## Plot individual empirical ROC plots for all combinations of modalities
## 1 and 2 and readers 1, 2 and 3. Six operating characteristics are plotted.
ret <- PlotEmpiricalOperatingCharacteristics(dataset =
dataset02, trts = c(1:2), rdrs = c(1:3), opChType = "ROC")
## print(ret$Plot)
## Example 2
## Empirical wAFROC plots, consisting of
## three sub-plots:
## (1) sub-plot, red, with operating points, for the 1st modality (string ID "1") and the 2nd
## reader (string ID "3"), labeled "M:1 R:3"
## (2) sub-plot, green, no operating points, for the 2nd modality (string ID "2") AVERAGED
## over the 2nd and 3rd readers (string IDs "3" and "4"), labeled "M:2 R: 3 4"
## (3) sub-plot, blue, no operating points, AVERAGED over the first two modalities
## (string IDs "1" and "2") AND over the 1st, 2nd and 3rd readers
## (string IDs "1", "3" and "4"), labeled "M: 1 2 R: 1 3 4"
plotT <- list(1, 2, c(1:2))
plotR <- list(2, c(2:3), c(1:3))
ret <- PlotEmpiricalOperatingCharacteristics(dataset = dataset04, trts = plotT,
rdrs = plotR, opChType = "wAFROC")
## print(ret$Plot)
## Example 3
## Correspondences between indices and string identifiers for modalities and
## readers in this dataset (apparently reader "2" did not complete the study).
## names(dataset04$descriptions$readerID)
## [1] "1" "3" "4" "5"
RSM predicted operating characteristics, ROC pdfs and AUCs
Description
Visualize RSM predicted ROC, AFROC, wAFROC and FROC curves, and ROC pdfs, given equal-length arrays of search model parameters: mu, lambda, nu and zeta1.
Usage
PlotRsmOperatingCharacteristics(
mu,
lambda,
nu,
zeta1,
lesDistr = 1,
relWeights = 0,
OpChType = "ALL",
legendPosition = c(1, 0),
legendDirection = "horizontal",
legendJustification = c(0, 1),
nlfRange = NULL,
llfRange = NULL,
nlfAlpha = NULL
)
Arguments
mu |
Array: the RSM mu parameter. |
lambda |
Array: the RSM lambda parameter. |
nu |
Array: the RSM nu parameter. |
zeta1 |
Array, the lowest reporting threshold; if missing the default is an array of -3s. |
lesDistr |
Array: the probability mass function of the lesion distribution for diseased cases. The default is 1. See UtilLesionDistrVector. |
relWeights |
The relative weights of the lesions; a vector of
length equal to |
OpChType |
The type of operating characteristic desired: can be " |
legendPosition |
The positioning of the legend: " |
legendDirection |
Allows control on the direction of the legend;
|
legendJustification |
Where to position the legend, default is bottom right corner c(0,1) |
nlfRange |
This applies to FROC plot only. The x-axis range, e.g., c(0,2),
for FROC plot. Default is " |
llfRange |
This applies to FROC plot only. The y-axis range, e.g., c(0,1),
for FROC plot. Default is " |
nlfAlpha |
Upper limit of the integrated area under the FROC plot.
Default is " |
Details
RSM is the Radiological Search Model described in the book. This
function is vectorized with respect to the first 4 arguments. For
lesDistr
the sum must be one. To indicate that all dis. cases
contain 4 lesions, set lesDistr = c(0,0,0,1).
Value
A list of elements containing five ggplot2 objects (ROCPlot, AFROCPlot wAFROCPlot, FROCPlot and PDFPlot) and two area measures (each of which can have up to two elements), the area under the search model predicted ROC curves in up to two treatments, the area under the search model predicted AFROC curves in up to two treatments, the area under the search model predicted wAFROC curves in up to two treatments, the area under the search model predicted FROC curves in up to two treatments.
ROCPlot
The predicted ROC plotsAFROCPlot
The predicted AFROC plotswAFROCPlot
The predicted wAFROC plotsFROCPlot
The predicted FROC plotsPDFPlot
The predicted ROC pdf plots, highest rating generatedaucROC
The predicted ROC AUCs, highest rating generatedaucAFROC
The predicted AFROC AUCsaucwAFROC
The predicted wAFROC AUCsaucFROC
The predicted FROC AUCs
References
Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449-3462.
Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.
Chakraborty, DP, Yoon, HJ (2008) Operating characteristics predicted by models for diagnostic tasks involving lesion localization, Med Phys, 35:2, 435.
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples (CRC Press, Boca Raton, FL). https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Examples
## Following example is for mu = 2, lambda = 1, nu = 0.6, in one treatment and
## mu = 3, lambda = 1.5, nu = 0.8, in the other treatment. 20% of the diseased
## cases have a single lesion, 40% have two lesions, 10% have 3 lesions,
## and 30% have 4 lesions.
PlotRsmOperatingCharacteristics(mu = c(2, 3), lambda = c(1, 1.5), nu = c(0.6, 0.8),
lesDistr = c(0.2, 0.4, 0.1, 0.3), legendPosition = "bottom")
RSM predicted FROC ordinate
Description
RSM predicted FROC ordinate
Usage
RSM_LLF(z, mu, nu)
Arguments
z |
The z-sample value at which to evaluate the FROC ordinate. |
mu |
The RSM mu parameter. |
nu |
The RSM nu prime parameter. |
Value
yFROC
Examples
RSM_LLF(1,1,0.5)
RSM predicted FROC abscissa
Description
RSM predicted FROC abscissa
Usage
RSM_NLF(z, lambda)
Arguments
z |
The z-sample value at which to evaluate the FROC abscissa. |
lambda |
The RSM lambda parameter. |
Value
xFROC
Examples
RSM_NLF(1,1)
RSM predicted ROC-rating pdf for diseased cases
Description
RSM predicted ROC-rating pdf for diseased cases
Usage
RSM_pdfD(z, mu, lambda, nu, lesDistr)
Arguments
z |
The z-sample value at which to evaluate the pdf. |
mu |
The mu parameter of the RSM. |
lambda |
The RSM lambda parameter. |
nu |
The RSM nu parameter. |
lesDistr |
The lesion distribution 1D vector. |
Value
Examples
lesDistr <- c(0.5, 0.5)
RSM_pdfD(1,1,1,0.9, lesDistr)
lesDistr <- c(0.2, 0.3, 0.5)
RSM_pdfD(1,1,1,0.5, lesDistr)
RSM predicted ROC-rating pdf for non-diseased cases
Description
RSM predicted ROC-rating pdf for non-diseased cases
Usage
RSM_pdfN(z, lambda)
Arguments
z |
The z-sample value at which to evaluate the pdf. |
lambda |
The (physical) lambda parameter of the RSM. |
Value
Examples
RSM_pdfN(1,1)
RSM predicted wAFROC ordinate
Description
RSM predicted wAFROC ordinate
Usage
RSM_wLLF(zeta, mu, nu, lesDistr, relWeights)
Arguments
zeta |
The zeta value at which to evaluate the FROC ordinate. |
mu |
The RSM mu parameter. |
nu |
The RSM nu prime parameter. |
lesDistr |
Lesion distribution vector. |
relWeights |
Relative lesion weights vector. |
Value
wLLF
Examples
RSM_wLLF(1,1,0.5, lesDistr = c(0.5, 0.4, 0.1), relWeights = c(0.7, 0.2, 0.1))
RSM predicted ROC-abscissa as function of z
Description
RSM predicted ROC-abscissa as function of z
Usage
RSM_xROC(z, lambda)
Arguments
z |
The z-sample at which to evaluate the ROC-abscissa. |
lambda |
The (physical) lambda parameter of the RSM. |
Value
xROC, the abscissa of the ROC
Examples
RSM_xROC(c(-Inf,0.1,0.2,0.3),1)
RSM predicted ROC-ordinate as function of z
Description
RSM predicted ROC-ordinate as function of z
Usage
RSM_yROC(z, mu, lambda, nu, lesDistr)
Arguments
z |
The z-sample value at which to evaluate the pdf. |
mu |
The mu parameter (of the RSM). |
lambda |
The RSM lambda parameter. |
nu |
The RSM nu parameter. |
lesDistr |
The 1D lesion distribution vector. |
Value
yROC, the ordinate of the ROC
Examples
lesDistr <- c(0.1,0.3,0.6)
RSM_yROC(c(-Inf,0.1,0.2,0.3), 1, 1, 0.9, lesDistr)
Simulate paired binned data for testing FitCorCbm
Description
Simulates single treatment 2-reader binned ROC dataset, simulated according to the CORCBM model,
for the purpose of testing the fitting program FitCorCbm
.
Usage
SimulateCorCbmDataset(
seed = 123,
K1 = 50,
K2 = 50,
desiredNumBins = 5,
muX = 1.5,
muY = 3,
alphaX = 0.4,
alphaY = 0.7,
rhoNor = 0.3,
rhoAbn2 = 0.8
)
Arguments
seed |
The seed variable, default is 123; set to NULL for truly random seed |
K1 |
The number of non-diseased cases, default is 50 |
K2 |
The number of diseased cases, default is 50 |
desiredNumBins |
The desired number of bins; default is 5 |
muX |
The CBM mu parameter in condition X |
muY |
The CBM mu parameter in condition Y |
alphaX |
The CBM alpha parameter in condition X |
alphaY |
The CBM alpha parameter in condition Y |
rhoNor |
The correlation of non-diseased case z-samples |
rhoAbn2 |
The correlation of diseased case z-samples, when disease is visible in both conditions |
Details
X and Y refer to the two arms of the pairing. muX
and alphaX
refer to the univariate CBM parameters
in condition X, rhoNor
is the correlation of ratings of non-diseased cases and rhoAbn2
is the correlation of ratings of
diseased cases when disease is visible in both conditions. The ROC data is bined to 5 bins in each condition.
See referenced publication.
Value
The return value is the desired dataset, suitable for testing FitCorCbm
References
Zhai X, Chakraborty DP (2017) A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.
Examples
dataset <- SimulateCorCbmDataset()
## this takes very long
## dataset <- SimulateCorCbmDataset(K1 = 5000, K2 = 5000)
Simulates an MRMC uncorrelated FROC dataset using the RSM
Description
Simulates an uncorrelated MRMC FROC dataset for specified numbers of readers and treatments
Usage
SimulateFrocDataset(mu, lambda, nu, zeta1, I, J, K1, K2, perCase, seed = NULL)
Arguments
mu |
The mu parameter of the RSM |
lambda |
The RSM lambda parameter |
nu |
The RSM nu parameter |
zeta1 |
The lowest reporting threshold |
I |
The number of treatments |
J |
The number of readers |
K1 |
The number of non-diseased cases |
K2 |
The number of diseased cases |
perCase |
A K2 length array containing the numbers of lesions per diseased case |
seed |
The initial seed for the random number generator, the default
is |
Details
See book chapters on the Radiological Search Model (RSM) for details. In this code correlations between ratings on the same case are assumed to be zero.
Value
The return value is an FROC dataset.
References
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Examples
set.seed(1)
K1 <- 5;K2 <- 7;
maxLL <- 2;perCase <- floor(runif(K2, 1, maxLL + 1))
mu <- 1;lambda <- 1;nu <- 0.99 ;zeta1 <- -1
I <- 2; J <- 5
frocDataRaw <- SimulateFrocDataset(
mu = mu, lambda = lambda, nu = nu, zeta1 = zeta1,
I = I, J = J, K1 = K1, K2 = K2, perCase = perCase )
## plot the data
ret <- PlotEmpiricalOperatingCharacteristics(frocDataRaw, opChType = "FROC")
## print(ret$Plot)
Simulates an "AUC-equivalent" FROC dataset from an LROC dataset
Description
Simulates a multiple-treatment multiple-reader "AUC-equivalent" FROC dataset from a supplied LROC dataset, e.g., datasetCadLroc.
Usage
SimulateFrocFromLrocDataset(dataset)
Arguments
dataset |
The LROC dataset to be converted to FROC. |
Details
The LROC paradigm always yields a single mark per case. Therefore the equivalent FROC will also have only one mark per case. The NL arrays of the two datasets are identical. The LL array is created by copying the LLCl array of the LROC dataset to the LL array of the FROC dataset, from diseased case index k2 = 1 to k2 = K2. Additionally, the LLIl array of the LROC dataset is copied to the NL array of the FROC dataset, starting at case index k1 = K1+1 to k1 = K1+K2. Any zero ratings are replace by -Infs. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this function is to test the CAD significance testing functions using CAD FROC datasets, which I currently don't have.
Value
The equivalent FROC dataset
Examples
frocDataset <- SimulateFrocFromLrocDataset(datasetCadLroc)
lrocAuc <- UtilFigureOfMerit(datasetCadLroc, FOM = "Wilcoxon")
frocHrAuc <- UtilFigureOfMerit(frocDataset, FOM = "HrAuc")
testthat::expect_equal(lrocAuc, frocHrAuc)
Simulates an uncorrelated FLROC FrocDataset using the RSM
Description
Simulates an uncorrelated LROC dataset for specified numbers of readers and treatments
Usage
SimulateLrocDataset(mu, lambda, nu, zeta1, I, J, K1, K2, lesionVector)
Arguments
mu |
The mu parameter of the RSM |
lambda |
The RSM lambda parameter |
nu |
The RSM nu parameter |
zeta1 |
The lowest reporting threshold |
I |
The number of treatments |
J |
The number of readers |
K1 |
The number of non-diseased cases |
K2 |
The number of diseased cases |
lesionVector |
A K2 length array containing the numbers of lesions per diseased case |
Details
See book chapters on the Radiological Search Model (RSM) for details. The approach is to first simulate an FROC dataset and then convert it to an Lroc dataset. The correlations between FROC ratings on the same case are assumed to be zero.
Value
The return value is an LROC dataset.
References
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Examples
set.seed(1)
K1 <- 5
K2 <- 5
mu <- 2
lambda <- 1
lesionVector <- rep(1, 5)
nu <- 0.8
zeta1 <- -3
frocData <- SimulateFrocDataset(mu, lambda, nu, zeta1, I = 2, J = 5, K1, K2, lesionVector)
lrocData <- DfFroc2Lroc(frocData)
Simulates a binormal model ROC dataset
Description
Simulates an uncorrelated binormal model ROC factorial dataset
Usage
SimulateRocDataset(I = 1, J = 1, K1, K2, a, b, seed = NULL)
Arguments
I |
The number of modalities, default is 1 |
J |
The number of readers, default is 1 |
K1 |
The number of non-diseased cases |
K2 |
The number of diseased cases |
a |
The |
b |
The |
seed |
The initial seed, default is NULL, which results in a random seed |
Details
See book Chapter 6 for details
Value
An ROC dataset
References
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Examples
K1 <- 5;K2 <- 7;a <- 1.5;b <- 0.5
rocDataRaw <- SimulateRocDataset(K1 = K1, K2 = K2, a = a, b = b)
RSM fitted model for FROC sample size
Description
RSM fitted model for FROC sample size
Usage
SsFrocNhRsmModel(dataset, lesDistr)
Arguments
dataset |
The pilot dataset object representing a NH (ROC or FROC) dataset. |
lesDistr |
A 1D array containing the probability mass function of number of lesions per diseased case in the pivotal FROC study. |
Details
If dataset is FROC, it is converted to an ROC dataset. The dataset is automatically binned. The search model is used to fit each treatment-reader combination. The median value for each parameter is computed and returned by the function (3 values). These are used to compute predicted wAFROC and ROC FOMS over a range of values of deltaMu, which are fitted by a straight line constrained to pass through the origin. The scale factor and R2 are returned. The scaling factor is the value by which the ROC effect size must be multiplied to get the wAFROC effect size. See https://dpc10ster.github.io/RJafrocQuickStart/froc-sample-size.html for vignettes explaining the FROC sample size estimation procedure.
Value
A list containing:
-
mu
, the mu parameter of the NH model. -
lambda
, the lambda parameter of the NH model. -
nu
, the nu parameter of the NH model. -
scaleFactor
, the scaling factor that multiplies the ROC effect size to get wAFROC effect size. -
R2
, the R2 of the fit.
Examples
## Examples with CPU or elapsed time > 5s
## user system elapsed
## SsFrocNhRsmModel 8.102 0.023 8.135
## SsFrocNhRsmModel(DfExtractDataset(dataset04, trts = c(1,2)), c(0.69, 0.2, 0.11))
Statistical power for specified numbers of readers and cases
Description
Calculate the statistical power for specified numbers of readers J, cases K, analysis method and DBM or OR variances components
Usage
SsPowerGivenJK(
dataset,
...,
FOM,
J,
K,
effectSize = NULL,
method = "OR",
analysisOption = "RRRC",
LegacyCode = FALSE,
alpha = 0.05
)
Arguments
dataset |
The pilot dataset. If set to NULL then variance components must be supplied. |
... |
Optional variance components. These are needed if |
FOM |
The figure of merit |
J |
The number of readers in the pivotal study. |
K |
The number of cases in the pivotal study. |
effectSize |
The effect size to be used in the pivotal study. Default is NULL, which uses the observed effect size in the pilot dataset. Must be supplied if dataset is set to NULL and variance components are supplied. |
method |
"OR" (the default) or "DBM" (but see |
analysisOption |
Desired generalization, "RRRC" (the default), "FRRC", "RRFC" or "ALL". RRFC = random reader fixed case, etc. |
LegacyCode |
Logical, defaults to |
alpha |
The significance level, default is 0.05. |
Details
The default effectSize
uses the observed effect size in the
pilot study. A numeric value over-rides the default value. This argument
must be supplied if dataset = NULL and variance compenents
(the ... arguments) are supplied.
Value
The expected statistical power.
Note
The procedure is valid for ROC studies only; for FROC studies see Vignettes 19.
References
Hillis SL, Berbaum KS (2004). Power Estimation for the Dorfman-Berbaum-Metz Method. Acad Radiol, 11, 1260–1273.
Hillis SL, Obuchowski NA, Berbaum KS (2011). Power Estimation for Multireader ROC Methods: An Updated and Unified Approach. Acad Radiol, 18, 129–142.
Hillis SL, Schartz KM (2018). Multireader sample size program for diagnostic studies: demonstration and methodology. Journal of Medical Imaging, 5(04).
Examples
## EXAMPLE 1: RRRC power
## specify 2-treatment ROC dataset and force DBM alg.
SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05,
J = 6, K = 251, method = "DBM", LegacyCode = TRUE) # RRRC is default
## EXAMPLE 1A: FRRC power
SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05,
J = 6, K = 251, method = "DBM", LegacyCode = TRUE, analysisOption = "FRRC")
## EXAMPLE 1B: RRFC power
SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05,
J = 6, K = 251, method = "DBM", LegacyCode = TRUE, analysisOption = "RRFC")
## EXAMPLE 2: specify NULL dataset & DBM var. comp. & force DBM-based alg.
vcDBM <- UtilVarComponentsDBM(dataset02, FOM = "Wilcoxon")$VarCom
SsPowerGivenJK(dataset = NULL, FOM = "Wilcoxon", J = 6, K = 251,
effectSize = 0.05, method = "DBM", LegacyCode = TRUE,
list(
VarTR = vcDBM["VarTR","Estimates"], # replace rhs with actual values as in 4A
VarTC = vcDBM["VarTC","Estimates"], # do:
VarErr = vcDBM["VarErr","Estimates"])) # do:
## EXAMPLE 3: specify 2-treatment ROC dataset and use OR-based alg.
SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05,
J = 6, K = 251)
## EXAMPLE 4: specify NULL dataset & OR var. comp. & use OR-based alg.
JStar <- length(dataset02$ratings$NL[1,,1,1])
KStar <- length(dataset02$ratings$NL[1,1,,1])
vcOR <- UtilORVarComponentsFactorial(dataset02, FOM = "Wilcoxon")$VarCom
SsPowerGivenJK(dataset = NULL, FOM = "Wilcoxon", effectSize = 0.05, J = 6,
K = 251, list(JStar = JStar, KStar = KStar,
VarTR = vcOR["VarTR","Estimates"], # replace rhs with actual values as in 4A
Cov1 = vcOR["Cov1","Estimates"], # do:
Cov2 = vcOR["Cov2","Estimates"], # do:
Cov3 = vcOR["Cov3","Estimates"], # do:
Var = vcOR["Var","Estimates"]))
## EXAMPLE 4A: specify NULL dataset & OR var. comp. & use OR-based alg.
SsPowerGivenJK(dataset = NULL, FOM = "Wilcoxon", effectSize = 0.05, J = 6,
K = 251, list(JStar = 5, KStar = 114,
VarTR = 0.00020040252,
Cov1 = 0.00034661371,
Cov2 = 0.00034407483,
Cov3 = 0.00023902837,
Var = 0.00080228827))
## EXAMPLE 5: specify NULL dataset & DBM var. comp. & use OR-based alg.
## The DBM var. comp. are converted internally to OR var. comp.
vcDBM <- UtilVarComponentsDBM(dataset02, FOM = "Wilcoxon")$VarCom
KStar <- length(dataset02$ratings$NL[1,1,,1])
SsPowerGivenJK(dataset = NULL, J = 6, K = 251, effectSize = 0.05,
method = "DBM", FOM = "Wilcoxon",
list(KStar = KStar, # replace rhs with actual values as in 5A
VarR = vcDBM["VarR","Estimates"], # do:
VarC = vcDBM["VarC","Estimates"], # do:
VarTR = vcDBM["VarTR","Estimates"], # do:
VarTC = vcDBM["VarTC","Estimates"], # do:
VarRC = vcDBM["VarRC","Estimates"], # do:
VarErr = vcDBM["VarErr","Estimates"]))
## EXAMPLE 5A: specify NULL dataset & DBM var. comp. & use OR-based alg.
SsPowerGivenJK(dataset = NULL, J = 6, K = 251, effectSize = 0.05,
method = "DBM", FOM = "Wilcoxon",
list(KStar = 114,
VarR = 0.00153499935,
VarC = 0.02724923428,
VarTR = 0.00020040252,
VarTC = 0.01197529621,
VarRC = 0.01226472859,
VarErr = 0.03997160319))
Power given J, K and Dorfman-Berbaum-Metz variance components
Description
Power given J, K and Dorfman-Berbaum-Metz variance components
Usage
SsPowerGivenJKDbmVarCom(
J,
K,
effectSize,
VarTR,
VarTC,
VarErr,
alpha = 0.05,
analysisOption = "RRRC"
)
Arguments
J |
The number of readers |
K |
The number of cases |
effectSize |
The effect size |
VarTR |
The treatment-reader DBM variance component |
VarTC |
The treatment-case DBM variance component |
VarErr |
The error-term DBM variance component |
alpha |
The size of the test (default = 0.05) |
analysisOption |
The desired generalization ("RRRC", "FRRC", "RRFC", "ALL") |
Details
The variance components are obtained using StSignificanceTesting
with method = "DBM"
.
Value
A list object containing the estimated power and associated statistics for each desired generalization.
Examples
VarCom <- StSignificanceTesting(dataset02, FOM = "Wilcoxon", method = "DBM",
analysisOption = "RRRC")$ANOVA$VarCom
VarTR <- VarCom["VarTR",1]
VarTC <- VarCom["VarTC",1]
VarErr <- VarCom["VarErr",1]
ret <- SsPowerGivenJKDbmVarCom (J = 5, K = 100, effectSize = 0.05, VarTR,
VarTC, VarErr, analysisOption = "RRRC")
cat("RRRC power = ", ret$powerRRRC)
Power given J, K and Obuchowski-Rockette variance components
Description
Power given J, K and Obuchowski-Rockette variance components
Usage
SsPowerGivenJKOrVarCom(
J,
K,
KStar,
effectSize,
VarTR,
Cov1,
Cov2,
Cov3,
Var,
alpha = 0.05,
analysisOption = "RRRC"
)
Arguments
J |
The number of readers in the pivotal study |
K |
The number of cases in the pivotal study |
KStar |
The number of cases in the pilot study |
effectSize |
The effect size |
VarTR |
The treatment-reader OR variance component |
Cov1 |
The OR Cov1 covariance |
Cov2 |
The OR Cov2 covariance |
Cov3 |
The OR Cov3 covariance |
Var |
The OR pure variance term |
alpha |
The size of the test (default = 0.05) |
analysisOption |
The desired generalization ("RRRC", "FRRC", "RRFC", "ALL") |
Details
The variance components are obtained using StSignificanceTesting
with method = "OR"
.
Value
A list object containing the estimated power and associated statistics for each desired generalization.
Examples
dataset <- dataset02 ## the pilot study
KStar <- length(dataset$ratings$NL[1,1,,1])
VarCom <- StSignificanceTesting(dataset, FOM = "Wilcoxon",
method = "OR", analysisOption = "RRRC")$ANOVA$VarCom
VarTR <- VarCom["VarTR",1]
Cov1 <- VarCom["Cov1",1]
Cov2 <- VarCom["Cov2",1]
Cov3 <- VarCom["Cov3",1]
Var <- VarCom["Var",1]
ret <- SsPowerGivenJKOrVarCom (J = 5, K = 100, KStar = KStar,
effectSize = 0.05, VarTR, Cov1, Cov2, Cov3, Var, analysisOption = "RRRC")
cat("RRRC power = ", ret$powerRRRC)
Generate a power table using the OR method
Description
Generate combinations of numbers of readers J and numbers of cases K for desired power and specified generalization(s)
Usage
SsPowerTable(
dataset,
FOM,
effectSize = NULL,
alpha = 0.05,
desiredPower = 0.8,
analysisOption = "RRRC"
)
Arguments
dataset |
The pilot ROC dataset to be used to extrapolate to the pivotal study. |
FOM |
The figure of merit. |
effectSize |
The effect size to be used in the pivotal study,
default value is |
alpha |
The The size of the test, default is 0.05. |
desiredPower |
The desired statistical power, default is 0.8. |
analysisOption |
Desired generalization, "RRRC" (the default), "FRRC", "RRFC" or "ALL". |
Details
The default effectSize
uses the observed effect size in the
pilot study. A supplied numeric value over-rides the default value.
Value
A list containing up to 3 (depending on analysisOption
) dataframes.
Each dataframe contains 3 arrays:
numReaders |
The numbers of readers in the pivotal study. |
numCases |
The numbers of cases in the pivotal study. |
power |
The estimated statistical powers. |
Note
The procedure is valid for ROC studies only; for FROC studies see Vignettes 19.
Examples
## Examples with CPU or elapsed time > 5s
## user system elapsed
## SsPowerTable 20.033 0.037 20.077
## Example of sample size calculation with OR method
## SsPowerTable(dataset02, FOM = "Wilcoxon", method = "OR")
Number of cases, for specified number of readers, to achieve desired power
Description
Number of cases to achieve the desired power, for specified number of readers J, and specified DBM or ORH analysis method
Usage
SsSampleSizeKGivenJ(
dataset,
...,
J,
FOM,
effectSize = NULL,
method = "OR",
alpha = 0.05,
desiredPower = 0.8,
analysisOption = "RRRC",
LegacyCode = FALSE
)
Arguments
dataset |
The pilot dataset. If set to NULL then variance components must be supplied. |
... |
Optional variance components, VarTR, VarTC and VarErr. These are needed if dataset is not supplied. |
J |
The number of readers in the pivotal study. |
FOM |
The figure of merit. Not needed if variance components are supplied. |
effectSize |
The effect size to be used in the pivotal study. Default is NULL. Must be supplied if dataset is set to NULL and variance components are supplied. |
method |
"OR" (default) or "DBM". |
alpha |
The significance level of the study, default is 0.05. |
desiredPower |
The desired statistical power, default is 0.8. |
analysisOption |
Desired generalization, "RRRC", "FRRC", "RRFC" or "ALL" (the default). |
LegacyCode |
Logical, default is |
Details
effectSize
= NULL uses the observed effect size in
the pilot study. A numeric value over-rides the default value. This
argument must be supplied if dataset = NULL and variance compenents
(the optional ... arguments) are supplied.
Value
A list of two elements:
K |
The minimum number of cases K in the pivotal study
to just achieve the desired statistical power, calculated
for each value of |
power |
The predicted statistical power. |
Note
The procedure is valid for ROC studies only; for FROC studies see Vignettes 19.
Examples
## the following two should give identical results
SsSampleSizeKGivenJ(dataset02, FOM = "Wilcoxon", effectSize = 0.05, J = 6, method = "DBM")
a <- UtilVarComponentsDBM(dataset02, FOM = "Wilcoxon")$VarCom
SsSampleSizeKGivenJ(dataset = NULL, J = 6, effectSize = 0.05, method = "DBM", LegacyCode = TRUE,
list(VarTR = a["VarTR",1],
VarTC = a["VarTC",1],
VarErr = a["VarErr",1]))
## the following two should give identical results
SsSampleSizeKGivenJ(dataset02, FOM = "Wilcoxon", effectSize = 0.05, J = 6, method = "OR")
a <- UtilORVarComponentsFactorial(dataset02, FOM = "Wilcoxon")$VarCom
KStar <- length(dataset02$ratings$NL[1,1,,1])
SsSampleSizeKGivenJ(dataset = NULL, J = 6, effectSize = 0.05, method = "OR",
list(KStar = KStar,
VarTR = a["VarTR",1],
Cov1 = a["Cov1",1],
Cov2 = a["Cov2",1],
Cov3 = a["Cov3",1],
Var = a["Var",1]))
for (J in 6:10) {
ret <- SsSampleSizeKGivenJ(dataset02, FOM = "Wilcoxon", J = J, analysisOption = "RRRC")
message("# of readers = ", J, " estimated # of cases = ", ret$K,
", predicted power = ", signif(ret$powerRRRC,3), "\n")
}
Performs DBM or OR significance testing for factorial or split-plot A,C datasets
Description
Performs Dorfman-Berbaum-Metz (DBM) or Obuchowski-Rockette (OR)
significance testing, for specified dataset;
significance testing refers to analysis designed to assign a P-value,
and other statistics, for
rejecting the null hypothesis (NH) that the reader-averaged
figure of merit (FOM) differences between treatments is zero. The results of
the analysis are best visualized in the text or
Excel-formatted files produced by UtilOutputReport
.
Usage
StSignificanceTesting(
dataset,
FOM,
FPFValue = 0.2,
alpha = 0.05,
method = "DBM",
covEstMethod = "jackknife",
nBoots = 200,
analysisOption = "ALL",
tempOrgCode = FALSE
)
Arguments
dataset |
The dataset to be analyzed, see |
FOM |
The figure of merit, see |
FPFValue |
Only needed for |
alpha |
The significance level of the test of the null hypothesis that all treatment effects are zero; the default is 0.05 |
method |
The significance testing method to be used:
|
covEstMethod |
The covariance matrix estimation method
in
|
nBoots |
The number of bootstraps (defaults to 200), relevant only if
|
analysisOption |
Determines which factors are regarded as random vs. fixed:
|
tempOrgCode |
default FALSE; if TRUE, then code from version 0.0.1 of RJafroc is used (see RJafroc_0.0.1.tar). This is intended to check against errors that crept in subsequent to the original version as I attempted to improve the organization of the code and the output. As implicit in the name of this temporary flag, it will eventually be removed. |
Value
For method = "DBM"
the returned list contains 4 dataframes:
FOMs |
Contains |
ANOVA |
Contains |
RRRC |
Contains results of |
FRRC |
Contains results of |
RRFC |
Contains results of |
For method = "OR"
the return list contains 4 dataframes:
FOMs |
Contains |
ANOVA |
Contains |
RRRC |
Contains results of |
FRRC |
Contains results of |
RRFC |
Contains results of |
References
Dorfman DD, Berbaum KS, Metz CE (1992) ROC characteristic rating analysis: Generalization to the Population of Readers and Patients with the Jackknife method, Invest. Radiol. 27, 723-731.
Obuchowski NA, Rockette HE (1995) Hypothesis Testing of the Diagnostic Accuracy for Multiple Diagnostic Tests: An ANOVA Approach with Dependent Observations, Communications in Statistics: Simulation and Computation 24, 285-308.
Hillis SL (2014) A marginal-mean ANOVA approach for analyzing multireader multicase radiological imaging data, Statistics in medicine 33, 330-360.
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Examples
StSignificanceTesting(dataset02,FOM = "Wilcoxon", method = "DBM")
StSignificanceTesting(dataset02,FOM = "Wilcoxon", method = "OR")
## following is split-plot-c analysis using a simulated split-plot-c dataset
StSignificanceTesting(datasetFROCSpC, FOM = "wAFROC", method = "OR")
StSignificanceTesting(dataset05, FOM = "wAFROC")
StSignificanceTesting(dataset05, FOM = "HrAuc", method = "DBM")
StSignificanceTesting(dataset05, FOM = "SongA1", method = "DBM")
StSignificanceTesting(dataset05, FOM = "SongA2", method = "DBM")
StSignificanceTesting(dataset05, FOM = "wAFROC1", method = "DBM")
StSignificanceTesting(dataset05, FOM = "AFROC1", method = "DBM")
StSignificanceTesting(dataset05, FOM = "AFROC", method = "DBM")
Significance testing: standalone CAD vs. radiologists
Description
Comparing standalone CAD vs. at least two radiologists interpreting the same cases; standalone CAD means that all the designer-level mark-rating pairs generated by the CAD algorithm are available to the analyst, not just the one or two marks per case displayed to the radiologist (the latter are marks whose ratings exceed a pre-selected threshold). At the very minimum, location-level information, such as in the LROC paradigm, should be used. Ideally, the FROC paradigm should be used. A severe statistical power penalty is paid if one uses the ROC paradigm. See Standalone CAD vs Radiologists chapter, available via download link at site https://github.com/dpc10ster/RJafrocBook/blob/gh-pages/RJafrocBook.pdf
Usage
StSignificanceTestingCadVsRad(
dataset,
FOM,
FPFValue = 0.2,
method = "1T-RRRC",
alpha = 0.05,
plots = FALSE
)
Arguments
dataset |
The dataset to be analyzed; must be single-treatment at least three readers, where the first reader is CAD. |
FOM |
The desired FOM; for ROC data it must be |
FPFValue |
Only needed for |
method |
The desired analysis: "1T-RRFC","1T-RRRC" (the default) or "2T-RRRC", see manuscript for details. |
alpha |
Significance level of the test, defaults to 0.05. |
plots |
Flag, default is FALSE, i.e., a plot is not displayed. If TRUE, it displays the appropriate operating characteristic for all readers and CAD. |
Details
PCL is the probability of a correct localization.
The LROC is the plot of PCL (ordinate) vs. FPF.
For LROC data, FOM = "PCL" means the interpolated PCL value at the specified
FPFValue
.For FOM = "ALROC" the trapezoidal area under the LROC from FPF = 0 to FPF =
FPFValue
is used.If
method = "1T-RRRC"
the first reader is assumed to be CAD.If
method = "2T-RRRC"
the first treatment is assumed to be CAD.The NH is that the FOM of CAD equals the average of the readers.
The
method = "1T-RRRC"
analysis uses an adaptation of the single-treatment multiple-reader Obuchowski Rockette (OR) model described in a paper by Hillis (2007), section 5.3. It is characterized by 3 parametersVarR
,Var
andCov2
, where the latter two are estimated using the jackknife.For
method = "2T-RRRC"
the analysis replicates the CAD data as many times as necessary so as to form one "treatment" of an MRMC pairing, the other "treatment" being the radiologists. Then standard ORH analysis is applied. The method is described in Kooi et al. It gives exactly the same final results (F-statistic, ddf and p-value) as"1T-RRRC"
but the intermediate quantities are meaningless.
Value
If method = "1T-RRRC"
the return value is a
list with the following elements:
fomCAD |
The observed FOM for CAD. |
fomRAD |
The observed FOM array for the readers. |
avgRadFom |
The average FOM of the readers. |
avgDiffFom |
The mean of the difference FOM, RAD - CAD. |
ciAvgDiffFom |
The 95-percent CI of the average difference, RAD - CAD. |
varR |
The variance of the radiologists. |
varError |
The variance of the error term in the single-treatment multiple-reader OR model. |
cov2 |
The covariance of the error term. |
tstat |
The observed value of the t-statistic; it's square is equivalent to an F-statistic. |
df |
The degrees of freedom of the t-statistic. |
pval |
The p-value for rejecting the NH. |
Plots |
If argument plots = TRUE, a ggplot object
containing empirical operating characteristics
corresponding to specified FOM. For example, if |
If method = "2T-RRRC"
the return value is a list
with the following elements:
fomCAD |
The observed FOM for CAD. |
fomRAD |
The observed FOM array for the readers. |
avgRadFom |
The average FOM of the readers. |
avgDiffFom |
The mean of the difference FOM, RAD - CAD. |
ciDiffFom |
A data frame containing the statistics associated with the average difference, RAD - CAD. |
ciAvgRdrEachTrt |
A data frame containing the statistics associated with the average FOM in each "treatment". |
varR |
The variance of the pure reader term in the OR model. |
varTR |
The variance of the treatment-reader term error term in the OR model. |
cov1 |
The covariance1 of the error term - same reader, different treatments. |
cov2 |
The covariance2 of the error term - different readers, same treatment. |
cov3 |
The covariance3 of the error term - different readers, different treatments. |
varError |
The variance of the pure error term in the OR model. |
FStat |
The observed value of the F-statistic. |
ndf |
The numerator degrees of freedom of the F-statistic. |
df |
The denominator degrees of freedom of the F-statistic. |
pval |
The p-value for rejecting the NH. |
Plots |
see above. |
References
Hillis SL (2007) A comparison of denominator degrees of freedom methods for multiple observer ROC studies, Statistics in Medicine. 26:596-619.
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Hupse R, Samulski M, Lobbes M, et al (2013) Standalone computer-aided detection compared to radiologists performance for the detection of mammographic masses, Eur Radiol. 23(1):93-100.
Kooi T, Gubern-Merida A, et al. (2016) A comparison between a deep convolutional neural network and radiologists for classifying regions of interest in mammography. Paper presented at: International Workshop on Digital Mammography, Malmo, Sweden.
Examples
ret1M <- StSignificanceTestingCadVsRad (dataset09,
FOM = "Wilcoxon", method = "1T-RRRC")
StSignificanceTestingCadVsRad(datasetCadLroc,
FOM = "Wilcoxon", method = "1T-RRFC")
retLroc1M <- StSignificanceTestingCadVsRad (datasetCadLroc,
FOM = "PCL", method = "1T-RRRC", FPFValue = 0.05)
## test with fewer readers
dataset09a <- DfExtractDataset(dataset09, rdrs = seq(1:7))
ret1M7 <- StSignificanceTestingCadVsRad (dataset09a,
FOM = "Wilcoxon", method = "1T-RRRC")
datasetCadLroc7 <- DfExtractDataset(datasetCadLroc, rdrs = seq(1:7))
ret1MLroc7 <- StSignificanceTestingCadVsRad (datasetCadLroc7,
FOM = "PCL", method = "1T-RRRC", FPFValue = 0.05)
## takes longer than 5 sec on OSX
## retLroc2M <- StSignificanceTestingCadVsRad (datasetCadLroc,
## FOM = "PCL", method = "2T-RRRC", FPFValue = 0.05)
## ret2MLroc7 <- StSignificanceTestingCadVsRad (datasetCadLroc7,
## FOM = "PCL", method = "2T-RRRC", FPFValue = 0.05)
Perform significance testing using crossed treatments analysis
Description
Performs ORH analysis for specified crossed treatments dataset averaged over specified treatment factor
Usage
StSignificanceTestingCrossedModalities(
ds,
avgIndx,
FOM = "wAFROC",
alpha = 0.05,
analysisOption = "ALL"
)
Arguments
ds |
The crossed treatments dataset |
avgIndx |
The index of the treatment to be averaged over |
FOM |
|
alpha |
|
analysisOption |
Value
The return list contains the same items with StSignificanceTesting
.
Examples
## read the built in dataset
retCrossed2 <- StSignificanceTestingCrossedModalities(datasetCrossedModality, 1)
RSM ROC/AFROC/wAFROC AUC calculator
Description
Returns the ROC, AFROC and wAFROC AUCs corresponding to
specified RSM parameters. See also UtilAucPROPROC
,
UtilAucBinormal
and UtilAucCBM
Usage
UtilAnalyticalAucsRSM(mu, lambda, nu, zeta1 = -Inf, lesDistr, relWeights = 0)
Arguments
mu |
The mean of the Gaussian distribution for the ratings of latent LLs (continuous ratings of lesions that are found by the search mechanism). The NLs are assumed to be distributed as N(0,1). |
lambda |
The RSM lambda parameter. |
nu |
The RSM nu parameters. |
zeta1 |
The lowest reporting threshold, the default is |
lesDistr |
The lesion distribution 1D array, i.e., the probability mass function (pmf) of the numbers of lesions for diseased cases. |
relWeights |
The relative weights of the lesions; a vector of
length |
Value
A list containing the ROC, AFROC and wAFROC AUCs corresponding to the specified parameters
References
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449-3462.
Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.
Examples
mu <- 1;lambda <- 1;nu <- 0.9
lesDistr <- c(0.9, 0.1)
## i.e., 90% of dis. cases have one lesion, and 10% have two lesions
relWeights <- c(0.05, 0.95)
## i.e., lesion 1 has weight 5 percent while lesion two has weight 95 percent
UtilAnalyticalAucsRSM(mu, lambda, nu, zeta1 = -Inf, lesDistr)
UtilAnalyticalAucsRSM(mu, lambda, nu, zeta1 = -Inf, lesDistr, relWeights)
Binormal model AUC function
Description
Returns the Binormal model ROC-AUC corresponding to
specified parameters. See also UtilAnalyticalAucsRSM
, UtilAucPROPROC
and UtilAucCBM
Usage
UtilAucBinormal(a, b)
Arguments
a |
The |
b |
The |
Value
Binormal model-predicted ROC-AUC
References
Dorfman DD, Alf E (1969) Maximum-Likelihood Estimation of Parameters of Signal-Detection Theory and Determination of Confidence Intervals - Rating-Method Data, Journal of Mathematical Psychology. 6:487-496.
Examples
a <- 2;b <- 0.7
UtilAucBinormal(a,b)
CBM AUC function
Description
Returns the CBM ROC-AUC
See also UtilAnalyticalAucsRSM
, UtilAucPROPROC
and UtilAucBinormal
Usage
UtilAucCBM(mu, alpha)
Arguments
mu |
The |
alpha |
The |
Value
CBM-predicted ROC-AUC for the specified parameters
References
Dorfman DD, Berbaum KS (2000) A contaminated binormal model for ROC data: Part II. A formal model, Acad Radiol 7:6 427–437.
Examples
mu <- 2;alpha <- 0.8
UtilAucCBM(mu,alpha)
PROPROC AUC function
Description
Returns the PROPROC ROC-AUC corresponding to specified
parameters. See also UtilAnalyticalAucsRSM
, UtilAucBinormal
and UtilAucCBM
Usage
UtilAucPROPROC(c1, da)
Arguments
c1 |
The c-parameter of the PROPROC model, since c is a reserved function in R. |
da |
The da-parameter of the PROPROC model. |
Value
PROPROC model-predicted ROC-AUC for the specified parameters
References
Metz CE, Pan X (1999) Proper Binormal ROC Curves: Theory and Maximum-Likelihood Estimation, J Math Psychol 43(1):1-33.
Examples
c1 <- .2;da <- 1.5
UtilAucPROPROC(c1,da)
Convert from DBM to OR variance components
Description
UtilDBM2ORVarCom
converts from DBM variance components to OR
variance components
Usage
UtilDBM2ORVarCom(K, DBMVarCom)
Arguments
K |
Total number of cases |
DBMVarCom |
DBM variance components, a data.frame containing VarR, VarC, VarTR, VarTC, VarRC and VarErr |
Value
UtilDBM2ORVarCom
returns the equivalent OR Variance components
Examples
DBMVarCom <- StSignificanceTesting(dataset02, FOM = "Wilcoxon", method = "DBM")$ANOVA$VarCom
UtilDBM2ORVarCom(114, DBMVarCom)
ORVarCom <- StSignificanceTesting(dataset02, FOM = "Wilcoxon", method = "OR")$ANOVA$VarCom
UtilOR2DBMVarCom(114, ORVarCom)
Calculate empirical figures of merit (FOMs) for specified dataset
Description
Calculate the specified empirical figure of merit for each treatment-reader combination in the ROC, FROC, ROI or LROC dataset
Usage
UtilFigureOfMerit(dataset, FOM = "wAFROC", FPFValue = 0.2)
Arguments
dataset |
The dataset to be analyzed, |
FOM |
The figure of merit; the default is |
FPFValue |
Only needed for |
Details
The allowed FOMs depend on the dataType
field of the
dataset
object.
For dataset$descriptions$design = "SPLIT-PLOT-C"
, end-point based
FOMs (e.g., "MaxLLF") are not allowed.
For dataset$descriptions$type = "ROC"
only FOM = "Wilcoxon"
is allowed.
For dataset$descriptions$type = "FROC"
the following FOMs are allowed:
-
FOM = "AFROC1"
(use only if zero normal cases) -
FOM = "AFROC"
-
FOM = "wAFROC1"
(use only if zero normal cases) -
FOM = "wAFROC"
(the default) -
FOM = "HrAuc"
-
FOM = "SongA1"
-
FOM = "SongA2"
-
FOM = "HrSe"
(an example of an end-point based FOM) -
FOM = "HrSp"
(another example) -
FOM = "MaxLLF"
(do:) -
FOM = "MaxNLF"
(do:) -
FOM = "MaxNLFAllCases"
(do:) -
FOM = "ExpTrnsfmSp"
"MaxLLF"
, "MaxNLF"
and "MaxNLFAllCases"
correspond to ordinate, and abscissa, respectively, of the highest point
on the FROC operating characteristic obtained by counting all the marks.
The "ExpTrnsfmSp"
FOM is described in the paper by Popescu.
Given the large number of FOMs possible with FROC data, it is appropriate
to make a recommendation: it is recommended that one use the wAFROC FOM
whenever possible.
For dataType = "ROI"
dataset only FOM = "ROI"
is allowed.
For dataType = "LROC"
dataset the following FOMs are allowed:
-
FOM = "Wilcoxon"
for ROC data inferred from LROC data -
FOM = "PCL"
the probability of correct localization at specifiedFPFValue
-
FOM = "ALROC"
the area under the LROC from zero to specifiedFPFValue
FPFValue
The FPF at which to evaluate PCL
or ALROC
;
the default is 0.2; only needed for LROC data.
Value
An c(I, J)
dataframe, where the row names are modalityID
's of the
treatments and column names are the readerID
's of the readers.
References
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Chakraborty DP, Berbaum KS (2004) Observer studies involving detection and localization: modeling, analysis, and validation, Medical Physics, 31(8), 1–18.
Song T, Bandos AI, Rockette HE, Gur D (2008) On comparing methods for discriminating between actually negative and actually positive subjects with FROC type data, Medical Physics 35 1547–1558.
Popescu LM (2011) Nonparametric signal detectability evaluation using an exponential transformation of the FROC curve, Medical Physics, 38(10), 5690.
Obuchowski NA, Lieber ML, Powell KA (2000) Data Analysis for Detection and Localization of Multiple Abnormalities with Application to Mammography, Acad Radiol, 7:7 553–554.
Swensson RG (1996) Unified measurement of observer performance in detecting and localizing target objects on images, Med Phys 23:10, 1709–1725.
Examples
UtilFigureOfMerit(dataset02, FOM = "Wilcoxon") # ROC data
UtilFigureOfMerit(DfFroc2Roc(dataset01), FOM = "Wilcoxon") # FROC dataset, converted to ROC
UtilFigureOfMerit(dataset01) # FROC dataset, default wAFROC FOM
UtilFigureOfMerit(datasetCadLroc, FOM = "Wilcoxon") #LROC data
UtilFigureOfMerit(datasetCadLroc, FOM = "PCL") #LROC data
UtilFigureOfMerit(datasetCadLroc, FOM = "ALROC") #LROC data
UtilFigureOfMerit(datasetROI, FOM = "ROI") #ROI data
# these are meant to illustrate conditions which will throw an error
## UtilFigureOfMerit(dataset02, FOM = "wAFROC") #error
## UtilFigureOfMerit(dataset01, FOM = "Wilcoxon") #error
Convert from intrinsic to physical RSM parameters
Description
Convert intrinsic RSM parameters lambda_i
and nu_i
correspond to the physical RSM parameters lambda_i'
and nu_i'
. The physical parameters are more meaningful but they
depend on mu
. The intrinsic parameters are independent of
mu
. See book for details.
Usage
UtilIntrinsic2RSM(mu, lambda_i, nu_i)
Arguments
mu |
The mean of the Gaussian distribution for the ratings of latent
LLs, i.e. continuous ratings of lesions that were found by the search
mechanism ~ N( |
lambda_i |
The intrinsic Poisson lambda_i parameter. |
nu_i |
The intrinsic Binomial nu_i parameter. |
Details
RSM is the Radiological Search Model described in the book.
See also UtilRSM2Intrinsic
.
Value
A list containing \lambda
and \nu
References
Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449–3462.
Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Examples
mu <- 2;lambda_i <- 20;nu_i <- 1.1512925
lambda <- UtilIntrinsic2RSM(mu, lambda_i, nu_i)$lambda
nu <- UtilIntrinsic2RSM(mu, lambda_i, nu_i)$nu
## note that the physical values are only constrained to be positive, but the physical variable nu
## must obey 0 <= nu <= 1
Get the lesion distribution vector of a dataset
Description
The lesion distribution vector for a dataset.
Usage
UtilLesionDistrVector(dataset)
Arguments
dataset |
A dataset object. |
Details
Two characteristics of an FROC dataset, apart from ratings,
affect the FOM: the distribution of lesions per case and the distribution
of lesion weights. This function addresses the distribution of lesions per
case. The distribution of weights is addressed in UtilLesionWeightsMatrix.
lesDistr
is a 1D-array containing the fraction of unique
values of lesions per diseased case in the dataset. For ROC or LROC data
this vector is c(1)
, since all diseased cases contain one lesion.
For FROC data the length of the vector equals the maximum number of lesions
per diseased case. The first entry is the fraction of dis. cases containing
one lesion, the second entry is the fraction of dis. cases containing two lesions, etc.
See PlotRsmOperatingCharacteristics for a function that depends on
lesDistr
.
Value
lesDistr The 1D lesion distribution array.
Examples
UtilLesionDistrVector (dataset01) # FROC dataset ## [1] 0.93258427 0.06741573
UtilLesionDistrVector (dataset02) # ROC dataset ## 1
Determine lesion weights distribution 2D matrix
Description
Determine the lesion weights distribution 2D matrix of a dataset or manually specify the lesion weights distribution 2D matrix.
Usage
UtilLesionWeightsMatrixDataset(dataset, relWeights = 0)
UtilLesionWeightsMatrixLesDistr(lesDistr, relWeights = 0)
Arguments
dataset |
A dataset object. |
relWeights |
The relative weights of the lesions: a unit sum vector of
length equal to the maximum number of lesions per dis. case. For example,
|
lesDistr |
A unit sum vector of length equal to the maximum number of
lesions per diseased case, specifying the relative frequency of lesions
per dis. case in the dataset. For example, |
Details
Two characteristics of an FROC dataset, apart from the
ratings, affect the FOM: the distribution of lesion per case and the
distribution of lesion weights. This function addresses the weights.
The distribution of lesions is addressed in UtilLesionDistrVector.
See
PlotRsmOperatingCharacteristics for a function that depends on
lesWghtDistr
.
The underlying assumption is that lesion 1 is the same type across all
diseased cases, lesion 2 is the same type across all diseased cases,
..., etc. This allows assignment of weights independent of the case index.
Value
lesWghtDistr The 2D lesion weights distribution matrix. The first
column enumerates the number of lesions per case, while the remaining
columns contain the weights.
Missing values are filled with -Inf
. Not to be
confused with the lesionWeight
list member in an FROC dataset,
which enumerates the weights of lesions on individual cases.
Examples
UtilLesionWeightsMatrixDataset (dataset01) # FROC data
## [,1] [,2] [,3]
##[1,] 1 1.0 -Inf
##[2,] 2 0.5 0.5
UtilLesionWeightsMatrixDataset (dataset02) # ROC data
## [,1] [,2]
##[1,] 1 1
## Example 1: dataset with 1 to 4 lesions per case, with frequency as per first argument
UtilLesionWeightsMatrixLesDistr (c(0.6, 0.2, 0.1, 0.1), c(0.2, 0.4, 0.1, 0.3))
## [,1] [,2] [,3] [,4] [,5]
##[1,] 1 1.0000000 -Inf -Inf -Inf
##[2,] 2 0.3333333 0.6666667 -Inf -Inf
##[3,] 3 0.2857143 0.5714286 0.1428571 -Inf
##[4,] 4 0.2000000 0.4000000 0.1000000 0.3
## Explanation
##> c(0.2)/sum(c(0.2))
##[1] 1 ## (weights for cases with 1 lesion)
##> c(0.2, 0.4)/sum(c(0.2, 0.4))
##[1] 0.3333333 0.6666667 ## (weights for cases with 2 lesions)
##> c(0.2, 0.4, 0.1)/sum(c(0.2, 0.4, 0.1))
##[1] 0.2857143 0.5714286 0.1428571 ## (weights for cases with 3 lesions)
##> c(0.2, 0.4, 0.1, 0.3)/sum(c(0.2, 0.4, 0.1, 0.3))
##[1] 0.2000000 0.4000000 0.1000000 0.3 ## (weights for cases with 4 lesions)
## Example2 : dataset with *no* cases with 3 lesions per case
UtilLesionWeightsMatrixLesDistr (c(0.1, 0.7, 0.0, 0.2), c(0.4, 0.3, 0.2, 0.1))
## [,1] [,2] [,3] [,4]
##[1,] 1 1.0000000 -Inf -Inf
##[2,] 2 0.5714286 0.4285714 -Inf
##[3,] 4 0.5000000 0.3750000 0.125
## Explanation: note that row with 3 lesions per case does not occur
##> c(0.4)/sum(c(0.4))
##[1] 1 ## (weights for cases with 1 lesion)
##> c(0.4, 0.3)/sum(c(0.4, 0.3))
##[1] 0.5714286 0.4285714 ## (weights for cases with 2 lesions)
##> c(0.4, 0.3, 0.1)/sum(c(0.4, 0.3, 0.1))
##[1] 0.500 0.375 0.125 ## (weights for cases with 4 lesions)
Calculate mean squares for factorial dataset
Description
Calculates the mean squares used in the DBM and ORH methods for factorial dataset
Usage
UtilMeanSquares(dataset, FOM = "Wilcoxon", FPFValue = 0.2, method = "DBM")
Arguments
dataset |
The dataset to be analyzed, see |
FOM |
The figure of merit to be used in the calculation. The default
is |
FPFValue |
Only needed for |
method |
The method, in which the mean squares are calculated. The two
valid choices are |
Details
For DBM
method, msT, msTR, msTC, msTRC
will not be available
if the dataset contains only one treatment. Similarly,
msR, msTR, msRC, msTRC
will not be returned for single reader dataset.
For ORH
method, msT, msR, msTR
will be returned for multiple
reader multiple treatment dataset. msT
is not available for single
treatment dataset, and msR
is not available for single reader dataset.
Value
A list containing all possible mean squares
Examples
UtilMeanSquares(dataset02, FOM = "Wilcoxon")
UtilMeanSquares(dataset05, FOM = "wAFROC", method = "OR")
Convert from OR to DBM variance components
Description
UtilOR2DBMVarCom
converts from OR to DBM variance components.
Usage
UtilOR2DBMVarCom(K, ORVarCom)
Arguments
K |
Total number of cases |
ORVarCom |
OR variance components, a data.frame containing VarR, VarTR, Cov1, Cov2, Cov3 and Var |
Value
UtilOR2DBMVarCom
returns the equivalent DBM variance components
Examples
DBMVarCom <- StSignificanceTesting(dataset02, FOM = "Wilcoxon", method = "DBM")$ANOVA$VarCom
UtilDBM2ORVarCom(114, DBMVarCom)
ORVarCom <- StSignificanceTesting(dataset02, FOM = "Wilcoxon", method = "OR")$ANOVA$VarCom
UtilOR2DBMVarCom(114, ORVarCom)
Utility for estimating Obuchowski-Rockette variance components for factorial datasets
Description
Utility for estimating Obuchowski-Rockette variance components for factorial datasets
Usage
UtilORVarComponentsFactorial(
dataset,
FOM,
FPFValue = 0.2,
covEstMethod = "jackknife",
nBoots = 200,
seed = NULL
)
Arguments
dataset |
The factorial dataset object |
FOM |
The figure of merit |
FPFValue |
Only needed for |
covEstMethod |
The covariance estimation method, "jackknife" (the default) or "bootstrap" or "DeLong" (DeLongt is applicable only for FOM = Wilcoxon). |
nBoots |
Only needed for bootstrap covariance estimation method. The number of bootstraps, defaults to 200. |
seed |
Only needed for the bootstrap covariance estimation method. The initial
seed for the random number generator, the default is |
Details
The variance components are obtained using StSignificanceTesting
with method = "OR"
.
Value
A list object containing the following data.frames
:
foms
: the figures of merit for different treatment-reader combinationsTRanova
: the OR treatment-reader ANOVA tableVarCom
: the OR variance-componentsCov1
,Cov2
,Cov3
,Var
and correlationsrho1
,rho2
andrho3
IndividualTrt
: the individual treatment mean-squares,Var
andCov2
valuesIndividualRdr
: the individual reader mean-squares,Var
andCov1
values
Examples
## use the default jackknife for covEstMethod
vc <- UtilORVarComponentsFactorial(dataset02, FOM = "Wilcoxon")
str(vc)
UtilORVarComponentsFactorial(dataset02, FOM = "Wilcoxon",
covEstMethod = "bootstrap", nBoots = 2000, seed = 100)$VarCom
UtilORVarComponentsFactorial(dataset02, FOM = "Wilcoxon", covEstMethod = "DeLong")$VarCom
Generate a text formatted report file or an Excel file
Description
Generates a formatted report of the analysis and saves it to a text or an Excel file
Usage
UtilOutputReport(
dataset,
ReportFileBaseName = NULL,
ReportFileExt = "txt",
method = "DBM",
FOM,
alpha = 0.05,
covEstMethod = "jackknife",
nBoots = 200,
sequentialNames = FALSE,
overWrite = FALSE,
analysisOption = "ALL"
)
Arguments
dataset |
The dataset object to be analyzed (not the file name),
see |
ReportFileBaseName |
The report file (with extension |
ReportFileExt |
The report file extension determines the type of output.
|
method |
The significance testing method, |
FOM |
The figure of merit; see |
alpha |
See |
covEstMethod |
See |
nBoots |
See |
sequentialNames |
A logical variable: if |
overWrite |
A |
analysisOption |
"RRRC", "FRRC", "RRFC or "ALL"; see |
Details
A formatted report of the data analysis is written to the output file in either text or Excel format.
Value
StResult The object returned by StSignificanceTesting
.
Examples
# text output is created in a temporary file
UtilOutputReport(dataset03, FOM = "Wilcoxon")
# Excel output is created in a temporary file
UtilOutputReport(dataset03, FOM = "Wilcoxon", ReportFileExt = "xlsx")
Pseudovalues for given dataset and FOM
Description
Returns centered jackknife pseudovalues AND jackknife FOM values, for factorial OR split-plot-a OR split-plot-c study designs
Usage
UtilPseudoValues(dataset, FOM, FPFValue = 0.2)
Arguments
dataset |
The dataset to be analyzed, see |
FOM |
The figure of merit to be used in the calculation.
The default is |
FPFValue |
Only needed for |
Value
A list containing two arrays containing the pseudovalues and the jackknife FOM values of the datasets (a third returned value is for internal use).
Note
Each returned array has dimension c(I,J,K)
, where K
depends on the
FOM: K1
for FOMs that are based on normal cases only, K2
for FOMs that are
based on abnormal cases only, and K
for FOMs that are based on normal and
abnormal cases.
Examples
UtilPseudoValues(dataset05, FOM = "wAFROC")$jkFomValues[1,1,1:10]
Convert from physical to intrinsic RSM parameters
Description
Convert physical RSM parameters \lambda_i
' and \nu_i
' to the
intrinsic RSM parameters \lambda_i
and \nu_i
. The physical
parameters are more meaningful but they depend on \mu
. The intrinsic
parameters are independent of \mu
. See book for details.
Usage
UtilRSM2Intrinsic(mu, lambda, nu)
Arguments
mu |
The mean of the Gaussian distribution for the ratings of latent LLs,
i.e. continuous ratings of lesions that were found by the search mechanism
~ N( |
lambda |
The Poisson |
nu |
The |
Details
RSM is the Radiological Search Model described in the book. A latent mark
becomes an actual mark if the corresponding rating exceeds the lowest reporting
threshold zeta1. See also UtilIntrinsic2RSM
.
Value
A list containing \lambda_i
and \nu_i
, the RSM search parameters
References
Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449-3462.
Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.
Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840
Examples
mu <- 2;lambda <- 10;nu <- 0.9
lambda_i <- UtilRSM2Intrinsic(mu, lambda, nu)$lambda_i
nu_i <- UtilRSM2Intrinsic(mu, lambda, nu)$nu_i
## note that the physical values are only constrained to be positive, e.g., nu_i is not constrained
## to be between 0 and one.
Utility for Dorfman-Berbaum-Metz variance components
Description
Utility for Dorfman-Berbaum-Metz variance components
Usage
UtilVarComponentsDBM(dataset, FOM, FPFValue = 0.2)
Arguments
dataset |
The dataset object |
FOM |
The figure of merit |
FPFValue |
Only needed for |
Value
A list object containing the variance components.
Examples
UtilVarComponentsDBM(dataset02, FOM = "Wilcoxon")
TONY FROC dataset
Description
This is referred to in the book as the "TONY" dataset. It consists of 185 cases, 89 of which are diseased, interpreted in two treatments ("BT" = breast tomosynthesis and "DM" = digital mammography) by five radiologists using the FROC paradigm.
Usage
dataset01
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:2, 1:5, 1:185, 1:3], ratings of non-lesion localizations, NLsrating$LL
, num [1:2, 1:5, 1:89, 1:2], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:89], number of lesions per diseased caselesions$IDs
, num [1:89, 1:2], numeric labels of lesions on diseased caseslesions$weights
, num [1:89, 1:2], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset01", base name of dataset in 'data' folderdescriptions$type
, chr "FROC", the data typedescriptions$name
, chr "TONY", the name of the datasetdescriptions$truthTableStr
, num [1:2, 1:5, 1:185, 1:4] 1 1 1 1 ..., truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:2] "BT" "DM", treatment labelsdescriptions$readerID
, chr [1:5] "1" "2" "3" "4" ..., reader labels
References
Chakraborty DP, Svahn T (2011) Estimating the parameters of a model of visual search from ROC data: an alternate method for fitting proper ROC curves. PROC SPIE 7966.
Examples
str(dataset01)
PlotEmpiricalOperatingCharacteristics(dataset = dataset01, opChType = "wAFROC")$Plot
Van Dyke ROC dataset
Description
This is referred to in the book as the "VD" dataset. It consists of 114 cases, 45 of which are diseased, interpreted in two treatments ("0" = single spin echo MRI, "1" = cine-MRI) by five radiologists using the ROC paradigm. Each diseased cases had an aortic dissection; the ROC paradigm generates one rating per case. Often referred to in the ROC literature as the Van Dyke dataset, which, along with the Franken dataset, has been widely used to illustrate advances in ROC methodology. The example below displays the ROC plot for the first treatment and first reader.
Usage
dataset02
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:2, 1:5, 1:114, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1:2, 1:5, 1:45, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:45], number of lesions per diseased caselesions$IDs
, num [1:45, 1], numeric labels of lesions on diseased caseslesions$weights
, num [1:45, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset02", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "VAN-DYKE", the name of the datasetdescriptions$truthTableStr
, num [1:2, 1:5, 1:114, 1:2] 1 1 1 1 ..., truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:2] "0" "1", treatment labelsdescriptions$readerID
, chr [1:5] "0" "1" "2" ..., reader labels
References
Van Dyke CW, et al. Cine MRI in the diagnosis of thoracic aortic dissection. 79th RSNA Meetings. 1993.
Examples
str(dataset02)
PlotEmpiricalOperatingCharacteristics(dataset = dataset02, opChType = "ROC")$Plot
Franken ROC dataset
Description
This is referred to in the book as the "FR" dataset. It consists of 100 cases, 67 of which are diseased, interpreted in two treatments, "0" = conventional film radiographs, "1" = digitized images viewed on monitors, by four radiologists using the ROC paradigm. Often referred to in the ROC literature as the Franken-dataset, which, along the the Van Dyke dataset, has been widely used to illustrate advances in ROC methodology.
Usage
dataset03
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:2, 1:4, 1:100, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1:2, 1:4, 1:67, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:67], number of lesions per diseased caselesions$IDs
, num [1:67, 1], numeric labels of lesions on diseased caseslesions$weights
, num [1:67, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset03", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "FRANKEN", the name of the datasetdescriptions$truthTableStr
, num [1:2, 1:4, 1:100, 1:2], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:2] "TREAT1" "TREAT2", treatment labelsdescriptions$readerID
, chr chr [1:4] "READER_1" "READER_2" "READER_3" "READER_4", reader labels
References
Franken EA, et al. Evaluation of a Digital Workstation for Interpreting Neonatal Examinations: A Receiver Operating Characteristic Study. Investigative Radiology. 1992;27(9):732-737.
Examples
str(dataset03)
PlotEmpiricalOperatingCharacteristics(dataset = dataset03, opChType = "ROC")$Plot
Federica Zanca FROC dataset
Description
This is referred to in the book as the "FED" dataset. It consists of 200 mammograms, 100 of which contained one to 3 simulated microcalcifications, interpreted in five treatments (basically different image processing algorithms) by four radiologists using the FROC paradigm and a 5-point rating scale. The maximum number of NLs per case, over the entire dataset was 7 and the dataset contained at least one diseased mammogram with 3 lesions. The Excel file containing this dataset is /inst/extdata/datasets/FZ_ALL.xlsx. The normal cases are labeled 100:199 while the normal cases are labeled 0:99.
Usage
dataset04
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:5, 1:4, 1:200, 1:7], ratings of non-lesion localizations, NLsrating$LL
, num [1:5, 1:4, 1:100, 1:3], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:100], number of lesions per diseased caselesions$IDs
, num [1:100, 1:3], numeric labels of lesions on diseased caseslesions$weights
, num [1:100, 1:3], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset04", base name of dataset in 'data' folderdescriptions$type
, chr "FROC", the data typedescriptions$name
, chr "FEDERICA", the name of the datasetdescriptions$truthTableStr
, num [1:5, 1:4, 1:200, 1:4], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:5] "1" "2" "3" "4" "5", treatment labelsdescriptions$readerID
, chr [1:4] "1" "3" "4" "5", reader labels
References
Zanca F et al. Evaluation of clinical image processing algorithms used in digital mammography. Medical Physics. 2009;36(3):765-775.
Examples
str(dataset04)
PlotEmpiricalOperatingCharacteristics(dataset = dataset04, opChType = "wAFROC")$Plot
John Thompson FROC dataset
Description
This is referred to in the book as the "JT" dataset. It consists of 92 cases, 47 of which are diseased, interpreted in two treatments ("1" = CT images acquired for attenuation correction, "2" = diagnostic CT images), by nine radiographers using the FROC paradigm. Each case was a slice of an anthropomorphic phantom 47 with inserted nodular lesions (max 3 per slice). The maximum number of NLs per case, over the entire dataset was 7.
Usage
dataset05
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:2, 1:9, 1:92, 1:7], ratings of non-lesion localizations, NLsrating$LL
, num [1:2, 1:9, 1:47, 1:3], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:47], number of lesions per diseased caselesions$IDs
, num [1:47, 1:3], numeric labels of lesions on diseased caseslesions$weights
, num [1:47, 1:3], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset05", base name of dataset in 'data' folderdescriptions$type
, chr "FROC", the data typedescriptions$name
, chr "THOMPSON", the name of the datasetdescriptions$truthTableStr
, num [1:2, 1:9, 1:92, 1:4], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:2] "1" "2", treatment labelsdescriptions$readerID
, chr [1:4] "1" "2" "3" "4", reader labels
References
Thompson JD Hogg P, et al. (2014) A Free-Response Evaluation Determining Value in the Computed Tomography Attenuation Correction Image for Revealing Pulmonary Incidental Findings: A Phantom Study. Academic Radiology, 21 (4): 538-545.
Examples
str(dataset05)
PlotEmpiricalOperatingCharacteristics(dataset = dataset05, opChType = "wAFROC")$Plot
Magnus FROC dataset
Description
This is referred to in the book as the "MAG" dataset (after Magnus Bath, who conducted the JAFROC analysis). It consists of 100 cases, 69 of which are diseased, interpreted in two treatments ("1" = conventional chest, "1" = chest tomosynthesis) by four radiologists using the FROC paradigm.
Usage
dataset06
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:2, 1:4, 1:89, 1:17], ratings of non-lesion localizations, NLsrating$LL
, num [1:2, 1:4, 1:42, 1:15], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:42], number of lesions per diseased caselesions$IDs
, num [1:42, 1:15], numeric labels of lesions on diseased caseslesions$weights
, num [1:42, 1:15], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset06", base name of dataset in 'data' folderdescriptions$type
, chr "FROC", the data typedescriptions$name
, chr "MAGNUS", the name of the datasetdescriptions$truthTableStr
, num [1:2, 1:4, 1:89, 1:16], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:2] "1" "2", treatment labelsdescriptions$readerID
, chr [1:4] "1" "2" "3" "4", reader labels
References
Vikgren J et al. Comparison of Chest Tomosynthesis and Chest Radiography for Detection of Pulmonary Nodules: Human Observer Study of Clinical Cases. Radiology. 2008;249(3):1034-1041.
Examples
str(dataset06)
PlotEmpiricalOperatingCharacteristics(dataset = dataset06, opChType = "wAFROC")$Plot
Lucy Warren FROC dataset
Description
This is referred to in the book as the "OPT" dataset (for OptiMam). It consists of 162 cases, 81 of which are diseased, interpreted in five treatments (see reference, basically different ways of acquiring the images) by seven radiologists using the FROC paradigm.
Usage
dataset07
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:5, 1:7, 1:162, 1:4], ratings of non-lesion localizations, NLsrating$LL
, num [1:5, 1:7, 1:81, 1:3], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:81], number of lesions per diseased caselesions$IDs
, num [1:81, 1:3], numeric labels of lesions on diseased caseslesions$weights
, num [1:81, 1:3], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset07", base name of dataset in 'data' folderdescriptions$type
, chr "FROC", the data typedescriptions$name
, chr "LUCY-WARREN", the name of the datasetdescriptions$truthTableStr
, num [1:5, 1:7, 1:162, 1:4], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, [1:5] "1" "2" "3" "4" ..., treatment labelsdescriptions$readerID
, chr [1:7] "1" "2" "3" "4" ..., reader labels
References
Warren LM, Mackenzie A, Cooke J, et al. Effect of image quality on calcification detection in digital mammography. Medical Physics. 2012;39(6):3202-3213.
Examples
str(dataset07)
PlotEmpiricalOperatingCharacteristics(dataset = dataset07, opChType = "wAFROC")$Plot
Monica Penedo ROC dataset
Description
This is referred to in the book as the "PEN" dataset. It consists of 112 cases, 64 of which are diseased, interpreted in five treatments (basically different image compression algorithms) by five radiologists using the FROC paradigm (the inferred ROC dataset is included; the original FROC data is lost).
Usage
dataset08
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:5, 1:5, 1:112, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1:5, 1:5, 1:64, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:64], number of lesions per diseased caselesions$IDs
, num [1:64, 1], numeric labels of lesions on diseased caseslesions$weights
, num [1:64, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset08", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "PENEDO", the name of the datasetdescriptions$truthTableStr
, num [1:5, 1:5, 1:112, 1:2], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:5] "0" "1" "2" "3" ..., treatment labelsdescriptions$readerID
, chr [1:5] "0" "1" "2" "3" ..., reader labels
References
Penedo et al. Free-Response Receiver Operating Characteristic Evaluation of Lossy JPEG2000 and Object-based Set Partitioning in Hierarchical Trees Compression of Digitized Mammograms. Radiology. 2005;237(2):450-457.
Examples
str(dataset08)
PlotEmpiricalOperatingCharacteristics(dataset = dataset08, opChType = "ROC")$Plot
Nico Karssemeijer ROC dataset (CAD vs. radiologists)
Description
This is referred to in the book as the "NICO" dataset. It consists of 200 mammograms,
80 of which contain one malignant mass,
interpreted by a CAD system and nine radiologists using the
LROC paradigm. The first reader is CAD. The highest rating was used to convert this to an ROC
dataset. The original LROC data is datasetCadLroc
. Analyzing this
data requires methods described in the book, implemented in the function
StSignificanceTestingCadVsRad
.
Usage
dataset09
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1, 1:10, 1:200, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1, 1:10, 1:80, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:80], number of lesions per diseased caselesions$IDs
, num [1:80, 1], numeric labels of lesions on diseased caseslesions$weights
, num [1:80, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset09", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "NICO-CAD-ROC", the name of the datasetdescriptions$truthTableStr
, num [1, 1:10, 1:200, 1:2], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr "1", treatment label(s)descriptions$readerID
, chr [1:10] "1" "2" "3" "4" ..., reader labels
References
Hupse R et al. Standalone computer-aided detection compared to radiologists' performance for the detection of mammographic masses. Eur Radiol. 2013;23(1):93-100.
Examples
str(dataset09)
PlotEmpiricalOperatingCharacteristics(dataset = dataset09, rdrs = 1:10, opChType = "ROC")$Plot
Mark Ruschin ROC dataset
Description
This is referred to in the book as the "RUS" dataset. It consists of 90 cases, 40 of which are diseased, the images were acquired at three dose levels, which can be regarded as treatments. "0" = conventional film radiographs, "1" = digitized images viewed on monitors, Eight radiologists interpreted the cases using the FROC paradigm. These have been reduced to ROC data by using the highest ratings (the original FROC data is lost).
Usage
dataset10
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:3, 1:8, 1:90, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1:3, 1:8, 1:40, 1] , ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:40], number of lesions per diseased caselesions$IDs
, num [1:40, 1], numeric labels of lesions on diseased caseslesions$weights
, num [1:40, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset10", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "RUSCHIN", the name of the datasetdescriptions$truthTableStr
, num [1:3, 1:8, 1:90, 1:2], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:3] "1" "2" "3", treatment label(s)descriptions$readerID
, chr [1:8] "1" "2" "3" "4" ..., reader labels
References
Ruschin M, et al. Dose dependence of mass and microcalcification detection in digital mammography: free response human observer studies. Med Phys. 2007;34:400 - 407.
Examples
str(dataset10)
PlotEmpiricalOperatingCharacteristics(dataset = dataset10, opChType = "ROC")$Plot
Dobbins 1 FROC dataset
Description
This is referred to in the book as the "DOB1" dataset. Dobbins et al conducted a multi-institutional, MRMC study to compare the performance of digital tomosynthesis (GE's VolumeRad device), dual-energy (DE) imaging, and conventional chest radiography for pulmonary nodule detection and management. All study images were obtained with a flat-panel detector developed by GE. The case set consisted of 158 subjects, of which 43 were non-diseased and the rest had 1 - 20 pulmonary nodules independently verified, using with CT images, by 3 experts who did not participate in the observer study. The study used FROC paradigm data collection. There are 4 treatments labeled 1 - 4 (conventional chest x-ray, CXR, CXR augmented with dual-energy (CXR+DE), VolumeRad digital tomosynthesis images and VolumeRad augmented with DE (VolumeRad+DE).
Usage
dataset11
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:4, 1:5, 1:158, 1:4], ratings of non-lesion localizations, NLsrating$LL
, num [1:4, 1:5, 1:115, 1:20], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:115], number of lesions per diseased caselesions$IDs
, num [1:115, 1:20], numeric labels of lesions on diseased caseslesions$weights
, num [1:115, 1:20], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset11", base name of dataset in 'data' folderdescriptions$type
, chr "FROC", the data typedescriptions$name
, chr "DOBBINS-1", the name of the datasetdescriptions$truthTableStr
, num [1:4, 1:5, 1:158, 1:21], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:4] "1" "2" "3" "4", treatment label(s)descriptions$readerID
, chr [1:5] "1" "2" "3" "4" ..., reader labels
References
Dobbins III JT et al. Multi-Institutional Evaluation of Digital Tomosynthesis, Dual-Energy Radiography, and Conventional Chest Radiography for the Detection and Management of Pulmonary Nodules. Radiology. 2016;282(1):236-250.
Examples
str(dataset11)
Dobbins 2 ROC dataset
Description
This is referred to in the code as the "DOB2" dataset. It contains actionability ratings, i.e., do you recommend further follow up on the patient, one a 1 (definitely not) to 5 (definitely yes), effectively an ROC dataset using a 5-point rating scale.
Usage
dataset12
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:4, 1:5, 1:152, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1:4, 1:5, 1:88, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:88], number of lesions per diseased caselesions$IDs
, num [1:88, 1], numeric labels of lesions on diseased caseslesions$weights
, num [1:88, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset12", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "DOBBINS-2", the name of the datasetdescriptions$truthTableStr
, num [1:4, 1:5, 1:152, 1:2] , truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:4] "1" "2" "3" "4", treatment label(s)descriptions$readerID
, chr [1:5] "1" "2" "3" "4" ..., reader labels
References
Dobbins III JT et al. Multi-Institutional Evaluation of Digital Tomosynthesis, Dual-Energy Radiography, and Conventional Chest Radiography for the Detection and Management of Pulmonary Nodules. Radiology. 2016;282(1):236-250.
Examples
str(dataset12)
Dobbins 3 FROC dataset
Description
This is referred to in the code as the "DOB3" dataset. This is a subset of DOB1 which includes data for lesions not-visible on CXR, but visible to truth panel on all treatments.
Usage
dataset13
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:4, 1:5, 1:158, 1:4], ratings of non-lesion localizations, NLsrating$LL
, num [1:4, 1:5, 1:106, 1:15], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:106], number of lesions per diseased caselesions$IDs
, num [1:106, 1:15], numeric labels of lesions on diseased caseslesions$weights
, num [1:106, 1:15], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset13", base name of dataset in 'data' folderdescriptions$type
, chr "FROC", the data typedescriptions$name
, chr "DOBBINS-3", the name of the datasetdescriptions$truthTableStr
, num [1:4, 1:5, 1:158, 1:16], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:4] "1" "2" "3" "4", treatment label(s)descriptions$readerID
, chr [1:5] "1" "2" "3" "4" ..., reader labels
References
Dobbins III JT et al. Multi-Institutional Evaluation of Digital Tomosynthesis, Dual-Energy Radiography, and Conventional Chest Radiography for the Detection and Management of Pulmonary Nodules. Radiology. 2016;282(1):236-250.
Examples
str(dataset13)
Federica Zanca real (as opposed to inferred) ROC dataset
Description
This is referred to in the book as the "FZR" dataset. It is a real ROC study, conducted on the same images and using the same radiologists, on treatments "4" and "5" of dataset04. This was compared to highest rating inferred ROC data from dataset04 to conclude, erroneously, that the highest rating assumption is invalid. See book Section 13.6.2.
Usage
dataset14
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:2, 1:4, 1:200, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1:2, 1:4, 1:100, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:100], number of lesions per diseased caselesions$IDs
, num [1:100, 1] , numeric labels of lesions on diseased caseslesions$weights
, num [1:100, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "dataset14", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "FEDERICA-REAL-ROC", the name of the datasetdescriptions$truthTableStr
, num [1:2, 1:4, 1:200, 1:2], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr [1:2] "4" "5", treatment label(s)descriptions$readerID
, chr [1:4] "1" "2" "3" "4", reader labels
References
Zanca F, Hillis SL, Claus F, et al (2012) Correlation of free-response and receiver-operating-characteristic area-under-the-curve estimates: Results from independently conducted FROC/ROC studies in mammography. Med Phys. 39(10):5917-5929.
Examples
str(dataset14)
Binned dataset suitable for checking FitCorCbm
; seed = 123
Description
A binned dataset suitable for analysis by FitCorCbm
. It was generated by
DfCreateCorCbmDataset by setting the seed
variable to 123. Note
the formatting of the data as a single treatment two reader dataset, even though
the actual pairing might be different, see FitCorCbm
. The dataset is
intentionally large so as to demonstrate the asymptotic convergence of ML estimates,
produced by FitCorCbm
, to the population values. The data was generated
by the following argument values to DfCreateCorCbmDataset
: seed = 123,
K1 = 5000, K2 = 5000, desiredNumBins = 5, muX = 1.5, muY = 3, alphaX = 0.4,
alphaY = 0.7, rhoNor = 0.3, rhoAbn2 = 0.8.
Usage
datasetBinned123
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1, 1:2, 1:10000, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1, 1:2, 1:5000, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:5000], number of lesions per diseased caselesions$IDs
, num [1:5000, 1] , numeric labels of lesions on diseased caseslesions$weights
, num [1:5000, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "datasetBinned123", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "SIM-CORCBM-SEED-123", the name of the datasetdescriptions$truthTableStr
, NA, truth table structuredescriptions$design
, chr "FCTRL-X-MOD", study design, factorial datasetdescriptions$modalityID
, chr "1", treatment label(s)descriptions$readerID
, chr [1:2] "1" "2", reader labels
References
Zhai X, Chakraborty DP (2017). A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.
Examples
str(datasetBinned123)
Binned dataset suitable for checking FitCorCbm
; seed = 124
Description
A binned dataset suitable for analysis by FitCorCbm
. It was generated by
DfCreateCorCbmDataset
by setting the seed
variable to 124.
Otherwise similar to datasetBinned123
.
Usage
datasetBinned124
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1, 1:2, 1:10000, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1, 1:2, 1:5000, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:5000], number of lesions per diseased caselesions$IDs
, num [1:5000, 1] , numeric labels of lesions on diseased caseslesions$weights
, num [1:5000, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "datasetBinned124", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "SIM-CORCBM-SEED-124", the name of the datasetdescriptions$truthTableStr
, NA, truth table structuredescriptions$design
, chr "FCTRL-X-MOD", study design, factorial datasetdescriptions$modalityID
, chr "1", treatment label(s)descriptions$readerID
, chr [1:2] "1" "2", reader labels
References
Zhai X, Chakraborty DP (2017). A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.
Examples
str(datasetBinned124)
Binned dataset suitable for checking FitCorCbm
; seed = 125
Description
A binned dataset suitable for analysis by FitCorCbm
. It was generated by
DfCreateCorCbmDataset
by setting the seed
variable to 125.
Otherwise similar to datasetBinned123
.
Usage
datasetBinned125
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1, 1:2, 1:10000, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1, 1:2, 1:5000, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:5000], number of lesions per diseased caselesions$IDs
, num [1:5000, 1] , numeric labels of lesions on diseased caseslesions$weights
, num [1:5000, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "datasetBinned125", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "SIM-CORCBM-SEED-125", the name of the datasetdescriptions$truthTableStr
, NA, truth table structuredescriptions$design
, chr "FCTRL-X-MOD", study design, factorial datasetdescriptions$modalityID
, chr "1", treatment label(s)descriptions$readerID
, chr [1:2] "1" "2", reader labels
References
Zhai X, Chakraborty DP (2017). A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.
Examples
str(datasetBinned125)
Nico Karssemeijer LROC dataset (CAD vs. radiologists)
Description
This is the actual LROC data corresponding to dataset09
, which was the inferred
ROC data. Note that the LL
field is split into two, LL
, representing true
positives where the lesions were correctly localized, and LL_IL
, representing true
positives where the lesions were incorrectly localized. The first reader is CAD
and the remaining readers are radiologists.
Usage
datasetCadLroc
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1, 1:10, 1:200, 1], ratings of localizations on normal casesrating$LL
, num [1, 1:10, 1:80, 1], ratings of correct localizations on abnormal casesrating$LL_IL
num [1, 1:10, 1:80, 1], ratings of incorrect localizations on abnormal caseslesions$perCase
, int [1:80], number of lesions per diseased caselesions$IDs
, num [1:80, 1] , numeric labels of lesions on diseased caseslesions$weights
, num [1:80, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "datasetCadLroc", base name of dataset in 'data' folderdescriptions$type
, chr "LROC", the data typedescriptions$name
, chr "NICO-CAD-LROC", the name of the datasetdescriptions$truthTableStr
, num [1:2, 1:4, 1:200, 1:2], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr "1", treatment label(s)descriptions$readerID
, chr [1:10] "1" "2" "3" "4" ..., reader labels
References
Hupse R et al. Standalone computer-aided detection compared to radiologists' performance for the detection of mammographic masses. Eur Radiol. 2013;23(1):93-100.
Examples
str(datasetCadLroc)
Simulated FROC CAD vs. RAD dataset
Description
Simulated FROC CAD vs. RAD dataset suitable for checking code. It was generated from datasetCadLroc using SimulateFrocFromLrocData.R. The LROC paradigm always yields a single mark per case. Therefore the equivalent FROC will also have only one mark per case. The NL arrays of the two datasets are identical. The LL array is created by copying the LL (correct localiztion) array of the LROC dataset to the LL array of the FROC dataset, from diseased case index k2 = 1 to k2 = K2. Additionally, the LL_IL array of the LROC dataset is copied to the NL array of the FROC dataset, starting at case index k1 = K1+1 to k1 = K1+K2. Any zero ratings are replace by -Infs. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this dataset & function is to test the CAD significance testing functions using CAD FROC datasets, which I currently don't have.
Usage
datasetCadSimuFroc
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1, 1:10, 1:200, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1, 1:10, 1:80, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:80], number of lesions per diseased caselesions$IDs
, num [1:80, 1] , numeric labels of lesions on diseased caseslesions$weights
, num [1:80, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "datasetCadSimuFroc", base name of dataset in 'data' folderdescriptions$type
, chr "LROC", the data typedescriptions$name
, chr "NICO-CAD-LROC", the name of the datasetdescriptions$truthTableStr
, num [1:2, 1:4, 1:200, 1:2], truth table structuredescriptions$design
, chr "FCTRL", study design, factorial datasetdescriptions$modalityID
, chr "1", treatment label(s)descriptions$readerID
, chr [1:10] "1" "2" "3" "4" ..., reader labels
John Thompson crossed treatment FROC dataset
Description
This is a crossed treatment dataset, see book Section 18.5. There are two treatment factors.
The first treatment factor modalityID1
can be "F" or "I", which represent two CT reconstruction
algorithms. The second treatment factor modalityID2
can be "20" "40" "60" "80", which
represent the mAs values of the image acquisition. The factors are fully crossed. The function
StSignificanceTestingCrossedModalities
analyzes such datasets.
Usage
datasetCrossedModality
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:2, 1:4, 1:11, 1:68, 1:5], ratings of non-lesion localizations, NLsrating$LL
, num [1:2, 1:4, 1:11, 1:34, 1:3], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:34], number of lesions per diseased caselesions$IDs
, num [1:34, 1:3] , numeric labels of lesions on diseased caseslesions$weights
, num [1:34, 1:3], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "datasetCrossedModality", base name of dataset in 'data' folderdescriptions$type
, chr "FROC", the data typedescriptions$name
, chr "THOMPSON-X-MOD", the name of the datasetdescriptions$truthTableStr
, NA, truth table structuredescriptions$design
, chr "FCTRL-X-MOD", study design, factorial datasetdescriptions$modalityID
, chr [1:2] "F" "I", treatment label(s)descriptions$readerID
, chr [1:4] "20" "40" "60" "80", reader labels
References
Thompson JD, Chakraborty DP, et al. (2016) Effect of reconstruction methods and x-ray tube current-time product on nodule detection in an anthropomorphic thorax phantom: a crossed-treatment JAFROC observer study. Medical Physics. 43(3):1265-1274.
Examples
str(datasetCrossedModality)
Simulated degenerate ROC dataset (for testing purposes)
Description
A simulated degenerated dataset. A degenerate dataset is defined as one with no interior operating points on the ROC plot. Such data tend to be observed with expert level radiologists. This dataset is used to illustrate the robustness of two fitting models, namely CBM and RSM. The widely used binormal model and PROPROC fail on such datasets.
Usage
datasetDegenerate
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1, 1, 1:15, 1], ratings of non-lesion localizations, NLsrating$LL
, num [1, 1, 1:10, 1], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:10], number of lesions per diseased caselesions$IDs
, num [1:10, 1] , numeric labels of lesions on diseased caseslesions$weights
, num [1:10, 1], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "datasetDegenerate", base name of dataset in 'data' folderdescriptions$type
, chr "ROC", the data typedescriptions$name
, chr "SIM-DEGENERATE", the name of the datasetdescriptions$truthTableStr
, NA, truth table structuredescriptions$design
, chr "FCTRL-X-MOD", study design, factorial datasetdescriptions$modalityID
, chr "1", treatment label(s)descriptions$readerID
, chr "1", reader labels
Examples
str(datasetDegenerate)
Simulated FROC SPLIT-PLOT-C dataset
Description
Simulated from FED Excel dataset by successively ignoring readers 3:4, c(1,3:4), c(1:2,4), etc. created simulated split plot Excel dataset from Fed dataset: confirmed it is read without error
Usage
datasetFROCSpC
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:2, 1:4, 1:200, 1:7], ratings of non-lesion localizations, NLsrating$LL
, num [1:2, 1:4, 1:100, 1:3], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:100], number of lesions per diseased caselesions$IDs
, num [1:100, 1:3] , numeric labels of lesions on diseased caseslesions$weights
, num [1:100, 1:3], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "datasetFROCSpC", base name of dataset in 'data' folderdescriptions$type
, chr "FROC", the data typedescriptions$name
, chr "SIM-FROC-SPLIT-PLOT-C", the name of the datasetdescriptions$truthTableStr
, NA, truth table structuredescriptions$design
, chr "FCTRL-X-MOD", study design, factorial datasetdescriptions$modalityID
, chr [1:2] "4" "5", treatment label(s)descriptions$readerID
, chr [1:4] "1" "3" "4" "5", reader labels
Examples
str(datasetFROCSpC)
Simulated ROI dataset
Description
TBA Simulated ROI dataset: assumed are 4 ROIs per case, 5 readers, 50 non-dieased and 40 diseased cases.
Usage
datasetROI
Format
A list with 3 elements: $ratings
, $lesions
and $descriptions
; $ratings
contain 3 elements, $NL
, $LL
and $LL_IL
as sub-lists; $lesions
contain 3 elements, $perCase
, $IDs
and $weights
as sub-lists; $descriptions
contain 7 elements, $fileName
, $type
, $name
,
$truthTableStr
, $design
, $modalityID
and $readerID
as sub-lists;
rating$NL
, num [1:2, 1:5, 1:90, 1:4], ratings of non-lesion localizations, NLsrating$LL
, num [1:2, 1:5, 1:40, 1:4], ratings of lesion localizations, LLsrating$LL_IL
NA, this placeholder is used only for LROC datalesions$perCase
, int [1:40], number of lesions per diseased caselesions$IDs
, num [1:40, 1:4] , numeric labels of lesions on diseased caseslesions$weights
, num [1:40, 1:4], weights (or clinical importances) of lesionsdescriptions$fileName
, chr, "datasetROI", base name of dataset in 'data' folderdescriptions$type
, chr "ROI", the data typedescriptions$name
, chr "SIM-ROI", the name of the datasetdescriptions$truthTableStr
, NA, truth table structuredescriptions$design
, chr "FCTRL-X-MOD", study design, factorial datasetdescriptions$modalityID
, chr [1:2] "1" "2", treatment label(s)descriptions$readerID
, chr [1:5] "1" "2" "3" "4" ..., reader labels
Examples
str(datasetROI)
Determine if a dataset is binned
Description
Determine if a dataset is binned
Usage
isBinnedDataset(dataset, maxUniqeRatings = 6)
Arguments
dataset |
The dataset |
maxUniqeRatings |
For each treatment-reader combination, the max number of unique ratings in order to be classified as binned, the default value for |
Value
a logical [I x J]
array, TRUE if the corresponding treatment-reader combination is binned, i.e., has at most maxUniqeRatings
unique ratings, FALSE otherwise.
Examples
isBinnedDataset(dataset01)
Check the validity of a dataset
Description
Checks the validity of the dataset.
Usage
isValidDataset(dataset)
Arguments
dataset |
The dataseet object to be checked. |
Value
TRUE
if dataset is valid, FALSE
otherwise.