Help for package RJafroc

Type:

Package

Title:

Artificial Intelligence Systems and Observer Performance

Version:

2.1.2

Date:

2022-11-08

Depends:

R (≥ 3.5.0)

Imports:

bbmle, binom, dplyr, ggplot2, mvtnorm, numDeriv, openxlsx, readxl, Rcpp, stats, stringr, tools, utils

Suggests:

testthat, knitr, kableExtra, rmarkdown

LinkingTo:

Rcpp

Description:

Analyzing the performance of artificial intelligence (AI) systems/algorithms characterized by a 'search-and-report' strategy. Historically observer performance has dealt with measuring radiologists' performances in search tasks, e.g., searching for lesions in medical images and reporting them, but the implicit location information has been ignored. The implemented methods apply to analyzing the absolute and relative performances of AI systems, comparing AI performance to a group of human readers or optimizing the reporting threshold of an AI system. In addition to performing historical receiver operating receiver operating characteristic (ROC) analysis (localization information ignored), the software also performs free-response receiver operating characteristic (FROC) analysis, where lesion localization information is used. A book using the software has been published: Chakraborty DP: Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, Taylor-Francis LLC; 2017: https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840. Online updates to this book, which use the software, are at https://dpc10ster.github.io/RJafrocQuickStart/, https://dpc10ster.github.io/RJafrocRocBook/ and at https://dpc10ster.github.io/RJafrocFrocBook/. Supported data collection paradigms are the ROC, FROC and the location ROC (LROC). ROC data consists of single ratings per images, where a rating is the perceived confidence level that the image is that of a diseased patient. An ROC curve is a plot of true positive fraction vs. false positive fraction. FROC data consists of a variable number (zero or more) of mark-rating pairs per image, where a mark is the location of a reported suspicious region and the rating is the confidence level that it is a real lesion. LROC data consists of a rating and a location of the most suspicious region, for every image. Four models of observer performance, and curve-fitting software, are implemented: the binormal model (BM), the contaminated binormal model (CBM), the correlated contaminated binormal model (CORCBM), and the radiological search model (RSM). Unlike the binormal model, CBM, CORCBM and RSM predict 'proper' ROC curves that do not inappropriately cross the chance diagonal. Additionally, RSM parameters are related to search performance (not measured in conventional ROC analysis) and classification performance. Search performance refers to finding lesions, i.e., true positives, while simultaneously not finding false positive locations. Classification performance measures the ability to distinguish between true and false positive locations. Knowing these separate performances allows principled optimization of reader or AI system performance. This package supersedes Windows JAFROC (jackknife alternative FROC) software V4.2.1, https://github.com/dpc10ster/WindowsJafroc. Package functions are organized as follows. Data file related function names are preceded by 'Df', curve fitting functions by 'Fit', included data sets by 'dataset', plotting functions by 'Plot', significance testing functions by 'St', sample size related functions by 'Ss', data simulation functions by 'Simulate' and utility functions by 'Util'. Implemented are figures of merit (FOMs) for quantifying performance and functions for visualizing empirical or fitted operating characteristics: e.g., ROC, FROC, alternative FROC (AFROC) and weighted AFROC (wAFROC) curves. For fully crossed study designs significance testing of reader-averaged FOM differences between modalities is implemented via either Dorfman-Berbaum-Metz or the Obuchowski-Rockette methods. Also implemented is single treatment analysis, which allows comparison of performance of a group of radiologists to a specified value, or comparison of AI to a group of radiologists interpreting the same cases. Crossed-modality analysis is implemented wherein there are two crossed treatment factors and the aim is to determined performance in each treatment factor averaged over all levels of the second factor. Sample size estimation tools are provided for ROC and FROC studies; these use estimates of the relevant variances from a pilot study to predict required numbers of readers and cases in a pivotal study to achieve the desired power. Utility and data file manipulation functions allow data to be read in any of the currently used input formats, including Excel, and the results of the analysis can be viewed in text or Excel output files. The methods are illustrated with several included datasets from the author's collaborations. This update includes improvements to the code, some as a result of user-reported bugs and new feature requests, and others discovered during ongoing testing and code simplification.

License:

GPL-3

LazyData:

true

URL:

https://dpc10ster.github.io/RJafroc/

RoxygenNote:

7.2.1

Encoding:

UTF-8

NeedsCompilation:

yes

Packaged:

2022-11-08 18:38:08 UTC; Dev

Author:

Dev Chakraborty [cre, aut, cph], Peter Phillips [ctb], Xuetong Zhai [aut]

Maintainer:

Dev Chakraborty <dpc10ster@gmail.com>

Repository:

CRAN

Date/Publication:

2022-11-08 19:10:02 UTC

Artificial Intelligence Systems and Observer Performance

Description

RJafroc analyzes the performance of artificial intelligence (AI) systems/algorithms characterized by a search-and-report strategy. Historically observer performance has dealt with measuring radiologists' performances in search tasks, e.g., searching for lesions in medical images and reporting them, but the implicit location information has been ignored. The methods here apply to any task involving searching for and reporting arbitrary targets in images. The implemented methods apply to analyzing the absolute and relative performances of AI systems, comparing AI performance to a group of human readers or optimizing the reporting threshold of an AI system. In addition to performing historical receiver operating characteristic (ROC) analysis (localization information ignored), the software also performs free-response receiver operating characteristic (FROC) analysis, where the implicit lesion localization information is used. A book describing the underlying methodology and which uses the software has been published: Chakraborty DP: Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, Taylor-Francis LLC; 2017: https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840. Online updates to this book, which use the software, are at https://dpc10ster.github.io/RJafrocQuickStart/, https://dpc10ster.github.io/RJafrocRocBook/ and at https://dpc10ster.github.io/RJafrocFrocBook/. Supported data collection paradigms are the ROC, FROC and the location ROC (LROC). ROC data consists of single ratings per images, where a rating is the perceived confidence level that the image is that of a diseased patient. An ROC curve is a plot of true positive fraction vs. false positive fraction. FROC data consists of a variable number (zero or more) of mark-rating pairs per image, where a mark is the location of a reported suspicious region and the rating is the confidence level that it is a real lesion. LROC data consists of a rating and a location of the most suspicious region, for every image. Four models of observer performance, and curve-fitting software, are implemented: the binormal model (BM), the contaminated binormal model (CBM), the correlated contaminated binormal model (CORCBM), and the radiological search model (RSM). Unlike the binormal model, CBM, CORCBM and RSM predict "proper" ROC curves that do not inappropriately cross the chance diagonal. Additionally, RSM parameters are related to search performance (not measured in conventional ROC analysis) and classification performance. Search performance refers to finding lesions, i.e., true positives, while simultaneously not finding false positive locations. Classification performance measures the ability to distinguish between true and false positive locations. Knowing these separate performances allows principled optimization of reader or AI system performance. This package supersedes Windows JAFROC (jackknife alternative FROC) software V4.2.1, https://github.com/dpc10ster/WindowsJafroc. Package functions are organized as follows. Data file related function names are preceded by Df, curve fitting functions by Fit, included data sets by dataset, plotting functions by Plot, significance testing functions by St, sample size related functions by Ss, data simulation functions by Simulate and utility functions by Util. Implemented are figures of merit (FOMs) for quantifying performance, functions for visualizing empirical operating characteristics: e.g., ROC, FROC, alternative FROC (AFROC) and weighted AFROC (wAFROC) curves. For fully crossed study designs significance testing of reader-averaged FOM differences between modalities is implemented via both Dorfman-Berbaum-Metz and the Obuchowski-Rockette methods. Also implemented are single treatment analyses, allowing comparison of performance of a group of radiologists to a specified value, or comparison of AI to a group of radiologists/algorithms interpreting the same cases. Crossed-modality analysis is implemented wherein there are two crossed treatment factors and the aim is to determined performance in each treatment factor averaged over all levels of the second factor. Sample size estimation tools are provided for ROC and FROC studies; these use estimates of the relevant variances from a pilot study to predict required numbers of readers and cases in a pivotal study to achieve the desired power. Utility and data file manipulation functions allow data to be read in any of the currently used input formats, including Excel, and the results of the analysis can be viewed in text or Excel output files. The methods are illustrated with several included datasets from the author's collaborations. This update includes improvements to the code, some as a result of user-reported bugs and new feature requests, and others discovered during ongoing testing and code simplification. All changes are noted in NEWS.md.

Details

Package:	RJafroc
Type:	Package
Version:	2.1.2
Date:	2022-11-08
License:	GPL-3
URL:	https://dpc10ster.github.io/RJafroc/

Definitions and abbreviations

a: The separation or "a" parameter of the binormal model
AFROC curve: plot of LLF (ordinate) vs. FPF, where FPF is inferred using highest rating of NL marks on non-diseased cases
AFROC: alternative FROC, see Chakraborty 1989
AFROC1 curve: plot of LLF (ordinate) vs. FPF1, where FPF1 is inferred using highest rating of NL marks on ALL cases
alpha: The significance level \alpha of the test of the null hypothesis of no treatment effect
AUC: area under curve; e.g., ROC-AUC = area under ROC curve, an example of a FOM
b: The width or "b" parameter of the conventional binormal model
Binormal model: two unequal variance normal distributions, one at zero and one at mu, for modeling ROC ratings, sigma is the std. dev. ratio of diseased to non-diseased distributions
CAD: computer aided detection algorithm
CBM: contaminated binormal model (CBM): two equal variance normal distributions for modeling ROC ratings, the diseased distribution is bimodal, with a peak at zero and one at \mu, the integrated fraction at \mu is \alpha (not to be confused with \alpha of NH testing)
CI: The (1-\alpha) confidence interval for the stated statistic
Crossed modality: a dataset containing two modality (treatment) factors, with the levels of the two factors crossed, see paper by Thompson et al
DBM: Dorfman-Berbaum-Metz, a significance testing method for detecting a treatment effect in MRMC studies
DBMH: Hillis' modification of the DBM method
ddf: Denominator degrees of freedom of appropriate F-test; the corresponding ndf is I - 1
Empirical AUC: trapezoidal area under curve, same as the Wilcoxon statistic for ROC paradigm
FN: false negative, a diseased case classified as non-diseased
FOM: figure of merit, a quantitative measure of performance, performance metric
FP: false positive, a non-diseased case classified as diseased
FPF: number of FPs divided by number of non-diseased cases
FROC curve: plot of LLF (ordinate) vs. NLF
FROC: free-response ROC (a data collection paradigm where each image yields a random number, 0, 1, 2,..., of mark-rating pairs)
FRRC: Analysis that treats readers as fixed and cases as random factors
I: total number of modalities, indexed by i
image/case: used interchangeably; a case can consist of several images of the same patient in the same treatment
iMRMC: A text file format used for ROC data by FDA/CDRH researchers
individual: A single-treatment single-reader dataset.
Intrinsic: Used in connection with RSM; a parameter that is independent of the RSM \mu parameter, but whose meaning may not be as transparent as the corresponding physical parameter
J: number of readers, indexed by j
JAFROC file format: A .xlsx format file, applicable to ROC, ROI, FROC and LROC paradigms
JAFROC: jackknife AFROC: Windows software for analyzing observer performance data: no longer updated, replaced by current package; the name is a misnomer as the jackknife is used only for significance testing; alternatively, the bootstrap could be used; what distinguishes FROC from ROC analysis is the use of the AFROC-AUC as the FOM. With this change, the DBM or the OR method can be used for significance testing
K: total number of cases, K = K1 + K2, indexed by k
K1: total number of non-diseased cases, indexed by k1
K2: total number of diseased cases, indexed by k2
LL: lesion localization i.e., a mark that correctly locates an existing localized lesion; TP is a special case, when the proximity criterion is lax (i.e., "acceptance radius" is large)
LLF: number of LLs divided by the total number of lesions
LROC: location receiver operating characteristic, a data collection paradigm where each image yields a single rating and one location
lrc/MRMC: A text file format used for ROC data by University of Iowa researchers
mark: the location of a suspected diseased region
maxLL: maximum number of lesions per case in dataset
maxNL: maximum number of NL marks per case in dataset
MRMC: multiple reader multiple case (each reader interprets each case in each treatment, i.e. fully crossed study design)
ndf: Numerator degrees of freedom of appropriate F-test, usually number of treatments minus one
NH: The null hypothesis that all treatment effects are zero; rejected if the p-value is smaller than \alpha
NL: non-lesion localization, of which FP is a special case, i.e., a mark that does not correctly locate any existing localized lesion(s)
NLF: number of NLs divided by the total number of cases
Operating characteristic: A plot of normalized correct decisions on diseased cases along ordinate vs. normalized incorrect decisions on non-diseased cases
Operating point: A point on an operating characteristic, e.g., (FPF, TPF) represents an operating point on an ROC
OR: Obuchowski-Rockette, a significance testing method for detecting a treatment effect in MRMC studies
ORH: Hillis' modification of the OR method
Physical parameter: Used in connection with RSM; a parameter whose meaning is more transparent than the corresponding intrinsic parameter, but which depends on the RSM \mu parameter
Proximity criterion / acceptance radius: Used in connection with FROC (or LROC data); the "nearness" criterion is used to determine if a mark is close enough to a lesion to be counted as a LL (or correct localization); otherwise it is counted as a NL (or incorrect localization)
p-value: the probability, under the null hypothesis, that the observed treatment effects, or larger, could occur by chance
Proper: a proper fit does not inappropriately fall below the chance diagonal, does not display a "hook" near the upper right corner
PROPROC: Metz's binormal model based fitting of proper ROC curves
RSM, Radiological Search Model: two unit variance normal distributions for modeling NL and LL ratings; four parameters, \mu, \nu', \lambda' and \zeta1
Rating: Confidence level assigned to a case; higher values indicate greater confidence in presence of disease; -Inf is allowed but NA is not allowed
Reader/observer/radiologist/CAD: used interchangeably
RJafroc: the current software
ROC: receiver operating characteristic, a data collection paradigm where each image yields a single rating and location information is ignored
ROC curve: plot of TPF (ordinate) vs. FPF, as threshold is varied; an example of an operating characteristic
ROCFIT: Metz software for binormal model based fitting of ROC data
ROI: region-of-interest (each case is divided into a number of ROIs and the reader assigns an ROC rating to each ROI)
FRRC: Analysis that treats readers as fixed and cases as random factors
RRFC: Analysis that treats readers as random and cases as fixed factors
RRRC: Analysis that treats both readers and cases as random factors
RSCORE-II: original software for binormal model based fitting of ROC data
RSM: Radiological search model, also method for fitting a proper ROC curve to ROC data
RSM-\zeta1: Lowest reporting threshold, determines if suspicious region is actually marked
RSM-\lambda: Intrinsic parameter of RSM corresponding to \lambda', independent of \mu
RSM-\lambda': Physical Poisson parameter of RSM, average number of latent NLs per case; depends on \mu
RSM-\mu: separation of the unit variance distributions of RSM
RSM-\nu: Intrinsic parameter of RSM, corresponding to \nu', independent of \mu
RSM-\nu': binomial parameter of RSM, probability that lesion is found
SE: sensitivity, same as TPF
Significance testing: determining the p-value of a statistical test
SP: specificity, same as 1-FPF
Threshold: Reporting criteria: if confidence exceeds a threshold value, report case as diseased, otherwise report non-diseased
TN: true negative, a non-diseased case classified as non-diseased
TP: true positive, a diseased case classified as diseased
TPF: number of TPs divided by number of diseased cases
Treatment/modality: used interchangeably, for example, computed tomography (CT) images vs. magnetic resonance imaging (MRI) images
wAFROC curve: plot of weighted LLF (ordinate) vs. FPF, where FPF is inferred using highest rating of NL marks on non-diseased cases ONLY
wAFROC1 curve: plot of weighted LLF (ordinate) vs. FPF1, where FPF1 is inferred using highest rating of NL marks on ALL cases
wAFROC1 FOM: weighted trapezoidal area under AFROC1 curve: only use if there are zero non-diseased cases is always number of treatments minus one

Dataset

The dataset object has 3 list elements: $ratings, $lesions and $descriptions, where:

dataset$ratings: contains 3 elements as sub-lists: $NL, $LL and $LL_IL; these describe the structure of the ratings;
dataset$lesions: contains 3 elements as sub-lists: $perCase, $IDs and $weights; these describe the structure of the lesions;
dataset$descriptions: contains 7 elements as sub-lists: $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID; these describe other characteristics of the dataset as detailed next.

Note: -Inf is used to indicate the ratings of unmarked lesions and/or missing values. As an example of the latter, if the maximum number of NLs in a dataset is 4, but some images have fewer than 4 NL marks, the corresponding "empty" positions would be filled with -Infs. Do not use NA to denote a missing rating.

Note: "dataset" in this package always represents R object(s) with the following structure(s):

General data structure, e.g., `dataset02`, an ROC dataset, and `dataset05`, an FROC dataset.

ratings$NL: a float array with dimensions c(I, J, K, maxNL), containing the ratings of NL marks. The first K1 locations of the third index corresponds to NL marks on non-diseased cases and the remaining locations correspond to NL marks on diseased cases. The 4th dimension allows for multiple NL marks on a case: the first index holds the first NL rating on the image, the second holds the second NL rating on the image, etc. The value of maxNL is determined by the case with the maximum number of lesions per case in the dataset. For FROC datasets missing NL ratings are assigned the -Inf rating. For ROC datasets, FP ratings are assigned to the first K1 elements of NL[,,1:K1,1] and the remaining K2 elements of NL[,,(K1+1):K,1] are set to -Inf.
ratings$LL: for non-LROC datasets a float array with dimensions c(I, J, K2, maxLL) containing the ratings of LL marks. The value of maxLL is determined by the maximum number of lesions per case in the dataset. Unmarked lesions are assigned the -Inf rating. For ROC datasets TP ratings are assigned to LL[,,1:K2,1]. For LROC datasets it is a float array with dimensions c(I, J, K2, 1) containing the ratings of correct localizations, otherwise the rating is recorded in the incorrect localization array described next.
ratings$LL_IL: for LROC datasets the ratings of incorrect localization marks on abnormal cases. It is a float array with dimensions c(I, J, K2, 1). For non-LROC datasets this array is filled with NAs.
lesions$perCase: an integer array with length K2, the number of lesions on each diseased case. The maximum value of this array equals maxLL. For example, dataset05$lesions$perCase[4 is 2, meaning the 4th diseased case has two lesions.
lesions$IDs: an integer array with dimensions [K2, maxLL], labeling (or naming) the lesions on the diseased cases. For example, dataset05$lesions$IDs[4,] is c(1,2,-Inf), meaning the 4th diseased case has two lesions, labeled 1 and 2.
lesions$weights: a floating point array with dimensions c(K2, maxLL), representing the relative importance of detecting each lesion. The weights for an abnormal case must sum to unity. For example, dataset05$lesions$weights[4,] is c(0.5,0.5, -Inf), corresponding to equal weights (0.5) assigned to of the two lesions in the case.
descriptions$fileName: a character variable containing the file name of the source data for this dataset. This is generated automatically by the DfReadDataFile function used to read the file. For a simulalated dataset it is set to "NA" (i.e., a character vector, not the variable NA).
descriptions$type: a character variable describing the data type: "ROC", "LROC", "ROI" or "FROC".
descriptions$name: a character variable containing the name of the dataset: e.g., "dataset02" or "dataset05". This is generated automatically by the DfReadDataFile function used to read the file.
descriptions$truthTableStr: a c(I, J, L, maxLL+1) object. For normal cases elements c(I, J, L, 1) are filled with 1s if the corresponding interpretations occurred or NAs otherwise. For abnormal cases elements c(I, J, L, 2:(maxLL+1)) are filled with 1s if the corresponding interpretations occurred or NAs otherwise. This object is necessary for analyzing more complex designs, e.g., split-plot, as described next.
descriptions$design: a character variable: "FCTRL", "SPLIT-PLOT-A" or "SPLIT-PLOT-A", corresponding to factorial, split-plot-A or split-plot-C designs. The A and C refer to subparts of Table VII in a Hillis 2014 publication.
descriptions$modalityID: a character vector of length I, which labels/names the modalities in the dataset. For non-JAFROC data file formats, they must be unique integers.
descriptions$readerID: a character vector of length J, which labels/names the readers in the dataset. For non-JAFROC data file formats, they must be unique integers.

ROI data structure, example `datasetROI`

Only changes from the previously described structure are described below:

ratings$NL: a float array with dimensions c(I, J, K, Q) containing the ratings of each of Q quadrants for each non-diseased case.
ratings$LL: a float array with dimensions c(I, J, K2, Q) containing the ratings of quadrants for each diseased case.
lesions$perCase: this contains the locations, on abnormal cases, containing at least one lesion.

Crossed modality data structure, example `datasetCrossedModality`

Only changes from the previously described structure are described below:

ratings$NL: a float array with dimension c(I1, I2, J, K, maxNL) containing the ratings of NL marks. Note the existence of two modality indices.
LL: a float array with dimension c(I1, I2, J, K2, maxLL) containing the ratings of all LL marks. Note the existence of two modality indices.
modalityID1: corresponding to first modality factor.
modalityID2: corresponding to second modality factor.

Df: Datafile Related Functions

Df2RJafrocDataset: Convert a ratings array to a dataset object.
DfBinDataset: Return a binned dataset.
DfCreateCorCbmDataset:Create paired dataset for testing FitCorCbm.
DfExtractDataset: Extract a subset of modalities and readers from a dataset.
DfFroc2Roc: Convert an FROC dataset to a highest rating inferred ROC dataset.
DfLroc2Roc: Convert an LROC dataset to a highest rating inferred ROC dataset.
DfLroc2Froc: Simulates an "AUC-equivalent" FROC dataset from a supplied LROC dataset.
DfFroc2Lroc: Simulates an "AUC-equivalent" LROC dataset from a supplied FROC dataset.
DfReadCrossedModalities: Read a crossed-modalities data file.
DfReadDataFile: Read a general data file.
DfSaveDataFile: Save ROC data file in a different format.
DfExtractCorCbmDataset: Extract two arms of a pairing from an MRMC ROC dataset suitable for using FitCorCbm.

Fitting Functions

FitBinormalRoc: Fit the binormal model to ROC data (R equivalent of ROCFIT or RSCORE).
FitCbmRoc: Fit the contaminated binormal model (CBM) to ROC data.
FitRsmRoc: Fit the radiological search model (RSM) to ROC data.
FitCorCbm: Fit the correlated contaminated binormal model (CORCBM) to paired ROC data.
FitRsmRoc: Fit the radiological search model (RSM) to ROC data.

Plotting Functions

PlotBinormalFit: Plot binormal-predicted ROC curve with provided BM parameters.
PlotEmpiricalOperatingCharacteristics: Plot empirical operating characteristics for specified dataset.
PlotRsmOperatingCharacteristics: Plot RSM-fitted ROC curves.

Simulation Functions

SimulateFrocDataset: Simulates an uncorrelated FROC dataset using the RSM.
SimulateRocDataset: Simulates an uncorrelated binormal model ROC dataset.
SimulateCorCbmDataset: Simulates an uncorrelated binormal model ROC dataset.
SimulateLrocDataset: Simulates an uncorrelated LROC dataset.

Sample size Functions

SsPowerGivenJK: Calculate statistical power given numbers of readers J and cases K.
SsPowerTable: Generate a power table.
SsSampleSizeKGivenJ: Calculate number of cases K, for specified number of readers J, to achieve desired power for an ROC study.

Significance Testing Functions

StSignificanceTesting: Perform significance testing, DBM or OR.
StSignificanceTestingCadVsRad: Perform significance testing, CAD vs. radiologists.
StSignificanceTestingCrossedModalities: Perform significance testing using crossed modalities analysis.

Miscellaneous and Utility Functions

UtilAucBinormal: Binormal model AUC function.
UtilAucCBM: CBM AUC function.
UtilAucPROPROC: PROPROC AUC function.
UtilAnalyticalAucsRSM: RSM ROC/AFROC AUC calculator.
UtilFigureOfMerit: Calculate empirical figures of merit (FOMs) for specified dataset.
UtilIntrinsic2RSM: Convert from intrinsic to physical RSM parameters.
UtilLesionWeightsMatrix: Calculates the lesion weights matrix.
UtilMeanSquares: Calculates the mean squares used in the DBMH and ORH methods.
UtilOutputReport: Generate a formatted report file.
UtilRSM2Intrinsic: Convert from RSM parameters to intrinsic RSM parameters.
UtilPseudoValues: Return jackknife pseudovalues.
UtilVarComponentsDBM: Utility for Dorfman-Berbaum-Metz variance components.
UtilORVarComponentsFactorial: Utility for Obuchowski-Rockette variance components.

Author(s)

Author: Dev Chakraborty dpc10ster@gmail.com.
Author: Xuetong Zhai xuetong.zhai@gmail.com.
Contributor: Peter Phillips peter.phillips@cumbria.ac.uk.

References

Basics of ROC

Metz, CE (1978). Basic principles of ROC analysis. In Seminars in nuclear medicine (Vol. 8, pp. 283–298). Elsevier.

Metz, CE (1986). ROC Methodology in Radiologic Imaging. Investigative Radiology, 21(9), 720.

Metz, CE (1989). Some practical issues of experimental design and data analysis in radiological ROC studies. Investigative Radiology, 24(3), 234.

Metz, CE (2008). ROC analysis in medical imaging: a tutorial review of the literature. Radiological Physics and Technology, 1(1), 2–12.

Wagner, R. F., Beiden, S. V, Campbell, G., Metz, CE, & Sacks, W. M. (2002). Assessment of medical imaging and computer-assist systems: lessons from recent experience. Academic Radiology, 9(11), 1264–77.

Wagner, R. F., Metz, CE, & Campbell, G. (2007). Assessment of medical imaging systems and computer aids: a tutorial review. Academic Radiology, 14(6), 723–48.

DBM/OR methods and extensions

DORFMAN, D. D., BERBAUM, KS, & Metz, CE (1992). Receiver operating characteristic rating analysis: generalization to the population of readers and patients with the jackknife method. Investigative Radiology, 27(9), 723.

Obuchowski, NA, & Rockette, HE (1994). HYPOTHESIS TESTING OF DIAGNOSTIC ACCURACY FOR MULTIPLE READERS AND MULTIPLE TESTS: AN ANOVA APPROACH WITH DEPENDENT OBSERVATIONS. Communications in Statistics-Simulation and Computation, 24(2), 285–308.

Hillis, SL, Berbaum, KS, & Metz, CE (2008). Recent developments in the Dorfman-Berbaum-Metz procedure for multireader ROC study analysis. Academic Radiology, 15(5), 647–61.

Hillis, SL, Obuchowski, NA, & Berbaum, KS (2011). Power Estimation for Multireader ROC Methods: An Updated and Unified Approach. Acad Radiol, 18, 129–142.

Hillis, SL SL (2007). A comparison of denominator degrees of freedom methods for multiple observer ROC analysis. Statistics in Medicine, 26(3), 596–619.

FROC paradigm

Chakraborty DP. Maximum Likelihood analysis of free-response receiver operating characteristic (FROC) data. Med Phys. 1989;16(4):561–568.

Chakraborty, DP, & Berbaum, KS (2004). Observer studies involving detection and localization: modeling, analysis, and validation. Medical Physics, 31(8), 1–18.

Chakraborty, DP (2006). A search model and figure of merit for observer data acquired according to the free-response paradigm. Physics in Medicine and Biology, 51(14), 3449–62.

Chakraborty, DP (2006). ROC curves predicted by a model of visual search. Physics in Medicine and Biology, 51(14), 3463–82.

Chakraborty, DP (2011). New Developments in Observer Performance Methodology in Medical Imaging. Seminars in Nuclear Medicine, 41(6), 401–418.

Chakraborty, DP (2013). A Brief History of Free-Response Receiver Operating Characteristic Paradigm Data Analysis. Academic Radiology, 20(7), 915–919.

Chakraborty, DP, & Yoon, H.-J. (2008). Operating characteristics predicted by models for diagnostic tasks involving lesion localization. Medical Physics, 35(2), 435.

Thompson JD, Chakraborty DP, Szczepura K, et al. (2016) Effect of reconstruction methods and x-ray tube current-time product on nodule detection in an anthropomorphic thorax phantom: a crossed-modality JAFROC observer study. Medical Physics. 43(3):1265-1274.

Zhai X, Chakraborty DP. (2017) A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. doi: 10.1002/mp.12263:2207–2222.

Hillis SL, Chakraborty DP, Orton CG. ROC or FROC? It depends on the research question. Medical Physics. 2017.

Chakraborty DP, Nishikawa RM, Orton CG. Due to potential concerns of bias and conflicts of interest, regulatory bodies should not do evaluation methodology research related to their regulatory missions. Medical Physics. 2017.

Dobbins III JT, McAdams HP, Sabol JM, Chakraborty DP, et al. (2016) Multi-Institutional Evaluation of Digital Tomosynthesis, Dual-Energy Radiography, and Conventional Chest Radiography for the Detection and Management of Pulmonary Nodules. Radiology. 282(1):236-250.

Warren LM, Mackenzie A, Cooke J, et al. Effect of image quality on calcification detection in digital mammography. Medical Physics. 2012;39(6):3202-3213.

Chakraborty DP, Zhai X. On the meaning of the weighted alternative free-response operating characteristic figure of merit. Medical physics. 2016;43(5):2548-2557.

Chakraborty DP. (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples. Taylor-Francis, LLC.

Compute the chisquare goodness of fit statistic for ROC fitting model

Description

Compute the chisquare goodness of fit statistic for specified ROC data fitting model

Usage

ChisqrGoodnessOfFit(fpCounts, tpCounts, parameters, model, lesDistr)

Arguments

fpCounts

The FP counts table

tpCounts

The TP counts table

parameters

The parameters of the model including cutoffs, see details

model

The fitting model: "BINORMAL", "CBM" or "RSM

lesDistr

The lesion distribution matrix; not needed for "BINORMAL" or "CBM" models. Array [1:maxLL,1:2]. The probability mass function of the lesion distribution for diseased cases. The first column contains the actual numbers of lesions per case. The second column contains the fraction of diseased cases with the number of lesions specified in the first column. The second column must sum to unity.

Details

For model = "BINORMAL" the parameters are c(a,b,zetas). For model = "CBM" the parameters are c(mu,alpha,zetas). For model = "RSM" the parameters are c(mu,lambda,nu,zetas). Due to the sparsity of the data, in most cases the goodness of fit statistic cannot be calculated as the criterion of at least 5 counts in each cell (TP and FP) is usually not met. An exception dataset is shown below.

Value

The return value is a list with the following elements:

chisq

The chi-square statistic

pVal

The p-value of the fit

df

The degrees of freedom

Examples

## Test with TONY data for which chisqr can be calculated
ds <- DfFroc2Roc(dataset01)
fit <- FitBinormalRoc(ds, 2, 3) # trt 2 and rdr 3
## fitted a,b and zeta parameters from preceding line were used to call the
## function as shown below:
fpCounts = c(119,  30,   9,  19,   7,   1)
tpCounts = c(10, 11,  7, 16, 29, 16)
gfit = ChisqrGoodnessOfFit(fpCounts, tpCounts, 
parameters = c(fit$a, fit$b, fit$zetas), model="BINORMAL") 
gfit

Convert ratings arrays to an RJafroc dataset

Description

Converts ratings arrays, ROC or FROC, but not LROC, to an RJafroc dataset, thereby allowing the user to leverage the file I/O, plotting and analyses capabilities of RJafroc.

Usage

Df2RJafrocDataset(NL, LL, InputIsCountsTable = FALSE, ...)

Arguments

NL

Non-lesion localizations array (or FP array for ROC data).

LL

Lesion localizations array (or TP array for ROC data).

InputIsCountsTable

If TRUE, the NL and LL arrays are rating-counts tables, with common lengths equal to the number of ratings R, if FALSE, the default, these are arrays of lengths K1, the number of non-diseased cases, and K2, the number of diseased cases, respectively.

...

Other elements of RJafroc dataset that may, depending on the context, need to be specified. perCase must be specified if an FROC dataset is to be returned. It is a K2-length array specifying the numbers of lesions in each diseased case in the dataset.

Details

The function "senses" the data type (ROC or FROC) from the the absence or presence of perCase.

ROC data can be NL[1:K1] and LL[1:K2] or NL[1:I,1:J,1:K1] and LL[1:I,1:J,1:K2].
FROC data can be NL[1:K1,1:maxNL] and LL[1:K2, 1:maxLL] or NL[1:I,1:J,1:K1,1:maxNL] and LL[1:I,1:J,1:K2,1:maxLL].

Here maxNL/maxLL = maximum numbers of NLs/LLs, per case, over entire dataset. Equal weights are assigned to every lesion (FROC data). Consecutive characters/integers starting with "1" are assigned to IDs, modalityID and readerID.

Value

A dataset with the structure described in RJafroc-package.

Examples

## Input as ratings arrays
set.seed(1);NL <- rnorm(5);LL <- rnorm(7)*1.5 + 2
dataset <- Df2RJafrocDataset(NL, LL)

## Input as counts tables
K1t <- c(30, 19, 8, 2, 1)
K2t <- c(5,  6, 5, 12, 22)
dataset <- Df2RJafrocDataset(K1t, K2t, InputIsCountsTable = TRUE)

Returns a binned dataset

Description

Bins continuous (i.e. floating point) or quasi-continuous (e.g. integers 0-100) ratings in a dataset and returns the corresponding binned dataset in which the ratings are integers 1, 2,...., with higher values representing greater confidence in presence of disease

Usage

DfBinDataset(dataset, desiredNumBins = 7, opChType)

Arguments

dataset

The dataset to be binned, with structure as in RJafroc-package.

desiredNumBins

The desired number of bins. The default is 7.

opChType

The operating characteristic relevant to the binning operation: "ROC", "FROC", "AFROC", or "wAFROC".

Details

For small datasets the number of bins may be smaller than desiredNumBins. The algorithm needs to know the type of operating characteristic relevant to the binning operation. For ROC the bins are FP and TP counts, for FROC the bins are NL and LL counts, for AFROC the bins are FP and LL counts, and for wAFROC the bins are FP and wLL counts. Binning is generally employed prior to fitting a statistical model, e.g., maximum likelihood, to the data. This version chooses ctffs so as to maximize empirical AUC (this yields a unique choice of ctffs which gives the reader the maximum deserved credit).

Value

The binned dataset

References

Miller GA (1956) The Magical Number Seven, Plus or Minus Two: Some limits on our capacity for processing information, The Psychological Review 63, 81-97

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples, CRC Press, Boca Raton, FL. https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples


binned <- DfBinDataset(dataset02, desiredNumBins = 3, opChType = "ROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "ROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "AFROC")
binned <- DfBinDataset(dataset05, desiredNumBins = 4, opChType = "wAFROC")
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 1)
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 2)
binned <- DfBinDataset(dataset05, opChType = "wAFROC", desiredNumBins = 3)
## etc.

 

## takes longer than 5 sec on OSX
dataset <- SimulateRocDataset(I = 2, J = 5, K1 = 50, K2 = 70, a = 1, b = 0.5, seed = 123)
datasetB <- DfBinDataset(dataset, desiredNumBins = 7, opChType = "ROC")
fomOrg <- as.matrix(UtilFigureOfMerit(dataset, FOM = "Wilcoxon"))
print(fomOrg)
fomBinned <- as.matrix(UtilFigureOfMerit(datasetB, FOM = "Wilcoxon"))
print(fomBinned)
cat("mean, sd = ", mean(fomOrg), sd(fomOrg), "\n")
cat("mean, sd = ", mean(fomBinned), sd(fomBinned), "\n")

Create paired dataset for testing `FitCorCbm`

Description

The paired dataset is generated using bivariate sampling; details are in referenced publication

Usage

DfCreateCorCbmDataset(
  seed = 123,
  K1 = 50,
  K2 = 50,
  desiredNumBins = 5,
  muX = 1.5,
  muY = 3,
  alphaX = 0.4,
  alphaY = 0.7,
  rhoNor = 0.3,
  rhoAbn2 = 0.8
)

Arguments

seed

The seed variable, default is 123; set to NULL for truly random seed

K1

The number of non-diseased cases, default is 50

K2

The number of diseased cases, default is 50

desiredNumBins

The desired number of bins; default is 5

muX

The CBM \mu parameter in condition X

muY

The CBM \mu parameter in condition Y

alphaX

The CBM \alpha parameter in condition X

alphaY

The CBM ‘⁠alpha⁠’ parameter in condition Y

rhoNor

The correlation of non-diseased case z-samples

rhoAbn2

The correlation of diseased case z-samples, when disease is visible in both conditions

Details

The ROC data is bined to 5 bins in each condition.

Value

The return value is the desired dataset, suitable for testing FitCorCbm.

References

Zhai X, Chakraborty DP (2017) A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.

Examples

## seed <- 1 
## this gives unequal numbers of bins in X and Y conditions for 50/50 dataset
dataset <- DfCreateCorCbmDataset()


## this takes very long time!! used to show asymptotic convergence of ML estimates 
## dataset <- DfCreateCorCbmDataset(K1 = 5000, K2 = 5000)

Extract two arms of a pairing from an MRMC ROC dataset

Description

Extract a paired dataset from a larger dataset. The pairing could be two readers in the same treatment, or different readers in different treatments, or the same reader in different treatments. If necessary The data is binned to 5 bins in each condition.

Usage

DfExtractCorCbmDataset(dataset, trts = 1, rdrs = 1)

Arguments

dataset

The original dataset from which the pairing is to be extracted

trts

A vector, maximum length 2, contains the indices of the treatment or treatments to be extracted

rdrs

A vector, maximum length 2, contains the indices of the reader or readers to be extracted

Details

The desired pairing is contained in the vectors trts and rdrs. If either has length one, the other must have length two and the pairing is implicit. If both are length two, then the pairing is that implied by the first treatement and the second reader, which is one arm, and the other arm is that implied by the second treatment paired with the first reader. Using this method any allowed pairing can be extracted and analyzed by FitCorCbm. The utility of this software is in designing a ratings simulator that is statistically matched to a real dataset.

Value

A new dataset in which the number of treatments is one and the number of readers is two

Examples



## Extract the paired data corresponding to the second and third readers in the first treatment
## from the included ROC dataset
dataset11_23 <- DfExtractCorCbmDataset(dataset05, trts = 1, rdrs = c(2,3))

## Extract the paired data corresponding to the third reader in the first and second treatments
dataset12_33 <- DfExtractCorCbmDataset(dataset05, trts = c(1,2), rdrs = 3)

## Extract the data corresponding to the first reader in the first
## treatment paired with the data
## from the third reader in the second treatment
## (the bin indices are at different positions in the two arrays)
dataset12_13 <- DfExtractCorCbmDataset(dataset05,
trts = c(1,2), rdrs = c(1,3))

Extract a subset of treatments and readers from a dataset

Description

Extract a dataset consisting of a subset of treatments/readers from a larger dataset

Usage

DfExtractDataset(dataset, trts, rdrs)

Arguments

dataset

The original dataset from which the subset is to be extracted

trts

A vector contains the indices of the treatments to be extracted. If this parameter is not supplied, all treatments are extracted.

rdrs

A vector contains the indices of the readers to be extracted. If this parameter is not supplied, all readers are extracted.

Details

Note that trts and rdrs are the vectors of indices not IDs. For example, if the ID of the first reader is "0", the corresponding value in trts should be 1 not 0.

Value

A new dataset containing only the specified treatments and readers that were extracted from the original dataset

Examples

## Extract the data corresponding to the second reader in the 
## first treatment from an included ROC dataset
ds1 <- DfExtractDataset(dataset05, trts = 1, rdrs = 2)

## Extract the data of the first and third reader in all 
## treatment from the included ROC dataset
ds2 <- DfExtractDataset(dataset05, rdrs = c(1, 3))

Simulates an "AUC-equivalent" LROC dataset from an FROC dataset

Description

Simulates a multiple-treatment multiple-reader "AUC-equivalent" LROC dataset from a supplied FROC dataset.

Usage

DfFroc2Lroc(dataset)

Arguments

dataset

The FROC dataset to be converted to LROC.

Details

The FROC paradigm can have 0 or more marks per case. However, LROC is restricted to exactly one mark per case. For the NL array of the LROC data, for non-disesed cases, the highest rating of the FROC marks, or -Inf if there are no marks, is copied to case index k1 = 1 to k1 = K1 of the LROC dataset. For each diseased case, if the max LL rating exceeds the max NL rating, then the max LL rating is copied to the LL array, otherwise the max NL rating is copied to the LL_IL array. The max NL rating on each diseased case is then set to -Inf (since the LROC paradigm only allows one mark. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this function is to test the Significance testing functions using MRMC LROC datasets, which I currently don't have.

Value

The equivalent LROC dataset

Examples


lrocDataset <- DfFroc2Lroc(dataset05)
frocHrAuc <- UtilFigureOfMerit(dataset05, FOM = "HrAuc")   
lrocWilcoxonAuc <- UtilFigureOfMerit(lrocDataset, FOM = "Wilcoxon")
## expect_equal(frocHrAuc, lrocWilcoxonAuc)

Convert an FROC dataset to an ROC dataset

Description

Convert an FROC dataset to a highest rating inferred ROC dataset

Usage

DfFroc2Roc(dataset)

Arguments

dataset

The FROC dataset to be converted, RJafroc-package.

Details

The first member of the ROC dataset is NL, whose 3rd dimension has length (K1 + K2), the total number of cases. Ratings of cases (K1 + 1) through (K1 + K2) are -Inf. This is because in an ROC dataset FPs are only possible on non-diseased cases.The second member of the list is LL. Its 3rd dimension has length K2, the number of diseased cases. This is because TPs are only possible on diseased cases. For each case the inferred ROC rating is the highest of all FROC ratings on that case. If a case has no marks, a finite ROC rating, guaranteed to be smaller than the rating on any marked case, is assigned to it. The dataset structure is shown below:

NL Ratings array [1:I, 1:J, 1:(K1+K2), 1], of false positives, FPs
LL Ratings array [1:I, 1:J, 1:K2, 1], of true positives, TPs
perCase array [1:K2], number of lesions per diseased case
IDs array [1:K2, 1], labels of lesions on diseased cases
weights array [1:K2, 1], weights (or clinical importances) of lesions
dataType "ROC", the data type
modalityID [1:I] inherited modality labels
readerID [1:J] inherited reader labels

Value

An ROC dataset with finite ratings in NL[,,1:K1,1] and LL[,,1:K2,1].

Examples

rocDataSet <- DfFroc2Roc(dataset05)
rocSpDataSet <- DfFroc2Roc(datasetFROCSpC)

## in the following example, because of the smaller number of cases, 
## it is easy to see the process at work:

set.seed(1);K1 <- 3;K2 <- 5
mu <- 1;nu <- 0.5;lambda <- 2;zeta1 <- 0
lambda_i <- UtilRSM2Intrinsic(mu,lambda,nu)$lambda_i
nu_i <- UtilRSM2Intrinsic(mu,lambda,nu)$nu_i
Lmax <- 2;Lk2 <- floor(runif(K2, 1, Lmax + 1))
frocDataRaw <- SimulateFrocDataset(mu, lambda_i, nu_i, zeta1, I = 1, J = 1, 
K1, K2, perCase = Lk2)
hrData <- DfFroc2Roc(frocDataRaw)

## print("frocDataRaw$ratings$NL[1,1,,] = ")
## print("hrData$ratings$NL[1,1,1:K1,] = ")
## print("frocDataRaw$ratings$LL[1,1,,] = ")
## print("hrData$ratings$LL[1,1,,] = ")

## following is the output

## [1] "frocDataRaw$ratings$NL[1,1,,] = "
## [,1]      [,2]      [,3] [,4]
## [1,] 2.4046534 0.7635935      -Inf -Inf
## [2,]      -Inf      -Inf      -Inf -Inf
## [3,] 0.2522234      -Inf      -Inf -Inf
## [4,] 0.4356833      -Inf      -Inf -Inf
## [5,]      -Inf      -Inf      -Inf -Inf
## [6,]      -Inf      -Inf      -Inf -Inf
## [7,]      -Inf      -Inf      -Inf -Inf
## [8,] 0.8041895 0.3773956 0.1333364 -Inf

## > ## print("hrData$ratings$NL[1,1,1:K1,] = ")
## [1] "hrData$ratings$NL[1,1,1:K1,] = "
## [1] 2.4046534      -Inf 0.2522234
## > ## print("frocDataRaw$ratings$LL[1,1,,] = ")
## [1] "frocDataRaw$ratings$LL[1,1,,] = "
## [,1] [,2]
## [1,]      -Inf -Inf
## [2,] 1.5036080 -Inf
## [3,] 0.8442045 -Inf
## [4,] 1.0467262 -Inf
## [5,]      -Inf -Inf
## > ## print("hrData$ratings$LL[1,1,,] = ")
## [1] "hrData$ratings$LL[1,1,,] = "
## [1] 0.4356833 1.5036080 0.8442045 1.0467262 0.8041895
## Note that rating of the first and the last diseased case came from NL marks

Simulates an "AUC-equivalent" FROC dataset from an LROC dataset

Description

Simulates a multiple-treatment multiple-reader "AUC-equivalent" FROC dataset from a supplied LROC dataset, e.g., datasetCadLroc.

Usage

DfLroc2Froc(dataset)

Arguments

dataset

The LROC dataset to be converted to FROC.

Details

The LROC paradigm always yields a single mark per case. Therefore the equivalent FROC will also have only one mark per case. The NL arrays of the two datasets are identical. The LL array is created by copying the LLCl array of the LROC dataset to the LL array of the FROC dataset, from diseased case index k2 = 1 to k2 = K2. Additionally, the LLIl array of the LROC dataset is copied to the NL array of the FROC dataset, starting at case index k1 = K1+1 to k1 = K1+K2. Any zero ratings are replace by -Infs. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this function is to test the CAD significance testing functions using CAD FROC datasets, which I currently don't have.

Value

The equivalent FROC dataset

Examples


frocDataset <- DfLroc2Froc(datasetCadLroc)
lrocAuc <- UtilFigureOfMerit(datasetCadLroc, FOM = "Wilcoxon")
frocHrAuc <- UtilFigureOfMerit(frocDataset, FOM = "HrAuc")

Convert an LROC dataset to a ROC dataset

Description

Converts an LROC dataset to an ROC dataset

Usage

DfLroc2Roc(dataset)

Arguments

dataset

The LROC dataset to be converted.

Details

For the diseased cases one takes the maximum rating on each diseased case, which could be a LL ("true positive" correct localization) or a LL_IL ("true positive" incorrect localization) rating, whichever has the higher rating. For non-diseased cases the NL arrays are identical.

Value

An ROC dataset

Examples

rocDataSet <- DfLroc2Roc(datasetCadLroc)

Read a crossed-treatment data file

Description

Read an crossed-treatment data file, in which the two treatment factors are crossed

Usage

DfReadCrossedModalities(fileName, sequentialNames = FALSE)

Arguments

fileName

A string specifying the name of the file that contains the dataset, which must be an extended-JAFROC format data file containing an additional treatment factor.

sequentialNames

If TRUE, consecutive integers (starting from 1) will be used as the treatment and reader IDs. Otherwise, treatment and reader IDs in the original data file will be used. The default is FALSE.

Details

The data format is similar to the JAFROC format (see RJafroc-package). The difference is that there are two treatment factors. TBA For an example see ... add reference to FROC book chapter https://dpc10ster.github.io/RJafrocFrocBook/

Value

A dataset with the specified structure, similar to a standard RJafroc dataset (see RJafroc-package). Because of the extra treatment factor, NL and LL are each five dimensional arrays. There are also two treatment IDS: modalityID1 and modalityID2.

References

Thompson JD, Chakraborty DP, Szczepura K, et al. (2016) Effect of reconstruction methods and x-ray tube current-time product on nodule detection in an anthropomorphic thorax phantom: a crossed-treatment JAFROC observer study. Medical Physics. 43(3):1265-1274.

Read a data file

Description

Read a disk file and create a ROC, FROC or LROC dataset object from it.

Usage

DfReadDataFile(
  fileName,
  format = "JAFROC",
  newExcelFileFormat = FALSE,
  lrocForcedMark = NA,
  delimiter = ",",
  sequentialNames = FALSE
)

Arguments

fileName

A string specifying the name of the file. The file-extension must match the format specified below.

format

A string specifying the format of the data file. It can be "JAFROC", the default, which requires a .xlsx Excel file (not .xls), "MRMC" or "iMRMC". For "MRMC" the format is determined by the data file extension (.csv or .txt or .lrc) as specified in https://perception.lab.uiowa.edu/. For "iMRMC" the file extension is .imrmc and the format is described in https://code.google.com/archive/p/imrmc/. See following note for important information about deprecation of the "MRMC" format.

newExcelFileFormat

Logical. Must be true to read LROC data. This argument only applies to the "JAFROC" format. The default is FALSE. If TRUE the function accommodates 3 additional columns in the Truth worksheet. If FALSE, the original function (as in version 1.2.0) is used and the three extra columns, if present, throws an error.

lrocForcedMark

Logical: For LROC dataset only: is a forced mark required on every image? The default is NA. If a mark is not required, set it to FALSE otherwise to TRUE.

delimiter

The string delimiter to be used for the "MRMC" format ("," is the default), see https://perception.lab.uiowa.edu/. This parameter is not used when reading "JAFROC" or "iMRMC" data files.

sequentialNames

A logical variable: if TRUE, consecutive integers (starting from 1) will be used as the treatment and reader IDs (i.e., names). Otherwise, treatment and reader IDs in the original data file will be used.

Value

A dataset with the structure specified in RJafroc-package.

Note

The "MRMC" format is deprecated. For non-JAFROC formats four file extensions (.csv, .txt, .lrc and .imrmc) are possible, all of which are restricted to ROC data. Only the iMRMC format is actively supported, i.e, files with extension .imrmc. Other formats (.csv, .txt, .lrc) are deprecated. Such files can still be read by this function and then saved to a JAFROC format file for further analysis within this package. For non-JAFROC data file formats, the readerID and modalityID fields must be unique integers.

Examples

fileName <- system.file("extdata", "toyFiles/ROC/rocCr.xlsx", 
package = "RJafroc", mustWork = TRUE)
rdrArr1D <- DfReadDataFile(fileName, newExcelFileFormat = TRUE)



fileName <- system.file("extdata", "Roc.xlsx", 
package = "RJafroc", mustWork = TRUE)
RocDataXlsx <- DfReadDataFile(fileName)

fileName <- system.file("extdata", "RocData.csv", 
package = "RJafroc", mustWork = TRUE)
RocDataCsv<- DfReadDataFile(fileName, format = "MRMC")

fileName <- system.file("extdata", "RocData.imrmc", 
package = "RJafroc", mustWork = TRUE)
RocDataImrmc<- DfReadDataFile(fileName, format = "iMRMC")

fileName <- system.file("extdata", "Froc.xlsx", 
package = "RJafroc", mustWork = TRUE)
FrocDataXlsx <- DfReadDataFile(fileName, sequentialNames = TRUE)

Save ROC dataset in different formats

Description

Save ROC dataset in other formats so it can be analyzed with alternate software

Usage

DfSaveDataFile(
  dataset,
  fileName,
  format = "MRMC",
  dataDescription = "RJafroc dataset converted to imrmc format"
)

Arguments

dataset

The dataset to be saved.

fileName

The file name of the output data file. The extension of the data file must match the corresponding format, see RJafroc-package

format

The format of the data file, which can be "MRMC" or "iMRMC", see RJafroc-package.

dataDescription

An optional string variable describing the data file, the default value is the variable name of dataset The description appears on the first line of *.lrc or *imrmc data file. This parameter is not used when saving dataset in other formats.

Examples


## DfSaveDataFile(dataset = dataset02, 
##    fileName = "rocData2.csv", format = "MRMC")
## DfSaveDataFile(dataset = dataset02, 
##    fileName = "rocData2.lrc", format = "MRMC", 
##     dataDescription = "ExampleROCdata1")
## DfSaveDataFile(dataset = dataset02, 
##    fileName = "rocData2.txt", format = "MRMC", 
##     dataDescription = "ExampleROCdata2")
##  DfSaveDataFile(dataset = dataset02, 
##    fileName = "dataset05.imrmc", format = "iMRMC", 
##    dataDescription = "ExampleROCdata3")

Save dataset object as a JAFROC format Excel file

Description

Save a dataset object as a JAFROC format Excel file

Usage

DfWriteExcelDataFile(dataset, fileName)

Arguments

dataset

The dataset object, see RJafroc-package.

fileName

The file name to save to; the extension of the data file must be .xlsx

Examples


##DfWriteExcelDataFile(dataset = dataset05, fileName = "rocData2.xlsx")

Fit the binormal model to selected treatment and reader in an ROC dataset

Description

Fit the binormal model-predicted ROC curve for a dataset. This is the R equivalent of ROCFIT or RSCORE

Usage

FitBinormalRoc(dataset, trt = 1, rdr = 1)

Arguments

dataset

The ROC dataset

trt

The desired treatment, default is 1

rdr

The desired reader, default is 1

Details

In the binormal model ratings (more accurately the latent decision variables) from diseased cases are sampled from N(a,1) while ratings for non-diseased cases are sampled from N(0,b^2). To avoid clutter error bars are only shown for the lowest and uppermost operating points. An FROC dataset is internally converted to a highest rating inferred ROC dataset. To many bins containing zero counts will cause the algorithm to fail; so be sure to bin the data appropriately to fewer bins, where each bin has at least one count.

Value

The returned value is a list with the following elements:

a

The mean of the diseased distribution; the non-diseased distribution is assumed to have zero mean

b

The standard deviation of the non-diseased distribution. The diseased distribution is assumed to have unit standard deviation

zetas

The binormal model cutoffs, zetas or thresholds

AUC

The binormal model fitted ROC-AUC

StdAUC

The standard deviation of AUC

NLLIni

The initial value of negative LL

NLLFin

The final value of negative LL

ChisqrFitStats

The chisquare goodness of fit results

covMat

The covariance matrix of the parameters

fittedPlot

A ggplot2 object containing the fitted operating characteristic along with the empirical operating points. Use print() to display the object

References

Dorfman DD, Alf E (1969) Maximum-Likelihood Estimation of Parameters of Signal-Detection Theory and Determination of Confidence Intervals - Rating-Method Data, Journal of Mathematical Psychology 6, 487-496.

Grey D, Morgan B (1972) Some aspects of ROC curve-fitting: normal and logistic models. Journal of Mathematical Psychology 9, 128-139.

Examples


## Test with an included ROC dataset
retFit <- FitBinormalRoc(dataset02);## print(retFit$fittedPlot)


## Test with an included FROC dataset; it needs to be binned
## as there are more than 5 discrete ratings levels
binned <- DfBinDataset(dataset05, desiredNumBins = 5, opChType = "ROC")
retFit <- FitBinormalRoc(binned);## print(retFit$fittedPlot)


## Test with single interior point data
fp <- c(rep(1,7), rep(2, 3))
tp <- c(rep(1,5), rep(2, 5))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitBinormalRoc(dataset);## print(retFit$fittedPlot)

## Test with two interior data points
fp <- c(rep(1,7), rep(2, 5), rep(3, 3))
tp <- c(rep(1,3), rep(2, 5), rep(3, 7))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitBinormalRoc(dataset);## print(retFit$fittedPlot)

## Test with TONY data for which chisqr can be calculated
ds <- DfFroc2Roc(dataset01)
retFit <- FitBinormalRoc(ds, 2, 3);## print(retFit$fittedPlot)
retFit$ChisqrFitStats
 
## Test with included degenerate ROC data
retFit <- FitBinormalRoc(datasetDegenerate);## print(retFit$fittedPlot)

Fit the contaminated binormal model (CBM) to selected treatment and reader in an ROC dataset

Description

Fit the CBM-predicted ROC curve for specified treatment and reader

Usage

FitCbmRoc(dataset, trt = 1, rdr = 1)

Arguments

dataset

The dataset containing the data

trt

The desired treatment, default is 1

rdr

The desired reader, default is 1

Details

In CBM ratings from diseased cases are sampled from a mixture distribution with two components: (1) distributed normal with mean mu and unit variance with integrated area alpha, and (2) from a unit-normal distribution with integrated area 1-alpha. Ratings for non-diseased cases are sampled from a unit-normal distribution. The ChisqrFitStats consists of a list containing the chi-square value, the p-value and the degrees of freedom.

Value

The return value is a list with the following elements:

mu

The mean of the visible diseased distribution (the non-diseased) has zero mean

alpha

The proportion of diseased cases where the disease is visible

zetas

The cutoffs, zetas or thresholds

AUC

The AUC of the fitted ROC curve

StdAUC

The standard deviation of AUC

NLLIni

The initial value of negative LL

NLLFin

The final value of negative LL

ChisqrFitStats

The chisquare goodness of fit results

covMat

The covariance matrix of the parameters

fittedPlot

A ggplot2 object containing the fitted operating characteristic along with the empirical operating points. Use print() to display the object

Note

This algorithm is very robust, much more so than the binormal model.

References

Dorfman DD, Berbaum KS (2000) A contaminated binormal model for ROC data: Part II. A formal model, Acad Radiol, 7:6, 427–437.

Examples



## CPU time 8.7 sec on Ubuntu (#13)
## Test with included ROC data
retFit <- FitCbmRoc(dataset02);## print(retFit$fittedPlot)

## Test with included degenerate ROC data (yes! CBM can fit such data)
retFit <- FitCbmRoc(datasetDegenerate);## print(retFit$fittedPlot)

## Test with single interior point data
fp <- c(rep(1,7), rep(2, 3))
tp <- c(rep(1,5), rep(2, 5))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitCbmRoc(dataset);## print(retFit$fittedPlot)

## Test with two interior data points
fp <- c(rep(1,7), rep(2, 5), rep(3, 3))
tp <- c(rep(1,3), rep(2, 5), rep(3, 7))
dataset <- Df2RJafrocDataset(fp, tp)
retFit <- FitCbmRoc(dataset);
## print(retFit$fittedPlot)

## Test with included ROC data (some bins have zero counts)
retFit <- FitCbmRoc(dataset02, 2, 1);## print(retFit$fittedPlot)

## Test with TONY data for which chisqr can be calculated
ds <- DfFroc2Roc(dataset01)
retFit <- FitCbmRoc(ds, 2, 3);## print(retFit$fittedPlot)
retFit$ChisqrFitStats

Fit CORCBM to a paired ROC dataset

Description

Fit the Correlated Contaminated Binormal Model (CORCBM) to a paired ROC dataset. The ROC dataset has to be formatted as a single treatment, two-reader dataset, even though the actual pairing may be different, see details.

Usage

FitCorCbm(dataset)

Arguments

dataset

A paired ROC dataset

Details

The conditions (X, Y) can be two readers interpreting images in the same treatment, the same reader interpreting images in different treatments, or different readers interpreting images in 2 different treatments. Function DfExtractCorCbmDataset can be used to construct a dataset suitable for FitCorCbm. With reference to the returned values, and assuming R bins in condition X and L bins in conditon Y, FPCounts is the R x L matrix containing the counts for non-diseased cases, TPCounts is the R x L matrix containing the counts for diseased cases; muX,muY,alphaX,alphaY,rhoNor,rhoAbn2 are the CORCBM parameters; aucX,aucX are the AUCs in the two conditions; stdAucX,stdAucY are the corresponding standard errors;stdErr contains the standard errors of the parameters of the model; areaStat, areaPval,covMat are the area-statistic, the p-value and the covariance matrix of the parameters. If a parameter approaches a limit, e.g., rhoNor = 0.9999, it is held constant at near the limiting value and the covariance matrix has one less dimension (along each edge) for each parameter that is held constant. The indices of the parameters held fixed are in fitCorCbmRet$fixParam.

Value

The return value is a list containing three objects:

fitCorCbmRet

list(FPCounts,TPCounts, muX,muY,alphaX,alphaY,rhoNor, rhoAbn2,zetaX,zetaY,covMat,fixParam)

stats

list(aucX,aucX,stdAucX, stdAucY,stdErr,areaStat,areaPval)

fittedPlot

The fitted plot with operating points, error bars, for both conditions

References

Fit the radiological search model (RSM) to an ROC dataset

Description

Fit an RSM-predicted ROC curve to a binned single-treatment single-reader ROC dataset

Usage

FitRsmRoc(binnedRocData, lesDistr, trt = 1, rdr = 1)

Arguments

binnedRocData

The binned ROC dataset containing the data

lesDistr

The lesion distribution 1D array.

trt

The selected treatment, default is 1

rdr

The selected reader, default is 1

Details

If dataset is FROC, first convert it to ROC, using DfFroc2Roc. MLE ROC algorithms require binned datasets. Use DfBinDataset to perform the binning prior to calling this function. In the RSM: (1) The (random) number of latent NLs per case is Poisson distributed with mean parameter lambda, and the corresponding ratings are sampled from N(0,1). The (2) The (random) number of latent LLs per diseased case is binomial distributed with success probability nu and trial size equal to the number of lesions in the case, and the corresponding ratings are sampled from N(mu,1). (3) A latent NL or LL is actually marked if its rating exceeds the lowest threshold zeta1. To avoid clutter error bars are only shown for the lowest and uppermost operating points. Because of the extra parameter, and the requirement to have five counts, the chi-square statistic often cannot be calculated.

Value

The return value is a list with the following elements:

mu

The mean of the diseased distribution relative to the non-diseased one

lambda

The Poisson parameter describing the distribution of latent NLs per case

nu

The binomial success probability describing the distribution of latent LLs per diseased case

zetas

The RSM cutoffs, zetas or thresholds

AUC

The RSM fitted ROC-AUC

StdAUC

The standard deviation of AUC

NLLIni

The initial value of negative LL

NLLFin

The final value of negative LL

ChisqrFitStats

The chisquare goodness of fit results

covMat

The covariance matrix of the parameters

fittedPlot

A ggplot2 object containing the fitted operating characteristic along with the empirical operating points. Use print to display the object

References

Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm. Phys Med Biol 51, 3449-3462.

Chakraborty DP (2006) ROC Curves predicted by a model of visual search. Phys Med Biol 51, 3463–3482.

Examples


## Test with included ROC data (some bins have zero counts)
lesDistr <- UtilLesionDistrVector(dataset02)
retFit <- FitRsmRoc(dataset02, lesDistr)
## print(retFit$fittedPlot)

## Test with included degenerate ROC data
lesDistr <- UtilLesionDistrVector(datasetDegenerate)
retFit <- FitRsmRoc(datasetDegenerate, lesDistr)

## Test with single interior point data
fp <- c(rep(1,7), rep(2, 3))
tp <- c(rep(1,5), rep(2, 5))
binnedRocData <- Df2RJafrocDataset(fp, tp)
lesDistr <- UtilLesionDistrVector(binnedRocData)
retFit <- FitRsmRoc(binnedRocData, lesDistr)

## Test with two interior data points
fp <- c(rep(1,7), rep(2, 5), rep(3, 3))
tp <- c(rep(1,3), rep(2, 5), rep(3, 7))
binnedRocData <- Df2RJafrocDataset(fp, tp)
lesDistr <- UtilLesionDistrVector(binnedRocData)
retFit <- FitRsmRoc(binnedRocData, lesDistr)


## Test with three interior data points
fp <- c(rep(1,12), rep(2, 5), rep(3, 3), rep(4, 5)) #25
tp <- c(rep(1,3), rep(2, 5), rep(3, 7), rep(4, 10)) #25
binnedRocData <- Df2RJafrocDataset(fp, tp)
lesDistr <- UtilLesionDistrVector(binnedRocData)
retFit <- FitRsmRoc(binnedRocData, lesDistr)

## test for TONY data, i = 2 and j = 3 
## only case permitting chisqure calculation
lesDistr <- UtilLesionDistrVector(dataset01)
rocData <- DfFroc2Roc(dataset01)
retFit <- FitRsmRoc(rocData, lesDistr, trt = 2, rdr = 3)
## print(retFit$fittedPlot)
retFit$ChisqrFitStats

Plot binormal fit

Description

Plot the binormal-predicted ROC curve with provided parameters

Usage

PlotBinormalFit(a, b)

Arguments

a

vector: the mean(s) of the diseased distribution(s).

b

vector: the standard deviations(s) of the diseased distribution(s).

Details

a and b must have the same length. The predicted ROC curve for each a and b pair will be plotted.

Value

A ggplot2 object of the plotted ROC curve(s) are returned. Use print function to display the saved object.

Examples

binormalPlot <- PlotBinormalFit(c(1, 2), c(0.5, 0.5))
## print(binormalPlot)

Plot CBM fitted curve

Description

Plot the CBM-predicted ROC curve with provided CBM parameters

Usage

PlotCbmFit(mu, alpha)

Arguments

mu

vector: the mean(s) of the z-samples of the diseased distribution(s) where the disease is visible

alpha

vector: the proportion(s) of the diseased distribution(s) where the disease is visible

Details

mu and alpha must have equal length. The predicted ROC curve for each mu and alpha pair will be plotted.

Value

A ggplot2 object of the plotted ROC curve(s)

References

Dorfman DD, Berbaum KS (2000) A contaminated binormal model for ROC data: Part II. A formal model, Acad Radiol 7, 427–437.

Examples

cbmPlot <- PlotCbmFit(c(1, 2), c(0.5, 0.5))
## print(cbmPlot)

Plot empirical operating characteristics, ROC, FROC or LROC

Description

Plot empirical operating characteristics (operating points connected by straight lines) for specified modalities and readers, or, if desired, plots (no operating points) averaged over specified modalities and / or readers.

Usage

PlotEmpiricalOperatingCharacteristics(
  dataset,
  trts = 1,
  rdrs = 1,
  opChType,
  legend.position = c(0.8, 0.3),
  maxDiscrete = 10
)

Arguments

dataset

Dataset object.

trts

List or vector: integer indices of modalities to be plotted. Default is 1.

rdrs

List or vector: integer indices of readers to be plotted. Default is 1.

opChType

Type of operating characteristic to be plotted: "ROC", "FROC", "AFROC", "wAFROC", "AFROC1", "wAFROC1", or "LROC".

legend.position

Where to position the legend. The default is c(0.8, 0.2), i.e., 0.8 rightward and 0.2 upward (the plot is a unit square).

maxDiscrete

maximum number of op. points in order to be considered discrete and to be displayed by symbols and connecting lines; any more points will be regarded as continuous and only connected by lines; default is 10.

Details

The trts and rdrs are vectors or lists of integer indices, not the corresponding string IDs. For example, if the string ID of the first reader is "0", the value in rdrs should be 1 not 0. The legend will display the string IDs.

If both of trts and rdrs are vectors, all combinations of modalities and readers are plotted. See Example 1.

If both trts and rdrs are lists, they must have the same length. Only the combination of modality and reader at the same position in their respective lists are plotted. If some elements of the modalities and / or readers lists are vectors, the average operating characteristic over the implied modalities and / or readers are plotted. See Example 2.

For LROC datasets, opChType can be "ROC" or "LROC".

Value

A ggplot2 object containing the operating characteristic plot(s) and a data frame containing the points defining the operating characteristics.

Plot

ggplot2 object. For continuous or averaged data, operating characteristics curves are plotted without showing operating points. For binned (individual) data, both operating points and connecting lines are shown. To avoid clutter, if there are more than 20 operating points, they are not shown.

Points

Data frame with four columns: abscissa, ordinate, class (which codes modality and reader names) and type, which can be "D" for discrete ratings, "C" for continuous ratings, i.e., more than 20 operating points, or "A", for reader averaged.

Examples

## Example 1
## Plot individual empirical ROC plots for all combinations of modalities
## 1 and 2 and readers 1, 2 and 3. Six operating characteristics are plotted.

ret <- PlotEmpiricalOperatingCharacteristics(dataset = 
dataset02, trts = c(1:2), rdrs = c(1:3), opChType = "ROC")
## print(ret$Plot)

## Example 2
## Empirical wAFROC plots, consisting of
## three sub-plots:
## (1) sub-plot, red, with operating points, for the 1st modality (string ID "1") and the 2nd 
## reader (string ID "3"), labeled "M:1 R:3" 
## (2) sub-plot, green, no operating points, for the 2nd modality (string ID "2") AVERAGED 
## over the 2nd and 3rd readers (string IDs "3" and "4"), labeled "M:2  R: 3 4" 
## (3) sub-plot, blue, no operating points, AVERAGED over the first two modalities 
## (string IDs "1" and "2") AND over the 1st, 2nd and 3rd readers 
## (string IDs "1", "3" and "4"), labeled "M: 1 2  R: 1  3 4" 

plotT <- list(1, 2, c(1:2))
plotR <- list(2, c(2:3), c(1:3))

ret <- PlotEmpiricalOperatingCharacteristics(dataset = dataset04, trts = plotT, 
   rdrs = plotR, opChType = "wAFROC")                  
## print(ret$Plot)

## Example 3
## Correspondences between indices and string identifiers for modalities and 
## readers in this dataset (apparently reader "2" did not complete the study).

## names(dataset04$descriptions$readerID)
## [1] "1" "3" "4" "5"

RSM predicted operating characteristics, ROC pdfs and AUCs

Description

Visualize RSM predicted ROC, AFROC, wAFROC and FROC curves, and ROC pdfs, given equal-length arrays of search model parameters: mu, lambda, nu and zeta1.

Usage

PlotRsmOperatingCharacteristics(
  mu,
  lambda,
  nu,
  zeta1,
  lesDistr = 1,
  relWeights = 0,
  OpChType = "ALL",
  legendPosition = c(1, 0),
  legendDirection = "horizontal",
  legendJustification = c(0, 1),
  nlfRange = NULL,
  llfRange = NULL,
  nlfAlpha = NULL
)

Arguments

mu

Array: the RSM mu parameter.

lambda

Array: the RSM lambda parameter.

nu

Array: the RSM nu parameter.

zeta1

Array, the lowest reporting threshold; if missing the default is an array of -3s.

lesDistr

Array: the probability mass function of the lesion distribution for diseased cases. The default is 1. See UtilLesionDistrVector.

relWeights

The relative weights of the lesions; a vector of length equal to length(maxLL). The default is zero, in which case equal weights are assumed.

OpChType

The type of operating characteristic desired: can be "ROC", "AFROC", "wAFROC", "FROC" or "pdfs" or "ALL". The default is "ALL".

legendPosition

The positioning of the legend: "right", "left", "top" or "bottom". Use "none" to suppress the legend.

legendDirection

Allows control on the direction of the legend; "horizontal", the default, or "vertical"

legendJustification

Where to position the legend, default is bottom right corner c(0,1)

nlfRange

This applies to FROC plot only. The x-axis range, e.g., c(0,2), for FROC plot. Default is "NULL", which means the maximum NLF range, as determined by the data.

llfRange

This applies to FROC plot only. The y-axis range, e.g., c(0,1), for FROC plot. Default is "NULL", which means the maximum LLF range, as determined by the data.

nlfAlpha

Upper limit of the integrated area under the FROC plot. Default is "NULL", which means the maximum NLF range is used (i.e., lambda). Attempt to integrate outside the maximum NLF will generate an error.

Details

RSM is the Radiological Search Model described in the book. This function is vectorized with respect to the first 4 arguments. For lesDistr the sum must be one. To indicate that all dis. cases contain 4 lesions, set lesDistr = c(0,0,0,1).

Value

A list of elements containing five ggplot2 objects (ROCPlot, AFROCPlot wAFROCPlot, FROCPlot and PDFPlot) and two area measures (each of which can have up to two elements), the area under the search model predicted ROC curves in up to two treatments, the area under the search model predicted AFROC curves in up to two treatments, the area under the search model predicted wAFROC curves in up to two treatments, the area under the search model predicted FROC curves in up to two treatments.

ROCPlot The predicted ROC plots
AFROCPlot The predicted AFROC plots
wAFROCPlot The predicted wAFROC plots
FROCPlot The predicted FROC plots
PDFPlot The predicted ROC pdf plots, highest rating generated
aucROC The predicted ROC AUCs, highest rating generated
aucAFROC The predicted AFROC AUCs
aucwAFROC The predicted wAFROC AUCs
aucFROC The predicted FROC AUCs

References

Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449-3462.

Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.

Chakraborty, DP, Yoon, HJ (2008) Operating characteristics predicted by models for diagnostic tasks involving lesion localization, Med Phys, 35:2, 435.

Chakraborty DP (2017) Observer Performance Methods for Diagnostic Imaging - Foundations, Modeling, and Applications with R-Based Examples (CRC Press, Boca Raton, FL). https://www.routledge.com/Observer-Performance-Methods-for-Diagnostic-Imaging-Foundations-Modeling/Chakraborty/p/book/9781482214840

Examples

## Following example is for mu = 2, lambda = 1, nu = 0.6, in one treatment and   
## mu = 3, lambda = 1.5, nu = 0.8, in the other treatment. 20% of the diseased 
## cases have a single lesion, 40% have two lesions, 10% have 3 lesions, 
## and 30% have 4 lesions.  

PlotRsmOperatingCharacteristics(mu = c(2, 3), lambda = c(1, 1.5), nu = c(0.6, 0.8),
   lesDistr = c(0.2, 0.4, 0.1, 0.3), legendPosition = "bottom")

RSM predicted FROC ordinate

Description

RSM predicted FROC ordinate

Usage

RSM_LLF(z, mu, nu)

Arguments

z

The z-sample value at which to evaluate the FROC ordinate.

mu

The RSM mu parameter.

nu

The RSM nu prime parameter.

Value

yFROC

Examples

RSM_LLF(1,1,0.5)

RSM predicted FROC abscissa

Description

RSM predicted FROC abscissa

Usage

RSM_NLF(z, lambda)

Arguments

z

The z-sample value at which to evaluate the FROC abscissa.

lambda

The RSM lambda parameter.

Value

xFROC

Examples

RSM_NLF(1,1)

RSM predicted ROC-rating pdf for diseased cases

Description

RSM predicted ROC-rating pdf for diseased cases

Usage

RSM_pdfD(z, mu, lambda, nu, lesDistr)

Arguments

z

The z-sample value at which to evaluate the pdf.

mu

The mu parameter of the RSM.

lambda

The RSM lambda parameter.

nu

The RSM nu parameter.

lesDistr

The lesion distribution 1D vector.

Value

pdf

Examples

lesDistr <- c(0.5, 0.5)
RSM_pdfD(1,1,1,0.9, lesDistr)
lesDistr <- c(0.2, 0.3, 0.5)
RSM_pdfD(1,1,1,0.5, lesDistr)

RSM predicted ROC-rating pdf for non-diseased cases

Description

RSM predicted ROC-rating pdf for non-diseased cases

Usage

RSM_pdfN(z, lambda)

Arguments

z

The z-sample value at which to evaluate the pdf.

lambda

The (physical) lambda parameter of the RSM.

Value

pdf

Examples

RSM_pdfN(1,1)

RSM predicted wAFROC ordinate

Description

RSM predicted wAFROC ordinate

Usage

RSM_wLLF(zeta, mu, nu, lesDistr, relWeights)

Arguments

zeta

The zeta value at which to evaluate the FROC ordinate.

mu

The RSM mu parameter.

nu

The RSM nu prime parameter.

lesDistr

Lesion distribution vector.

relWeights

Relative lesion weights vector.

Value

wLLF

Examples

RSM_wLLF(1,1,0.5, lesDistr = c(0.5, 0.4, 0.1), relWeights = c(0.7, 0.2, 0.1))

RSM predicted ROC-abscissa as function of z

Description

RSM predicted ROC-abscissa as function of z

Usage

RSM_xROC(z, lambda)

Arguments

z

The z-sample at which to evaluate the ROC-abscissa.

lambda

The (physical) lambda parameter of the RSM.

Value

xROC, the abscissa of the ROC

Examples

RSM_xROC(c(-Inf,0.1,0.2,0.3),1)

RSM predicted ROC-ordinate as function of z

Description

RSM predicted ROC-ordinate as function of z

Usage

RSM_yROC(z, mu, lambda, nu, lesDistr)

Arguments

z

The z-sample value at which to evaluate the pdf.

mu

The mu parameter (of the RSM).

lambda

The RSM lambda parameter.

nu

The RSM nu parameter.

lesDistr

The 1D lesion distribution vector.

Value

yROC, the ordinate of the ROC

Examples

lesDistr <- c(0.1,0.3,0.6)
RSM_yROC(c(-Inf,0.1,0.2,0.3), 1, 1, 0.9, lesDistr)

Simulate paired binned data for testing FitCorCbm

Description

Simulates single treatment 2-reader binned ROC dataset, simulated according to the CORCBM model, for the purpose of testing the fitting program FitCorCbm.

Usage

SimulateCorCbmDataset(
  seed = 123,
  K1 = 50,
  K2 = 50,
  desiredNumBins = 5,
  muX = 1.5,
  muY = 3,
  alphaX = 0.4,
  alphaY = 0.7,
  rhoNor = 0.3,
  rhoAbn2 = 0.8
)

Arguments

seed

The seed variable, default is 123; set to NULL for truly random seed

K1

The number of non-diseased cases, default is 50

K2

The number of diseased cases, default is 50

desiredNumBins

The desired number of bins; default is 5

muX

The CBM mu parameter in condition X

muY

The CBM mu parameter in condition Y

alphaX

The CBM alpha parameter in condition X

alphaY

The CBM alpha parameter in condition Y

rhoNor

The correlation of non-diseased case z-samples

rhoAbn2

The correlation of diseased case z-samples, when disease is visible in both conditions

Details

X and Y refer to the two arms of the pairing. muX and alphaX refer to the univariate CBM parameters in condition X, rhoNor is the correlation of ratings of non-diseased cases and rhoAbn2 is the correlation of ratings of diseased cases when disease is visible in both conditions. The ROC data is bined to 5 bins in each condition. See referenced publication.

Value

The return value is the desired dataset, suitable for testing FitCorCbm

References

Examples


dataset <- SimulateCorCbmDataset()


## this takes very long
## dataset <- SimulateCorCbmDataset(K1 = 5000, K2 = 5000)

Simulates an MRMC uncorrelated FROC dataset using the RSM

Description

Simulates an uncorrelated MRMC FROC dataset for specified numbers of readers and treatments

Usage

SimulateFrocDataset(mu, lambda, nu, zeta1, I, J, K1, K2, perCase, seed = NULL)

Arguments

mu

The mu parameter of the RSM

lambda

The RSM lambda parameter

nu

The RSM nu parameter

zeta1

The lowest reporting threshold

I

The number of treatments

J

The number of readers

K1

The number of non-diseased cases

K2

The number of diseased cases

perCase

A K2 length array containing the numbers of lesions per diseased case

seed

The initial seed for the random number generator, the default is NULL, as if no seed has been specified.

Details

See book chapters on the Radiological Search Model (RSM) for details. In this code correlations between ratings on the same case are assumed to be zero.

Value

The return value is an FROC dataset.

References

Examples

set.seed(1) 
K1 <- 5;K2 <- 7;
maxLL <- 2;perCase <- floor(runif(K2, 1, maxLL + 1))
mu <- 1;lambda <- 1;nu <- 0.99 ;zeta1 <- -1
I <- 2; J <- 5

frocDataRaw <- SimulateFrocDataset(
  mu = mu, lambda = lambda, nu = nu, zeta1 = zeta1,
  I = I, J = J, K1 = K1, K2 = K2, perCase = perCase )
  
## plot the data
ret <- PlotEmpiricalOperatingCharacteristics(frocDataRaw, opChType = "FROC")
## print(ret$Plot)

Simulates an "AUC-equivalent" FROC dataset from an LROC dataset

Description

Simulates a multiple-treatment multiple-reader "AUC-equivalent" FROC dataset from a supplied LROC dataset, e.g., datasetCadLroc.

Usage

SimulateFrocFromLrocDataset(dataset)

Arguments

dataset

The LROC dataset to be converted to FROC.

Details

Value

The equivalent FROC dataset

Examples


frocDataset <- SimulateFrocFromLrocDataset(datasetCadLroc)
lrocAuc <- UtilFigureOfMerit(datasetCadLroc, FOM = "Wilcoxon")
frocHrAuc <- UtilFigureOfMerit(frocDataset, FOM = "HrAuc")   
testthat::expect_equal(lrocAuc, frocHrAuc)

Simulates an uncorrelated FLROC FrocDataset using the RSM

Description

Simulates an uncorrelated LROC dataset for specified numbers of readers and treatments

Usage

SimulateLrocDataset(mu, lambda, nu, zeta1, I, J, K1, K2, lesionVector)

Arguments

mu

The mu parameter of the RSM

lambda

The RSM lambda parameter

nu

The RSM nu parameter

zeta1

The lowest reporting threshold

I

The number of treatments

J

The number of readers

K1

The number of non-diseased cases

K2

The number of diseased cases

lesionVector

A K2 length array containing the numbers of lesions per diseased case

Details

See book chapters on the Radiological Search Model (RSM) for details. The approach is to first simulate an FROC dataset and then convert it to an Lroc dataset. The correlations between FROC ratings on the same case are assumed to be zero.

Value

The return value is an LROC dataset.

References

Examples

  set.seed(1)
  K1 <- 5
  K2 <- 5
  mu <- 2
  lambda <- 1
  lesionVector <- rep(1, 5)
  nu <- 0.8
  zeta1 <- -3
  frocData <- SimulateFrocDataset(mu, lambda, nu, zeta1, I = 2, J = 5, K1, K2, lesionVector)
  lrocData <- DfFroc2Lroc(frocData)

Simulates a binormal model ROC dataset

Description

Simulates an uncorrelated binormal model ROC factorial dataset

Usage

SimulateRocDataset(I = 1, J = 1, K1, K2, a, b, seed = NULL)

Arguments

I

The number of modalities, default is 1

J

The number of readers, default is 1

K1

The number of non-diseased cases

K2

The number of diseased cases

a

The a parameter of the binormal model

b

The b parameter of the binormal model

seed

The initial seed, default is NULL, which results in a random seed

Details

See book Chapter 6 for details

Value

An ROC dataset

References

Examples

K1 <- 5;K2 <- 7;a <- 1.5;b <- 0.5
rocDataRaw <- SimulateRocDataset(K1 = K1, K2 = K2, a = a, b = b)

RSM fitted model for FROC sample size

Description

RSM fitted model for FROC sample size

Usage

SsFrocNhRsmModel(dataset, lesDistr)

Arguments

dataset

The pilot dataset object representing a NH (ROC or FROC) dataset.

lesDistr

A 1D array containing the probability mass function of number of lesions per diseased case in the pivotal FROC study.

Details

If dataset is FROC, it is converted to an ROC dataset. The dataset is automatically binned. The search model is used to fit each treatment-reader combination. The median value for each parameter is computed and returned by the function (3 values). These are used to compute predicted wAFROC and ROC FOMS over a range of values of deltaMu, which are fitted by a straight line constrained to pass through the origin. The scale factor and R2 are returned. The scaling factor is the value by which the ROC effect size must be multiplied to get the wAFROC effect size. See https://dpc10ster.github.io/RJafrocQuickStart/froc-sample-size.html for vignettes explaining the FROC sample size estimation procedure.

Value

A list containing:

mu, the mu parameter of the NH model.
lambda, the lambda parameter of the NH model.
nu, the nu parameter of the NH model.
scaleFactor, the scaling factor that multiplies the ROC effect size to get wAFROC effect size.
R2, the R2 of the fit.

Examples



## Examples with CPU or elapsed time > 5s
## user system elapsed
## SsFrocNhRsmModel 8.102  0.023   8.135

## SsFrocNhRsmModel(DfExtractDataset(dataset04, trts = c(1,2)), c(0.69, 0.2, 0.11))

Statistical power for specified numbers of readers and cases

Description

Calculate the statistical power for specified numbers of readers J, cases K, analysis method and DBM or OR variances components

Usage

SsPowerGivenJK(
  dataset,
  ...,
  FOM,
  J,
  K,
  effectSize = NULL,
  method = "OR",
  analysisOption = "RRRC",
  LegacyCode = FALSE,
  alpha = 0.05
)

Arguments

dataset

The pilot dataset. If set to NULL then variance components must be supplied.

...

Optional variance components. These are needed if dataset is not supplied.

FOM

The figure of merit

J

The number of readers in the pivotal study.

K

The number of cases in the pivotal study.

effectSize

The effect size to be used in the pivotal study. Default is NULL, which uses the observed effect size in the pilot dataset. Must be supplied if dataset is set to NULL and variance components are supplied.

method

"OR" (the default) or "DBM" (but see LegacyCode option below).

analysisOption

Desired generalization, "RRRC" (the default), "FRRC", "RRFC" or "ALL". RRFC = random reader fixed case, etc.

LegacyCode

Logical, defaults to FALSE, which results in OR sample size method being used, even if DBM method is specified, as in Hillis 2011 & 2018 papers. If TRUE the method based on Hillis-Berbaum 2004 sample size paper is used.

alpha

The significance level, default is 0.05.

Details

The default effectSize uses the observed effect size in the pilot study. A numeric value over-rides the default value. This argument must be supplied if dataset = NULL and variance compenents (the ... arguments) are supplied.

Value

The expected statistical power.

Note

The procedure is valid for ROC studies only; for FROC studies see Vignettes 19.

References

Hillis SL, Berbaum KS (2004). Power Estimation for the Dorfman-Berbaum-Metz Method. Acad Radiol, 11, 1260–1273.

Hillis SL, Obuchowski NA, Berbaum KS (2011). Power Estimation for Multireader ROC Methods: An Updated and Unified Approach. Acad Radiol, 18, 129–142.

Hillis SL, Schartz KM (2018). Multireader sample size program for diagnostic studies: demonstration and methodology. Journal of Medical Imaging, 5(04).

Examples

## EXAMPLE 1: RRRC power 
## specify 2-treatment ROC dataset and force DBM alg.
SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05, 
J = 6, K = 251, method = "DBM", LegacyCode = TRUE) # RRRC is default  

## EXAMPLE 1A: FRRC power 
SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05, 
J = 6, K = 251, method = "DBM", LegacyCode = TRUE, analysisOption = "FRRC") 

## EXAMPLE 1B: RRFC power 
SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05, 
J = 6, K = 251, method = "DBM", LegacyCode = TRUE, analysisOption = "RRFC") 

## EXAMPLE 2: specify NULL dataset & DBM var. comp. & force DBM-based alg.
vcDBM <- UtilVarComponentsDBM(dataset02, FOM = "Wilcoxon")$VarCom
SsPowerGivenJK(dataset = NULL, FOM = "Wilcoxon", J = 6, K = 251, 
effectSize = 0.05, method = "DBM", LegacyCode = TRUE, 
list( 
VarTR = vcDBM["VarTR","Estimates"], # replace rhs with actual values as in 4A
VarTC = vcDBM["VarTC","Estimates"], # do:
VarErr = vcDBM["VarErr","Estimates"])) # do:
                     
## EXAMPLE 3: specify 2-treatment ROC dataset and use OR-based alg.
SsPowerGivenJK(dataset = dataset02, FOM = "Wilcoxon", effectSize = 0.05, 
J = 6, K = 251)

## EXAMPLE 4: specify NULL dataset & OR var. comp. & use OR-based alg.
JStar <- length(dataset02$ratings$NL[1,,1,1])
KStar <- length(dataset02$ratings$NL[1,1,,1])
vcOR <- UtilORVarComponentsFactorial(dataset02, FOM = "Wilcoxon")$VarCom
SsPowerGivenJK(dataset = NULL, FOM = "Wilcoxon", effectSize = 0.05, J = 6, 
K = 251, list(JStar = JStar, KStar = KStar, 
   VarTR = vcOR["VarTR","Estimates"], # replace rhs with actual values as in 4A
   Cov1 = vcOR["Cov1","Estimates"],   # do:
   Cov2 = vcOR["Cov2","Estimates"],   # do:
   Cov3 = vcOR["Cov3","Estimates"],   # do:
   Var = vcOR["Var","Estimates"]))
   
## EXAMPLE 4A: specify NULL dataset & OR var. comp. & use OR-based alg.
SsPowerGivenJK(dataset = NULL, FOM = "Wilcoxon", effectSize = 0.05, J = 6, 
K = 251, list(JStar = 5, KStar = 114, 
   VarTR = 0.00020040252,
   Cov1 = 0.00034661371,
   Cov2 = 0.00034407483,
   Cov3 = 0.00023902837,
   Var = 0.00080228827))
   
## EXAMPLE 5: specify NULL dataset & DBM var. comp. & use OR-based alg.
## The DBM var. comp. are converted internally to OR var. comp.
vcDBM <- UtilVarComponentsDBM(dataset02, FOM = "Wilcoxon")$VarCom
KStar <- length(dataset02$ratings$NL[1,1,,1])
SsPowerGivenJK(dataset = NULL, J = 6, K = 251, effectSize = 0.05, 
method = "DBM", FOM = "Wilcoxon",
list(KStar = KStar,                # replace rhs with actual values as in 5A 
VarR = vcDBM["VarR","Estimates"], # do:
VarC = vcDBM["VarC","Estimates"], # do:
VarTR = vcDBM["VarTR","Estimates"], # do:
VarTC = vcDBM["VarTC","Estimates"], # do:
VarRC = vcDBM["VarRC","Estimates"], # do:
VarErr = vcDBM["VarErr","Estimates"]))

## EXAMPLE 5A: specify NULL dataset & DBM var. comp. & use OR-based alg.
SsPowerGivenJK(dataset = NULL, J = 6, K = 251, effectSize = 0.05, 
method = "DBM", FOM = "Wilcoxon",
list(KStar = 114,
VarR = 0.00153499935,
VarC = 0.02724923428,
VarTR = 0.00020040252,
VarTC = 0.01197529621,
VarRC = 0.01226472859,
VarErr = 0.03997160319))

Power given J, K and Dorfman-Berbaum-Metz variance components

Description

Power given J, K and Dorfman-Berbaum-Metz variance components

Usage

SsPowerGivenJKDbmVarCom(
  J,
  K,
  effectSize,
  VarTR,
  VarTC,
  VarErr,
  alpha = 0.05,
  analysisOption = "RRRC"
)

Arguments

J

The number of readers

K

The number of cases

effectSize

The effect size

VarTR

The treatment-reader DBM variance component

VarTC

The treatment-case DBM variance component

VarErr

The error-term DBM variance component

alpha

The size of the test (default = 0.05)

analysisOption

The desired generalization ("RRRC", "FRRC", "RRFC", "ALL")

Details

The variance components are obtained using StSignificanceTesting with method = "DBM".

Value

A list object containing the estimated power and associated statistics for each desired generalization.

Examples

VarCom <- StSignificanceTesting(dataset02, FOM = "Wilcoxon", method = "DBM", 
   analysisOption = "RRRC")$ANOVA$VarCom
VarTR <- VarCom["VarTR",1]
VarTC <- VarCom["VarTC",1]
VarErr <- VarCom["VarErr",1]
ret <- SsPowerGivenJKDbmVarCom (J = 5, K = 100, effectSize = 0.05, VarTR, 
   VarTC, VarErr, analysisOption = "RRRC")
cat("RRRC power = ", ret$powerRRRC)

Power given J, K and Obuchowski-Rockette variance components

Description

Power given J, K and Obuchowski-Rockette variance components

Usage

SsPowerGivenJKOrVarCom(
  J,
  K,
  KStar,
  effectSize,
  VarTR,
  Cov1,
  Cov2,
  Cov3,
  Var,
  alpha = 0.05,
  analysisOption = "RRRC"
)

Arguments

J

The number of readers in the pivotal study

K

The number of cases in the pivotal study

KStar

The number of cases in the pilot study

effectSize

The effect size

VarTR

The treatment-reader OR variance component

Cov1

The OR Cov1 covariance

Cov2

The OR Cov2 covariance

Cov3

The OR Cov3 covariance

Var

The OR pure variance term

alpha

The size of the test (default = 0.05)

analysisOption

The desired generalization ("RRRC", "FRRC", "RRFC", "ALL")

Details

The variance components are obtained using StSignificanceTesting with method = "OR".

Value

A list object containing the estimated power and associated statistics for each desired generalization.

Examples

dataset <- dataset02 ## the pilot study
KStar <- length(dataset$ratings$NL[1,1,,1])
VarCom <- StSignificanceTesting(dataset, FOM = "Wilcoxon", 
method = "OR", analysisOption = "RRRC")$ANOVA$VarCom
VarTR <- VarCom["VarTR",1]
Cov1 <- VarCom["Cov1",1]
Cov2 <- VarCom["Cov2",1]
Cov3 <- VarCom["Cov3",1]
Var <- VarCom["Var",1]
ret <- SsPowerGivenJKOrVarCom (J = 5, K = 100, KStar = KStar,  
   effectSize = 0.05, VarTR, Cov1, Cov2, Cov3, Var, analysisOption = "RRRC")
    
cat("RRRC power = ", ret$powerRRRC)

Generate a power table using the OR method

Description

Generate combinations of numbers of readers J and numbers of cases K for desired power and specified generalization(s)

Usage

SsPowerTable(
  dataset,
  FOM,
  effectSize = NULL,
  alpha = 0.05,
  desiredPower = 0.8,
  analysisOption = "RRRC"
)

Arguments

dataset

The pilot ROC dataset to be used to extrapolate to the pivotal study.

FOM

The figure of merit.

effectSize

The effect size to be used in the pivotal study, default value is NULL. See Details.

alpha

The The size of the test, default is 0.05.

desiredPower

The desired statistical power, default is 0.8.

analysisOption

Desired generalization, "RRRC" (the default), "FRRC", "RRFC" or "ALL".

Details

The default effectSize uses the observed effect size in the pilot study. A supplied numeric value over-rides the default value.

Value

A list containing up to 3 (depending on analysisOption) dataframes. Each dataframe contains 3 arrays:

numReaders

The numbers of readers in the pivotal study.

numCases

The numbers of cases in the pivotal study.

power

The estimated statistical powers.

Note

The procedure is valid for ROC studies only; for FROC studies see Vignettes 19.

Examples


## Examples with CPU or elapsed time > 5s
##              user    system elapsed
## SsPowerTable 20.033  0.037  20.077    

## Example of sample size calculation with OR method
## SsPowerTable(dataset02, FOM = "Wilcoxon", method = "OR")

Number of cases, for specified number of readers, to achieve desired power

Description

Number of cases to achieve the desired power, for specified number of readers J, and specified DBM or ORH analysis method

Usage

SsSampleSizeKGivenJ(
  dataset,
  ...,
  J,
  FOM,
  effectSize = NULL,
  method = "OR",
  alpha = 0.05,
  desiredPower = 0.8,
  analysisOption = "RRRC",
  LegacyCode = FALSE
)

Arguments

dataset

The pilot dataset. If set to NULL then variance components must be supplied.

...

Optional variance components, VarTR, VarTC and VarErr. These are needed if dataset is not supplied.

J

The number of readers in the pivotal study.

FOM

The figure of merit. Not needed if variance components are supplied.

effectSize

The effect size to be used in the pivotal study. Default is NULL. Must be supplied if dataset is set to NULL and variance components are supplied.

method

"OR" (default) or "DBM".

alpha

The significance level of the study, default is 0.05.

desiredPower

The desired statistical power, default is 0.8.

analysisOption

Desired generalization, "RRRC", "FRRC", "RRFC" or "ALL" (the default).

LegacyCode

Logical, default is FALSE, if TRUE the DBM method is used. Otherwise the OR method is used.

Details

effectSize = NULL uses the observed effect size in the pilot study. A numeric value over-rides the default value. This argument must be supplied if dataset = NULL and variance compenents (the optional ... arguments) are supplied.

Value

A list of two elements:

K

The minimum number of cases K in the pivotal study to just achieve the desired statistical power, calculated for each value of analysisOption.

power

The predicted statistical power.

Note

The procedure is valid for ROC studies only; for FROC studies see Vignettes 19.

Examples

## the following two should give identical results
SsSampleSizeKGivenJ(dataset02, FOM = "Wilcoxon", effectSize = 0.05, J = 6, method = "DBM")

a <- UtilVarComponentsDBM(dataset02, FOM = "Wilcoxon")$VarCom
SsSampleSizeKGivenJ(dataset = NULL, J = 6, effectSize = 0.05, method = "DBM", LegacyCode = TRUE,
   list(VarTR = a["VarTR",1], 
   VarTC = a["VarTC",1], 
   VarErr = a["VarErr",1]))

## the following two should give identical results
SsSampleSizeKGivenJ(dataset02, FOM = "Wilcoxon", effectSize = 0.05, J = 6, method = "OR")

a <- UtilORVarComponentsFactorial(dataset02, FOM = "Wilcoxon")$VarCom
KStar <- length(dataset02$ratings$NL[1,1,,1])
SsSampleSizeKGivenJ(dataset = NULL, J = 6, effectSize = 0.05, method = "OR", 
   list(KStar = KStar, 
   VarTR = a["VarTR",1], 
   Cov1 = a["Cov1",1], 
   Cov2 = a["Cov2",1], 
   Cov3 = a["Cov3",1], 
   Var = a["Var",1]))

 
for (J in 6:10) {
 ret <- SsSampleSizeKGivenJ(dataset02, FOM = "Wilcoxon", J = J, analysisOption = "RRRC") 
 message("# of readers = ", J, " estimated # of cases = ", ret$K, 
 ", predicted power = ", signif(ret$powerRRRC,3), "\n")
}

Performs DBM or OR significance testing for factorial or split-plot A,C datasets

Description

Performs Dorfman-Berbaum-Metz (DBM) or Obuchowski-Rockette (OR) significance testing, for specified dataset; significance testing refers to analysis designed to assign a P-value, and other statistics, for rejecting the null hypothesis (NH) that the reader-averaged figure of merit (FOM) differences between treatments is zero. The results of the analysis are best visualized in the text or Excel-formatted files produced by UtilOutputReport.

Usage

StSignificanceTesting(
  dataset,
  FOM,
  FPFValue = 0.2,
  alpha = 0.05,
  method = "DBM",
  covEstMethod = "jackknife",
  nBoots = 200,
  analysisOption = "ALL",
  tempOrgCode = FALSE
)

Arguments

dataset

The dataset to be analyzed, see RJafroc-package. Must have two or more treatments and two or more readers. The dataset design can be "FCTRL", "SPLIT-PLOT-A" or "SPLIT-PLOT-C".

FOM

The figure of merit, see UtilFigureOfMerit

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

alpha

The significance level of the test of the null hypothesis that all treatment effects are zero; the default is 0.05

method

The significance testing method to be used: "DBM" (the default), representing the Dorfman-Berbaum-Metz method or "OR", representing the Obuchowski-Rockette method. and the Obuchowski-Rockette significance testing methods, respectively.

covEstMethod

The covariance matrix estimation method in ORH analysis (for method = "DBM" the jackknife is always used).

"Jackknife", the default,
"Bootstrap", in which case nBoots (above) is relevant,
"DeLong"; requires FOM = "Wilcoxon" or "ROI" or "HrAuc", otherwise an error results.

nBoots

The number of bootstraps (defaults to 200), relevant only if covEstMethod = "bootstrap" and method = "OR"

analysisOption

Determines which factors are regarded as random vs. fixed:

"RRRC" = random-reader random case,
"FRRC" = fixed-reader random case,
"RRFC" = random-reader fixed case,
"ALL" = outputs results of "RRRC", "FRRC" and "RRFC" analyses - this is the default.

tempOrgCode

default FALSE; if TRUE, then code from version 0.0.1 of RJafroc is used (see RJafroc_0.0.1.tar). This is intended to check against errors that crept in subsequent to the original version as I attempted to improve the organization of the code and the output. As implicit in the name of this temporary flag, it will eventually be removed.

Value

For method = "DBM" the returned list contains 4 dataframes:

FOMs

Contains foms, trtMeans and trtMeanDiffs: see return of UtilFigureOfMerit

ANOVA

Contains TRCAnova, VarCom, IndividualTrt and IndividualRdr ANOVA tables of pseudovalues

RRRC

Contains results of "RRRC" analyses: FTests, ciDiffTrt, ciAvgRdrEachTrt

FRRC

Contains results of "FRRC" analyses: FTests, ciDiffTrt, ciAvgRdrEachTrt, ciDiffTrtEachRdr

RRFC

Contains results of "RRFC" analyses: FTests, ciDiffTrt, ciAvgRdrEachTrt

For method = "OR" the return list contains 4 dataframes:

FOMs

Contains foms, trtMeans and trtMeanDiffs: UtilFigureOfMerit

ANOVA

Contains TRAnova, VarCom, IndividualTrt and IndividualRdr ANOVA tables of FOM values

RRRC

Contains results of "RRRC" analyses - same organization as DBM, see above

FRRC

Contains results of "FRRC" analyses - ditto

RRFC

Contains results of "RRFC" analyses- ditto

References

Dorfman DD, Berbaum KS, Metz CE (1992) ROC characteristic rating analysis: Generalization to the Population of Readers and Patients with the Jackknife method, Invest. Radiol. 27, 723-731.

Obuchowski NA, Rockette HE (1995) Hypothesis Testing of the Diagnostic Accuracy for Multiple Diagnostic Tests: An ANOVA Approach with Dependent Observations, Communications in Statistics: Simulation and Computation 24, 285-308.

Hillis SL (2014) A marginal-mean ANOVA approach for analyzing multireader multicase radiological imaging data, Statistics in medicine 33, 330-360.

Examples

StSignificanceTesting(dataset02,FOM = "Wilcoxon", method = "DBM") 
StSignificanceTesting(dataset02,FOM = "Wilcoxon", method = "OR")
## following is split-plot-c analysis using a simulated split-plot-c dataset
StSignificanceTesting(datasetFROCSpC, FOM = "wAFROC", method = "OR")


StSignificanceTesting(dataset05, FOM = "wAFROC")
StSignificanceTesting(dataset05, FOM = "HrAuc", method = "DBM") 
StSignificanceTesting(dataset05, FOM = "SongA1", method = "DBM") 
StSignificanceTesting(dataset05, FOM = "SongA2", method = "DBM") 
StSignificanceTesting(dataset05, FOM = "wAFROC1", method = "DBM")
StSignificanceTesting(dataset05, FOM = "AFROC1", method = "DBM")
StSignificanceTesting(dataset05, FOM = "AFROC", method = "DBM")

Significance testing: standalone CAD vs. radiologists

Description

Comparing standalone CAD vs. at least two radiologists interpreting the same cases; standalone CAD means that all the designer-level mark-rating pairs generated by the CAD algorithm are available to the analyst, not just the one or two marks per case displayed to the radiologist (the latter are marks whose ratings exceed a pre-selected threshold). At the very minimum, location-level information, such as in the LROC paradigm, should be used. Ideally, the FROC paradigm should be used. A severe statistical power penalty is paid if one uses the ROC paradigm. See Standalone CAD vs Radiologists chapter, available via download link at site https://github.com/dpc10ster/RJafrocBook/blob/gh-pages/RJafrocBook.pdf

Usage

StSignificanceTestingCadVsRad(
  dataset,
  FOM,
  FPFValue = 0.2,
  method = "1T-RRRC",
  alpha = 0.05,
  plots = FALSE
)

Arguments

dataset

The dataset to be analyzed; must be single-treatment at least three readers, where the first reader is CAD.

FOM

The desired FOM; for ROC data it must be "Wilcoxon", for FROC data it can be any valid FOM, e.g., "HrAuc", "wAFROC", etc; for LROC data it must be "Wilcoxon", or "PCL" or "ALROC".

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

method

The desired analysis: "1T-RRFC","1T-RRRC" (the default) or "2T-RRRC", see manuscript for details.

alpha

Significance level of the test, defaults to 0.05.

plots

Flag, default is FALSE, i.e., a plot is not displayed. If TRUE, it displays the appropriate operating characteristic for all readers and CAD.

Details

PCL is the probability of a correct localization.
The LROC is the plot of PCL (ordinate) vs. FPF.
For LROC data, FOM = "PCL" means the interpolated PCL value at the specified FPFValue.
For FOM = "ALROC" the trapezoidal area under the LROC from FPF = 0 to FPF = FPFValue is used.
If method = "1T-RRRC" the first reader is assumed to be CAD.
If method = "2T-RRRC" the first treatment is assumed to be CAD.
The NH is that the FOM of CAD equals the average of the readers.
The method = "1T-RRRC" analysis uses an adaptation of the single-treatment multiple-reader Obuchowski Rockette (OR) model described in a paper by Hillis (2007), section 5.3. It is characterized by 3 parameters VarR, Var and Cov2, where the latter two are estimated using the jackknife.
For method = "2T-RRRC" the analysis replicates the CAD data as many times as necessary so as to form one "treatment" of an MRMC pairing, the other "treatment" being the radiologists. Then standard ORH analysis is applied. The method is described in Kooi et al. It gives exactly the same final results (F-statistic, ddf and p-value) as "1T-RRRC" but the intermediate quantities are meaningless.

Value

If method = "1T-RRRC" the return value is a list with the following elements:

fomCAD

The observed FOM for CAD.

fomRAD

The observed FOM array for the readers.

avgRadFom

The average FOM of the readers.

avgDiffFom

The mean of the difference FOM, RAD - CAD.

ciAvgDiffFom

The 95-percent CI of the average difference, RAD - CAD.

varR

The variance of the radiologists.

varError

The variance of the error term in the single-treatment multiple-reader OR model.

cov2

The covariance of the error term.

tstat

The observed value of the t-statistic; it's square is equivalent to an F-statistic.

df

The degrees of freedom of the t-statistic.

pval

The p-value for rejecting the NH.

Plots

If argument plots = TRUE, a ggplot object containing empirical operating characteristics corresponding to specified FOM. For example, if FOM = "Wilcoxon" an ROC plot object is produced where reader 1 is CAD. If an LROC FOM is selected, an LROC plot is displayed.

If method = "2T-RRRC" the return value is a list with the following elements:

fomCAD

The observed FOM for CAD.

fomRAD

The observed FOM array for the readers.

avgRadFom

The average FOM of the readers.

avgDiffFom

The mean of the difference FOM, RAD - CAD.

ciDiffFom

A data frame containing the statistics associated with the average difference, RAD - CAD.

ciAvgRdrEachTrt

A data frame containing the statistics associated with the average FOM in each "treatment".

varR

The variance of the pure reader term in the OR model.

varTR

The variance of the treatment-reader term error term in the OR model.

cov1

The covariance1 of the error term - same reader, different treatments.

cov2

The covariance2 of the error term - different readers, same treatment.

cov3

The covariance3 of the error term - different readers, different treatments.

varError

The variance of the pure error term in the OR model.

FStat

The observed value of the F-statistic.

ndf

The numerator degrees of freedom of the F-statistic.

df

The denominator degrees of freedom of the F-statistic.

pval

The p-value for rejecting the NH.

Plots

see above.

References

Hillis SL (2007) A comparison of denominator degrees of freedom methods for multiple observer ROC studies, Statistics in Medicine. 26:596-619.

Hupse R, Samulski M, Lobbes M, et al (2013) Standalone computer-aided detection compared to radiologists performance for the detection of mammographic masses, Eur Radiol. 23(1):93-100.

Kooi T, Gubern-Merida A, et al. (2016) A comparison between a deep convolutional neural network and radiologists for classifying regions of interest in mammography. Paper presented at: International Workshop on Digital Mammography, Malmo, Sweden.

Examples

ret1M <- StSignificanceTestingCadVsRad (dataset09, 
FOM = "Wilcoxon", method = "1T-RRRC")

StSignificanceTestingCadVsRad(datasetCadLroc, 
FOM = "Wilcoxon", method = "1T-RRFC")

retLroc1M <- StSignificanceTestingCadVsRad (datasetCadLroc, 
FOM = "PCL", method = "1T-RRRC", FPFValue = 0.05)

## test with fewer readers
dataset09a <- DfExtractDataset(dataset09, rdrs = seq(1:7))
ret1M7 <- StSignificanceTestingCadVsRad (dataset09a, 
FOM = "Wilcoxon", method = "1T-RRRC")

datasetCadLroc7 <- DfExtractDataset(datasetCadLroc, rdrs = seq(1:7))
ret1MLroc7 <- StSignificanceTestingCadVsRad (datasetCadLroc7, 
FOM = "PCL", method = "1T-RRRC", FPFValue = 0.05)


## takes longer than 5 sec on OSX
## retLroc2M <- StSignificanceTestingCadVsRad (datasetCadLroc, 
## FOM = "PCL", method = "2T-RRRC", FPFValue = 0.05)

## ret2MLroc7 <- StSignificanceTestingCadVsRad (datasetCadLroc7, 
## FOM = "PCL", method = "2T-RRRC", FPFValue = 0.05)

Perform significance testing using crossed treatments analysis

Description

Performs ORH analysis for specified crossed treatments dataset averaged over specified treatment factor

Usage

StSignificanceTestingCrossedModalities(
  ds,
  avgIndx,
  FOM = "wAFROC",
  alpha = 0.05,
  analysisOption = "ALL"
)

Arguments

ds

The crossed treatments dataset

avgIndx

The index of the treatment to be averaged over

FOM

See StSignificanceTesting

alpha

See StSignificanceTesting

analysisOption

See StSignificanceTesting

Value

The return list contains the same items with StSignificanceTesting.

Examples

 
## read the built in dataset
retCrossed2 <- StSignificanceTestingCrossedModalities(datasetCrossedModality, 1)

RSM ROC/AFROC/wAFROC AUC calculator

Description

Returns the ROC, AFROC and wAFROC AUCs corresponding to specified RSM parameters. See also UtilAucPROPROC, UtilAucBinormal and UtilAucCBM

Usage

UtilAnalyticalAucsRSM(mu, lambda, nu, zeta1 = -Inf, lesDistr, relWeights = 0)

Arguments

mu

The mean of the Gaussian distribution for the ratings of latent LLs (continuous ratings of lesions that are found by the search mechanism). The NLs are assumed to be distributed as N(0,1).

lambda

The RSM lambda parameter.

nu

The RSM nu parameters.

zeta1

The lowest reporting threshold, the default is -Inf.

lesDistr

The lesion distribution 1D array, i.e., the probability mass function (pmf) of the numbers of lesions for diseased cases.

relWeights

The relative weights of the lesions; a vector of length maxLL; if zero, the default, equal weights are assumed.

Value

A list containing the ROC, AFROC and wAFROC AUCs corresponding to the specified parameters

References

Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449-3462.

Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.

Examples

mu <- 1;lambda <- 1;nu <- 0.9
lesDistr <- c(0.9, 0.1) 
## i.e., 90% of dis. cases have one lesion, and 10% have two lesions
relWeights <- c(0.05, 0.95)
## i.e., lesion 1 has weight 5 percent while lesion two has weight 95 percent

UtilAnalyticalAucsRSM(mu, lambda, nu, zeta1 = -Inf, lesDistr)
UtilAnalyticalAucsRSM(mu, lambda, nu, zeta1 = -Inf, lesDistr, relWeights)

Binormal model AUC function

Description

Returns the Binormal model ROC-AUC corresponding to specified parameters. See also UtilAnalyticalAucsRSM, UtilAucPROPROC and UtilAucCBM

Usage

UtilAucBinormal(a, b)

Arguments

a

The a parameter of the binormal model (separation of non-diseased and diseased pdfs)

b

The b parameter of the binormal model (std. dev. of non-diseased diseased pdf; diseased pdf has unit std. dev)

Value

Binormal model-predicted ROC-AUC

References

Examples

a <- 2;b <- 0.7
UtilAucBinormal(a,b)

CBM AUC function

Description

Returns the CBM ROC-AUC See also UtilAnalyticalAucsRSM, UtilAucPROPROC and UtilAucBinormal

Usage

UtilAucCBM(mu, alpha)

Arguments

mu

The mu parameter of CBM (separation of non-diseased and diseased pdfs)

alpha

The alpha parameter of CBM, i.e., the fraction of diseased cases on which the disease is visible

Value

CBM-predicted ROC-AUC for the specified parameters

References

Dorfman DD, Berbaum KS (2000) A contaminated binormal model for ROC data: Part II. A formal model, Acad Radiol 7:6 427–437.

Examples

mu <- 2;alpha <- 0.8
UtilAucCBM(mu,alpha)

PROPROC AUC function

Description

Returns the PROPROC ROC-AUC corresponding to specified parameters. See also UtilAnalyticalAucsRSM, UtilAucBinormal and UtilAucCBM

Usage

UtilAucPROPROC(c1, da)

Arguments

c1

The c-parameter of the PROPROC model, since c is a reserved function in R.

da

The da-parameter of the PROPROC model.

Value

PROPROC model-predicted ROC-AUC for the specified parameters

References

Metz CE, Pan X (1999) Proper Binormal ROC Curves: Theory and Maximum-Likelihood Estimation, J Math Psychol 43(1):1-33.

Examples

c1 <- .2;da <- 1.5
UtilAucPROPROC(c1,da)

Convert from DBM to OR variance components

Description

UtilDBM2ORVarCom converts from DBM variance components to OR variance components

Usage

UtilDBM2ORVarCom(K, DBMVarCom)

Arguments

K

Total number of cases

DBMVarCom

DBM variance components, a data.frame containing VarR, VarC, VarTR, VarTC, VarRC and VarErr

Value

UtilDBM2ORVarCom returns the equivalent OR Variance components

Examples

DBMVarCom <- StSignificanceTesting(dataset02, FOM = "Wilcoxon", method = "DBM")$ANOVA$VarCom
UtilDBM2ORVarCom(114, DBMVarCom)

ORVarCom <- StSignificanceTesting(dataset02, FOM = "Wilcoxon", method = "OR")$ANOVA$VarCom
UtilOR2DBMVarCom(114, ORVarCom)

Calculate empirical figures of merit (FOMs) for specified dataset

Description

Calculate the specified empirical figure of merit for each treatment-reader combination in the ROC, FROC, ROI or LROC dataset

Usage

UtilFigureOfMerit(dataset, FOM = "wAFROC", FPFValue = 0.2)

Arguments

dataset

The dataset to be analyzed, RJafroc-package

FOM

The figure of merit; the default is "wAFROC"

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

Details

The allowed FOMs depend on the dataType field of the dataset object.

For dataset$descriptions$design = "SPLIT-PLOT-C", end-point based FOMs (e.g., "MaxLLF") are not allowed. For dataset$descriptions$type = "ROC" only FOM = "Wilcoxon" is allowed. For dataset$descriptions$type = "FROC" the following FOMs are allowed:

FOM = "AFROC1" (use only if zero normal cases)
FOM = "AFROC"
FOM = "wAFROC1" (use only if zero normal cases)
FOM = "wAFROC" (the default)
FOM = "HrAuc"
FOM = "SongA1"
FOM = "SongA2"
FOM = "HrSe" (an example of an end-point based FOM)
FOM = "HrSp" (another example)
FOM = "MaxLLF" (do:)
FOM = "MaxNLF" (do:)
FOM = "MaxNLFAllCases" (do:)
FOM = "ExpTrnsfmSp"

"MaxLLF", "MaxNLF" and "MaxNLFAllCases" correspond to ordinate, and abscissa, respectively, of the highest point on the FROC operating characteristic obtained by counting all the marks. The "ExpTrnsfmSp" FOM is described in the paper by Popescu. Given the large number of FOMs possible with FROC data, it is appropriate to make a recommendation: it is recommended that one use the wAFROC FOM whenever possible.

For dataType = "ROI" dataset only FOM = "ROI" is allowed.

For dataType = "LROC" dataset the following FOMs are allowed:

FOM = "Wilcoxon" for ROC data inferred from LROC data
FOM = "PCL" the probability of correct localization at specified FPFValue
FOM = "ALROC" the area under the LROC from zero to specified FPFValue

FPFValue The FPF at which to evaluate PCL or ALROC; the default is 0.2; only needed for LROC data.

Value

An c(I, J) dataframe, where the row names are modalityID's of the treatments and column names are the readerID's of the readers.

References

Chakraborty DP, Berbaum KS (2004) Observer studies involving detection and localization: modeling, analysis, and validation, Medical Physics, 31(8), 1–18.

Song T, Bandos AI, Rockette HE, Gur D (2008) On comparing methods for discriminating between actually negative and actually positive subjects with FROC type data, Medical Physics 35 1547–1558.

Popescu LM (2011) Nonparametric signal detectability evaluation using an exponential transformation of the FROC curve, Medical Physics, 38(10), 5690.

Obuchowski NA, Lieber ML, Powell KA (2000) Data Analysis for Detection and Localization of Multiple Abnormalities with Application to Mammography, Acad Radiol, 7:7 553–554.

Swensson RG (1996) Unified measurement of observer performance in detecting and localizing target objects on images, Med Phys 23:10, 1709–1725.

Examples

UtilFigureOfMerit(dataset02, FOM = "Wilcoxon") # ROC data
UtilFigureOfMerit(DfFroc2Roc(dataset01), FOM = "Wilcoxon") # FROC dataset, converted to ROC
UtilFigureOfMerit(dataset01) # FROC dataset, default wAFROC FOM
UtilFigureOfMerit(datasetCadLroc, FOM = "Wilcoxon") #LROC data
UtilFigureOfMerit(datasetCadLroc, FOM = "PCL") #LROC data
UtilFigureOfMerit(datasetCadLroc, FOM = "ALROC") #LROC data
UtilFigureOfMerit(datasetROI, FOM = "ROI") #ROI data
 # these are meant to illustrate conditions which will throw an error
## UtilFigureOfMerit(dataset02, FOM = "wAFROC") #error
## UtilFigureOfMerit(dataset01, FOM = "Wilcoxon") #error

Convert from intrinsic to physical RSM parameters

Description

Convert intrinsic RSM parameters lambda_i and nu_i correspond to the physical RSM parameters lambda_i' and nu_i'. The physical parameters are more meaningful but they depend on mu. The intrinsic parameters are independent of mu. See book for details.

Usage

UtilIntrinsic2RSM(mu, lambda_i, nu_i)

Arguments

mu

The mean of the Gaussian distribution for the ratings of latent LLs, i.e. continuous ratings of lesions that were found by the search mechanism ~ N(\mu,1). The corresponding distribution for the ratings of latent NLs is N(0,1).

lambda_i

The intrinsic Poisson lambda_i parameter.

nu_i

The intrinsic Binomial nu_i parameter.

Details

RSM is the Radiological Search Model described in the book. See also UtilRSM2Intrinsic.

Value

A list containing \lambda and \nu

References

Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449–3462.

Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.

Examples

mu <- 2;lambda_i <- 20;nu_i <- 1.1512925 
lambda <- UtilIntrinsic2RSM(mu, lambda_i, nu_i)$lambda 
nu <- UtilIntrinsic2RSM(mu, lambda_i, nu_i)$nu 
## note that the physical values are only constrained to be positive, but the physical variable nu
## must obey 0 <= nu <= 1

Get the lesion distribution vector of a dataset

Description

The lesion distribution vector for a dataset.

Usage

UtilLesionDistrVector(dataset)

Arguments

dataset

A dataset object.

Details

Two characteristics of an FROC dataset, apart from ratings, affect the FOM: the distribution of lesions per case and the distribution of lesion weights. This function addresses the distribution of lesions per case. The distribution of weights is addressed in UtilLesionWeightsMatrix. lesDistr is a 1D-array containing the fraction of unique values of lesions per diseased case in the dataset. For ROC or LROC data this vector is c(1), since all diseased cases contain one lesion. For FROC data the length of the vector equals the maximum number of lesions per diseased case. The first entry is the fraction of dis. cases containing one lesion, the second entry is the fraction of dis. cases containing two lesions, etc. See PlotRsmOperatingCharacteristics for a function that depends on lesDistr.

Value

lesDistr The 1D lesion distribution array.

Examples

UtilLesionDistrVector (dataset01) # FROC dataset ## [1] 0.93258427 0.06741573
UtilLesionDistrVector (dataset02) # ROC dataset  ## 1

Determine lesion weights distribution 2D matrix

Description

Determine the lesion weights distribution 2D matrix of a dataset or manually specify the lesion weights distribution 2D matrix.

Usage

UtilLesionWeightsMatrixDataset(dataset, relWeights = 0)

UtilLesionWeightsMatrixLesDistr(lesDistr, relWeights = 0)

Arguments

dataset

A dataset object.

relWeights

The relative weights of the lesions: a unit sum vector of length equal to the maximum number of lesions per dis. case. For example, c(0.2, 0.4, 0.1, 0.3) means that on cases with one lesion the weight of the lesion is unity, on cases with two lesions the ratio of the weight of the first lesion to that of the second lesion is 0.2:0.4, i.e., lesion 2 is twice as important as lesion 1. On cases with 4 lesions the weights are in the ratio 0.2:0.4:0.1:0.3. The default is zero, in which case equal lesion weights are assumed.

lesDistr

A unit sum vector of length equal to the maximum number of lesions per diseased case, specifying the relative frequency of lesions per dis. case in the dataset. For example, c(0.8, 0.15, 0.05) specifies a dataset in which 80 percent of the dis. cases have one lesion per dis. case, 15 percent have two lesions per dis. case and 5 percent have three lesions per dis. case. As another example, c(0.8, 0.15, 0, 0.05) specifies a dataset in which 80 percent of the dis. cases have one lesion per dis. case, 15 percent have two lesions per dis. case, there are no cases with three lesions per dis. case and 5 percent have four lesions per dis. case.

Details

Two characteristics of an FROC dataset, apart from the ratings, affect the FOM: the distribution of lesion per case and the distribution of lesion weights. This function addresses the weights. The distribution of lesions is addressed in UtilLesionDistrVector. See PlotRsmOperatingCharacteristics for a function that depends on lesWghtDistr. The underlying assumption is that lesion 1 is the same type across all diseased cases, lesion 2 is the same type across all diseased cases, ..., etc. This allows assignment of weights independent of the case index.

Value

lesWghtDistr The 2D lesion weights distribution matrix. The first column enumerates the number of lesions per case, while the remaining columns contain the weights. Missing values are filled with -Inf. Not to be confused with the lesionWeight list member in an FROC dataset, which enumerates the weights of lesions on individual cases.

Examples

UtilLesionWeightsMatrixDataset (dataset01) # FROC data

##      [,1] [,2] [,3]
##[1,]    1  1.0 -Inf
##[2,]    2  0.5  0.5

UtilLesionWeightsMatrixDataset (dataset02) # ROC data

##      [,1] [,2]
##[1,]    1  1

## Example 1: dataset with 1 to 4 lesions per case, with frequency as per first argument
UtilLesionWeightsMatrixLesDistr (c(0.6, 0.2, 0.1, 0.1), c(0.2, 0.4, 0.1, 0.3))

##       [,1]  [,2]      [,3]      [,4]   [,5]
##[1,]    1 1.0000000      -Inf      -Inf -Inf 
##[2,]    2 0.3333333 0.6666667      -Inf -Inf
##[3,]    3 0.2857143 0.5714286 0.1428571 -Inf
##[4,]    4 0.2000000 0.4000000 0.1000000  0.3

## Explanation 
##> c(0.2)/sum(c(0.2))
##[1] 1 ## (weights for cases with 1 lesion)
##> c(0.2, 0.4)/sum(c(0.2, 0.4))
##[1] 0.3333333 0.6666667 ## (weights for cases with 2 lesions)
##> c(0.2, 0.4, 0.1)/sum(c(0.2, 0.4, 0.1))
##[1] 0.2857143 0.5714286 0.1428571 ## (weights for cases with 3 lesions)
##> c(0.2, 0.4, 0.1, 0.3)/sum(c(0.2, 0.4, 0.1, 0.3))
##[1] 0.2000000 0.4000000 0.1000000  0.3 ## (weights for cases with 4 lesions)


## Example2 : dataset with *no* cases with 3 lesions per case
UtilLesionWeightsMatrixLesDistr (c(0.1, 0.7, 0.0, 0.2), c(0.4, 0.3, 0.2, 0.1))

##       [,1]  [,2]      [,3]    [,4]
##[1,]    1 1.0000000      -Inf  -Inf
##[2,]    2 0.5714286 0.4285714  -Inf
##[3,]    4 0.5000000 0.3750000 0.125

## Explanation: note that row with 3 lesions per case does not occur 
##> c(0.4)/sum(c(0.4))
##[1] 1 ## (weights for cases with 1 lesion)
##> c(0.4, 0.3)/sum(c(0.4, 0.3))
##[1] 0.5714286 0.4285714 ## (weights for cases with 2 lesions)
##> c(0.4, 0.3, 0.1)/sum(c(0.4, 0.3, 0.1))
##[1] 0.500 0.375 0.125 ## (weights for cases with 4 lesions)

Calculate mean squares for factorial dataset

Description

Calculates the mean squares used in the DBM and ORH methods for factorial dataset

Usage

UtilMeanSquares(dataset, FOM = "Wilcoxon", FPFValue = 0.2, method = "DBM")

Arguments

dataset

The dataset to be analyzed, see RJafroc-package.

FOM

The figure of merit to be used in the calculation. The default is "FOM_wAFROC". See UtilFigureOfMerit.

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

method

The method, in which the mean squares are calculated. The two valid choices are "DBM" (default) and "OR".

Details

For DBM method, msT, msTR, msTC, msTRC will not be available if the dataset contains only one treatment. Similarly, msR, msTR, msRC, msTRC will not be returned for single reader dataset. For ORH method, msT, msR, msTR will be returned for multiple reader multiple treatment dataset. msT is not available for single treatment dataset, and msR is not available for single reader dataset.

Value

A list containing all possible mean squares

Examples

UtilMeanSquares(dataset02, FOM = "Wilcoxon")
UtilMeanSquares(dataset05, FOM = "wAFROC", method = "OR")

Convert from OR to DBM variance components

Description

UtilOR2DBMVarCom converts from OR to DBM variance components.

Usage

UtilOR2DBMVarCom(K, ORVarCom)

Arguments

K

Total number of cases

ORVarCom

OR variance components, a data.frame containing VarR, VarTR, Cov1, Cov2, Cov3 and Var

Value

UtilOR2DBMVarCom returns the equivalent DBM variance components

Examples

DBMVarCom <- StSignificanceTesting(dataset02, FOM = "Wilcoxon", method = "DBM")$ANOVA$VarCom
UtilDBM2ORVarCom(114, DBMVarCom)

ORVarCom <- StSignificanceTesting(dataset02, FOM = "Wilcoxon", method = "OR")$ANOVA$VarCom
UtilOR2DBMVarCom(114, ORVarCom)

Utility for estimating Obuchowski-Rockette variance components for factorial datasets

Description

Utility for estimating Obuchowski-Rockette variance components for factorial datasets

Usage

UtilORVarComponentsFactorial(
  dataset,
  FOM,
  FPFValue = 0.2,
  covEstMethod = "jackknife",
  nBoots = 200,
  seed = NULL
)

Arguments

dataset

The factorial dataset object

FOM

The figure of merit

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

covEstMethod

The covariance estimation method, "jackknife" (the default) or "bootstrap" or "DeLong" (DeLongt is applicable only for FOM = Wilcoxon).

nBoots

Only needed for bootstrap covariance estimation method. The number of bootstraps, defaults to 200.

seed

Only needed for the bootstrap covariance estimation method. The initial seed for the random number generator, the default is NULL, as if no seed has been specified.

Details

The variance components are obtained using StSignificanceTesting with method = "OR".

Value

A list object containing the following data.frames:

foms: the figures of merit for different treatment-reader combinations
TRanova: the OR treatment-reader ANOVA table
VarCom: the OR variance-components Cov1, Cov2, Cov3, Var and correlations rho1, rho2 and rho3
IndividualTrt: the individual treatment mean-squares, Var and Cov2 values
IndividualRdr: the individual reader mean-squares, Var and Cov1 values

Examples

## use the default jackknife for covEstMethod
vc <- UtilORVarComponentsFactorial(dataset02, FOM = "Wilcoxon")
str(vc) 

UtilORVarComponentsFactorial(dataset02, FOM = "Wilcoxon", 
   covEstMethod = "bootstrap", nBoots = 2000, seed = 100)$VarCom 

UtilORVarComponentsFactorial(dataset02, FOM = "Wilcoxon", covEstMethod = "DeLong")$VarCom

Generate a text formatted report file or an Excel file

Description

Generates a formatted report of the analysis and saves it to a text or an Excel file

Usage

UtilOutputReport(
  dataset,
  ReportFileBaseName = NULL,
  ReportFileExt = "txt",
  method = "DBM",
  FOM,
  alpha = 0.05,
  covEstMethod = "jackknife",
  nBoots = 200,
  sequentialNames = FALSE,
  overWrite = FALSE,
  analysisOption = "ALL"
)

Arguments

dataset

The dataset object to be analyzed (not the file name), see Dataset in RJafroc-package.

ReportFileBaseName

The report file (with extension "txt" or "xlsx", as specified by option ReportFileExt) is created in the user's directory. This argument specifies the report file base name (i.e., without the extension) for the desired report; the default is NULL, in which case the system generates a temporary text file, whose very long name is displayed. However, the file is very hard to locate. This is so that the software passes CRAN checks, as writing to the project directory, or any of its sub-directories, is frowned upon.

ReportFileExt

The report file extension determines the type of output. "txt", the default, for a text file, "xlsx" for an Excel file.

method

The significance testing method, "OR" or (the default) "DBM".

FOM

The figure of merit; see StSignificanceTesting.

alpha

See StSignificanceTesting; the default is 0.05.

covEstMethod

See StSignificanceTesting; only needed for method = "OR"; the default is "Jackknife".

nBoots

See StSignificanceTesting; only needed for "OR" analysis; the default is 200.

sequentialNames

A logical variable: if TRUE, consecutive integers (starting from 1) will be used as the treatment and reader IDs in the output report. Otherwise, treatment and reader IDs in the original dataset are used. This option is needed for aesthetics, as long names can mess up the output. The default is FALSE.

overWrite

A logical variable: if FALSE, a warning will be issued if the report file already exists and the program will wait until the user inputs "y" or "n" to determine whether to overwrite the existing file. If TRUE, the existing file will be silently overwritten. The default is FALSE.

analysisOption

"RRRC", "FRRC", "RRFC or "ALL"; see StSignificanceTesting.

Details

A formatted report of the data analysis is written to the output file in either text or Excel format.

Value

StResult The object returned by StSignificanceTesting.

Examples



 # text output is created in a temporary file
UtilOutputReport(dataset03, FOM = "Wilcoxon")
# Excel output is created in a temporary file
UtilOutputReport(dataset03, FOM = "Wilcoxon", ReportFileExt = "xlsx")

Pseudovalues for given dataset and FOM

Description

Returns centered jackknife pseudovalues AND jackknife FOM values, for factorial OR split-plot-a OR split-plot-c study designs

Usage

UtilPseudoValues(dataset, FOM, FPFValue = 0.2)

Arguments

dataset

The dataset to be analyzed, see RJafroc-package; must be factorial, or split-plot-a or split-plot-c.

FOM

The figure of merit to be used in the calculation. The default is "FOM_wAFROC". See UtilFigureOfMerit.

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

Value

A list containing two arrays containing the pseudovalues and the jackknife FOM values of the datasets (a third returned value is for internal use).

Note

Each returned array has dimension c(I,J,K), where K depends on the FOM: K1 for FOMs that are based on normal cases only, K2 for FOMs that are based on abnormal cases only, and K for FOMs that are based on normal and abnormal cases.

Examples

UtilPseudoValues(dataset05, FOM = "wAFROC")$jkFomValues[1,1,1:10]

Convert from physical to intrinsic RSM parameters

Description

Convert physical RSM parameters \lambda_i' and \nu_i' to the intrinsic RSM parameters \lambda_i and \nu_i. The physical parameters are more meaningful but they depend on \mu. The intrinsic parameters are independent of \mu. See book for details.

Usage

UtilRSM2Intrinsic(mu, lambda, nu)

Arguments

mu

lambda

The Poisson \lambda_i parameter, which describes the distribution of random numbers of latent NLs (suspicious regions that do not correspond to actual lesions) per case; the mean of these random numbers asymptotically approaches lambda

nu

The \nu_i parameter; it is the success probability of the binomial distribution describing the random number of latent LLs (suspicious regions that correspond to actual lesions) per diseased case

Details

RSM is the Radiological Search Model described in the book. A latent mark becomes an actual mark if the corresponding rating exceeds the lowest reporting threshold zeta1. See also UtilIntrinsic2RSM.

Value

A list containing \lambda_i and \nu_i, the RSM search parameters

References

Chakraborty DP (2006) A search model and figure of merit for observer data acquired according to the free-response paradigm, Phys Med Biol 51, 3449-3462.

Chakraborty DP (2006) ROC Curves predicted by a model of visual search, Phys Med Biol 51, 3463–3482.

Examples

mu <- 2;lambda <- 10;nu <- 0.9
lambda_i <- UtilRSM2Intrinsic(mu, lambda, nu)$lambda_i 
nu_i <- UtilRSM2Intrinsic(mu, lambda, nu)$nu_i 
## note that the physical values are only constrained to be positive, e.g., nu_i is not constrained
## to be between 0 and one.

Utility for Dorfman-Berbaum-Metz variance components

Description

Utility for Dorfman-Berbaum-Metz variance components

Usage

UtilVarComponentsDBM(dataset, FOM, FPFValue = 0.2)

Arguments

dataset

The dataset object

FOM

The figure of merit

FPFValue

Only needed for LROC data and FOM = "PCL" or "ALROC"; where to evaluate a partial curve based figure of merit. The default is 0.2.

Value

A list object containing the variance components.

Examples

UtilVarComponentsDBM(dataset02, FOM = "Wilcoxon")

TONY FROC dataset

Description

This is referred to in the book as the "TONY" dataset. It consists of 185 cases, 89 of which are diseased, interpreted in two treatments ("BT" = breast tomosynthesis and "DM" = digital mammography) by five radiologists using the FROC paradigm.

Usage

dataset01

Format

A list with 3 elements: $ratings, $lesions and $descriptions; $ratings contain 3 elements, $NL, $LL and $LL_IL as sub-lists; $lesions contain 3 elements, $perCase, $IDs and $weights as sub-lists; $descriptions contain 7 elements, $fileName, $type, $name, $truthTableStr, $design, $modalityID and $readerID as sub-lists;

rating$NL, num [1:2, 1:5, 1:185, 1:3], ratings of non-lesion localizations, NLs
rating$LL, num [1:2, 1:5, 1:89, 1:2], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:89], number of lesions per diseased case
lesions$IDs, num [1:89, 1:2], numeric labels of lesions on diseased cases
lesions$weights, num [1:89, 1:2], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset01", base name of dataset in 'data' folder
descriptions$type, chr "FROC", the data type
descriptions$name, chr "TONY", the name of the dataset
descriptions$truthTableStr, num [1:2, 1:5, 1:185, 1:4] 1 1 1 1 ..., truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:2] "BT" "DM", treatment labels
descriptions$readerID, chr [1:5] "1" "2" "3" "4" ..., reader labels

References

Chakraborty DP, Svahn T (2011) Estimating the parameters of a model of visual search from ROC data: an alternate method for fitting proper ROC curves. PROC SPIE 7966.

Examples

str(dataset01)
PlotEmpiricalOperatingCharacteristics(dataset = dataset01, opChType = "wAFROC")$Plot

Van Dyke ROC dataset

Description

This is referred to in the book as the "VD" dataset. It consists of 114 cases, 45 of which are diseased, interpreted in two treatments ("0" = single spin echo MRI, "1" = cine-MRI) by five radiologists using the ROC paradigm. Each diseased cases had an aortic dissection; the ROC paradigm generates one rating per case. Often referred to in the ROC literature as the Van Dyke dataset, which, along with the Franken dataset, has been widely used to illustrate advances in ROC methodology. The example below displays the ROC plot for the first treatment and first reader.

Usage

dataset02

Format

rating$NL, num [1:2, 1:5, 1:114, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1:2, 1:5, 1:45, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:45], number of lesions per diseased case
lesions$IDs, num [1:45, 1], numeric labels of lesions on diseased cases
lesions$weights, num [1:45, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset02", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "VAN-DYKE", the name of the dataset
descriptions$truthTableStr, num [1:2, 1:5, 1:114, 1:2] 1 1 1 1 ..., truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:2] "0" "1", treatment labels
descriptions$readerID, chr [1:5] "0" "1" "2" ..., reader labels

References

Van Dyke CW, et al. Cine MRI in the diagnosis of thoracic aortic dissection. 79th RSNA Meetings. 1993.

Examples

str(dataset02)
PlotEmpiricalOperatingCharacteristics(dataset = dataset02, opChType = "ROC")$Plot

Franken ROC dataset

Description

This is referred to in the book as the "FR" dataset. It consists of 100 cases, 67 of which are diseased, interpreted in two treatments, "0" = conventional film radiographs, "1" = digitized images viewed on monitors, by four radiologists using the ROC paradigm. Often referred to in the ROC literature as the Franken-dataset, which, along the the Van Dyke dataset, has been widely used to illustrate advances in ROC methodology.

Usage

dataset03

Format

rating$NL, num [1:2, 1:4, 1:100, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1:2, 1:4, 1:67, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:67], number of lesions per diseased case
lesions$IDs, num [1:67, 1], numeric labels of lesions on diseased cases
lesions$weights, num [1:67, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset03", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "FRANKEN", the name of the dataset
descriptions$truthTableStr, num [1:2, 1:4, 1:100, 1:2], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:2] "TREAT1" "TREAT2", treatment labels
descriptions$readerID, chr chr [1:4] "READER_1" "READER_2" "READER_3" "READER_4", reader labels

References

Franken EA, et al. Evaluation of a Digital Workstation for Interpreting Neonatal Examinations: A Receiver Operating Characteristic Study. Investigative Radiology. 1992;27(9):732-737.

Examples

str(dataset03)
PlotEmpiricalOperatingCharacteristics(dataset = dataset03, opChType = "ROC")$Plot

Federica Zanca FROC dataset

Description

This is referred to in the book as the "FED" dataset. It consists of 200 mammograms, 100 of which contained one to 3 simulated microcalcifications, interpreted in five treatments (basically different image processing algorithms) by four radiologists using the FROC paradigm and a 5-point rating scale. The maximum number of NLs per case, over the entire dataset was 7 and the dataset contained at least one diseased mammogram with 3 lesions. The Excel file containing this dataset is /inst/extdata/datasets/FZ_ALL.xlsx. The normal cases are labeled 100:199 while the normal cases are labeled 0:99.

Usage

dataset04

Format

rating$NL, num [1:5, 1:4, 1:200, 1:7], ratings of non-lesion localizations, NLs
rating$LL, num [1:5, 1:4, 1:100, 1:3], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:100], number of lesions per diseased case
lesions$IDs, num [1:100, 1:3], numeric labels of lesions on diseased cases
lesions$weights, num [1:100, 1:3], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset04", base name of dataset in 'data' folder
descriptions$type, chr "FROC", the data type
descriptions$name, chr "FEDERICA", the name of the dataset
descriptions$truthTableStr, num [1:5, 1:4, 1:200, 1:4], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:5] "1" "2" "3" "4" "5", treatment labels
descriptions$readerID, chr [1:4] "1" "3" "4" "5", reader labels

References

Zanca F et al. Evaluation of clinical image processing algorithms used in digital mammography. Medical Physics. 2009;36(3):765-775.

Examples

str(dataset04)
PlotEmpiricalOperatingCharacteristics(dataset = dataset04, opChType = "wAFROC")$Plot

John Thompson FROC dataset

Description

This is referred to in the book as the "JT" dataset. It consists of 92 cases, 47 of which are diseased, interpreted in two treatments ("1" = CT images acquired for attenuation correction, "2" = diagnostic CT images), by nine radiographers using the FROC paradigm. Each case was a slice of an anthropomorphic phantom 47 with inserted nodular lesions (max 3 per slice). The maximum number of NLs per case, over the entire dataset was 7.

Usage

dataset05

Format

rating$NL, num [1:2, 1:9, 1:92, 1:7], ratings of non-lesion localizations, NLs
rating$LL, num [1:2, 1:9, 1:47, 1:3], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:47], number of lesions per diseased case
lesions$IDs, num [1:47, 1:3], numeric labels of lesions on diseased cases
lesions$weights, num [1:47, 1:3], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset05", base name of dataset in 'data' folder
descriptions$type, chr "FROC", the data type
descriptions$name, chr "THOMPSON", the name of the dataset
descriptions$truthTableStr, num [1:2, 1:9, 1:92, 1:4], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:2] "1" "2", treatment labels
descriptions$readerID, chr [1:4] "1" "2" "3" "4", reader labels

References

Thompson JD Hogg P, et al. (2014) A Free-Response Evaluation Determining Value in the Computed Tomography Attenuation Correction Image for Revealing Pulmonary Incidental Findings: A Phantom Study. Academic Radiology, 21 (4): 538-545.

Examples

str(dataset05)
PlotEmpiricalOperatingCharacteristics(dataset = dataset05, opChType = "wAFROC")$Plot

Magnus FROC dataset

Description

This is referred to in the book as the "MAG" dataset (after Magnus Bath, who conducted the JAFROC analysis). It consists of 100 cases, 69 of which are diseased, interpreted in two treatments ("1" = conventional chest, "1" = chest tomosynthesis) by four radiologists using the FROC paradigm.

Usage

dataset06

Format

rating$NL, num [1:2, 1:4, 1:89, 1:17], ratings of non-lesion localizations, NLs
rating$LL, num [1:2, 1:4, 1:42, 1:15], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:42], number of lesions per diseased case
lesions$IDs, num [1:42, 1:15], numeric labels of lesions on diseased cases
lesions$weights, num [1:42, 1:15], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset06", base name of dataset in 'data' folder
descriptions$type, chr "FROC", the data type
descriptions$name, chr "MAGNUS", the name of the dataset
descriptions$truthTableStr, num [1:2, 1:4, 1:89, 1:16], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:2] "1" "2", treatment labels
descriptions$readerID, chr [1:4] "1" "2" "3" "4", reader labels

References

Vikgren J et al. Comparison of Chest Tomosynthesis and Chest Radiography for Detection of Pulmonary Nodules: Human Observer Study of Clinical Cases. Radiology. 2008;249(3):1034-1041.

Examples

str(dataset06)
PlotEmpiricalOperatingCharacteristics(dataset = dataset06, opChType = "wAFROC")$Plot

Lucy Warren FROC dataset

Description

This is referred to in the book as the "OPT" dataset (for OptiMam). It consists of 162 cases, 81 of which are diseased, interpreted in five treatments (see reference, basically different ways of acquiring the images) by seven radiologists using the FROC paradigm.

Usage

dataset07

Format

rating$NL, num [1:5, 1:7, 1:162, 1:4], ratings of non-lesion localizations, NLs
rating$LL, num [1:5, 1:7, 1:81, 1:3], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:81], number of lesions per diseased case
lesions$IDs, num [1:81, 1:3], numeric labels of lesions on diseased cases
lesions$weights, num [1:81, 1:3], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset07", base name of dataset in 'data' folder
descriptions$type, chr "FROC", the data type
descriptions$name, chr "LUCY-WARREN", the name of the dataset
descriptions$truthTableStr, num [1:5, 1:7, 1:162, 1:4], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, [1:5] "1" "2" "3" "4" ..., treatment labels
descriptions$readerID, chr [1:7] "1" "2" "3" "4" ..., reader labels

References

Warren LM, Mackenzie A, Cooke J, et al. Effect of image quality on calcification detection in digital mammography. Medical Physics. 2012;39(6):3202-3213.

Examples

str(dataset07)
PlotEmpiricalOperatingCharacteristics(dataset = dataset07, opChType = "wAFROC")$Plot

Monica Penedo ROC dataset

Description

This is referred to in the book as the "PEN" dataset. It consists of 112 cases, 64 of which are diseased, interpreted in five treatments (basically different image compression algorithms) by five radiologists using the FROC paradigm (the inferred ROC dataset is included; the original FROC data is lost).

Usage

dataset08

Format

rating$NL, num [1:5, 1:5, 1:112, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1:5, 1:5, 1:64, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:64], number of lesions per diseased case
lesions$IDs, num [1:64, 1], numeric labels of lesions on diseased cases
lesions$weights, num [1:64, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset08", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "PENEDO", the name of the dataset
descriptions$truthTableStr, num [1:5, 1:5, 1:112, 1:2], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:5] "0" "1" "2" "3" ..., treatment labels
descriptions$readerID, chr [1:5] "0" "1" "2" "3" ..., reader labels

References

Penedo et al. Free-Response Receiver Operating Characteristic Evaluation of Lossy JPEG2000 and Object-based Set Partitioning in Hierarchical Trees Compression of Digitized Mammograms. Radiology. 2005;237(2):450-457.

Examples

str(dataset08)
PlotEmpiricalOperatingCharacteristics(dataset = dataset08, opChType = "ROC")$Plot

Nico Karssemeijer ROC dataset (CAD vs. radiologists)

Description

This is referred to in the book as the "NICO" dataset. It consists of 200 mammograms, 80 of which contain one malignant mass, interpreted by a CAD system and nine radiologists using the LROC paradigm. The first reader is CAD. The highest rating was used to convert this to an ROC dataset. The original LROC data is datasetCadLroc. Analyzing this data requires methods described in the book, implemented in the function StSignificanceTestingCadVsRad.

Usage

dataset09

Format

rating$NL, num [1, 1:10, 1:200, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1, 1:10, 1:80, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:80], number of lesions per diseased case
lesions$IDs, num [1:80, 1], numeric labels of lesions on diseased cases
lesions$weights, num [1:80, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset09", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "NICO-CAD-ROC", the name of the dataset
descriptions$truthTableStr, num [1, 1:10, 1:200, 1:2], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr "1", treatment label(s)
descriptions$readerID, chr [1:10] "1" "2" "3" "4" ..., reader labels

References

Hupse R et al. Standalone computer-aided detection compared to radiologists' performance for the detection of mammographic masses. Eur Radiol. 2013;23(1):93-100.

Examples

str(dataset09)
PlotEmpiricalOperatingCharacteristics(dataset = dataset09, rdrs = 1:10, opChType = "ROC")$Plot

Mark Ruschin ROC dataset

Description

This is referred to in the book as the "RUS" dataset. It consists of 90 cases, 40 of which are diseased, the images were acquired at three dose levels, which can be regarded as treatments. "0" = conventional film radiographs, "1" = digitized images viewed on monitors, Eight radiologists interpreted the cases using the FROC paradigm. These have been reduced to ROC data by using the highest ratings (the original FROC data is lost).

Usage

dataset10

Format

rating$NL, num [1:3, 1:8, 1:90, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1:3, 1:8, 1:40, 1] , ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:40], number of lesions per diseased case
lesions$IDs, num [1:40, 1], numeric labels of lesions on diseased cases
lesions$weights, num [1:40, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset10", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "RUSCHIN", the name of the dataset
descriptions$truthTableStr, num [1:3, 1:8, 1:90, 1:2], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:3] "1" "2" "3", treatment label(s)
descriptions$readerID, chr [1:8] "1" "2" "3" "4" ..., reader labels

References

Ruschin M, et al. Dose dependence of mass and microcalcification detection in digital mammography: free response human observer studies. Med Phys. 2007;34:400 - 407.

Examples

str(dataset10)
PlotEmpiricalOperatingCharacteristics(dataset = dataset10, opChType = "ROC")$Plot

Dobbins 1 FROC dataset

Description

This is referred to in the book as the "DOB1" dataset. Dobbins et al conducted a multi-institutional, MRMC study to compare the performance of digital tomosynthesis (GE's VolumeRad device), dual-energy (DE) imaging, and conventional chest radiography for pulmonary nodule detection and management. All study images were obtained with a flat-panel detector developed by GE. The case set consisted of 158 subjects, of which 43 were non-diseased and the rest had 1 - 20 pulmonary nodules independently verified, using with CT images, by 3 experts who did not participate in the observer study. The study used FROC paradigm data collection. There are 4 treatments labeled 1 - 4 (conventional chest x-ray, CXR, CXR augmented with dual-energy (CXR+DE), VolumeRad digital tomosynthesis images and VolumeRad augmented with DE (VolumeRad+DE).

Usage

dataset11

Format

rating$NL, num [1:4, 1:5, 1:158, 1:4], ratings of non-lesion localizations, NLs
rating$LL, num [1:4, 1:5, 1:115, 1:20], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:115], number of lesions per diseased case
lesions$IDs, num [1:115, 1:20], numeric labels of lesions on diseased cases
lesions$weights, num [1:115, 1:20], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset11", base name of dataset in 'data' folder
descriptions$type, chr "FROC", the data type
descriptions$name, chr "DOBBINS-1", the name of the dataset
descriptions$truthTableStr, num [1:4, 1:5, 1:158, 1:21], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:4] "1" "2" "3" "4", treatment label(s)
descriptions$readerID, chr [1:5] "1" "2" "3" "4" ..., reader labels

References

Dobbins III JT et al. Multi-Institutional Evaluation of Digital Tomosynthesis, Dual-Energy Radiography, and Conventional Chest Radiography for the Detection and Management of Pulmonary Nodules. Radiology. 2016;282(1):236-250.

Examples

str(dataset11)

Dobbins 2 ROC dataset

Description

This is referred to in the code as the "DOB2" dataset. It contains actionability ratings, i.e., do you recommend further follow up on the patient, one a 1 (definitely not) to 5 (definitely yes), effectively an ROC dataset using a 5-point rating scale.

Usage

dataset12

Format

rating$NL, num [1:4, 1:5, 1:152, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1:4, 1:5, 1:88, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:88], number of lesions per diseased case
lesions$IDs, num [1:88, 1], numeric labels of lesions on diseased cases
lesions$weights, num [1:88, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset12", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "DOBBINS-2", the name of the dataset
descriptions$truthTableStr, num [1:4, 1:5, 1:152, 1:2] , truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:4] "1" "2" "3" "4", treatment label(s)
descriptions$readerID, chr [1:5] "1" "2" "3" "4" ..., reader labels

References

Examples

str(dataset12)

Dobbins 3 FROC dataset

Description

This is referred to in the code as the "DOB3" dataset. This is a subset of DOB1 which includes data for lesions not-visible on CXR, but visible to truth panel on all treatments.

Usage

dataset13

Format

rating$NL, num [1:4, 1:5, 1:158, 1:4], ratings of non-lesion localizations, NLs
rating$LL, num [1:4, 1:5, 1:106, 1:15], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:106], number of lesions per diseased case
lesions$IDs, num [1:106, 1:15], numeric labels of lesions on diseased cases
lesions$weights, num [1:106, 1:15], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset13", base name of dataset in 'data' folder
descriptions$type, chr "FROC", the data type
descriptions$name, chr "DOBBINS-3", the name of the dataset
descriptions$truthTableStr, num [1:4, 1:5, 1:158, 1:16], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:4] "1" "2" "3" "4", treatment label(s)
descriptions$readerID, chr [1:5] "1" "2" "3" "4" ..., reader labels

References

Examples

str(dataset13)

Federica Zanca real (as opposed to inferred) ROC dataset

Description

This is referred to in the book as the "FZR" dataset. It is a real ROC study, conducted on the same images and using the same radiologists, on treatments "4" and "5" of dataset04. This was compared to highest rating inferred ROC data from dataset04 to conclude, erroneously, that the highest rating assumption is invalid. See book Section 13.6.2.

Usage

dataset14

Format

rating$NL, num [1:2, 1:4, 1:200, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1:2, 1:4, 1:100, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:100], number of lesions per diseased case
lesions$IDs, num [1:100, 1] , numeric labels of lesions on diseased cases
lesions$weights, num [1:100, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "dataset14", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "FEDERICA-REAL-ROC", the name of the dataset
descriptions$truthTableStr, num [1:2, 1:4, 1:200, 1:2], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr [1:2] "4" "5", treatment label(s)
descriptions$readerID, chr [1:4] "1" "2" "3" "4", reader labels

References

Zanca F, Hillis SL, Claus F, et al (2012) Correlation of free-response and receiver-operating-characteristic area-under-the-curve estimates: Results from independently conducted FROC/ROC studies in mammography. Med Phys. 39(10):5917-5929.

Examples

str(dataset14)

Binned dataset suitable for checking `FitCorCbm`; seed = 123

Description

A binned dataset suitable for analysis by FitCorCbm. It was generated by DfCreateCorCbmDataset by setting the seed variable to 123. Note the formatting of the data as a single treatment two reader dataset, even though the actual pairing might be different, see FitCorCbm. The dataset is intentionally large so as to demonstrate the asymptotic convergence of ML estimates, produced by FitCorCbm, to the population values. The data was generated by the following argument values to DfCreateCorCbmDataset: seed = 123, K1 = 5000, K2 = 5000, desiredNumBins = 5, muX = 1.5, muY = 3, alphaX = 0.4, alphaY = 0.7, rhoNor = 0.3, rhoAbn2 = 0.8.

Usage

datasetBinned123

Format

rating$NL, num [1, 1:2, 1:10000, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1, 1:2, 1:5000, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:5000], number of lesions per diseased case
lesions$IDs, num [1:5000, 1] , numeric labels of lesions on diseased cases
lesions$weights, num [1:5000, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "datasetBinned123", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "SIM-CORCBM-SEED-123", the name of the dataset
descriptions$truthTableStr, NA, truth table structure
descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset
descriptions$modalityID, chr "1", treatment label(s)
descriptions$readerID, chr [1:2] "1" "2", reader labels

References

Zhai X, Chakraborty DP (2017). A bivariate contaminated binormal model for robust fitting of proper ROC curves to a pair of correlated, possibly degenerate, ROC datasets. Medical Physics. 44(6):2207–2222.

Examples

str(datasetBinned123)

Binned dataset suitable for checking `FitCorCbm`; seed = 124

Description

A binned dataset suitable for analysis by FitCorCbm. It was generated by DfCreateCorCbmDataset by setting the seed variable to 124. Otherwise similar to datasetBinned123.

Usage

datasetBinned124

Format

rating$NL, num [1, 1:2, 1:10000, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1, 1:2, 1:5000, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:5000], number of lesions per diseased case
lesions$IDs, num [1:5000, 1] , numeric labels of lesions on diseased cases
lesions$weights, num [1:5000, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "datasetBinned124", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "SIM-CORCBM-SEED-124", the name of the dataset
descriptions$truthTableStr, NA, truth table structure
descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset
descriptions$modalityID, chr "1", treatment label(s)
descriptions$readerID, chr [1:2] "1" "2", reader labels

References

Examples

str(datasetBinned124)

Binned dataset suitable for checking `FitCorCbm`; seed = 125

Description

A binned dataset suitable for analysis by FitCorCbm. It was generated by DfCreateCorCbmDataset by setting the seed variable to 125. Otherwise similar to datasetBinned123.

Usage

datasetBinned125

Format

rating$NL, num [1, 1:2, 1:10000, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1, 1:2, 1:5000, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:5000], number of lesions per diseased case
lesions$IDs, num [1:5000, 1] , numeric labels of lesions on diseased cases
lesions$weights, num [1:5000, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "datasetBinned125", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "SIM-CORCBM-SEED-125", the name of the dataset
descriptions$truthTableStr, NA, truth table structure
descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset
descriptions$modalityID, chr "1", treatment label(s)
descriptions$readerID, chr [1:2] "1" "2", reader labels

References

Examples

str(datasetBinned125)

Nico Karssemeijer LROC dataset (CAD vs. radiologists)

Description

This is the actual LROC data corresponding to dataset09, which was the inferred ROC data. Note that the LL field is split into two, LL, representing true positives where the lesions were correctly localized, and LL_IL, representing true positives where the lesions were incorrectly localized. The first reader is CAD and the remaining readers are radiologists.

Usage

datasetCadLroc

Format

rating$NL, num [1, 1:10, 1:200, 1], ratings of localizations on normal cases
rating$LL, num [1, 1:10, 1:80, 1], ratings of correct localizations on abnormal cases
rating$LL_ILnum [1, 1:10, 1:80, 1], ratings of incorrect localizations on abnormal cases
lesions$perCase, int [1:80], number of lesions per diseased case
lesions$IDs, num [1:80, 1] , numeric labels of lesions on diseased cases
lesions$weights, num [1:80, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "datasetCadLroc", base name of dataset in 'data' folder
descriptions$type, chr "LROC", the data type
descriptions$name, chr "NICO-CAD-LROC", the name of the dataset
descriptions$truthTableStr, num [1:2, 1:4, 1:200, 1:2], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr "1", treatment label(s)
descriptions$readerID, chr [1:10] "1" "2" "3" "4" ..., reader labels

References

Hupse R et al. Standalone computer-aided detection compared to radiologists' performance for the detection of mammographic masses. Eur Radiol. 2013;23(1):93-100.

Examples

str(datasetCadLroc)

Simulated FROC CAD vs. RAD dataset

Description

Simulated FROC CAD vs. RAD dataset suitable for checking code. It was generated from datasetCadLroc using SimulateFrocFromLrocData.R. The LROC paradigm always yields a single mark per case. Therefore the equivalent FROC will also have only one mark per case. The NL arrays of the two datasets are identical. The LL array is created by copying the LL (correct localiztion) array of the LROC dataset to the LL array of the FROC dataset, from diseased case index k2 = 1 to k2 = K2. Additionally, the LL_IL array of the LROC dataset is copied to the NL array of the FROC dataset, starting at case index k1 = K1+1 to k1 = K1+K2. Any zero ratings are replace by -Infs. The equivalent FROC dataset has the same HrAuc as the original LROC dataset. See example. The main use of this dataset & function is to test the CAD significance testing functions using CAD FROC datasets, which I currently don't have.

Usage

datasetCadSimuFroc

Format

rating$NL, num [1, 1:10, 1:200, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1, 1:10, 1:80, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:80], number of lesions per diseased case
lesions$IDs, num [1:80, 1] , numeric labels of lesions on diseased cases
lesions$weights, num [1:80, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "datasetCadSimuFroc", base name of dataset in 'data' folder
descriptions$type, chr "LROC", the data type
descriptions$name, chr "NICO-CAD-LROC", the name of the dataset
descriptions$truthTableStr, num [1:2, 1:4, 1:200, 1:2], truth table structure
descriptions$design, chr "FCTRL", study design, factorial dataset
descriptions$modalityID, chr "1", treatment label(s)
descriptions$readerID, chr [1:10] "1" "2" "3" "4" ..., reader labels

John Thompson crossed treatment FROC dataset

Description

This is a crossed treatment dataset, see book Section 18.5. There are two treatment factors. The first treatment factor modalityID1 can be "F" or "I", which represent two CT reconstruction algorithms. The second treatment factor modalityID2 can be "20" "40" "60" "80", which represent the mAs values of the image acquisition. The factors are fully crossed. The function StSignificanceTestingCrossedModalities analyzes such datasets.

Usage

datasetCrossedModality

Format

rating$NL, num [1:2, 1:4, 1:11, 1:68, 1:5], ratings of non-lesion localizations, NLs
rating$LL, num [1:2, 1:4, 1:11, 1:34, 1:3], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:34], number of lesions per diseased case
lesions$IDs, num [1:34, 1:3] , numeric labels of lesions on diseased cases
lesions$weights, num [1:34, 1:3], weights (or clinical importances) of lesions
descriptions$fileName, chr, "datasetCrossedModality", base name of dataset in 'data' folder
descriptions$type, chr "FROC", the data type
descriptions$name, chr "THOMPSON-X-MOD", the name of the dataset
descriptions$truthTableStr, NA, truth table structure
descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset
descriptions$modalityID, chr [1:2] "F" "I", treatment label(s)
descriptions$readerID, chr [1:4] "20" "40" "60" "80", reader labels

References

Thompson JD, Chakraborty DP, et al. (2016) Effect of reconstruction methods and x-ray tube current-time product on nodule detection in an anthropomorphic thorax phantom: a crossed-treatment JAFROC observer study. Medical Physics. 43(3):1265-1274.

Examples

str(datasetCrossedModality)

Simulated degenerate ROC dataset (for testing purposes)

Description

A simulated degenerated dataset. A degenerate dataset is defined as one with no interior operating points on the ROC plot. Such data tend to be observed with expert level radiologists. This dataset is used to illustrate the robustness of two fitting models, namely CBM and RSM. The widely used binormal model and PROPROC fail on such datasets.

Usage

datasetDegenerate

Format

rating$NL, num [1, 1, 1:15, 1], ratings of non-lesion localizations, NLs
rating$LL, num [1, 1, 1:10, 1], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:10], number of lesions per diseased case
lesions$IDs, num [1:10, 1] , numeric labels of lesions on diseased cases
lesions$weights, num [1:10, 1], weights (or clinical importances) of lesions
descriptions$fileName, chr, "datasetDegenerate", base name of dataset in 'data' folder
descriptions$type, chr "ROC", the data type
descriptions$name, chr "SIM-DEGENERATE", the name of the dataset
descriptions$truthTableStr, NA, truth table structure
descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset
descriptions$modalityID, chr "1", treatment label(s)
descriptions$readerID, chr "1", reader labels

Examples

str(datasetDegenerate)

Simulated FROC SPLIT-PLOT-C dataset

Description

Simulated from FED Excel dataset by successively ignoring readers 3:4, c(1,3:4), c(1:2,4), etc. created simulated split plot Excel dataset from Fed dataset: confirmed it is read without error

Usage

datasetFROCSpC

Format

rating$NL, num [1:2, 1:4, 1:200, 1:7], ratings of non-lesion localizations, NLs
rating$LL, num [1:2, 1:4, 1:100, 1:3], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:100], number of lesions per diseased case
lesions$IDs, num [1:100, 1:3] , numeric labels of lesions on diseased cases
lesions$weights, num [1:100, 1:3], weights (or clinical importances) of lesions
descriptions$fileName, chr, "datasetFROCSpC", base name of dataset in 'data' folder
descriptions$type, chr "FROC", the data type
descriptions$name, chr "SIM-FROC-SPLIT-PLOT-C", the name of the dataset
descriptions$truthTableStr, NA, truth table structure
descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset
descriptions$modalityID, chr [1:2] "4" "5", treatment label(s)
descriptions$readerID, chr [1:4] "1" "3" "4" "5", reader labels

Examples

str(datasetFROCSpC)

Simulated ROI dataset

Description

TBA Simulated ROI dataset: assumed are 4 ROIs per case, 5 readers, 50 non-dieased and 40 diseased cases.

Usage

datasetROI

Format

rating$NL, num [1:2, 1:5, 1:90, 1:4], ratings of non-lesion localizations, NLs
rating$LL, num [1:2, 1:5, 1:40, 1:4], ratings of lesion localizations, LLs
rating$LL_ILNA, this placeholder is used only for LROC data
lesions$perCase, int [1:40], number of lesions per diseased case
lesions$IDs, num [1:40, 1:4] , numeric labels of lesions on diseased cases
lesions$weights, num [1:40, 1:4], weights (or clinical importances) of lesions
descriptions$fileName, chr, "datasetROI", base name of dataset in 'data' folder
descriptions$type, chr "ROI", the data type
descriptions$name, chr "SIM-ROI", the name of the dataset
descriptions$truthTableStr, NA, truth table structure
descriptions$design, chr "FCTRL-X-MOD", study design, factorial dataset
descriptions$modalityID, chr [1:2] "1" "2", treatment label(s)
descriptions$readerID, chr [1:5] "1" "2" "3" "4" ..., reader labels

Examples

str(datasetROI)

Determine if a dataset is binned

Description

Determine if a dataset is binned

Usage

isBinnedDataset(dataset, maxUniqeRatings = 6)

Arguments

dataset

The dataset

maxUniqeRatings

For each treatment-reader combination, the max number of unique ratings in order to be classified as binned, the default value for maxUniqeRatings is 6; if there are more unique ratings the treatment-reader combination is classified as not binned.

Value

a logical [I x J] array, TRUE if the corresponding treatment-reader combination is binned, i.e., has at most maxUniqeRatings unique ratings, FALSE otherwise.

Examples

isBinnedDataset(dataset01)

Check the validity of a dataset

Description

Checks the validity of the dataset.

Usage

isValidDataset(dataset)

Arguments

dataset

The dataseet object to be checked.

Value

TRUE if dataset is valid, FALSE otherwise.

Artificial Intelligence Systems and Observer Performance

Description

Details

Definitions and abbreviations

Dataset

General data structure, e.g., dataset02, an ROC dataset, and dataset05, an FROC dataset.

ROI data structure, example datasetROI

Crossed modality data structure, example datasetCrossedModality

Df: Datafile Related Functions

Fitting Functions

Plotting Functions

Simulation Functions

Sample size Functions

Significance Testing Functions

Miscellaneous and Utility Functions

Author(s)

References

Compute the chisquare goodness of fit statistic for ROC fitting model

Description

Usage

Arguments

Details

Value

Examples

Convert ratings arrays to an RJafroc dataset

Description

Usage

Arguments

Details

Value

Examples

Returns a binned dataset

Description

Usage

Arguments

Details

Value

References

Examples

Create paired dataset for testing FitCorCbm

Description

Usage

Arguments

Details

Value

References

Examples

Extract two arms of a pairing from an MRMC ROC dataset

Description

Usage

Arguments

Details

Value

Examples

Extract a subset of treatments and readers from a dataset

Description

Usage

Arguments

Details

Value

Examples

Simulates an "AUC-equivalent" LROC dataset from an FROC dataset

Description

Usage

Arguments

Details

Value

Examples

Convert an FROC dataset to an ROC dataset

Description

Usage

Arguments

Details

Value

Examples

Simulates an "AUC-equivalent" FROC dataset from an LROC dataset

Description

Usage

Arguments

Details

General data structure, e.g., `dataset02`, an ROC dataset, and `dataset05`, an FROC dataset.

ROI data structure, example `datasetROI`

Crossed modality data structure, example `datasetCrossedModality`

Create paired dataset for testing `FitCorCbm`