This release contains minor changes needed to support CRAN submission.
This release contains updates to the GitHub Actions workflow files to resolve some issues with workflow errors.
This release contains minor changes needed to support CRAN submission.
This release adds missing plausible units to the
plausibleUnitConceptIds check.
This release contains an update to the GitHub Actions workflow file to resolve an issue pushing the package to drat.
This release contains some minor bug fixes:
plausibleUnitConceptIds checkIt also contains some changes to enable CRAN submission.
This release includes a bugfix in the
isStandardValidConcept check. Previously, this check was
not flagging records with a valid, non-standard concept in the concept
ID field. It was only flagging classification concepts and invalid
concepts.
This release includes:
Note that these updates may result in significant changes to your DQD results. However, we hope that the results will be more accurate and actionable! Please reach out with any questions or unexpected findings.
condition_concept_ids will not fail the check (it
is not required or recommended to create a condition era for unmapped
condition occurrences)isRequiredunit_concept_id fields if value_as_number was
non-NULL (a missing numeric value does not necessarily mean that a value
of 0 is acceptable for the unit concept)isStandardValidConcept, adding non-NULL
requirements for numerator and denominator in order to remove the
overlap between this check and isRequiredplausibleUnitConceptIds logic:
value_as_number - a wrong unit is a wrong unit, regardless
of whether a value is availablemeasureStandardConceptCompletenessPAYER_PLAN_PERIOD.family_source_value from
sourceValueCompleteness check in v5.3 default threshold
file (this field has no corresponding concept ID field and had this
check enabled in error)plausibleValueHigh check for
DRUG_EXPOSURE.quantity to prevent false positive failures
(the default value was in fact plausible for some liquid formulations;
in order to accurately measure drug quantity plausibility this check
would need to be customized at the concept level)birth_datetime to date in
plausibleAfterBirth to prevent false positive failures for
events occurring on the date of birthexecuteDqChecks return value invisiblewriteJsonResultsToTable function, an
option is now provided to write all results to a single table (the
approach used in executeDqChecks when
writeToTable = TRUE). Ultimately, the approach
which writes results to 3 separate tables will be deprecated;
for now, a warning is added to prepare users for this changeThis release includes a patch bugfix for the
standardConceptFieldName update described below. The added
field names had previously been added in the wrong column of the
threshold file; this has now been fixed.
This release includes:
There is now a parameter, checkSeverity, which can be
used to limit the execution of DQD to fatal,
convention, and/or characterization checks.
Fatal checks are checks that should never fail, under any circumstance,
as they relate to the relational integrity of the CDM. Convention checks
are checks on critical OMOP CDM conventions for which failures should be
resolved whenever possible; however, some level of failure is
unavoidable (i.e., standard concept mapping for source concepts with no
suitable standard concept). Characterization checks provide users with
an understanding of the quality of the underlying data and generally
will need their thresholds modified to match expectations of the
source.
This release includes:
plausibleStartBeforeEnd was failing if
SOURCE_RELEASE_DATE was before CDM_RELEASE_DATE in the CDM_SOURCE table.
This is the opposite of the correct logic! The check is now updated to
fail if the CDM_RELEASE_DATE is before the SOURCE_RELEASE_DATEplausibleTemporalAfter was throwing a syntax error in
BigQuery due to the format of a hardcoded date in the SQL query. This
query has now been updated to be compliant with SqlRender and the issue
has been resolvedviewDqDashboard to error
out in newer versions of R. This has now been resolvedSqlOnly mode was failing due to the format of the new
check plausibleGenderUseDescendants, which takes multiple
concepts as an input. This has now been fixedexecutionTimeSeconds. This field stores the execution time
in seconds of each check in numeric format. (The existing
executionTime field stores execution time as a string,
making it difficult to use in analysis.)The default thresholds for 2 checks were discovered to be inconsistently populated and occasionally set to illogical levels. These have now been fixed as detailed below.
sourceValueCompleteness have
been updated as follows:
_source_value columns in condition_occurrence,
measurement, procedure_occurrence, drug_exposure, and visit_occurrence
tables_source_value columnssourceConceptRecordCompleteness have been updated as
follows:
_source_concept_id columns in
condition_occurrence, drug_exposure, measurement, procedure_occurrence,
device_exposure, and observation tables_source_concept_id columnsWe have continued (and nearly completed) our initiative to add more comprehensive user documentation at the data quality check level. A dedicated documentation page is being created for each check type. Each check’s page includes detailed information about how its result is generated and what to do if it fails. Guidance is provided for both ETL developers and data users.
Check out the newly added pages here and please reach out with feedback as we continue improving our documentation!
This release includes:
4 new data quality check types have been added in this release:
plausibleStartBeforeEnd: The number and percent of
records with a value in the cdmFieldName field of the
cdmTableName that occurs after the date in the
plausibleStartBeforeEndFieldName.plausibleAfterBirth: The number and percent of records
with a date value in the cdmFieldName field of the
cdmTableName table that occurs prior to birth.plausibleBeforeDeath: The number and percent of records
with a date value in the cdmFieldName field of the
cdmTableName table that occurs after death.plausibleGenderUseDescendants: For descendants of
CONCEPT_ID conceptId (conceptName),
the number and percent of records associated with patients with an
implausible gender (correct gender =
plausibleGenderUseDescendants).The 3 temporal plausibilty checks are intended to
replace plausibleTemporalAfter and
plausibleDuringLife, for a more comprehensive and clear
approach to various temporality scenarios.
plausibleGenderUseDescendants is intended to
replace plausibleGender, to enhance
readability of the DQD results and improve performance. The replaced
checks are still available and enabled by default in DQD; however, in a
future major release, these checks will be deprecated. Please plan
accordingly.
For more information on the new checks, please check the Check Type Definitions documentation page. If you’d like to disable the deprecated checks, please see the suggested check exclusion workflow in our Getting Started code here.
plausibleUnitConceptIds has been reduced, and the lists of
plausible units for those measurements have been re-reviewed and updated
for accuracy. This change is intended to improve performance and
reliability of this check. Please file an issue if you would like to
contribute additional measurements + plausible units to be checked in
the futureplausibleValueLow thresholds have been
corrected to prevent false positive failures from occurringWe have begun an initiative to add more comprehensive user documentation at the data quality check level. A dedicated documentation page is being created for each check type. Each check’s page will include detailed information about how its result is generated and what to do if it fails. Guidance is provided for both ETL developers and data users.
9 pages have been added so far, and the rest will come in a future release. Check them out here and please reach out with feedback as we continue improving our documentation!
This release includes:
A new function writeDBResultsToJson which can be used to
write DQD results previously written to a database table (by setting
writeToTable = TRUE in executeDqChecks or by
using the writeJsonResultsToTable function) into a JSON
file in the standard DQD JSON format.
vocabDatabaseSchema where appropriateThis release includes:
vroomThis release includes:
The following changes involve updates to the default data quality check threshold files. If you are currently using an older version of DQD and update to v2.4.0, you may see changes in your DQD results. The failure threshold changes are fixes to incorrect thresholds in the v5.4 files and thus should result in more accurate, easier to interpret results. The unit concept ID changes ensure that long-invalid concepts will no longer be accepted as plausible measurement units.
measurePersonCompleteness and
measureValueCompleteness were fixed in the v5.4 table &
field level threshold files. This issue has existed since v5.4 support
was initially added in March 2022
measurePersonCompleteness checks had a threshold
of 0 when it should have been 95 or 100measureValueCompleteness checks had a threshold of
100 when it should have been 0, and many had no threshold (defaulting to
0) when it should have been 100measurePersonCompleteness for the DEATH table has been
toggled to Yes, with a threshold of 100plausibleUnitConceptIds
have been updated to 720870. Concept 9117 became non-standard and was
replaced with concept 720870, on 28-Mar-2022plausibleUnitConceptIds have been removed. These concepts
were deprecated on 05-May-2022convertJsonResultsFileCase in
Shiny app was appended with DataQualityDashboard::. This
prevents potential issues related to package loading and function naming
conflictsSome minor refactoring of testthat files and package build configuration and some minor documentation updates were also added in this release.
This release includes:
sqlOnly and
sqlOnlyIncrementalInsert to TRUE in
executeDqChecks will return (but not run) a set of SQL
queries that, when executed, will calculate the results of the DQ checks
and insert them into a database table. Additionally,
sqlOnlyUnionCount can be used to specify a number of SQL
queries to union for each check type, allowing for parallel execution of
these queries and potentially large performance gains. See the SqlOnly
vignette for detailsconvertJsonResultsFileCase can be used to convert the keys
in a DQD results JSON file between snakecase and camelcase. This allows
reading of v2.1.0+ JSON files in older DQD versions, and other
conversions which may be necessary for secondary use of the DQD results
file. See function
documentation for detailsviewDqDashboard will now automatically convert the case of
pre-v2.1.0 results files to camelcase so that older results files may be
viewed in v2.3.0+This release includes:
cohortTableName parameter added to
executeDqChecks. Allows user to specify the name of the
cohort table when running DQD on a cohort. Defaults to
"cohort"YYYYMMDD to
conform to SqlRender standardvocabDatabaseSchema and cohortDatabaseSchema
where appropriateoutputFile parameter from DQD setup vignette
(variable not set in script)And some minor documentation updates for clarity/accuracy.
This release includes:
offset column name in v5.4
thresholds file so that this column is skipped by DQD in all cases (use
of reserved word causes failures in some SQL dialects)This release includes:
outputFolder parameter for the
executeDqChecks function is now REQUIRED and no longer has
a default value. This may be a breaking change for users who
have not specified this parameter in their script to run
DQD.No material changes from v1.4, this adds a correct
DESCRIPTION file with the correct DQD version
This release provides support for CDM v5.4 and
incorporates minor bug fixes related to incorrectly assigned checks in
the control files.
This fixes a small bug and removes a duplicate record in the concept level checks that was throwing an error.
This release includes additional concept level checks to support the
OHDSI Symposium 2020 study-a-thon and bug fixes to the
writeJSONToTable function. This is the release that
study-a-thon data partners should use.
This is a bug fix release that updates how notes are viewed in the UI and adds CDM table, field, and check name to the final table.
This release of the Data Quality Dashboard incorporates the following
features: - Addition of notes fields in the threshold files - Addition
of notes to the UI - Functionality to run the DQD on a cohort - Fixes
the writeToTable, writeJsonToTable
functions
This is the first release of the OHDSI Data Quality Dashboard tool.