Effect of vocabulary mapping for conditions on phenotype cohorts.

J Am Med Inform Assoc 2018 12;25(12):1618-1625

Department of Biomedical Informatics, Columbia University, New York, New York, USA.

Objective: To study the effect on patient cohorts of mapping condition (diagnosis) codes from source billing vocabularies to a clinical vocabulary.

Materials And Methods: Nine International Classification of Diseases, Ninth Revision, Clinical Modification (ICD9-CM) concept sets were extracted from eMERGE network phenotypes, translated to Systematized Nomenclature of Medicine - Clinical Terms concept sets, and applied to patient data that were mapped from source ICD9-CM and ICD10-CM codes to Systematized Nomenclature of Medicine - Clinical Terms codes using Observational Health Data Sciences and Informatics (OHDSI) Observational Medical Outcomes Partnership (OMOP) vocabulary mappings. The original ICD9-CM concept set and a concept set extended to ICD10-CM were used to create patient cohorts that served as gold standards.

Results: Four phenotype concept sets were able to be translated to Systematized Nomenclature of Medicine - Clinical Terms without ambiguities and were able to perform perfectly with respect to the gold standards. The other 5 lost performance when 2 or more ICD9-CM or ICD10-CM codes mapped to the same Systematized Nomenclature of Medicine - Clinical Terms code. The patient cohorts had a total error (false positive and false negative) of up to 0.15% compared to querying ICD9-CM source data and up to 0.26% compared to querying ICD9-CM and ICD10-CM data. Knowledge engineering was required to produce that performance; simple automated methods to generate concept sets had errors up to 10% (one outlier at 250%).

Discussion: The translation of data from source vocabularies to Systematized Nomenclature of Medicine - Clinical Terms (SNOMED CT) resulted in very small error rates that were an order of magnitude smaller than other error sources.

Conclusion: It appears possible to map diagnoses from disparate vocabularies to a single clinical vocabulary and carry out research using a single set of definitions, thus improving efficiency and transportability of research.

Download full-text PDF

Source
https://academic.oup.com/jamia/advance-article/doi/10.1093/j
Publisher Site
http://dx.doi.org/10.1093/jamia/ocy124DOI Listing
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6289550PMC
December 2018
27 Reads

Publication Analysis

Top Keywords

clinical terms
20
nomenclature medicine
20
medicine clinical
20
systematized nomenclature
20
concept sets
16
patient cohorts
12
icd9-cm icd10-cm
12
icd10-cm codes
8
translated systematized
8
icd9-cm concept
8
querying icd9-cm
8
clinical
8
concept set
8
compared querying
8
icd9-cm
6
concept
6
data
5
medicine
5
terms
5
systematized
5

References

(Supplied by CrossRef)

SNOMED CT et al.

MedDRA et al.

Hripcsak et al.
2015
Characterizing treatment pathways at scale using the OHDSI network
Hripcsak et al.
Proc Natl Acad Sci USA 2016
Validation of a common data model for active safety surveillance research
Overhage et al.
J Am Med Inform Assoc 2012
Evaluation of alternative standardized terminologies for medical conditions within a network of observational healthcare databases
Reich et al.
J Biomed Inform 2012
Validation of electronic medical record–based phenotyping algorithms: results and lessons learned from the eMERGE network
Newton et al.
J Am Med Inform Assoc 2013

Similar Publications