Developing Customizable Cancer Information Extraction Modules for Pathology Reports Using CLAMP.

Stud Health Technol Inform 2019 Aug;264:1041-1045

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, Texas.

Natural language processing (NLP) technologies have been successfully applied to cancer research by enabling automated phenotypic information extraction from narratives in electronic health records (EHRs) such as pathology reports; however, developing customized NLP solutions requires substantial effort. To facilitate the adoption of NLP in cancer research, we have developed a set of customizable modules for extracting comprehensive types of cancer-related information in pathology reports (e.g., tumor size, tumor stage, and biomarkers), by leveraging the existing CLAMP system, which provides user-friendly interfaces for building customized NLP solutions for individual needs. Evaluation using annotated data at Vanderbilt University Medical Center showed that CLAMP-Cancer could extract diverse types of cancer information with good F-measures (0.80-0.98). We then applied CLAMP-Cancer to an information extraction task at Mayo Clinic and showed that we can quickly build a customized NLP system with comparable performance with an existing system at Mayo Clinic. CLAMP-Cancer is freely available for academic use.

Download full-text PDF

Source
http://dx.doi.org/10.3233/SHTI190383DOI Listing
August 2019
2 Reads

Publication Analysis

Top Keywords

pathology reports
12
customized nlp
12
nlp solutions
8
mayo clinic
8
nlp
5
existing clamp
4
leveraging existing
4
biomarkers leveraging
4
clamp system
4
solutions individual
4
individual evaluation
4
evaluation annotated
4
building customized
4
interfaces building
4
system user-friendly
4
user-friendly interfaces
4
stage biomarkers
4
tumor size
4
set customizable
4
customizable modules
4

Similar Publications