J Wash Acad Sci 2018 ;104(4):31-78
Carnegie Mellon University, Pittsburgh, PA.
Motivated by the need for flexible, intuitive, reusable, and normalized terminology for guiding search and building ontologies, we present a general approach for generating sets of such terminologies from natural language documents. The terms that this approach generates are root- and rule-based terms, generated by a series of rules designed to be flexible, to evolve, and, perhaps most important, to protect against ambiguity and standardize semantically similar but syntactically distinct phrases to a normal form. This approach combines several linguistic and computational methods that can be automated with the help of training sets to quickly and consistently extract normalized terms. Read More