18 results match your criteria bionlp community

  • Page 1 of 1

LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19.

Genomics Inform 2021 Sep 30;19(3):e23. Epub 2021 Sep 30.

Hubei Key Lab of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, 430070 Wuhan, China.

Currently, coronavirus disease 2019 (COVID-19) literature has been increasing dramatically, and the increased text amount make it possible to perform large scale text mining and knowledge discovery. Therefore, curation of these texts becomes a crucial issue for Bio-medical Natural Language Processing (BioNLP) community, so as to retrieve the important information about the mechanism of COVID-19. PubAnnotation is an aligned annotation system which provides an efficient platform for biological curators to upload their annotations or merge other external annotations. Read More

View Article and Full-Text PDF
September 2021

A two-stage deep learning approach for extracting entities and relationships from medical texts.

J Biomed Inform 2019 11 20;99:103285. Epub 2019 Sep 20.

Computer Science Department, Carlos III University of Madrid, Leganés 28911, Madrid, Spain. Electronic address:

This work presents a two-stage deep learning system for Named Entity Recognition (NER) and Relation Extraction (RE) from medical texts. These tasks are a crucial step to many natural language understanding applications in the biomedical domain. Automatic medical coding of electronic medical records, automated summarizing of patient records, automatic cohort identification for clinical studies, text simplification of health documents for patients, early detection of adverse drug reactions or automatic identification of risk factors are only a few examples of the many possible opportunities that the text analysis can offer in the clinical domain. Read More

View Article and Full-Text PDF
November 2019

A review of drug knowledge discovery using BioNLP and tensor or matrix decomposition.

Genomics Inform 2019 Jun 27;17(2):e18. Epub 2019 Jun 27.

Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China.

Prediction of the relations among drug and other molecular or social entities is the main knowledge discovery pattern for the purpose of drug-related knowledge discovery. Computational approaches have combined the information from different sources and levels for drug-related knowledge discovery, which provides a sophisticated comprehension of the relationship among drugs, targets, diseases, and targeted genes, at the molecular level, or relationships among drugs, usage, side effect, safety, and user preference, at a social level. In this research, previous work from the BioNLP community and matrix or matrix decomposition was reviewed, compared, and concluded, and eventually, the BioNLP open-shared task was introduced as a promising case study representing this area. Read More

View Article and Full-Text PDF

PMC text mining subset in BioC: about three million full-text articles and growing.

Bioinformatics 2019 09;35(18):3533-3535

National Center for Biotechnology Information (NCBI), U.S. Library of Medicine (NLM), National Institutes of Health (NIH), Bethesda, MD, USA.

Motivation: Interest in text mining full-text biomedical research articles is growing. To facilitate automated processing of nearly 3 million full-text articles (in PubMed Central® Open Access and Author Manuscript subsets) and to improve interoperability, we convert these articles to BioC, a community-driven simple data structure in either XML or JavaScript Object Notation format for conveniently sharing text and annotations.

Results: The resultant articles can be downloaded via both File Transfer Protocol for bulk access and a Web API for updates or a more focused collection. Read More

View Article and Full-Text PDF
September 2019

BioCreative VI Precision Medicine Track system performance is constrained by entity recognition and variations in corpus characteristics.

Database (Oxford) 2018 01 1;2018. Epub 2018 Jan 1.

School of Computing and Information Systems, The University of Melbourne, Parkville VIC Australia.

Precision medicine aims to provide personalized treatments based on individual patient profiles. One critical step towards precision medicine is leveraging knowledge derived from biomedical publications-a tremendous literature resource presenting the latest scientific discoveries on genes, mutations and diseases. Biomedical natural language processing (BioNLP) plays a vital role in supporting automation of this process. Read More

View Article and Full-Text PDF
January 2018

BRONCO: Biomedical entity Relation ONcology COrpus for extracting gene-variant-disease-drug relations.

Database (Oxford) 2016 13;2016. Epub 2016 Apr 13.

Department of Computer Science and Engineering, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul, 02841 Korea and

Comprehensive knowledge of genomic variants in a biological context is key for precision medicine. As next-generation sequencing technologies improve, the amount of literature containing genomic variant data, such as new functions or related phenotypes, rapidly increases. Because numerous articles are published every day, it is almost impossible to manually curate all the variant information from the literature. Read More

View Article and Full-Text PDF
January 2017

Overview of the gene regulation network and the bacteria biotope tasks in BioNLP'13 shared task.

BMC Bioinformatics 2015 13;16 Suppl 10:S1. Epub 2015 Jul 13.

Background: We present the two Bacteria Track tasks of BioNLP 2013 Shared Task (ST): Gene Regulation Network (GRN) and Bacteria Biotope (BB). These tasks were previously introduced in the 2011 BioNLP-ST Bacteria Track as Bacteria Gene Interaction (BI) and Bacteria Biotope (BB). The Bacteria Track was motivated by a need to develop specific BioNLP tools for fine-grained event extraction in bacteria biology. Read More

View Article and Full-Text PDF
February 2016

Community challenges in biomedical text mining over 10 years: success, failure and the future.

Brief Bioinform 2016 Jan 1;17(1):132-44. Epub 2015 May 1.

One effective way to improve the state of the art is through competitions. Following the success of the Critical Assessment of protein Structure Prediction (CASP) in bioinformatics research, a number of challenge evaluations have been organized by the text-mining research community to assess and advance natural language processing (NLP) research for biomedicine. In this article, we review the different community challenge evaluations held from 2002 to 2014 and their respective tasks. Read More

View Article and Full-Text PDF
January 2016

OntoMate: a text-mining tool aiding curation at the Rat Genome Database.

Database (Oxford) 2015 25;2015. Epub 2015 Jan 25.

Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA Human and Molecular Genetics Center, Medical College of Wisconsin, Department of Quantitative Health Sciences, University of Massachusetts Medical School, Department of Physiology, Medical College of Wisconsin and Department of Surgery, Medical College of Wisconsin, 8701 Watertown Plank Rd, Milwaukee, WI 53226-3548, USA.

The Rat Genome Database (RGD) is the premier repository of rat genomic, genetic and physiologic data. Converting data from free text in the scientific literature to a structured format is one of the main tasks of all model organism databases. RGD spends considerable effort manually curating gene, Quantitative Trait Locus (QTL) and strain information. Read More

View Article and Full-Text PDF
September 2015

BC4GO: a full-text corpus for the BioCreative IV GO task.

Database (Oxford) 2014 28;2014. Epub 2014 Jul 28.

WormBase, Division of Biology, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA 91125, USA, USDA-ARS Plant Genetics Research Unit and Division of Plant Sciences, Department of Agronomy, University of Missouri, Columbia, MO 65211, USA, FlyBase, Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK, Rat Genome Database, Human and Molecular Genetics Center, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, WI 53226, USA, TAIR, Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford, CA 94305, USA, Center for Bioinformatics and Computational Biology, University of Delaware, 15 Innovation Way, Newark, DE 19711, USA, Howard Hughes Medical Institute, California Institute of Technology, 1200 E. California Blvd., Pasadena, CA 91125, USA, National Center for Biotechnology Information (NCBI), 8600 Rockville Pike, Bethesda, MD 20894, USA

Gene function curation via Gene Ontology (GO) annotation is a common task among Model Organism Database groups. Owing to its manual nature, this task is considered one of the bottlenecks in literature curation. There have been many previous attempts at automatic identification of GO terms and supporting information from full text. Read More

View Article and Full-Text PDF
February 2015

TrigNER: automatically optimized biomedical event trigger recognition on scientific documents.

Source Code Biol Med 2014 Jan 8;9(1). Epub 2014 Jan 8.

IEETA/DETI, University of Aveiro, 3810-193, Aveiro, Portugal.

Background: Cellular events play a central role in the understanding of biological processes and functions, providing insight on both physiological and pathogenesis mechanisms. Automatic extraction of mentions of such events from the literature represents an important contribution to the progress of the biomedical domain, allowing faster updating of existing knowledge. The identification of trigger words indicating an event is a very important step in the event extraction pipeline, since the following task(s) rely on its output. Read More

View Article and Full-Text PDF
January 2014

Approximate subgraph matching-based literature mining for biomedical events and relations.

PLoS One 2013 17;8(4):e60954. Epub 2013 Apr 17.

National Center for Biotechnology Information, Bethesda, Maryland, United States of America.

The biomedical text mining community has focused on developing techniques to automatically extract important relations between biological components and semantic events involving genes or proteins from literature. In this paper, we propose a novel approach for mining relations and events in the biomedical literature using approximate subgraph matching. Extraction of such knowledge is performed by searching for an approximate subgraph isomorphism between key contextual dependencies and input sentence graphs. Read More

View Article and Full-Text PDF
November 2013

The Genia Event and Protein Coreference tasks of the BioNLP Shared Task 2011.

BMC Bioinformatics 2012 Jun 26;13 Suppl 11:S1. Epub 2012 Jun 26.

Database Center for Life Science, Research Organization of Information and Science, 2-11-16 Yayoi, Bunkyo-ku, Tokyo, Japan.

Background: The Genia task, when it was introduced in 2009, was the first community-wide effort to address a fine-grained, structural information extraction from biomedical literature. Arranged for the second time as one of the main tasks of BioNLP Shared Task 2011, it aimed to measure the progress of the community since 2009, and to evaluate generalization of the technology to full text papers. The Protein Coreference task was arranged as one of the supporting tasks, motivated from one of the lessons of the 2009 task that the abundance of coreference structures in natural language text hinders further improvement with the Genia task. Read More

View Article and Full-Text PDF

Concept annotation in the CRAFT corpus.

BMC Bioinformatics 2012 Jul 9;13:161. Epub 2012 Jul 9.

Department of Pharmacology, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.

Background: Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text.

Results: This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. Read More

View Article and Full-Text PDF

Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies.

J Biomed Inform 2011 Oct 28;44(5):805-14. Epub 2011 Apr 28.

Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA.

Biomedical natural language processing (BioNLP) is a useful technique that unlocks valuable information stored in textual data for practice and/or research. Syntactic parsing is a critical component of BioNLP applications that rely on correctly determining the sentence and phrase structure of free text. In addition to dealing with the vast amount of domain-specific terms, a robust biomedical parser needs to model the semantic grammar to obtain viable syntactic structures. Read More

View Article and Full-Text PDF
October 2011

Empirical investigations into full-text protein interaction Article Categorization Task (ACT) in the BioCreative II.5 Challenge.

Man Lan Jian Su

IEEE/ACM Trans Comput Biol Bioinform 2010 Jul-Sep;7(3):421-7

Institute for Infocomm Research, Connexis, Singapore.

The selection of protein interaction documents is one important application for biology research and has a direct impact on the quality of downstream BioNLP applications, i.e., information extraction and retrieval, summarization, QA, etc. Read More

View Article and Full-Text PDF
December 2010

Event extraction with complex event classification using rich features.

J Bioinform Comput Biol 2010 Feb;8(1):131-46

Department of Computer Science, University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo, Japan.

Biomedical Natural Language Processing (BioNLP) attempts to capture biomedical phenomena from texts by extracting relations between biomedical entities (i.e. proteins and genes). Read More

View Article and Full-Text PDF
February 2010

Concept recognition for extracting protein interaction relations from biomedical text.

Genome Biol 2008 1;9 Suppl 2:S9. Epub 2008 Sep 1.

Center for Computational Pharmacology, University of Colorado School of Medicine, Aurora, Colorado 80045, USA.

Background: Reliable information extraction applications have been a long sought goal of the biomedical text mining community, a goal that if reached would provide valuable tools to benchside biologists in their increasingly difficult task of assimilating the knowledge contained in the biomedical literature. We present an integrated approach to concept recognition in biomedical text. Concept recognition provides key information that has been largely missing from previous biomedical information extraction efforts, namely direct links to well defined knowledge resources that explicitly cement the concept's semantics. Read More

View Article and Full-Text PDF
December 2008
  • Page 1 of 1