Deep collaborative filtering for prediction of disease genes.

Authors:
Xiangxiang Zeng
Xiangxiang Zeng
Huazhong University of Science and Technology
China
Yuying He
Yuying He
National Cancer Institute at the National Institute of Environmental Health Sciences
United States
Xiaoping Min
Xiaoping Min
2 National Institute of Diagnostics and Vaccine Development in Infectious Diseases

IEEE/ACM Trans Comput Biol Bioinform 2019 Mar 26. Epub 2019 Mar 26.

Accurate prioritization of potential disease genes is a fundamental challenge in biomedical research. Various algorithms have been developed to solve such problems. Inductive Matrix Completion (IMC) is one of the most reliable models for its well established framework and its superior performance in predicting gene-disease associations. However, the IMC method does not hierarchically extract deep features, which might limit the quality of recovery. In this case, the architecture of deep learning, which obtains high-level representations and handles noises and outliers presented in large-scale biological datasets, is introduced into the side information of genes in our Deep Collaborative Filtering (DCF) model. Further, for lack of negative examples, we also exploit Positive-Unlabeled (PU) learning formulation to low-rank matrix completion.Our approach achieves substantially improved performance over other state-of-the-art methods on diseases from the Online Mendelian Inheritance in Man (OMIM) database. Our approach is 6% more efficient than standard IMC in detecting a true association, and significantly outperforms other alternatives in terms of the precision-recall metric at the top-k predictions. Moreover, we also validate the disease with no previously known gene associations and newly reported OMIM associations. The experimental results show that DCF is still satisfactory for ranking novel disease phenotypes as well as mining unexplored relationships. The source code and the data are available at https://github.com/xzenglab/Deep-Collaborative-Filtering.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2019.2907536DOI Listing
March 2019
1 Read

Publication Analysis

Top Keywords

disease genes
8
collaborative filtering
8
deep collaborative
8
low-rank matrix
4
matrix completionour
4
formulation low-rank
4
completionour approach
4
positive-unlabeled learning
4
examples exploit
4
exploit positive-unlabeled
4
approach achieves
4
learning formulation
4
performance state-of-the-art
4
online mendelian
4
mendelian inheritance
4
inheritance man
4
diseases online
4
methods diseases
4
improved performance
4
negative examples
4

Similar Publications