IEEE/ACM Trans Comput Biol Bioinform 2019 Apr 3. Epub 2019 Apr 3.
DNA methylation plays an important role in the regulation of some biological processes. Up to now, with the development of machine learning models, there are several sequence-based deep learning models designed to predict DNA methylation states, which gain better performance than traditional methods like random forest and SVM. However, convolutional network based deep learning models that use one-hot encoding DNA sequence as input may discover limited information and cause unsatisfactory prediction performance, so more data and model structures of diverse angles should be considered. In this work, we proposed a hybrid sequence-based deep learning model with both MeDIP-seq data and Histone information to predict DNA methylated CpG states (MHCpG). We combined both MeDIP-seq data and histone modification data with sequence information and implemented convolutional network to discover sequence patterns. In addition, we used statistical data gained from previous three input data and adopted a 3-layer feedforword neuron network to extract more high-level features. We compared our method with traditional predicting methods using random forest and other previous methods like CpGenie and DeepCpG, the result showed that MHCpG exceeded the other approaches and gained more satisfactory performance.