Intelligent computational model for classification of sub-Golgi protein using oversampling and fisher feature selection methods.

Authors:
Jamal Ahmad
Jamal Ahmad
J.N. Medical College
Belgaum | India
Faisal Javed
Faisal Javed
Medical College
Pakistan
Maqsood Hayat
Maqsood Hayat
Pakistan Institute of Engineering and Applied Sciences

Artif Intell Med 2017 05 10;78:14-22. Epub 2017 May 10.

Department of Computer Science, Abdul Wali Khan University, Mardan, Pakistan. Electronic address:

Golgi is one of the core proteins of a cell, constitutes in both plants and animals, which is involved in protein synthesis. Golgi is responsible for receiving and processing the macromolecules and trafficking of newly processed protein to its intended destination. Dysfunction in Golgi protein is expected to cause many neurodegenerative and inherited diseases that may be cured well if they are detected effectively and timely. Golgi protein is categorized into two parts cis-Golgi and trans-Golgi. The identification of Golgi protein via direct method is very hard due to limited available recognized structures. Therefore, the researchers divert their attention toward the sequences from structures. However, owing to technological advancement, exploration of huge amount of sequences was reported in the databases. So recognition of large amount of unprocessed data using conventional methods is very difficult. Therefore, the concept of intelligence was incorporated with computational model. Intelligence based computational model obtained reasonable results, but the gap of improvement is still under consideration. In this regard, an intelligent automatic recognition model is developed in order to enhance the true classification rate of sub-Golgi proteins. In this approach, discrete and evolutionary feature extraction methods are applied on the benchmark Golgi protein datasets to excerpt salient, propound and variant numerical descriptors. After that, an oversampling technique Syntactic Minority over Sampling Technique is employed to balance the data. Hybrid spaces are also generated with combination of these feature spaces. Further, Fisher feature selection method is utilized to reduce the extra noisy and redundant features from feature vector. Finally, k-nearest neighbor algorithm is used as learning hypothesis. Three distinct cross validation tests are used to examine the stability and efficiency of the proposed model. The predicted outcomes of proposed model are better than the existing models in the literature so far. Finally, it is anticipated that the proposed model will provide the foundation to pharmaceutical industry in drug design and research community to innovate new ideas in the area of computational biology and bioinformatics.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.artmed.2017.05.001DOI Listing

Still can't find the full text of the article?

We can help you send a request to the authors directly.
May 2017
12 Reads

Publication Analysis

Top Keywords

golgi protein
16
computational model
12
proposed model
12
feature selection
8
fisher feature
8
protein
7
model
7
golgi
6
feature
5
extraction methods
4
applied benchmark
4
benchmark golgi
4
methods applied
4
evolutionary feature
4
sub-golgi proteins
4
rate sub-golgi
4
classification rate
4
proteins approach
4
approach discrete
4
protein datasets
4

Similar Publications