Intelligent computational model for classification of sub-Golgi protein using oversampling and fisher feature selection methods.

Jamal Ahmad
Jamal Ahmad
J.N. Medical College
Belgaum | India
Faisal Javed
Faisal Javed
Medical College
Maqsood Hayat
Maqsood Hayat
Pakistan Institute of Engineering and Applied Sciences

Artif Intell Med 2017 05 10;78:14-22. Epub 2017 May 10.

Department of Computer Science, Abdul Wali Khan University, Mardan, Pakistan. Electronic address:

Golgi is one of the core proteins of a cell, constitutes in both plants and animals, which is involved in protein synthesis. Golgi is responsible for receiving and processing the macromolecules and trafficking of newly processed protein to its intended destination. Dysfunction in Golgi protein is expected to cause many neurodegenerative and inherited diseases that may be cured well if they are detected effectively and timely. Golgi protein is categorized into two parts cis-Golgi and trans-Golgi. The identification of Golgi protein via direct method is very hard due to limited available recognized structures. Therefore, the researchers divert their attention toward the sequences from structures. However, owing to technological advancement, exploration of huge amount of sequences was reported in the databases. So recognition of large amount of unprocessed data using conventional methods is very difficult. Therefore, the concept of intelligence was incorporated with computational model. Intelligence based computational model obtained reasonable results, but the gap of improvement is still under consideration. In this regard, an intelligent automatic recognition model is developed in order to enhance the true classification rate of sub-Golgi proteins. In this approach, discrete and evolutionary feature extraction methods are applied on the benchmark Golgi protein datasets to excerpt salient, propound and variant numerical descriptors. After that, an oversampling technique Syntactic Minority over Sampling Technique is employed to balance the data. Hybrid spaces are also generated with combination of these feature spaces. Further, Fisher feature selection method is utilized to reduce the extra noisy and redundant features from feature vector. Finally, k-nearest neighbor algorithm is used as learning hypothesis. Three distinct cross validation tests are used to examine the stability and efficiency of the proposed model. The predicted outcomes of proposed model are better than the existing models in the literature so far. Finally, it is anticipated that the proposed model will provide the foundation to pharmaceutical industry in drug design and research community to innovate new ideas in the area of computational biology and bioinformatics.
PDF Download - Full Text Link
( Please be advised that this article is hosted on an external website not affiliated with
Source Status ListingPossible
May 2017
4 Reads

Similar Publications

A Novel Feature Extraction Method with Feature Selection to Identify Golgi-Resident Protein Types from Imbalanced Data.

Int J Mol Sci 2016 Feb 6;17(2):218. Epub 2016 Feb 6.

School of Control Science and Engineering, Shandong University, Jinan 250061, China.

The Golgi Apparatus (GA) is a major collection and dispatch station for numerous proteins destined for secretion, plasma membranes and lysosomes. The dysfunction of GA proteins can result in neurodegenerative diseases. Therefore, accurate identification of protein subGolgi localizations may assist in drug development and understanding the mechanisms of the GA involved in various cellular processes. Read More

View Article
February 2016

Prediction of Protein Submitochondrial Locations by Incorporating Dipeptide Composition into Chou's General Pseudo Amino Acid Composition.

J Membr Biol 2016 06 8;249(3):293-304. Epub 2016 Jan 8.

Department of Computer Science, Abdul Wali Khan University Mardan, Mardan, Pakistan.

Mitochondrion is the key organelle of eukaryotic cell, which provides energy for cellular activities. Submitochondrial locations of proteins play crucial role in understanding different biological processes such as energy metabolism, program cell death, and ionic homeostasis. Prediction of submitochondrial locations through conventional methods are expensive and time consuming because of the large number of protein sequences generated in the last few decades. Read More

View Article
June 2016

isGPT: An optimized model to identify sub-Golgi protein types using SVM and Random Forest based feature selection.

Artif Intell Med 2018 01 26;84:90-100. Epub 2017 Nov 26.

Department of CSE, BUET, ECE Building, West Palasi, Dhaka 1205, Bangladesh. Electronic address:

The Golgi Apparatus (GA) is a key organelle for protein synthesis within the eukaryotic cell. The main task of GA is to modify and sort proteins for transport throughout the cell. Proteins permeate through the GA on the ER (Endoplasmic Reticulum) facing side (cis side) and depart on the other side (trans side). Read More

View Article
January 2018

iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space.

Artif Intell Med 2017 06 17;79:62-70. Epub 2017 Jun 17.

Department of Computer Science, Abdul Wali Khan University Mardan, KP 23200, Pakistan. Electronic address:

Cancer is a fatal disease, responsible for one-quarter of all deaths in developed countries. Traditional anticancer therapies such as, chemotherapy and radiation, are highly expensive, susceptible to errors and ineffective techniques. These conventional techniques induce severe side-effects on human cells. Read More

View Article
June 2017