IEEE/ACM Trans Comput Biol Bioinform 2019 Mar 7. Epub 2019 Mar 7.
Antimicrobial peptides are short amino acid sequences that may be antibacterial, antifungal, and antiviral. Most machine learning methodologies applied to identifying antibacterial peptides have developed feature vectors of identical lengths for each peptide in a given dataset although the peptides themselves may differ in number of amino acids. Features are often chosen which represent certain periodic patterns in the peptide sequence without any initial guidance as to whether such patterns are relevant for the classification task at hand. This can result in the construction of a large number of irrelevant features in addition to relevant features. To help alleviate these issues, we choose to extract a feature vector from individual amino acid feature representations through the application of bidirectional Long Short-Term Memory recurrent neural networks. The Long Short-Term Memory network recursively iterates along both directions of the given amino acid sequence and ultimately extracts a finite length feature vector that is then used to classify the peptide. This work demonstrates the application of Long Short-Term Memory recurrent neural networks to classification of antibacterial peptides and compares it to a Random Forest classifier and a k-nearest neighbor classifier.