Thumbnail
Access Restriction
Subscribed

Author Liu, Bin ♦ Chen, Junjie ♦ Wang, Xiaolong
Source SpringerLink
Content type Text
Publisher Springer Berlin Heidelberg
File Format PDF
Copyright Year ©2015
Language English
Subject Domain (in DDC) Natural sciences & mathematics ♦ Life sciences; biology
Subject Keyword Protein remote homology ♦ Pseudo amino acid composition ♦ Support vector machine ♦ Principal component analysis ♦ Cell Biology ♦ Biochemistry ♦ Microbial Genetics and Genomics ♦ Plant Genetics & Genomics ♦ Animal Genetics and Genomics
Abstract Protein remote homology detection is one of the important tasks in computational proteomics, which is important for basic research and practical application. Currently, the SVM-based discriminative methods have shown superior performance. However, the existing feature vectors still cannot suitably represent the protein sequences, and often lack an interpretable model for analysis of characteristic features. Previous studies showed that sequence-order effects and physicochemical properties are important for representing protein sequences. However, how to use these kinds of information for constructing predictors is still a challenging problem. In this study, in order to incorporate the sequence-order information and physicochemical properties into the prediction, a method called disPseAAC is proposed, in which the feature vector is constructed by combining the occurrences of amino acid pairs within the Chou’s pseudo amino acid composition (PseAAC) approach. The predictive performance and computational cost are further improved by employing the principal component analysis strategy. Various experiments are conducted on a benchmark dataset. Experimental results show that disPseAAC achieves an ROC score of 0.922, outperforming some existing state-of-the-art methods. Furthermore, the learnt model can easily be analyzed in terms of discriminative features, and the computational cost of the proposed method is much lower than that of other profile-based methods.
ISSN 16174615
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2015-04-21
Publisher Place Berlin/Heidelberg
e-ISSN 16174623
Journal Molecular and General Genetics MGG
Volume Number 290
Issue Number 5
Page Count 13
Starting Page 1919
Ending Page 1931


Open content in new tab

   Open content in new tab
Source: SpringerLink