Indexed by:
Abstract:
The mechanism of how protein amino acid sequences determine protein structure is a core issue in biology. The protein fold type reflects the topological pattern of the structure's core. Fold recognition is an important method in protein sequence-structure research. This article focuses on the 36 fold types that are not incorporated into the unified hidden Markov model (HMM) model but that account for 41.8% of alpha, beta, and alpha/beta protein's in the Astral 1.65 sequence database. The training set contains samples that have less than 25% sequence identity with each other. We applied the hierarchical clustering method according to root mean square deviation (RMSD) and fold subgroups were generated. A profile-HMM based on a multiple structural alignment algorithm (MUSTANG) structure alignment was then built for each subgroup. After testing 9505 proteins with less than 95% sequence identity from the Astral 1.65 database, the average sensitivity, specificity and Matthew's correlation coefficient (MCC) of the 36 fold types were found to be 90%, 99% and 0.95, respectively. These results show that classification modeling according to RMSD is able to achieve precise fold recognition while a unified HMM cannot be built because there are too many elements in the training set. We have developed a new method and novel ideas to enable profile-HMM protein fold recognition and have laid the foundation for further research.
Keyword:
Reprint Author's Address:
Email:
Source :
ACTA PHYSICO-CHIMICA SINICA
ISSN: 1000-6818
Year: 2009
Issue: 12
Volume: 25
Page: 2558-2564
1 0 . 9 0 0
JCR@2022
ESI Discipline: CHEMISTRY;
JCR Journal Grade:4
CAS Journal Grade:1
Cited Count:
WoS CC Cited Count: 3
SCOPUS Cited Count:
ESI Highly Cited Papers on the List: 0 Unfold All
WanFang Cited Count:
Chinese Cited Count:
30 Days PV: 2
Affiliated Colleges: