|
ABSTRACT
A method is presented that uses β-strand interactions at both the sequence and the atomic level, to predict the beta-structural motifs in protein sequences. A program called Wrap-and-Pack implements this method, and is shown to recognize β-trefoils, an important class of globular β-structures, in the Protein Data Bank with 92% specificity and 92.3% sensitivity in cross-validation. It is demonstrated that Wrap-and-Pack learns each of the ten known SCOP β-trefoil families, when trained primarily on β-structures that are not β-trefoils, together with 3D structures of known β-trefoils from outside the family. Wrap-and-Pack also predicts many proteins of unknown structure to be β-trefoils. The computational method used here may generalize to other β-structures for which strand topology and profiles of residue accessibility are well conserved.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
S. Altschul, W. Gish, W. Miller, E. Myers, and D. Lipman. Basic local alignment search tool. J. Mol. Biol., 215:403--410, 1990.
|
| |
2
|
A. Bairoch and R. Apweiler. The SWISS-PROT protein database and its supplement TrEMBL in 2000. Nucleic Acids Res., 28:45--48, 2000.
|
| |
3
|
A. Bateman, L. Coin, R. Durbin, R. Finn, V. Hollich, S. Griffiths-Jones, A. Khanna, M. Marshall, S. Moxon, E. Sonnhammer, D. Studholme, C. Yeats, and S. Eddy. A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry, 33:3038--3049, 1994.
|
| |
4
|
B. Berger. Algorithms for protein structural motif recognition. J. Comp. Biol., 2:125--138, 1995.
|
| |
5
|
B. Berger and M. Singh. An iterative method for improved protein structural motif recognition. J. Comp. Biol., 4(3):261--273, Fall 1997.
|
| |
6
|
P. Bradley, L. Cowen, M. Menke, J. King, and B. Berger. Betawrap: Successful prediction of parallel β-helices from primary sequence reveals an association with many microbial pathogens. Proc. National Academy of Sciences. USA, 98(26):14819--14824, 2001.
|
 |
7
|
Phil Bradley , Peter S. Kim , Bonnie Berger, Trilogy: discovery of sequence-structure patterns across diverse proteins, Proceedings of the sixth annual international conference on Computational biology, p.77-88, April 18-21, 2002, Washington, DC, USA
[doi> 10.1145/565196.565206]
|
| |
8
|
S. H. Bryant and C. E. Lawrence. An empirical energy function for threading protein sequence through the folding motif. Proteins: Structure, Function and Genetics, 16:92--112, 1993.
|
 |
9
|
Phil Bradley , Lenore Cowen , Matthew Menke , Jonathan King , Bonnie Berger, Predicting the &bgr;-helix fold from protein sequence data, Proceedings of the fifth annual international conference on Computational biology, p.59-67, April 22-25, 2001, Montreal, Quebec, Canada
[doi> 10.1145/369133.369171]
|
| |
10
|
S. Eddy. Profile hidden Markov models. Bioinformatics, 14:755--763, 1998.
|
| |
11
|
D. Frishman and P. Argos. Knowledge-based secondary structure assignment. Proteins: structure, function and genetics, 556--579, 1995.
|
| |
12
|
L. Holm and C. Sander. Mapping the protein universe. Science, 260:595--602, 1996.
|
| |
13
|
D. Jones. Protein secondary structure prediction based on position-specific scoring matrices. J. Mol. Biol., 292:195--202, 1999.
|
| |
14
|
D. Jones, W. Taylor, and J. Thornton. A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry, 33:3038--3049, 1994.
|
| |
15
|
D. T. Jones, W. R. Taylor, and J. M. Thornton. A new approach to protein fold recognition. Nature, 358:86--89, 1992.
|
| |
16
|
J. Moult, K. Fidelis, A. Zemla, and T. Hubbard. Critical assessment of methods of protein structure prediction (CASP)-round V. Proteins: Structure, Function, and Genetics, 53:334--339, 2003.
|
| |
17
|
A. Murzin, S. Brenner, T. Hubbard, and C. Chothia. SCOP: a structural classification of proteins database for the investigation of sequences and structures. J. Mol. Biol., 297:536--540, 1995.
|
| |
18
|
C. Orengo, A. Michie, S. Jones, D. Jones, M. Swindells, and J. Thornton. Cath- a hierarchic classification of protein domain structures. Structure, 5(8):1093--1108, 1997.
|
| |
19
|
B. Rost and C. Sander. Prediction of protein secondary structure at better than 70\% accuracy. J. Mol. Biol., 232:584--599, 1993.
|
| |
20
|
A. Shelenkov, A. Shelenkov, and R. D. Jr. A graph-theory algorithm for rapid protein side-chain prediction. Protein Science, 9:2001--2014, 2003.
|
| |
21
|
M. J. Sippl. Calculation of conformational ensembles from potentials of mean force. J. Mol. Biol., 213:859--883, 1990.
|
| |
22
|
M. Sternberg, P. Bates, K. A. Kelley, and R. M. MacCallum. Progress in protein structure prediction: Assessment of CASP3. Curr. Opin. Struct. Biol., 9:368--373, 1999.
|
|