|
ABSTRACT
The accurate quantification of proteins is important in several areas of cell biology, biotechnology and medicine. Both relative and absolute quantification of proteins is often determined following mass spectrometric analysis of one or more of their constituent peptides. However, in order for quantification to be successful, it is important that the experimenter knows which peptides are readily detectable under the mass spectrometric conditions used for analysis. In this paper, genetic programming is used to develop a function which predicts the detectability of peptides from their calculated physico-chemical properties. Classification is carried out in two stages: the selection of a good classifier using the AUROC objective function and the setting of an appropriate threshold. This allows the user to select the balance point between conflicting priorities in an intuitive way. The success of this method is found to be highly dependent on the initial selection of input parameters. The use of brood recombination and a modified version of the multi-objective FOCUS method are also investigated. While neither has a significant effect on predictive accuracy, the use of the FOCUS method leads to considerably more compact solutions.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Aebersold, R. and Mann, M. Mass spectrometry-based proteomics. In Nature, 422 (Mar. 2003), 198--207.
|
| |
2
|
|
| |
3
|
Beynon, R.J., Doherty, M.K., Pratt, J.M and Gaskell, S.J. Multiplexed absolute quantification in proteomics using artificial QCAT proteins of concatenated signature peptides. In Nature Methods, 2, 8 (Aug. 2005), 587--589. Published online at http://www.nature.com/nmeth/journal/v2/n8
|
| |
4
|
|
| |
5
|
Breiman, L., Friedman, J., Olshen, R. and Stone, C. Classification and Regression Trees. Chapman & Hall / CRC, 1984
|
| |
6
|
Broadhurst, D.I. and Kell, D.B. Statistical Strategies for Avoiding False Discoveries in Metabolomics and Related Experiments. Metabolomics, 2, 4 (Dec. 2006), 171--197.
|
| |
7
|
Cover, T. and Hart, P. Nearest neighbor pattern classification. In IEEE Transactions on Information Theory, 13, 1 (Jan. 1967), 21--27.
|
| |
8
|
Eriksson, J., Chait, B.T. and Fenyo, D. A Statistical Basis for Testing the Significance of Mass Spectrometric Protein Identification Results. Analytical Chemistry 72, 5 (Mar. 2000), 999--1005.
|
| |
9
|
Fenn, J.B., Mann, M., Meng, C.K., Wong, S.F. and Whitehouse, C.M. Electrospray ionization for mass spectrometry of large biomolecules. Science, 246, 4926 (Oct. 1989), 64--71.
|
| |
10
|
Gay, S., Binz, P.-A., Hochstrasser, D.F. and Appel, R.D. Peptide mass fingerprinting peak intensity prediction: Extracting knowledge from spectra. In Proteomics, 2, 10 (Nov. 2002), 1374--1391.
|
| |
11
|
Gerber, S.A., Rush, J., Stemman, O., Kirshner, M.W. and Gygi, S.P. Absolute quantification of proteins and phosphoproteins from cell lysates by tandem MS. In PNAS, 100, 12 (Jun' 2003), 6940--6945
|
| |
12
|
Gianazza,E., Eberini, I., Arnoldi, A., Wait, R. and Sirtori, C.R. A Proteomic Investigation of Isolated Soy Proteins with Variable Effects in Experimental and Clinical Studies. In The Journal of Nutrition, 133, 1 (Jan. 2003), 9--14.
|
| |
13
|
de Jong, E.D., Watson, R.A. and Pollack, J.B. Reducing Bloat and Promoting Diversity using Multi-Objective Methods. In Proceedings of the Genetic and Evolutionary Computation Conference (GECCO--2001), (Jul' 2001), 11--18
|
| |
14
|
|
| |
15
|
Pratt, J.M., Simpson, D.M., Doherty, M.K., Rivers, J., Gaskell, S.J. and Beynon, R.J. Multiplexed absolute quantification for proteomics using concatenated signature peptides encoded by QconCAT genes. In Nature Protocols, 1, 2 (2006), 1029--1043.
|
| |
16
|
Rifai, N, Gillette, M.A. and Carr, S.A. Protein biomarker discovery and validation: the long and uncertain path to clinical utility. In Nature Biotechnology, 24, 8 (Aug. 2006), 971--983.
|
| |
17
|
|
| |
18
|
Haixu Tang , Randy J. Arnold , Pedro Alves , Zhiyin Xun , David E. Clemmer , Milos V. Novotny , James P. Reilly , Predrag Radivojac, A computational approach toward label-free protein quantification using predicted peptide detectability, Bioinformatics, v.22 n.14, p.e481-e488, July 2006
[doi> 10.1093/bioinformatics/btl237]
|
| |
19
|
Vaidyanathan, S., Broadhurst, D.I., Kell, D.B. and Goodacre, R. Explanatory Optimization of Protein Mass Spectrometry via Genetic Search. Anal. Chem. 75, 23 (Dec. 2003), 6679--6686.
|
| |
20
|
Westin, L.K., Receiver operating characteristic (ROC) analysis. Technical paper, UNINF-01.18, 2001, Umea University, http://www.cs.umu.se/research/report
|
|