|
ABSTRACT
One challenge in microarray experiments is assessing when the results are biologically significant. This assessment can be aided by detailed annotation of the probeset target sequences, including gene function or category, protein product, and pathway information. NetAffx compiles public and in-house annotations for all Affymetrix chip sets. Public annotations are collected from Unigene, LocusLink and Swiss-prot. In-house annotations are produced by Generalized Rapid Automated Protein Analysis (GRAPA), a high-accuracy HMM method for protein annotation. GRAPA has been used to generate novel annotations under three classification schemes: Structural Classification of Proteins (SCOP), Enzyme Commission (EC), and G protein coupled receptors (GPCR). In addition, annotations are generated by searching Pfam and BLOCKS databases. These annotation schemes have been applied to diverse genomes including human, mouse, rat, drosophila, and yeast, then mapped onto Affymetrix microarray probesets. The combination of protein-level annotations with public source annotations creates a powerful description of genes at both the genomic and protein levels. Users can collect information on a target sequence, or cluster microarray probe sets according to a given domain or functional category. NetAffx is available on the web at http://www.NetAffx.com/.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Altschul, S. F., W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, Basic local alignment search tool. J Mol Biol, 1990. 215(3): p. 403-10.
|
| |
2
|
Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, 1997. 25(17): p. 3389-402.
|
| |
3
|
Bairoch, A., The ENZYME database in 2000. Nucleic Acids Res, 2000. 28(1): p. 304-5.
|
| |
4
|
Bateman, A., E. Birney, R. Durbin, S. R. Eddy, K. L. Howe, and E. L. Sonnhammer, The Pfam protein families database. Nucleic Acids Res, 2000. 28(1): p. 263-6.
|
| |
5
|
Berman, H. M., J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, and P. E. Bourne, The Protein Data Bank. Nucleic Acids Res, 2000. 28(1): p. 235-42.
|
| |
6
|
Brenner, S. E., P. Koehl, and M. Levitt, The ASTRAL compendium for protein structure and sequence analysis. Nucleic Acids Res, 2000. 28(1): p. 254-6.
|
| |
7
|
Elofsson, A. and E. L. Sonnhammer, A comparison of sequence and structure protein domain families as a basis for structural genomics. Bioinformatics, 1999. 15(6): p. 480-500.
|
| |
8
|
Gough, J., C. Chothia, K. Karplus, C. Barrett, and R. Hughey. Optimal Hidden Markov Models for All Sequences of Known Structure. in Currents in Computational Molecular Biology 2000. 2000.
|
| |
9
|
Henikoff, J. G. and S. Henikoff, Blocks database and its applications. Methods Enzymol, 1996. 266: p. 88-105.
|
| |
10
|
Henikoff, J. G., S. Pietrokovski, C. M. McCallum, and S. Henikoff, Blocks-based methods for detecting protein homology. Electrophoresis, 2000. 21(9): p. 1700-6.
|
| |
11
|
Horn, F., J. Weare, M. W. Beukers, S. Horsch, A. Bairoch, W. Chen, O. Edvardsen, F. Campagne, and G. Vriend, GPCRDB: an information system for G protein-coupled receptors. Nucleic Acids Res, 1998. 26(1): p. 275-9.
|
| |
12
|
Karchin, R., K. Karplus, and D. Haussler, Classifying G-protein Coupled Receptors with Support Vector Machines. 2001.
|
| |
13
|
Karplus, K., C. Barrett, M. Cline, M. Diekhans, L. Grate, and R. Hughey, Predicting protein structure using only sequence information. Proteins, 1999. Suppl(3): p. 121-5.
|
| |
14
|
Karplus, K., C. Barrett, and R. Hughey, Hidden Markov models for detecting remote protein homologies. Bioinformatics, 1998. 14(10): p. 846-56.
|
| |
15
|
Karplus, K., R. Karchin, C. Barrett, S. Tu, M. Cline, M. Diekhans, L. Grate, J. Casper, and R. Hughey, What is the value added by human intervention in protein structure prediction. Proteins, 2001: p. in press.
|
| |
16
|
Karplus, K., K. Sjolander, C. Barrett, M. Cline, D. Haussler, R. Hughey, L. Holm, and C. Sander, Predicting protein structure using hidden Markov models. Proteins, 1997. Suppl(1): p. 134-9.
|
| |
17
|
Lander, E. S., et al., Initial sequencing and analysis of the human genome. Nature, 2001. 409(6822): p. 860-921.
|
| |
18
|
Moereels, H., P. J. Lewi, F. Daeyaert, E. Schenck, and P. A. Janssen, The alpha and omega of G-protein coupled receptors: a novel method for classification. Part 2. Bin classification. Receptors Channels, 1997. 5(3-4): p. 139-48.
|
| |
19
|
Moereels, H., P. J. Lewi, L. M. Koymans, and P. A. Janssen, The alpha and omega of G protein-coupled receptors. A novel method for classification. Ann N Y Acad Sci, 1997, 812: p. 147-8.
|
| |
20
|
Murzin, A. G., S. E. Brenner, T. Hubbard, and C. Chothia, SCOP: a structural classification of proteins database for the investigation of sequences and structures. J Mol Biol, 1995. 247(4): p. 536-40.
|
| |
21
|
Park, J., K. Karplus, C. Barrett, R. Hughey, D. Haussler, T. Hubbard, and C. Chothia, Sequence comparisons using multiple sequences detect three times as many remote homologues as pairwise methods. J Mol Biol, 1998. 284(4): p. 1201-10.
|
| |
22
|
Pruitt, K. D. and D. R. Maglott, RefSeq and LocusLink: NCBI gene-centered resources. Nucleic Acids Res, 2001. 29(1): p. 137-40.
|
| |
23
|
Redfern, C. H., et al., Conditional expression of a Gi-coupled receptor causes ventricular conduction delay and a lethal cardiomyopathy. Proc Natl Acad Sci U S A, 2000. 97(9): p. 4826-31.
|
| |
24
|
Schuler, G. D., Pieces of the puzzle: expressed sequence tags and the catalog of human genes. J Mol Med, 1997. 75(10): p. 694-8.
|
| |
25
|
Shigeta, R., G. Liu, M. Cline, A. Loraine, D. Kulp, and M. A. Siani-Rose, Generalized Rapid Automated Protein Analysis (GRAPA): annotating the human genome based on SCOP domain-derived hidden Markov models. submitted, 2001.
|
| |
26
|
Shigeta, R., M. A. Siani-Rose, and D. Kulp, RAKE: Accurate Automated Annotation of the Human Genome Based on SCOP Domain-derived Hidden Markov Models, in Currents in Computational Molecular Biology 2001. 2001. p. 247-248.
|
|