ACM Home Page
Please provide us with feedback. Feedback
Managing and analyzing carbohydrate data
Full text PdfPdf (12.16 MB)
Source ACM SIGMOD Record archive
Volume 33 ,  Issue 2  (June 2004) table of contents
SPECIAL ISSUE: Data engineering for life sciences table of contents
Pages: 33 - 38  
Year of Publication: 2004
ISSN:0163-5808
Authors
Kiyoko F. Aoki  Kyoto University, Kyoto, Japan
Nobuhisa Ueda  Kyoto University, Kyoto, Japan
Atsuko Yamaguchi  Kyoto University, Kyoto, Japan
Tatsuya Akutsu  Kyoto University, Kyoto, Japan
Minoru Kanehisa  Kyoto University, Kyoto, Japan
Hiroshi Mamitsuka  Kyoto University, Kyoto, Japan
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 3,   Downloads (12 Months): 10,   Citation Count: 3
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1024694.1024700
What is a DOI?

ABSTRACT

One of the most vital molecules in multicellular organisms is the carbohydrate, as it is structurally important in the construction of such organisms. In fact, all cells in nature carry carbohydrate sugar chains, or glycans, that help modulate various cell-cell events for the development of the organism. Unfortunately, informatics research on glycans has been slow in comparison to DNA and proteins, largely due to difficulties in the biological analysis of glycan structures. Our work consists of data engineering approaches in order to glean some understanding of the current glycan data that is publicly available. In particular, by modeling glycans as labeled unordered trees, we have implemented a tree-matching algorithm for measuring tree similarity. Our algorithm utilizes proven efficient methodologies in computer science that has been extended and developed for glycan data. Moreover, since glycans are recognized by various agents in multicellular organisms, in order to capture the patterns that might be recognized, we needed to somehow capture the dependencies that seem to range beyond the directly connected nodes in a tree. Therefore, by defining glycans as labeled ordered trees, we were able to develop a new probabilistic tree model such that sibling patterns across a tree could be mined. We provide promising results from our methodologies that could prove useful for the future of glycome informatics.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
K. F. AOKI ET AL., Efficient tree-matching methods for accurate carbohydrate database queries, Genome Informatics, 14 (2003), pp. 134--143.
 
2
K. F. AOKI ET AL., Application of a new probabilistic model for recognizing complex patterns in glycans, in ISMB, 2004.
 
3
 
4
E. BAUM AND T. PETRIE, Statistical inference for probabilistic functions of infinite state Markov chains, Ann. Math. Stat., 37 (1966), pp. 1554--1563.
 
5
C. R. BERTOZZI AND L. L. KIESSLING, Carbohydrates and glycobiology review: Chemical glycobiology, Science, 291 (2001), pp. 2357--2364.
 
6
S. A. BROOKS ET AL., Functional and Molecular Glycobiology, BIOS Scientific Publishers Ltd., 2002.
 
7
A. DEMPSTER, N. LAIRD, AND D. RUBIN, Maximum likelihood from incomplete data via the EM algorithm, J. R. Statist. Soc. B, 39 (1977), pp. 1--38.
 
8
 
9
K. DRICKAMER, Two distinct classes of carbohydrate-recognition domains in animal lectins, J. Biol. Chem., 263 (1988), pp. 9557--9560.
 
10
R. DURBIN ET AL., Biological sequence analysis, Cambridge University Press, Cambridge, 1998.
 
11
J. EDMONDS AND D. MATULA, An algorithm for subtree identification, SIAM Rev., 10 (1968), pp. 273--274.
 
12
P. FALK, L. C. HOSKINS, AND G. LARSON, Bacteria of the human intestinal microbiota produce glycosidases specific for lacto-series glycosphingolipids, J. Biochem, 108 (1990), pp. 466--474.
 
13
 
14
C. HYEOKHO AND R. BARANIUK, Multiscale image segmentation using wavelet-domain hidden Markov models, IEEE Trans. Image Proc., 46 (2001), pp. 886--902.
 
15
M. KANEHISA ET AL., The KEGG resource for deciphering the genome, NAR, 32 (2004), pp. D277-D280.
 
16
17
 
18
I. MARCHAL, G. GOLFIER, O. DUGAS, AND M. MAJED., Bioinformatics in glycobiology, Biochimie, 85 (2003), pp. 75--81.
 
19
 
20
T. F. SMITH AND M. S. WATERMAN, Identification of common molecular subsequences, J. Mol. Biol., 147 (1981), pp. 195--197.
21
 
22
N. UEDA, K. F. AOKI, AND H. MAMITSUKA, A general probabilistic framework for mining labeled ordered trees, in SIAM DM, 2004.
 
23
A. VARKI, Sialic acids as ligands in recognition phenomena, FASEB J., 11 (1997), pp. 248--255.
 
24
A. VARKI ET AL., eds., Essentials of Glycobiology, Cold Spring Harbor Lab. Press, New York, 1999.
25

Collaborative Colleagues:
Kiyoko F. Aoki: colleagues
Nobuhisa Ueda: colleagues
Atsuko Yamaguchi: colleagues
Tatsuya Akutsu: colleagues
Minoru Kanehisa: colleagues
Hiroshi Mamitsuka: colleagues