ACM Home Page
Please provide us with feedback. Feedback
GADDI: distance index based subgraph matching in biological networks
Full text PdfPdf (1.03 MB)
Source Extending Database Technology; Vol. 360 archive
Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology table of contents
Saint Petersburg, Russia
SESSION: Research sessions: Graph techniques table of contents
Pages 192-203  
Year of Publication: 2009
ISBN:978-1-60558-422-5
Authors
Shijie Zhang  Case Western Reserve University, Cleveland, OH
Shirong Li  Case Western Reserve University, Cleveland, OH
Jiong Yang  Case Western Reserve University, Cleveland, OH
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 17,   Downloads (12 Months): 86,   Citation Count: 0
Additional Information:

abstract   references   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1516360.1516384
What is a DOI?

ABSTRACT

Currently, a huge amount of biological data can be naturally represented by graphs, e.g., protein interaction networks, gene regulatory networks, etc. The need for indexing large graphs is an urgent research problem of great practical importance. The main challenge is size. Each graph may contain thousands (or more) vertices. Most of the previous work focuses on indexing a set of small or medium sized database graphs (with only tens of vertices) and finding whether a query graph occurs in any of these. In this paper, we are interested in finding all the matches of a query graph in a given large graph of thousands of vertices, which is a very important task in many biological applications. This increases the complexity significantly. We propose a novel distance measurement which reintroduces the idea of frequent substructures in a single large graph. We devise the novel structure distance based approach (GADDI) to efficiently find matches of the query graph. GADDI is further optimized by the use of a dynamic matching scheme to minimize redundant calculations. Last but not least, a number of real and synthetic data sets are used to evaluate the efficiency and scalability of our proposed method.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
J. Berg, and M. Lassig, Local graph alignment and motif search in biological networks, PNAS, 2004.
 
4
B. Bringmann, and S. Nijssen: What Is Frequent in a Single Graph? PAKDD 2008.
5
 
6
 
7
 
8
T. Dandekar, S. Schuster, B. Snel, M. Huynen, and P. Bork, Pathway alignment: application to the comparative analysis of glycolytic enzymes, Biochem, 1999.
 
9
L. Dehaspe, H. Toivonen, and R. King. Finding frequent substructures in chemical compounds. Proc. of KDD, 1998.
 
10
 
11
B. Dost, T. Shlomi, N. Gupta, E. Ruppin, V. Bafna, and R. Sharan. QNet: A Tool for Querying Protein Interaction Networks, Proc. of RECOMB, 2007.
 
12
 
13
 
14
 
15
J. Gonzalez, L. Holder, and D. Cook. Application of graph-based concept learning to the predictive toxicology domain. Proc. of the Predictive Toxicology Challenge Workshop, 2001.
 
16
R. Giugno, D. Shasha, GraphGrep:, A Fast and Universal Method for Querying Graphs. Proc. of ICPR, 2002.
 
17
18
19
 
20
 
21
 
22
D. Jensen, and H. Goldberg. Artificial Intelligence and Link Analysis. AAAI Press, 1998.
 
23
H. Jiang, H. Wang, P. Yu, and S. Zhou. Gstring: A novel approach for efficient search in graph databases. Proc. of ICDE, 2007.
24
 
25
J. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, and A. Tomkins. The Web as a graph: Measurements, models and methods. Lecture Notes in Computer Science, 1999.
 
26
 
27
 
28
M. Koyuturk, A. Grama, and W. Szpankowski. Pairwise Local Alignment of Protein Interaction Networks Guided by Models of Evolution. RECOMB 2005.
 
29
M. Koyuturk, Y. Kim, U. Topkara, S. Subramaniam, W. Szpankowski, and A. Grama. Pairwise Alignment of Protein Interaction Networks, RECOMB 2005.
30
 
31
M. Kuramochi, and G. Karypis. Frequent subgraph discovery. Proc. of ICDE, 2001.
 
32
33
 
34
R. Mooney, P. Melville, L. Tang, J. Shavlik, I. Castro Dutra, and D. Page. Relational data mining with inductive logic programming for link discovery. AAAI Press/The MIT Press, 2004.
35
36
 
37
38
 
39
R. Singh, J. Xu, and B. Berger. Pairwise Global Alignment of Protein Interaction Networks by Matching Neighborhood Topology, RECOMB 07'.
 
40
Y. Tian and J. Patel. TALE: A Tool for Approximate Large Graph Matching, Proc. of ICDE, 2008.
41
 
42
 
43
S. Wasserman, K. Faust, and D. Iacobucci. Social network analysis: Methods and applications. Cambridge University Press, 1994.
 
44
D. Williams, J. Huan, and W. Wang. Graph database indexing using structured graph decomposition. Proc. of ICDE, 2007.
 
45
46
47
 
48
 
49
 
50
K. Yoshida, H. Motoda, and N. Indurkhya, Graph-based induction as a unified learning framework. Journal of Applied Intelligence, 1994.
 
51
S. Zhang, M. Hu, and J. Yang. Treepi: a new graph indexing method. Proc. of ICDE, 2007.
 
52
53
 
54
Gene Ontology. http://www.geneontology.org/.
 
55
Social Network. http://www-personal.umich.edu/mejn/netdata/.
Collaborative Colleagues:
Shijie Zhang: colleagues
Shirong Li: colleagues
Jiong Yang: colleagues