| Learning relational probability trees |
| Full text |
Pdf
(411 KB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
Washington, D.C.
POSTER SESSION: Research track
table of contents
Pages: 625 - 630
Year of Publication: 2003
ISBN:1-58113-737-0
|
|
Authors
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 14, Downloads (12 Months): 73, Citation Count: 30
|
|
|
ABSTRACT
Classification trees are widely used in the machine learning and data mining communities for modeling propositional data. Recent work has extended this basic paradigm to probability estimation trees. Traditional tree learning algorithms assume that instances in the training data are homogenous and independently distributed. Relational probability trees (RPTs) extend standard probability estimation trees to a relational setting in which data instances are heterogeneous and interdependent. Our algorithm for learning the structure and parameters of an RPT searches over a space of relational features that use aggregation functions (e.g. AVERAGE, MODE, COUNT) to dynamically propositionalize relational data and create binary splits within the RPT. Previous work has identified a number of statistical biases due to characteristics of relational data such as autocorrelation and degree disparity. The RPT algorithm uses a novel form of randomization test to adjust for these biases. On a variety of relational learning tasks, RPTs built using randomization tests are significantly smaller than other models and achieve equivalent, or better, performance.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Mark Craven , Dan DiPasquo , Dayne Freitag , Andrew McCallum , Tom Mitchell , Kamal Nigam , Seán Slattery, Learning to extract symbolic knowledge from the World Wide Web, Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence, p.509-516, July 1998, Madison, Wisconsin, United States
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
|
| |
7
|
|
| |
8
|
D. Jensen, J. Neville and M. Hay. Avoiding bias when aggregating relational data with degree disparity. Proc. of the 20th Intl Joint Conf. on Machine Learning, to appear.
|
| |
9
|
|
| |
10
|
S. Kramer. Structural regression trees. Proc. of the 13th National Conference on Artificial Intelligence, 812--819, 1996.
|
| |
11
|
|
| |
12
|
|
| |
13
|
J. Neville, D. Jensen, B. Gallagher and R. Fairgrieve. Simple estimators for relational Bayesian classifiers. University of Massachusetts Amherst, Tech Report 03--04, 2003.
|
| |
14
|
F. Provost and P. Domingos. Well-trained PETs: Improving probability estimation trees. CDER Working Paper #00-04-IS, Stern School of Business, NYU, 2000.
|
| |
15
|
|
CITED BY 30
|
|
|
|
|
Amy McGovern , Lisa Friedland , Michael Hay , Brian Gallagher , Andrew Fast , Jennifer Neville , David Jensen, Exploiting relational structure to understand publication patterns in high-energy physics, ACM SIGKDD Explorations Newsletter, v.5 n.2, December 2003
|
|
|
Jennifer Neville , Özgür Şimşek , David Jensen , John Komoroske , Kelly Palmer , Henry Goldberg, Using relational knowledge discovery to prevent securities fraud, Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, August 21-24, 2005, Chicago, Illinois, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Andrew Fast , Lisa Friedland , Marc Maier , Brian Taylor , David Jensen , Henry G. Goldberg , John Komoroske, Relational data pre-processing techniques for improved securities fraud detection, Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, August 12-15, 2007, San Jose, California, USA
|
|
|
|
|
|
Brian Gallagher , Hanghang Tong , Tina Eliassi-Rad , Christos Faloutsos, Using ghost edges for classification in sparsely labeled networks, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Tom Croonenborghs , Jan Ramon , Hendrik Blockeel , Maurice Bruynooghe, Online learning and exploiting relational models in reinforcement learning, Proceedings of the 20th international joint conference on Artifical intelligence, p.726-731, January 06-12, 2007, Hyderabad, India
|
|
|
Jesse Davis , Elizabeth Burnside , Inês Dutra , David Page , Raghu Ramakrishnan , Vitor Santos Costa , Jude Shavlik, View learning for statistical relational learning: with an application to mammography, Proceedings of the 19th international joint conference on Artificial intelligence, p.677-683, July 30-August 05, 2005, Edinburgh, Scotland
|
|
|
|
|