|
ABSTRACT
Model induction from relational data requires aggregation of the values of attributes of related entities. This paper makes three contributions to the study of relational learning. (1) It presents a hierarchy of relational concepts of increasing complexity, using relational schema characteristics such as cardinality, and derives classes of aggregation operators that are needed to learn these concepts. (2) Expanding one level of the hierarchy, it introduces new aggregation operators that model the distributions of the values to be aggregated and (for classification problems) the differences in these distributions by class. (3) It demonstrates empirically on a noisy business domain that more-complex aggregation methods can increase generalization performance. Constructing features using target-dependent aggregations can transform relational prediction tasks so that well-understood feature-vector-based modeling algorithms can be applied successfully.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. M. Aronis and F. J. Provost. Efficiently constructing relational features from background knowledge for inductive machine learning. In U. Fayyad and R. Uthurusamy, editors, In Working Notes of the AAAI-94 Workshop on Knowledge Discovery in Databases (KDD-94), pages 347--358, 1994.
|
| |
2
|
|
| |
3
|
A. P. Bradley. The use of the area under the ROC curve in the evaluation of machine learning algorithms. In Pattern Recognition, volume 30(7), pages 1145--1159, 1997.
|
| |
4
|
Luc De Raedt , Hendrik Blockeel , Luc Dehaspe , Wim Van Laer, Three companions for data mining in first order logic, Relational Data Mining, Springer-Verlag New York, Inc., New York, NY, 2001
|
| |
5
|
J. Fürnkranz. Dimensionality reduction in ILP: A call to arms. In Raedt L. and Muggleton S., editors, Proceedings of the IJCAI-97 Workshop on Frontiers of Inductive Logic Programming, 1997.
|
| |
6
|
|
| |
7
|
D. Jensen and J. Neville. Data mining in social networks. In Dynamic Social Networks Modeling and Analysis, 2002.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
P. C. Mahalanobis. On the generalized distance in statistics. In Proc. Natl. Institute of Science of India, volume 12, pages 49--55, 1936.
|
| |
15
|
|
| |
16
|
|
| |
17
|
S. H. Muggleton. Cprogol4.4: a tutorial introduction. In Saso Dzeroski and Nada Lavrac, editors, Relational Data Mining, pages 105--139. Springer-Verlag, 2001.
|
| |
18
|
S. H. Muggleton and L. DeRaedt. Inductive logic programming: Theory and methods. The Journal of Logic Programming, 19 & 20:629--680, May 1994.
|
| |
19
|
A. Popescul, L. H. Ungar, S. Lawrence, and D. M. Pennock. Structural logistic regression: Combining relational and statistical learning. In Proceedings of the Workshop on Multi-Relational Data Mining (MRDM-2002), pages 130--141. University of Alberta, Edmonton, Canada, July 2002.
|
| |
20
|
F. Provost, C. Perlich, and S. Macskassy. Relational learning problems and simple models. In Lise Getoor and David Jensen, editors, Working Notes of the IJCAI-2003 Workshop on Learning Statistical Models from Relational Data, pages 116--120, 2003.
|
| |
21
|
|
| |
22
|
|
| |
23
|
|
| |
24
|
|
CITED BY 16
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jesse Davis , Elizabeth Burnside , Inês Dutra , David Page , Raghu Ramakrishnan , Vitor Santos Costa , Jude Shavlik, View learning for statistical relational learning: with an application to mammography, Proceedings of the 19th international joint conference on Artificial intelligence, p.677-683, July 30-August 05, 2005, Edinburgh, Scotland
|
|