|
|||||||||||||||||||
|
|||||||||||||||||||
ABSTRACT
Dealing with very large databases is one of the defining challenges in data mining research and development. When a data base is not a static repository of data, or if the data come from different data sources and putting all data together might amass a huge database for centralized processing, knowledge discovery in such data environments cannot be a one-time process. Existing techniques include data sampling, windowing, bagging, boosting, batch learning, hierarchical meta-learning, and parallel and distributed data mining. This talk will provide a review on these techniques, and present our own recent research efforts on multi-layer induction and synthesizing association rules from different data sources. |
|||||||||||||||||||