| Discovering unexpected information from your competitors' web sites |
| Full text |
Pdf
(1.07 MB)
|
| Source
|
International Conference on Knowledge Discovery and Data Mining
archive
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining
table of contents
San Francisco, California
Pages: 144 - 153
Year of Publication: 2001
ISBN:1-58113-391-X
|
|
Authors
|
|
Bing Liu
|
School of Computing, National University of Singapore, Singapore 117543
|
|
Yiming Ma
|
School of Computing, National University of Singapore, Singapore 117543
|
|
Philip S. Yu
|
IBM T. J. Watson Research Center, Yorktown Heights, NY
|
|
| Sponsors |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 17, Downloads (12 Months): 72, Citation Count: 17
|
|
|
ABSTRACT
Ever since the beginning of the Web, finding useful information from the Web has been an important problem. Existing approaches include keyword-based search, wrapper-based information extraction, Web query and user preferences. These approaches essentially find information that matches the user's explicit specifications. This paper argues that this is insufficient. There is another type of information that is also of great interest, i.e., unexpected information, which is unanticipated by the user. Finding unexpected information is useful in many applications. For example, it is useful for a company to find unexpected information bout its competitors, e.g., unexpected services and products that its competitors offer. With this information, the company can learn from its competitors and/or design counter measures to improve its competitiveness. Since the number of pages of a typical commercial site is very large and there are also many relevant sites (competitors), it is very difficult for a human user to view each page to discover the unexpected information. Automated assistance is needed. In this paper, we propose a number of methods to help the user find various types of unexpected information from his/her competitors' Web sites. Experiment results show that these techniques are very useful in practice and also efficient.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
|
 |
4
|
|
| |
5
|
|
 |
6
|
S. Ceri , S. Comai , E. Damiani , P. Fraternali , L. Tanca, Complex queries in XML-GL, Proceedings of the 2000 ACM symposium on Applied computing, p.888-893, March 2000, Como, Italy
[doi> 10.1145/338407.338677]
|
| |
7
|
|
 |
8
|
|
| |
9
|
Mark Craven , Dan DiPasquo , Dayne Freitag , Andrew McCallum , Tom Mitchell , Kamal Nigam , Seán Slattery, Learning to construct knowledge bases from the World Wide Web, Artificial Intelligence, v.118 n.1-2, p.69-113, April 2000
[doi> 10.1016/S0004-3702(00)00004-7]
|
| |
10
|
|
| |
11
|
Dulin Core Home Page, http://purl.org/DC.
|
 |
12
|
|
 |
13
|
Ronen Feldman , Yair Liberzon , Binyamin Rosenfeld , Jonathan Schler , Jonathan Stoppi, A framework for specifying explicit bias for revision of approximate information extraction rules, Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining, p.189-197, August 20-23, 2000, Boston, Massachusetts, United States
[doi> 10.1145/347090.347125]
|
 |
14
|
David Gibson , Jon Kleinberg , Prabhakar Raghavan, Inferring Web communities from link topology, Proceedings of the ninth ACM conference on Hypertext and hypermedia : links, objects, time and space---structure in hypermedia systems: links, objects, time and space---structure in hypermedia systems, p.225-234, June 20-24, 1998, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/276627.276652]
|
| |
15
|
|
| |
16
|
|
| |
17
|
B. Liu, and W. Hsu. Post-analysis of learnt rules. AAAI-96.
|
| |
18
|
B. Liu, W. Hsu, and S. Chen. Using general impressions to analyze discovered classification rules. KDD-97, 1997.
|
| |
19
|
A. Mendelzon, G. Mihaila, T. Milo. Querying the World Wide Web, Journal of Digital Libraries 1(1): 68-88, 1997.
|
 |
20
|
|
| |
21
|
G. Piatesky-Shapiro & C. Matheus. The interestingness of deviations. KDD-94, 1994.
|
| |
22
|
Resource Description Framework (RDF) Schema Specification, W3C proposed recommendation. 22 Feb, 1999. http://www, w3.org/TR/PR-rdf-schema/
|
| |
23
|
|
| |
24
|
|
| |
25
|
|
| |
26
|
|
CITED BY 17
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Cheng-Ru Lin , Chang-Hung Lee , Ming-Syan Chen , Philip S. Yu, Distributed data mining in a chain store database of short transactions, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
|
|
|
Jian-Tao Sun , Xuanhui Wang , Dou Shen , Hua-Jun Zeng , Zheng Chen, CWS: a comparative web search system, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|