|
ABSTRACT
With the phenomenal growth of the Web, there is an everincreasing volume of data and information published in numerous Web pages. The research in Web mining aims to develop new techniques to effectively extract and mine useful knowledge or information from these Web pages [8]. Due to the heterogeneity and lack of structure of Web data, automated discovery of targeted or unexpected knowledge/information is a challenging task. It calls for novel methods that draw from a wide range of fields spanning data mining, machine learning, natural language processing, statistics, databases, and information retrieval. In the past few years, there was a rapid expansion of activities in the Web mining field, which consists of Web usage mining, Web structure mining, and Web content mining. Web usage mining refers to the discovery of user access patterns from Web usage logs. Web structure mining tries to discover useful knowledge from the structure of hyperlinks. Web content mining aims to extract/mine useful information or knowledge from Web page contents. For this special issue, we focus on Web content mining.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
|
 |
4
|
|
| |
5
|
Bergman, M. K. The Deep Web: Surfacing Hidden Value. Technical report, BrightPlanet LLC, Dec. 2000
|
| |
6
|
Bunescu, R., Mooney, R. Collective Information Extraction with Relational Markov Networks. ACL-2004, 2004.
|
| |
7
|
|
| |
8
|
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
 |
12
|
|
 |
13
|
|
 |
14
|
|
| |
15
|
|
 |
16
|
|
 |
17
|
AnHai Doan , Jayant Madhavan , Pedro Domingos , Alon Halevy, Learning to map between ontologies on the semantic web, Proceedings of the 11th international conference on World Wide Web, May 07-11, 2002, Honolulu, Hawaii, USA
[doi> 10.1145/511446.511532]
|
| |
18
|
|
 |
19
|
D. W. Embley , Y. Jiang , Y.-K. Ng, Record-boundary discovery in Web documents, Proceedings of the 1999 ACM SIGMOD international conference on Management of data, p.467-478, May 31-June 03, 1999, Philadelphia, Pennsylvania, United States
|
 |
20
|
Oren Etzioni , Michael Cafarella , Doug Downey , Stanley Kok , Ana-Maria Popescu , Tal Shaked , Stephen Soderland , Daniel S. Weld , Alexander Yates, Web-scale information extraction in knowitall: (preliminary results), Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988687]
|
 |
21
|
|
 |
22
|
|
| |
23
|
He, H, Meng, W., Yu, C. Wu, Z. WISE-Integrator: An Automatic Integrator of Web Search Interfaces for E-Commerce. VLDB-03, 2003.
|
 |
24
|
|
 |
25
|
Hung-Yu Kao , Ming-Syan Chen , Shian-Hua Lin , Jan-Ming Ho, Entropy-based link analysis for mining web informative structures, Proceedings of the eleventh international conference on Information and knowledge management, November 04-09, 2002, McLean, Virginia, USA
[doi> 10.1145/584792.584886]
|
 |
26
|
Krishna Kummamuru , Rohit Lotlikar , Shourya Roy , Karan Singal , Raghu Krishnapuram, A hierarchical monothetic document clustering algorithm for summarization and browsing search results, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988762]
|
| |
27
|
|
| |
28
|
Kwok, C., Etzioni, O., Weld, D. Scaling Question Answering to the Web. WWW-00, 2000.
|
| |
29
|
|
| |
30
|
Lawrie, D. J. and Croft, W. B., Generating Hierarchical Summaries for Web Searches. WWW'03, 2003.
|
 |
31
|
|
 |
32
|
|
 |
33
|
|
 |
34
|
|
 |
35
|
|
| |
36
|
|
| |
37
|
|
 |
38
|
Satoshi Morinaga , Kenji Yamanishi , Kenji Tateishi , Toshikazu Fukushima, Mining product reputations on the Web, Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, July 23-26, 2002, Edmonton, Alberta, Canada
[doi> 10.1145/775047.775098]
|
 |
39
|
|
| |
40
|
Nigam, K. and Hurst, M. Towards a Robust Metric of Opinion. AAAI Spring Symposium on Exploring Attitude and Affect in Text. 2004.
|
| |
41
|
|
 |
42
|
|
 |
43
|
Dragomir Radev , Weiguo Fan , Hong Qi , Harris Wu , Amardeep Grewal, Probabilistic question answering on the web, Proceedings of the 11th international conference on World Wide Web, May 07-11, 2002, Honolulu, Hawaii, USA
[doi> 10.1145/511446.511500]
|
 |
44
|
Lakshmish Ramaswamy , Arun Iyengar , Ling Liu , Fred Douglis, Automatic detection of fragments in dynamically generated web pages, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988732]
|
 |
45
|
D. C. Reis , P. B. Golgher , A. S. Silva , A. F. Laender, Automatic web news extraction using tree edit distance, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988740]
|
| |
46
|
Sarawagi, S., Cohen, W. Semi-Markov Conditional Random Fields for Information Extraction, NIPS-04, 2004.
|
 |
47
|
Ruihua Song , Haifeng Liu , Ji-Rong Wen , Wei-Ying Ma, Learning block importance models for web pages, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
[doi> 10.1145/988672.988700]
|
| |
48
|
|
| |
49
|
Wilson, T, Wiebe, J, & Hwa, R. Just How Mad are You? Finding Strong and Weak Opinion Clauses. AAAI-04, 2004.
|
 |
50
|
|
 |
51
|
|
| |
52
|
Yi, L., and Liu, B. Web Page Cleaning for Web Mining through Feature Weighting IJCAI-03, 2003.
|
 |
53
|
|
 |
54
|
|
| |
55
|
|
 |
56
|
Hua-Jun Zeng , Qi-Cai He , Zheng Chen , Wei-Ying Ma , Jinwen Ma, Learning to cluster web search results, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
[doi> 10.1145/1008992.1009030]
|
 |
57
|
|
CITED BY 3
|
|
|
|
|
Wolfgang Gatterbauer , Paul Bohunsky , Marcus Herzog , Bernhard Krüpl , Bernhard Pollak, Towards domain-independent information extraction from web tables, Proceedings of the 16th international conference on World Wide Web, May 08-12, 2007, Banff, Alberta, Canada
|
|
|
|
|