|
ABSTRACT
Traditionally, information extraction from web tables has focused on small, more or less homogeneous corpora, often based on assumptions about the use of
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
 |
2
|
|
| |
3
|
|
| |
4
|
M. Banko, M. J. Cafarella, S. Soderland, M. Broadhead, and O. Etzioni. Open information extraction from the Web. In Proc. 20th IJCAI, pp. 2670--2676, Jan. 2007.
|
| |
5
|
|
| |
6
|
D. Cai, S. Yu, J.-R. Wen, and W.-Y. Ma. Extracting content structure for web pages based on visual representation. In Proc. 5th AP Web, pp. 406--417. Springer, Apr. 2003.
|
| |
7
|
|
 |
8
|
|
| |
9
|
M. Cosulschi, N. Constantinescu, and M. Gabroveanu. Classification and comparison of information structures from a web page. The Annals of the University of Craiova, 31:109--121, 2004.
|
| |
10
|
|
| |
11
|
|
| |
12
|
D.W. Embley, M. Hurst, D. P. Lopresti, and G. Nagy. Table-processing paradigms: a research survey. IJDAR, 8(2-3):66--86, June 2006.
|
| |
13
|
D.W. Embley, D.P. Lopresti, and G. Nagy. Notes on contemporary table recognition. In Proc. 7th Int. Workshop on Document Analysis Systems (DAS), pp. 164--175. Springer, Feb. 2006.
|
| |
14
|
O. Etzioni, M.J. Cafarella, D. Downey, A.-M. Popescu, T. Shaked, S. Soderland, D.S. Weld, and A. Yates. Methods for domain-independent information extraction from the Web: An experimental comparison. In Proc. 19th AAAI, pp. 391--398. AAAI Press/MIT Press, July 2004.
|
| |
15
|
W. Gatterbauer and P. Bohunsky. Table extraction using spatial reasoning on the CSS2 visual box model. In Proc. 21st AAAI, pp. 1313--1318. AAAI Press, July 2006.
|
| |
16
|
|
| |
17
|
|
| |
18
|
M. Hurst. Layout and language: Challenges for table understanding on the Web. In Proc. 1st WDA at 6th ICDAR, pp. 27--30, Sept. 2001.
|
| |
19
|
|
 |
20
|
|
 |
21
|
|
 |
22
|
|
 |
23
|
|
 |
24
|
|
| |
25
|
|
| |
26
|
A. Pivk, P. Cimiano, and Y. Sure. From tables to frames. Journal of Web Semantics, 3(2--3):132--146, 2005.
|
| |
27
|
B. Pollak and W. Gatterbauer. Creating permanent test sets of web pages for information extraction research. In Proc. 33rd SOFSEM: Theory and Practice of Computer Science, volII, pp. 103--115, Jan. 2007.
|
 |
28
|
|
| |
29
|
|
| |
30
|
|
| |
31
|
C. Vanoirbeek. Formatting structured tables. In Proc. of Electronic Publishing'92, pp. 291--309. Cambridge University Press, Apr. 1992.
|
| |
32
|
|
 |
33
|
|
| |
34
|
H. Wium Lie, B. Bos, C. Lilley, and I. Jacobs. Cascading Style Sheets, level 2. Technical report, World WideSS2.
|
| |
35
|
T. Wohlberg. Hypertables: Development of a structure description language for tables in XML. Master thesis, University of Hamburg, Germany, 1999.(Original title in German: Hypertables: Entwicklung einer Strukturbeschreibungssprache für Tabellen in XML).
|
 |
36
|
|
| |
37
|
|
| |
38
|
M. Yoshida, K. Torisawa, and J. Tsujii. A method to integrate tables of the world wide web. In Proc. 1st WDA at 6th ICDAR, pp. 31--34, Sept. 2001.
|
| |
39
|
Richard Zanibbi , Dorothea Blostein , R. Cordy, A survey of table recognition: Models, observations, transformations, and inferences, International Journal on Document Analysis and Recognition, v.7 n.1, p.1-16, March 2004
[doi> 10.1007/s10032-004-0120-9]
|
 |
40
|
|
 |
41
|
Hongkun Zhao , Weiyi Meng , Zonghuan Wu , Vijay Raghavan , Clement Yu, Fully automatic wrapper generation for search engines, Proceedings of the 14th international conference on World Wide Web, May 10-14, 2005, Chiba, Japan
[doi> 10.1145/1060745.1060760]
|
CITED BY 7
|
|
|
|
|
|
|
|
|
|
|
Jeremy T. Brudvik , Jeffrey P. Bigham , Anna C. Cavender , Richard E. Ladner, Hunting for headings: sighted labeling vs. automatic classification of headings, Proceedings of the 10th international ACM SIGACCESS conference on Computers and accessibility, October 13-15, 2008, Halifax, Nova Scotia, Canada
|
|
|
|
|
|
|
|
|
|
|