| From HTML documents to web tables and rules |
| Full text |
Pdf
(911 KB)
|
| Source
|
ACM International Conference Proceeding Series; Vol. 156
archive
Proceedings of the 8th international conference on Electronic commerce: The new e-commerce: innovations for conquering current barriers, obstacles and limitations to conducting successful business on the internet
table of contents
Fredericton, New Brunswick, Canada
SESSION: Semantic web ontologies, rules, and services track
table of contents
Pages: 125 - 131
Year of Publication: 2006
ISBN:1-59593-392-1
|
|
Authors
|
|
Kai Simon
|
Universität Freiburg, Freiburg i.Br., Germany
|
|
Georg Lausen
|
Universität Freiburg, Freiburg i.Br., Germany
|
|
Harold Boley
|
Institute for Information Technology -- e-Business, Fredericton, NB, Canada
|
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 8, Downloads (12 Months): 32, Citation Count: 0
|
|
|
ABSTRACT
We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and reorganizes semi-structured information into a tabular data structure, which can again be browsed and/or submitted to further machine processing. Second, exemplifying the latter, the extended knowledge extractor Rex ViPER mines the resulting tables for structural properties and functional dependencies. Rules are generated to obtain a more compact and manageable, often also enriched, knowledge representation. The resulting fully structured information, RuleML-serialized facts and rules, can be stored along with the orginal documents, queried by rule engines such as OO jDREW and FLORID, and interchanged between Web Services. Thus Rex ViPER contributes to automating the construction of a machine-processable Semantic Web.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
M. Ball, H. Boley, D. Hirtle, J. Mei, and B. Spencer. The OO jDREW Reference Implementation of RuleML. In Proc. Rules and Rule Markup Languages for the Semantic Web (RuleML-2005), pages 218--223. LNCS 3791, Springer-Verlag, November 2005.
|
| |
2
|
V. C. Bhavsar, H. Boley, and L. Yang. A Weighted-Tree Similarity Algorithm for Multi-Agent Systems in e-Business Environments. In Proc. Business Agents and the Semantic Web (BASeWEB) Workshop. To appear in: Computational Intelligence, Nov. 2004.
|
| |
3
|
H. Boley. Object-Oriented RuleML: User-Level Roles, URI-Grounded Clauses, and Order-Sorted Terms. In Proc. Rules and Rule Markup Languages for the Semantic Web (RuleML-2003). LNCS 2876, Springer-Verlag, Oct. 2003.
|
| |
4
|
H. Boley, S. Tabet, and G. Wagner. Design Rationale of RuleML: A Markup Language for Semantic Web Rules. In Proc. Semantic Web Working Symposium (SWWS'01), pages 381--401. Stanford University, July/August 2001.
|
| |
5
|
P. G. Brown and P. J. Haas. BHUNT: Automatic Discovery of Fuzzy Algebraic Constraints in Relational Data. In Proceedings of the 29th VLDB Conference, 2003.
|
| |
6
|
|
| |
7
|
N. E. Fuchs and U. Schwertel. Reasoning in Attempto Controlled English. In PPSWR, pages 174--188, 2003.
|
 |
8
|
Ihab F. Ilyas , Volker Markl , Peter Haas , Paul Brown , Ashraf Aboulnaga, CORDS: automatic discovery of correlations and soft functional dependencies, Proceedings of the 2004 ACM SIGMOD international conference on Management of data, June 13-18, 2004, Paris, France
[doi> 10.1145/1007568.1007641]
|
 |
9
|
|
 |
10
|
|
 |
11
|
|
| |
12
|
A. Maclachlan and H. Boley. Semantic Web Rules for Business Information. In Proc. International Conference on Web Technologies, Applications, and Services (WTAS 2005), Calgary, Canada. IASTED, July 2005.
|
| |
13
|
J. Mei, H. Boley, J. Li, V. C. Bhavsar, and Z. Lin. DatalogDL: Datalog Rules Parameterized by Description Logics, 2006. To appear in CSWWS2006.
|
 |
14
|
|
| |
15
|
D. Newman, S. Hettich, C. Blake, and C. Merz. UCI Repository of machine learning databases, 1998.
|
| |
16
|
A. Pivk, Y. Sure, P. Cimiano, M. Gams, V. Rajkovi, and R. Studer. Transforming Arbitrary Tables into F-Logic Frames with TARTAR. In Elsevier Science, 2005.
|
 |
17
|
|
| |
18
|
G. Yang and M. Kifer. Reasoning about Anonymous Resources and Meta Statements on the Semantic Web. In S. Spaccapietra, S. T. March, and K. Aberer, editors, J. Data Semantics I, volume 2800 of Lecture Notes in Computer Science, pages 69--97. Springer, 2003.
|
|