ACM Home Page
Please provide us with feedback. Feedback
Relating web pages to enable information-gathering tasks
Full text PdfPdf (452 KB)
Source
Conference on Hypertext and Hypermedia archive
Proceedings of the 20th ACM conference on Hypertext and hypermedia table of contents
Torino, Italy
SESSION: Link analysis table of contents
Pages 109-118  
Year of Publication: 2009
ISBN:978-1-60558-486-7
Authors
Amitabha Bagchi  Indian Institute of Technology, New Delhi, India
Garima Lahoti  Cazoodle Inc., Champaign, IL, USA
Sponsors
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 28,   Downloads (12 Months): 72,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557914.1557935
What is a DOI?

ABSTRACT

We argue that relationships between Web pages are functions of the user's intent. We identify a class of Web tasks - information-gathering - that can be facilitated by providing links to pages related to the page the user is currently viewing. We define three kinds of intentional relationships that correspond to whether the user is a) seeking sources of information, b) reading pages which provide information, or c) surfing through pages as part of an extended information-gathering process. We show that these three relationships can be mined using a combination of textual and link information and provide three scoring mechanisms that correspond to them: SeekRel, FactRel and SurfRel. These scoring mechanisms incorporate both textual and link information. We build a set of capacitated subnetworks, each corresponding to a particular keyword. Scores are computed by computing flows on these subnetworks. The capacities of the links are derived from the hub and authority values of the nodes they connect, following the work of Kleinberg (1998) on assigning authority to pages in hyperlinked environments. We evaluated our scoring mechanism by running experiments on four data sets taken from the Web. We present user evaluations of the relevance of the top results returned by our scoring mechanisms and compare those to the top results returned by Google's Similar Pages feature, and the Companion algorithm (Dean and Henzinger, 1999).


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Altavista. http://www.altavista.com/.
2
 
3
N. J. Belkin, R. N. Oddy, and H. M. Brooks. ASK for information retrieval: Part I. Background and theory. J. Doc., 38(2):61--71, 1982.
 
4
N. J. Belkin, R. N. Oddy, and H. M. Brooks. ASK for information retrieval: Part II. Results of a design study. J. Doc., 38(3):145--164, 1982.
 
5
6
 
7
 
8
 
9
M. de Kunder. The size of the World Wide Web. http://www.worldwidewebsize.com/. Retrieved on 29th February 2008.
 
10
 
11
C. Fellbaum, editor. Wordnet: An electronic lexical database. Bradford Books, 1998.
 
12
A. J. Ferrari, D. Gourley, K. Johnson, F. C. Knabe, D. Tunkelang, and J. S. Walter. Hierarchical data-driven navigation system and method for information retrieval. U.S. Patent number 7,035,864, April 2006.
13
 
14
15
16
 
17
 
18
 
19
S. Lawrence and C. L. Giles. Accessibility of information on the Web. Nature, 400:107--109, 1999.
 
20
21
 
22
 
23
 
24
 
25
Nutch. http://lucene.apache.org/nutch/.
26
 
27
28
 
29
 
30
A. Tombros and Z. Ali. Factors affecting Web page similarity. In 27th European Conference on Information Retrieval (ECIR), 2005.
31
 
32
Yahoo! Content analysis Web services: Term extraction. \tiny http://developer.yahoo.com/search/content/V1/termExtraction.html.\endthebibliography

Collaborative Colleagues:
Amitabha Bagchi: colleagues
Garima Lahoti: colleagues