ACM Home Page
Please provide us with feedback. Feedback
Key blog distillation: ranking aggregates
Full text PdfPdf (357 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the 17th ACM conference on Information and knowledge management table of contents
Napa Valley, California, USA
SESSION: IR: blog table of contents
Pages 1043-1052  
Year of Publication: 2008
ISBN:978-1-59593-991-3
Authors
Craig Macdonald  University of Glasgow, Glasgow, United Kingdom
Iadh Ounis  University of Glasgow, Glasgow, United Kingdom
Sponsors
ACM: Association for Computing Machinery
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 20,   Downloads (12 Months): 263,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458082.1458221
What is a DOI?

ABSTRACT

Searchers on the blogosphere often have a need to identify other key bloggers with similar interests to their own. However, a main difference of this blog distillation task from normal adhoc or Web document retrieval is that each blog can be seen as an aggregate of its constituent posts. On the other hand, we show that the task is similar to the expert search task, where a person's expertise is derived from the aggregate of their publications or emails. In this paper, we investigate several aspects of blog retrieval: Firstly, we experiment whether a blog should be represented as a whole unit, or as by considering each of its posts as indicators of its relevance, showing that expert search techniques can be adapted for blog search; Secondly, we examine whether indexing only the XML feed provided by each blog (and which is often incomplete) is sufficient, or whether the full-text of each blog post should be downloaded; Lastly, we use approaches to detect the central or recurring interests of each blog to increase the retrieval effectiveness of the system. Using the TREC 2007 Blog dataset, the results show that our proposed expert search paradigm is indeed useful in identifying key bloggers, achieving high retrieval effectiveness.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
G. Amati. Probabilistic Models for Information Retrieval based on Divergence from Randomness. PhD thesis, Univ. of Glasgow, 2003.
 
2
J. Arguello, J. Elsas, J. Callan, and J. Carbonell. Document Representation and Query Expansion Models for Blog Recommendation. In Proceedings of ICWSM 2008, 2008.
3
 
4
N. Craswell and D. Hawking. Overview of TREC-2004 Web track. In Proceedings of TREC-2004, 2004.
 
5
N. Craswell, D. Hawking, A.-M. Vercoustre, and P. Wilkins. Panoptic expert: Searching for experts not just for documents. In AusWeb-2001 Poster Proceedings, 2001.
 
6
J. Elsas, J. Arguello, J. Callan, and J. Carbonell. Retrieval and Feedback Models for Blog Distillation. In Proceedings of TREC-2007, 2008.
 
7
D. Hannah, C. Macdonald, B. He, J. Peng, and I. Ounis. University of Glasgow at TREC 2007: Experiments in Blog and Enterprise Tracks with Terrier. In Proceedings of TREC 2007, 2008.
 
8
B. He. Term Frequency Normalisation for Information Retrieval. PhD thesis, Univ. of Glasgow, 2007.
 
9
 
10
A. Java, P. Kolari, T. Finin, A. Joshi, and T. Oates. Feeds That Matter: A Study of Bloglines Subscriptions. In Proceedings of ICWSM 2007, 2007.
 
11
S. Kirkpatrick, C. Gelatt, and M. Vecchi. Optimization by simulated annealing. Science, 220(4598):671--680, 1983.
 
12
P. Kolari, T. Finin, A. Java, and A. Joshi. Spam in Blogs and Social Media, Tutorial. In Proceedings of ICWSM 2007, 2007.
13
 
14
C. Lioma, C. Macdonald, V. Plachouras, J. Peng, B. He and I. Ounis. University of Glasgow at TREC 2006: Experiments in Terabyte and Enterprise Tracks with Terrier. In Proceedings of TREC 2006, 2007.
 
15
C. Macdonald and I. Ounis. The TREC Blogs06 collection : Creating and analysing a blog test collection. Technical Report TR-2006-224, Univ. of Glasgow, 2006.
16
17
 
18
C. Macdonald and I. Ounis. Searching for Expertise: Experiments with the Voting Model. In Special Issue of the Computer Journal on Expertise Profiling. 2008; doi: 10.1093/comjnl/bxm112
 
19
C. Macdonald, I. Ounis, and I. Soboroff. Overview of the TREC-2007 Blog Track. In Proceedings of TREC-2007, 2008.
 
20
G. Mishne and M. de Rijke. A study of blog search. In Proceedings of ECIR 2006, pages 289--301, 2006.
 
21
I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. Terrier: A high performance and scalable information retrieval platform. In Proceedings of the OSIR Workshop 2006, pages 18--25, 2006.
 
22
I. Ounis, M. de Rijke, C. Macdonald, G. Mishne, and I. Soboroff. Overview of the TREC-2006 Blog Track. In Proceedings of TREC-2006, 2007.
 
23
 
24
S. E. Robertson, S. Walker, M. Hancock-Beaulieu, A. Gull, and M. Lau. Okapi at TREC-2. In Proceedings of TREC-2, pages 21--34, 1994.
25
26
 
27
 
28
Terrier 2.1 documentation: Examples of using Terrier to index TREC collections: WT2G and Blogs06, 2008. http://ir.dcs.gla.ac.uk/terrier/doc/trec_examples.html.
 
29
M. Thelwall. Bloggers during the London attacks: Top information sources and topics. In Proceedings of WWW Workshop on the Weblogging Ecosystem, 2006.

Collaborative Colleagues:
Craig Macdonald: colleagues
Iadh Ounis: colleagues