ACM Home Page
Please provide us with feedback. Feedback
Mining broad latent query aspects from search sessions
Full text MovMov (18:11),  PdfPdf (490 KB)
Source
International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
Paris, France
SESSION: Research track papers table of contents
Pages 867-876  
Year of Publication: 2009
ISBN:978-1-60558-495-9
Authors
Xuanhui Wang  University of Illinois at Urbana-Champaign, Urbana, IL, USA
Deepayan Chakrabarti  Yahoo! Research, Sunnyvale, CA, USA
Kunal Punera  Yahoo! Research, Sunnyvale, CA, USA
Sponsors
ACM: Association for Computing Machinery
SIGKDD: ACM Special Interest Group on Knowledge Discovery in Data
SIGMOD: ACM Special Interest Group on Management of Data
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 28,   Downloads (12 Months): 113,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1557019.1557114
What is a DOI?

ABSTRACT

Search queries are typically very short, which means they are often underspecified or have senses that the user did not think of. A broad latent query aspect is a set of keywords that succinctly represents one particular sense, or one particular information need, that can aid users in reformulating such queries. We extract such broad latent aspects from query reformulations found in historical search session logs. We propose a framework under which the problem of extracting such broad latent aspects reduces to that of optimizing a formal objective function under constraints on the total number of aspects the system can store, and the number of aspects that can be shown in response to any given query. We present algorithms to find a good set of aspects, and also to pick the best k aspects matching any query. Empirical results on real-world search engine logs show significant gains over a strong baseline that uses single-keyword reformulations: a gain of 14% and 23% in terms of human-judged accuracy and click-through data respectively, and around 20% in terms of consistency among aspects predicted for "similar" queries. This demonstrates both the importance of broad query aspects, and the efficacy of our algorithms for extracting them.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J. A. Aslam, E. Pelekov, and D. Rus. The star clustering algorithm for static and dynamic information organization. Journal of Graph Algorithms and Applications, 8(1):95--129, 2004.
 
2
3
4
5
6
7
8
9
10
 
11
S. Cucerzan and E. Brill. Extracting semantically related queries by exploiting user session information. http://research.
 
12
microsoft.com/users/silviu/Papers/np-www06.pdf.
13
14
15
 
16
17
18
 
19
M. Pasca and B. V. Durme. Weakly-supervised acquisition of open-domain classes and class attributes from web documents and query logs. In ACL, pages 19--27, 2008.
20
 
21
22
23

Collaborative Colleagues:
Xuanhui Wang: colleagues
Deepayan Chakrabarti: colleagues
Kunal Punera: colleagues