|
ABSTRACT
Connections is a file system search tool that combines traditional content-based search with context information gathered from user activity. By tracing file system calls, Connections can identify temporal relationships between files and use them to expand and reorder traditional content search results. Doing so improves both recall (reducing false-positives) and precision (reducing false-negatives). For example, Connections improves the average recall (from 13% to 22%) and precision (from 23% to 29%) on the first ten results. When averaged across all recall levels, Connections improves precision from 17% to 28%. Connections provides these benefits with only modest increases in average query time (2 seconds), indexing time (23 seconds daily), and index size(under 1% of the user's data set).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
N. Abdul-Jaleel, A. Corrada-Emmanuel, Q. Li, X. Liu, C. Wade, and J. Allan. UMass at TREC 2003: HARD and QA. Text Retrieval Conference, pages 715--725, 2003.
|
| |
2
|
A. Amer, D. Long, J.-F. Paris, and R. Burns. File access prediction with adjustable accuracy. International Performance Conference on Computers and Communication. IEEE, 2002.
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
 |
6
|
Scott Fertig , Eric Freeman , David Gelernter, Lifestreams: an alternative to the desktop metaphor, Conference companion on Human factors in computing systems: common ground, p.410-411, April 13-18, 1996, Vancouver, British Columbia, Canada
[doi> 10.1145/257089.257404]
|
 |
7
|
Jim Gemmell , Gordon Bell , Roger Lueder , Steven Drucker , Curtis Wong, MyLifeBits: fulfilling the Memex vision, Proceedings of the tenth ACM international conference on Multimedia, December 01-06, 2002, Juan-les-Pins, France
[doi> 10.1145/641007.641053]
|
| |
8
|
|
 |
9
|
David K. Gifford , Pierre Jouvelot , Mark A. Sheldon , James W. O'Toole, Jr., Semantic file systems, Proceedings of the thirteenth ACM symposium on Operating systems principles, p.16-25, October 13-16, 1991, Pacific Grove, California, United States
|
| |
10
|
Google, http://www.google.com/.
|
| |
11
|
Google Desktop, http://desktop.google.com/.
|
| |
12
|
|
| |
13
|
J. Griffioen and R. Appleton. Reducing file system latency using a predictive approach. Summer USENIX Technical Conference, pages 197--207. USENIX Association, 1994.
|
| |
14
|
Grokker, http://www.grokker.com/.
|
| |
15
|
T. H. Haveliwala. Topic-sensitive PageRank: a context-sensitive ranking algorithm for web search. IEEE Transactions on Knowledge and Data Engineering, 15(4):784--796. IEEE, August 2003.
|
| |
16
|
B. Hayes. Terabyte territory. American Scientist, 90(3):212--216, May--June 2002.
|
| |
17
|
E. Horvitz, J. Breese, D. Heckerman, D. Hovel, and K. Rommelse. The Lumiere project: bayesian user modeling for inferring the goals and needs of software users. Conference on Uncertainty in Artificial Intelligence, pages 256--265, 1998.
|
 |
18
|
|
 |
19
|
|
 |
20
|
|
| |
21
|
T. M. Kroeger and D. D. E. Long. Predicting file system actions from prior events. USENIX Annual Technical Conference, pages 319--328. USENIX Association, 1996.
|
| |
22
|
G. H. Kuenning. Seer: predictive file hoarding for disconnected mobile operation. Technical Report UCLA--CSD--970015. University of California, Los Angeles, May 1997.
|
| |
23
|
L. S. Larkey, J. Allan, M. E. Connell, A. Bolivar, and C. Wade. UMass at TREC 2002: cross language and novelty tracks. Text Retrieval Conference, 2002.
|
| |
24
|
H. Lei and D. Duchamp. An analytical approach to file prefetching. USENIX Annual Technical Conference. USENIX Association,, 1997.
|
| |
25
|
The Lemur Toolkit, http://www.lemurproject.org/.
|
| |
26
|
Lycos, http://www.lycos.com/.
|
| |
27
|
|
| |
28
|
U. Manber, M. Smith, and B. Gopal. WebGlimpse: combining browsing and searching. USENIX Annual Technical Conference. USENIX Association, 1997.
|
| |
29
|
U. Manber and S. Wu. GLIMPSE: a tool to search through entire file systems. Winter USENIX Technical Conference, pages 23--32. USENIX Association, 1994.
|
 |
30
|
|
| |
31
|
|
| |
32
|
D. Metzler, T. Strohman, H. Turtle, and W. B. Croft. Indri at TREC 2004: terabyte track. Text Retrieval Conference, 2004.
|
| |
33
|
M. A. Olson, K. Bostic, and M. Seltzer. Berkeley DB. Summer USENIX Technical Conference. USENIX Association, 1999.
|
| |
34
|
D. Quan, D. Huynh, and D. R. Karger. Haystack: a platform for authoring end user semantic web applications. International Semantic Web Conference, 2003.
|
| |
35
|
|
| |
36
|
B. Rhodes and T. Starner. The Remembrance Agent: a continuously running automated information retrieval system. International Conference on The Practical Application of Intelligent Agents and Multi Agent Technology, pages 487--495, 1996.
|
| |
37
|
S. Sechrest and M. McClennen. Blending hierarchical and attribute-based file naming. International Conference on Distributed Computing Systems, pages 572--580, 1992.
|
 |
38
|
Jaime Teevan , Christine Alvarado , Mark S. Ackerman , David R. Karger, The perfect search engine is not enough: a study of orienteering behavior in directed search, Proceedings of the SIGCHI conference on Human factors in computing systems, p.415-422, April 24-29, 2004, Vienna, Austria
[doi> 10.1145/985692.985745]
|
 |
39
|
|
| |
40
|
Terrier Information Retrieval Platform, http://ir.dcs.gla.ac.uk/terrier/.
|
 |
41
|
|
| |
42
|
E. M. Voorhees. Overview of TREC 2003. Text Retrieval Conference. NIST, 2003.
|
| |
43
|
Merriam-Webster OnLine, http://www.m-w.com/.
|
| |
44
|
X1 Desktop Search, http://www.x1.com/.
|
| |
45
|
Yahoo!, http://www.yahoo.com/.
|
CITED BY 17
|
|
Sam Shah , Craig A. N. Soules , Gregory R. Ganger , Brian D. Noble, Using provenance to aid in personal file search, 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference, p.1-14, June 17-22, 2007, Santa Clara, CA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jonathan Koren , Andrew Leung , Yi Zhang , Carlos Maltzahn , Sasha Ames , Ethan Miller, Searching and navigating petabyte-scale file systems based on facets, Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07, November 11-11, 2007, Reno, Nevada
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
P. L. Bradshaw , K. W. Brannon , T. Clark , K. Dahman , S. Doraiswamy , L. Duyanovich , B. L. Hillsberg , W. Hineman , M. Kaczmarski , B. J. Klingenberg , X. Ma , R. Rees, Archive storage system design for long-term storage of massive amounts of data, IBM Journal of Research and Development, v.52 n.4, p.379-388, July 2008
|
|
|
Anna Povzner , Kimberly Keeton , Arif Merchant , Charles B. Morrey, III , Mustafa Uysal , Marcos K. Aguilera, Autograph: automatically extracting workflow file signatures, ACM SIGOPS Operating Systems Review, v.43 n.1, January 2009
|
|
|
Brandon Salmon , Steven W. Schlosser , Lorrie Faith Cranor , Gregory R. Ganger, Perspective: semantic data management for the home, Proccedings of the 7th conference on File and stroage technologies, p.167-182, February 24-27, 2009, San Francisco, California
|
|
|
|
|
|
|
|