|
ABSTRACT
Information dissemination is a powerful mechanism for finding information in wide-area environments. An information dissemination server accepts long-term user queries, collects new documents from information sources, matches the documents against the queries, and continuously updates the users with relevant information. This paper is a retrospective of the Stanford Information Filtering Service (SIFT), a system that as of April 1996 was processing over 40,000 worldwide subscriptions and over 80,000 daily documents. The paper describes some of the indexing mechanisms that were developed for SIFT, as well as the evaluations that were conducted to select a scheme to implement. It also describes the implementation of SIFT, and experimental results for the actual system. Finally, it also discusses and experimentally evaluates techniques for distributing a service such as SIFT for added performance and availability.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
|
| |
3
|
BALABANOVIC, M. AND SHOHAM, Y. 1995. Learning information retrieval agents: Experiments with automated web browsing. In Proceedings of the 1995 AAAI Spring Symposium on Information Gathering from Heterogenous Distributed Environments (Stanford, CA, Mar.), AAAI Press, Menlo Park, CA.
|
 |
4
|
|
 |
5
|
|
| |
6
|
|
| |
7
|
COHEN, D. 1992. A format for e-mailing bibliographical records. Tech. Rep. RFC-1357. DDN Network Information Center, SRI International, Menlo Park, CA.
|
| |
8
|
DAO, S. AND PERRY, B. 1996. Information dissemination in hybrid satellite/terrestrial networks. IEEE Data Eng. Tech. Bull. 19, 3, 12-19.
|
| |
9
|
FRANKLIN, M., Ed. 1996. Special issue on data dissemination. IEEE Data Eng. Tech. Bull. 19, 3, 3-54.
|
 |
10
|
|
| |
11
|
FRANKLIN, M. AND ZDONIK, S. 1996. Dissemination-based information systems. IEEE Data Eng. Tech. Bull. 19, 3, 20-30.
|
| |
12
|
FRIEDBERG, S., INSEL, n., AND SPENCE, L. 1989. Linear Algebra. Prentice-Hall, Inc., Englewood Cliffs, NJ.
|
 |
13
|
David K. Gifford , Robert W. Baldwin , Stephen T. Berlin , John M. Lucassen, An architecture for large scale information systems, Proceedings of the tenth ACM symposium on Operating systems principles, p.161-170, December 1985, Orcas Island, Washington, United States
|
| |
14
|
GLANCE, D. 1996. Multicast support for data dissemination in orbixtalk. IEEE Data Eng. Tech. Bull. 19, 3, 31-39.
|
 |
15
|
|
| |
16
|
HARMON, D., Ed. 1993. Proceedings of the First Conference on Text Retrieval Conference.. National Institute of Standards and Technology, Gaithersburg, MD.
|
| |
17
|
HARMAN, D. K., Ed. 1994. Proceedings of the 2nd Conference on Text Retrieval. (TREC-2). National Institute of Standards and Technology, Gaithersburg, MD.
|
| |
18
|
HARMON, D., Ed. 1995. Proceedings of the 3rd Text Retrieval Conference. (TREC-3). National Institute of Standards and Technology, Gaithersburg, MD.
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
LONG, D. D. E., CARROLL, J. L., AND PARK, C.J. 1991. A study of the reliability of Internet sites. In Proceedings of the l Oth IEEE Symposium on Reliable Distributed Systems (Pisa, Italy, Sept.), IEEE Press, Piscataway, NJ, 177-186.
|
 |
23
|
|
 |
24
|
|
| |
25
|
MARIMBA, 1997. http://www.marimba.com. Marimba, Palo Alto, CA.
|
 |
26
|
|
| |
27
|
|
| |
28
|
POINTCAST, 1997. http://www.pointcast.com. Pointcast, Sunnyvale, CA.
|
| |
29
|
PORTER, M. F. 1980. An algorithm for suffix stripping. Program: Autom. Libr. Inf. Syst. 14, 3, 130-137.
|
| |
30
|
|
| |
31
|
|
| |
32
|
SALTON, G. 1991. Global text matching for information retrieval. Science 253, 1012-1015.
|
| |
33
|
SHEKHAR, S., FETTERER, A., AND LIU, D. 1996. Genesis: An approach to data dissemination in advanced traveler information systems. IEEE Data Eng. Tech. Bull. 19, 3, 41-47.
|
| |
34
|
SHETH, B. 1994. A learning approach to personalized information filtering. Master's Thesis. MIT Laboratory for Computer Science, Cambridge, MA.
|
| |
35
|
|
| |
36
|
TERRY, D. 1992. Replication in an information filtering system. In Proceedings of the 2nd Workshop on Management of Replicated Data (Monterey, CA, Nov.), 66-67.
|
 |
37
|
|
| |
38
|
|
| |
39
|
|
 |
40
|
|
| |
41
|
|
| |
42
|
YAN, T. AND GARCIA-MOLINA, H. 1993. Index structures for selective dissemination of information under the boolean model. Tech. Rep. STAN-CS-92-1454. Stanford University, Stanford, CA.
|
| |
43
|
|
| |
44
|
|
 |
45
|
|
| |
46
|
|
| |
47
|
YAN, T. W. AND GARC A-MOLINA, H. 1995. SIFT--a tool for wide-area information dissemination. In Proceedings of the 1995 Conference on USENIX Technical (Jan.), USENIX Assoc., Berkeley, CA, 177-186.
|
| |
48
|
ZIPF, G. K. 1949. Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Addison-Wesley, Reading, MA.
|
CITED BY 37
|
|
|
|
|
|
|
|
Françoise Fabret , H. Arno Jacobsen , François Llirbat , Joăo Pereira , Kenneth A. Ross , Dennis Shasha, Filtering algorithms and implementation for very fast publish/subscribe systems, ACM SIGMOD Record, v.30 n.2, p.115-126, June 2001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Marcos André Gonçalves , Edward A. Fox , Layne T. Watson , Neill A. Kipp, Streams, structures, spaces, scenarios, societies (5s): A formal model for digital libraries, ACM Transactions on Information Systems (TOIS), v.22 n.2, p.270-312, April 2004
|
|
|
Antonio Carzaniga , Alexander L. Wolf, Forwarding in a content-based network, Proceedings of the 2003 conference on Applications, technologies, architectures, and protocols for computer communications, August 25-29, 2003, Karlsruhe, Germany
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Utku Irmak , Svilen Mihaylov , Torsten Suel , Samrat Ganguly , Rauf Izmailov, Efficient query subscription processing for prospective search engines, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|