|
ABSTRACT
Often, publishers are reluctant to offer valuable digital documents
on the Internet for fear that they will be re-transmitted or copied
widely. A Copy Detection Mechanism can help identify such copying.
For example, publishers may register their documents with a copy
detection server, and the server can then automatically check
public sources such as UseNet articles and Web sites for potential
illegal copies. The server can search for exact copies, and also
for cases where significant portions of documents have been copied.
In this paper we study, for the first time, the performance of
various copy detection mechanisms, including the disk storage
requirements, main memory requirements, response times for
registration, and response time for querying. We also contrast
performance to the accuracy of the mechanisms (how well they detect
partial copies). The results are obtained using SCAM, an
experimental server we have implemented, and a collection of 50,000
netnews articles.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J. Brassil, S. Low, N. Maxemchuk, and L.O'Gorman. Document marking and identification using both line and word shifting. Technical report, AT&T Bell Labratories, 1994. May be obtained from ftp://ftp.research.att.com/dist/brassil/docmarkZ.ps.
|
| |
2
|
J. Brassil, S. Low, N. Maxemchuk, and L.O'Gorman. Electronic marking and identification techniques to discourage document copying. Technical report, AT~T Bell Labratories, 1994.
|
 |
3
|
Sergey Brin , James Davis , Héctor García-Molina, Copy detection mechanisms for digital documents, Proceedings of the 1995 ACM SIGMOD international conference on Management of data, p.398-409, May 22-25, 1995, San Jose, California, United States
|
| |
4
|
A. Choudhury, N. Maxemchuk, S. Paul, and H. Schulzrinne. Copyright protection for electronic publishing over computer networks. Technical report, AT&T Bell Labratories, 1994. Submitted to IEEE Network Magazine June 1994.
|
 |
5
|
|
| |
6
|
|
| |
7
|
G. N. Griswold. A method for protecting copyright on networks. In Joint Harvard MIT Workshop on Technology Strategies .for Protecting Intellectual Property in the Networked Multimedia Environment, April 1993.
|
| |
8
|
|
| |
9
|
U. Manber and S. Wu. Glimpse: A tool to search through entire file systems. In Proceedings of the winter USENIX Conference, January 1994.
|
| |
10
|
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
N. Shivakumar and H. Garcia-Molina. SCAM: A copy detection mechanism for digital documents. In Proceedings of 2rid International Conference in Theory and Practice o} Digital Libraries (DL'95), Austin, Texas, June 1995.
|
| |
15
|
D. Wheeler. Computer networks are said to offer new opportunities for plagiarists. The Chronicle of Higher Education, pages t7, 19, June 1993.
|
| |
16
|
T. Yah and H. Garcia-Molina. Duplicate detection in information dissemination. In Proceedings of Very Large Databases (VLDB'95) Conference, Zurich, Switzerland, September 1995.
|
CITED BY 22
|
|
|
|
|
|
|
|
|
|
|
Andrew B. Kahng , Darko Kirovski , Stefanus Mantik , Miodrag Potkonjak , Jennifer L. Wong, Copy detection for intellectual property protection of VLSI designs, Proceedings of the 1999 IEEE/ACM international conference on Computer-aided design, p.600-605, November 07-11, 1999, San Jose, California, United States
|
|
|
|
|
|
|
|
|
|
|
|
Edith Cohen , Mayur Datar , Shinji Fujiwara , Aristides Gionis , Piotr Indyk , Rajeev Motwani , Jeffrey D. Ullman , Cheng Yang, Finding Interesting Associations without Support Pruning, IEEE Transactions on Knowledge and Data Engineering, v.13 n.1, p.64-78, January 2001
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Ludmila Cherkasova , Kave Eshghi , Charles B. Morrey , Joseph Tucek , Alistair Veitch, Applying syntactic similarity algorithms for enterprise information management, Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining, June 28-July 01, 2009, Paris, France
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Scott Huffman , April Lehman , Alexei Stolboushkin , Howard Wong-Toi , Fan Yang , Hein Roehrig, Multiple-signal duplicate detection for search evaluation, Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, July 23-27, 2007, Amsterdam, The Netherlands
|
|
|
Bart Thomee , Mark J. Huiskes , Erwin Bakker , Michael S. Lew, Large scale image copy detection evaluation, Proceeding of the 1st ACM international conference on Multimedia information retrieval, October 30-31, 2008, Vancouver, British Columbia, Canada
|
|
|
Lior Aronovich , Ron Asher , Eitan Bachmat , Haim Bitner , Michael Hirsch , Shmuel T. Klein, The design of a similarity based deduplication system, Proceedings of SYSTOR 2009: The Israeli Experimental Systems Conference, May 04-April 06, 2009, Haifa, Israel
|
|
|
|
|