ACM Home Page
Please provide us with feedback. Feedback
Scalable near identical image and shot detection
Full text PdfPdf (2.68 MB)
Source Conference On Image And Video Retrieval archive
Proceedings of the 6th ACM international conference on Image and video retrieval table of contents
Amsterdam, The Netherlands
Pages: 549 - 556  
Year of Publication: 2007
ISBN:978-1-59593-733-9
Authors
Ondřej Chum  University of Oxford
James Philbin  University of Oxford
Michael Isard  Silicon Valley
Andrew Zisserman  University of Oxford
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 27,   Downloads (12 Months): 212,   Citation Count: 6
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1282280.1282359
What is a DOI?

ABSTRACT

This paper proposes and compares two novel schemes for near duplicate image and video-shot detection. The first approach is based on global hierarchical colour histograms, using Locality Sensitive Hashing for fast retrieval. The second approach uses local feature descriptors (SIFT) and for retrieval exploits techniques used in the information retrieval community to compute approximate set intersections between documents using a min-Hash algorithm.

The requirements for near-duplicate images vary according to the application, and we address two types of near duplicate definition: (i) being perceptually identical (e.g. up to noise, discretization effects, small photometric distortions etc); and (ii) being images of the same 3D scene (so allowing for viewpoint changes and partial occlusion). We define two shots to be near-duplicates if they share a large percentage of near-duplicate frames.

We focus primarily on scalability to very large image and video databases, where fast query processing is necessary. Both methods are designed so that only a small amount of data need be stored for each image. In the case of near-duplicate shot detection it is shown that a weak approximation to histogram matching, consuming substantially less storage, is sufficient for good results. We demonstrate our methods on the TRECVID 2006 data set which contains approximately 165 hours of video (about 17.8M frames with 146K key frames), and also on feature films and pop videos.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. Bertini, A. D. Bimbo, and W. Nunziati. Video clip matching using mpeg-7 descriptors and edit distance. In CIVR, pages 133--142, 2006.
 
2
3
 
4
5
6
 
7
 
8
A. Joly, O. Buisson, and C. Frélicot. Content-based copy detection using distortion-based probabilistic similarity search. IEEE Transactions on Multimedia, to appear, 2007.
 
9
A. Joly, C. Frelicot, and O. Buisson. Robust content-based video copy identification in a large reference database. In Proc. CIVR, 2003.
10
 
11
 
12
 
13
K. Mikolajczyk and C. Schmid. A performance evaluation of local descriptors. In Proc. CVPR, 2003.
 
14
 
15
 
16
T. Quack, V. Ferrari, and L. Van Gool. Video mining with frequent itemset configurations. In Proc. CIVR, 2006.
 
17
 
18
 
19
 
20
TRECVID. http://trecvid.nist.gov/.
 
21
Wikipedia. Come into my world. http://en.wikipedia.org/wiki/Come_Into_My_World.
 
22
YouTube. http://www.youtube.com/.
23
24

CITED BY  7

Collaborative Colleagues:
Ondřej Chum: colleagues
James Philbin: colleagues
Michael Isard: colleagues
Andrew Zisserman: colleagues