ACM Home Page
Please provide us with feedback. Feedback
Effective retrieval of polyphonic audio with polyphonic symbolic queries
Full text PdfPdf (398 KB)
Source
International Multimedia Conference archive
Proceedings of the international workshop on Workshop on multimedia information retrieval table of contents
Augsburg, Bavaria, Germany
POSTER SESSION: Multimedia retrieval and modeling table of contents
Pages: 105 - 114  
Year of Publication: 2007
ISBN:978-1-59593-778-0
Authors
Iman S. H. Suyoto  RMIT, Melbourne, Australia
Alexandra L. Uitdenbogerd  RMIT, Melbourne, Australia
Falk Scholer  RMIT, Melbourne, Australia
Sponsors
SIGMULTIMEDIA: ACM Special Interest Group on Multimedia
ACM: Association for Computing Machinery
SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 18,   Downloads (12 Months): 83,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1290082.1290100
What is a DOI?

ABSTRACT

Accurately finding audio recordings in response to symbolic queries is one of the key challenges in the field of music information retrieval. Pitch is one of the main features of music; in this paper we propose and evaluate approaches for using pitch information in polyphonic symbolic queries to retrieve full tracks of audio recordings. The audio data is first converted into symbolic data, using an automated transcription process. This is a noisy process, adding up to three times as many notes to the transcription than are actually present. Nevertheless, recordings can be accurately retrieved by manually-constructed queries (either in full or truncated) using the longest common subsequence algorithm (and a sliding window if the queries are truncated). Precision at 1 of about 80% was achieved, and around 85% of queries return correct answers in the top 10 from a collection of 1808 recordings. Truncated queries are as effective as untruncated queries for retrieving correct answers in the first rank position. Thus, the burden on users is reduced as they only need to produce a small fraction of a song as a query.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
J.-J. Aucouturier and M. Sandler. Using long-term structure to retrieve music: Representation and matching. In Downie and Bainbridge {10}, pages 1--2.
 
2
J.-J. Aucouturier and M. Sandler. Finding repeating patterns in acoustical musical signals: Applications for audio thumbnailing. In Proceedings of the Audio Engineering Society 22nd International Conference on Virtual, Synthetic and Entertainment Audio, pages 412--421, Espoo, Finland, June 2002.
 
3
 
4
C. Buckley and E. M. Voorhees. Retrieval system evaluation. In E. M. Voorhees and D. K. Harman, editors, TREC: Experiment and Evaluation in Information Retrieval, pages 53--78. MIT Press, Cambridge, USA, Sept. 2005.
 
5
C. L. Buyoli and R. Loureiro, editors. Proceedings of the Fifth International Conference on Music Information Retrieval, Barcelona, Spain, Oct. 2004. Audiovisual Institute Pompeu Fabra University.
 
6
L. P. Clarisse, J. P. Martens, M. Lesaffre, B. D. Baets, H. D. Meyer, and M. Leman. An auditory model based transcriber of singing sequences. In Fingerhut {12}, pages 116--123.
 
7
 
8
R. Dannenberg, K. Lemstrüm, and A. Tindale, editors. Proceedings of the Seventh International Conference on Music Information Retrieval, Victoria, Canada, Oct. 2006. University of Victoria.
 
9
R. B. Dannenberg, W. P. Birmingham, G. Tzanetakis, C. Meek, N. Hu, and B. Pardo. The Musart testbed for query-by-humming evaluation. In Hoos and Bainbridge {18}, pages 41--47.
 
10
J. S. Downie and D. Bainbridge, editors. Proceedings of the Second International Symposium on Music Information Retrieval, Bloomington, USA, Oct. 2001.
 
11
J. Eggink and G. J. Brown. Extracting melody lines from complex audio. In Buyoli and Loureiro {5}, pages 84--91.
 
12
M. Fingerhut, editor. Proceedings of the Third International Conference on Music Information Retrieval, Paris, France, Oct. 2002. IRCAM-Centre Pompidou.
 
13
J. Foote. Arthur: Retrieving orchestral music by long-term structure. In D. Byrd, J. S. Downie, T. Crawford, W. B. Croft, and C. Nevill-Manning, editors, Proceedings of the First International Symposium on Music Information Retrieval, Plymouth, USA, Oct. 2000.
 
14
D. C. Giancoli. Physics: Principles with Applications. Pearson Education, New Jersey, USA, sixth edition, 2005.
 
15
E. Gómez and P. Herrera. The song remains the same: Identifying versions of the same piece using tonal descriptors. In Dannenberg et al. {8}, pages 180--185.
 
16
A. Guo and H. Siegelmann. Time-warped longest common subsequence algorithm for music retrieval. In Buyoli and Loureiro {5}, pages 258--261.
 
17
 
18
H. H. Hoos and D. Bainbridge, editors. Proceedings of the Fourth International Conference on Music Information Retrieval, Baltimore, USA, Oct. 2003. Johns Hopkins University.
19
 
20
N. Hu, R. B. Dannenberg, and G. Tzanetakis. Polyphonic audio matching and alignment for music retrieval. In Proceedings of the 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pages 185--188, New Paltz, USA, Oct. 2003.
 
21
J.-S. R. Jang, C.-L. Hsu, and H.-R. Lee. Continuous HMM and its enhancement for singing/humming query retrieval. In Reiss and Wiggins {34}, pages 546--551.
 
22
A. Klapuri. Multiple fundamental frequency estimation by summing harmonic amplitudes. In Dannenberg et al. {8}, pages 216--221.
 
23
M. Marolt. Gaussian mixture models for extraction of melodic lines from audio recordings. In Buyoli and Loureiro {5}, pages 80--83.
 
24
M. Marolt. A mid-leve melody-based representation for calculating audio similarity. In Dannenberg et al. {8}, pages 280--285.
 
25
D. Mazzoni and R. B. Dannenberg. Melody matching directly from audio. In Downie and Bainbridge {10}, pages 17--18.
 
26
M. Müller, F. Kurth, and M. Clausen. Audio matching via chroma-based statistical features. In Reiss and Wiggins {34}, pages 288--295.
 
27
M. Müller, F. Kurth, and T. Röder. Towards an efficient algorithm for automatic score-to-audio synchronization. In Buyoli and Loureiro {5}, pages 365--372.
 
28
R. P. Paiva, T. Mendes, and A. Cardoso. On the detection of melody notes in polyphonic audio. In Reiss and Wiggins {34}, pages 175--182.
 
29
B. Pardo and M. Sanghi. Polyphonic musical sequence alignment for database search. In Reiss and Wiggins {34}, pages 215--222.
 
30
J. Pickens, J. P. Bello, G. Monti, T. Crawford, M. Dovey, M. Sandler, and D. Byrd. Polyphonic score retrieval using polyphonic audio quries: A harmonic modelling approach. In Fingerhut {12}, pages 140--149.
 
31
J. Pickens and C. Iliopoulos. Markov random fields and maximum entropy modeling for music information retrieval. In Reiss and Wiggins {34}, pages 207--214.
 
32
G. E. Poliner and D. P.W. Ellis. A classification approach to melody transcription. In Reiss and Wiggins {34}, pages 161--166.
 
33
C. Raphael. A hybrid graphical model for aligning polyphonic audio with musical scores. In Buyoli and Loureiro {5}, pages 387--394.
 
34
J. D. Reiss and G. A. Wiggins, editors. Proceedings of the Sixth International Conference on Music Information Retrieval, London, UK, Sept. 2005. Queen Mary, University of London.
35
 
36
S. Shalev-Shwartz, J. Keshet, and Y. Singer. Learning to align polyphonic music. In Buyoli and Loureiro {5}, pages 381--386.
 
37
J. Shifrin and W. P. Birmingham. Effectiveness of HMM-based retrieval on large databases. In Hoos and Bainbridge {18}, pages 33--39.
 
38
F. Soulez, X. Rodet, and D. Schwarz. Improving polyphonic and poly-instrumental music to score alignment. In Hoos and Bainbridge {18}, pages 143--148.
 
39
I. S. H. Suyoto and A. L. Uitdenbogerd. Effectiveness of note duration information for music retrieval. In L. Zhou, B. C. Ooi, and X. Meng, editors, Proceedings of the Tenth International Conference on Database Systems for Advanced Applications, pages 265--275, Beijing, China, Apr. 2005. Springer-Verlag. Published as LNCS 3453.
 
40
I. S. H. Suyoto and A. L. Uitdenbogerd. Aligning musical audio with symbols: A case study in Western classical music. Technical Report TR-07-1, School of Computer Science and Information Technology, RMIT, Mar. 2007. http://mirt.cs.rmit.edu.au/pubs/sbdg/.
 
41
R. Typke, F. Wiering, and R. C. Veltkamp. A search method for notated polyphonic music with pitch and tempo fluctuations. In Buyoli and Loureiro {5}, pages 281--288.
 
42
A. L. Uitdenbogerd. Music Information Retrieval Technology. PhD thesis, School of Computer Science and Information Technology, RMIT, Melbourne, Australia, 2002.
43
44

Collaborative Colleagues:
Iman S. H. Suyoto: colleagues
Alexandra L. Uitdenbogerd: colleagues
Falk Scholer: colleagues