|
ABSTRACT
Accurately finding audio recordings in response to symbolic queries is one of the key challenges in the field of music information retrieval. Pitch is one of the main features of music; in this paper we propose and evaluate approaches for using pitch information in polyphonic symbolic queries to retrieve full tracks of audio recordings. The audio data is first converted into symbolic data, using an automated transcription process. This is a noisy process, adding up to three times as many notes to the transcription than are actually present. Nevertheless, recordings can be accurately retrieved by manually-constructed queries (either in full or truncated) using the longest common subsequence algorithm (and a sliding window if the queries are truncated). Precision at 1 of about 80% was achieved, and around 85% of queries return correct answers in the top 10 from a collection of 1808 recordings. Truncated queries are as effective as untruncated queries for retrieving correct answers in the first rank position. Thus, the burden on users is reduced as they only need to produce a small fraction of a song as a query.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
J.-J. Aucouturier and M. Sandler. Using long-term structure to retrieve music: Representation and matching. In Downie and Bainbridge {10}, pages 1--2.
|
| |
2
|
J.-J. Aucouturier and M. Sandler. Finding repeating patterns in acoustical musical signals: Applications for audio thumbnailing. In Proceedings of the Audio Engineering Society 22nd International Conference on Virtual, Synthetic and Entertainment Audio, pages 412--421, Espoo, Finland, June 2002.
|
| |
3
|
|
| |
4
|
C. Buckley and E. M. Voorhees. Retrieval system evaluation. In E. M. Voorhees and D. K. Harman, editors, TREC: Experiment and Evaluation in Information Retrieval, pages 53--78. MIT Press, Cambridge, USA, Sept. 2005.
|
| |
5
|
C. L. Buyoli and R. Loureiro, editors. Proceedings of the Fifth International Conference on Music Information Retrieval, Barcelona, Spain, Oct. 2004. Audiovisual Institute Pompeu Fabra University.
|
| |
6
|
L. P. Clarisse, J. P. Martens, M. Lesaffre, B. D. Baets, H. D. Meyer, and M. Leman. An auditory model based transcriber of singing sequences. In Fingerhut {12}, pages 116--123.
|
| |
7
|
|
| |
8
|
R. Dannenberg, K. Lemstrüm, and A. Tindale, editors. Proceedings of the Seventh International Conference on Music Information Retrieval, Victoria, Canada, Oct. 2006. University of Victoria.
|
| |
9
|
R. B. Dannenberg, W. P. Birmingham, G. Tzanetakis, C. Meek, N. Hu, and B. Pardo. The Musart testbed for query-by-humming evaluation. In Hoos and Bainbridge {18}, pages 41--47.
|
| |
10
|
J. S. Downie and D. Bainbridge, editors. Proceedings of the Second International Symposium on Music Information Retrieval, Bloomington, USA, Oct. 2001.
|
| |
11
|
J. Eggink and G. J. Brown. Extracting melody lines from complex audio. In Buyoli and Loureiro {5}, pages 84--91.
|
| |
12
|
M. Fingerhut, editor. Proceedings of the Third International Conference on Music Information Retrieval, Paris, France, Oct. 2002. IRCAM-Centre Pompidou.
|
| |
13
|
J. Foote. Arthur: Retrieving orchestral music by long-term structure. In D. Byrd, J. S. Downie, T. Crawford, W. B. Croft, and C. Nevill-Manning, editors, Proceedings of the First International Symposium on Music Information Retrieval, Plymouth, USA, Oct. 2000.
|
| |
14
|
D. C. Giancoli. Physics: Principles with Applications. Pearson Education, New Jersey, USA, sixth edition, 2005.
|
| |
15
|
E. Gómez and P. Herrera. The song remains the same: Identifying versions of the same piece using tonal descriptors. In Dannenberg et al. {8}, pages 180--185.
|
| |
16
|
A. Guo and H. Siegelmann. Time-warped longest common subsequence algorithm for music retrieval. In Buyoli and Loureiro {5}, pages 258--261.
|
| |
17
|
|
| |
18
|
H. H. Hoos and D. Bainbridge, editors. Proceedings of the Fourth International Conference on Music Information Retrieval, Baltimore, USA, Oct. 2003. Johns Hopkins University.
|
 |
19
|
|
| |
20
|
N. Hu, R. B. Dannenberg, and G. Tzanetakis. Polyphonic audio matching and alignment for music retrieval. In Proceedings of the 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, pages 185--188, New Paltz, USA, Oct. 2003.
|
| |
21
|
J.-S. R. Jang, C.-L. Hsu, and H.-R. Lee. Continuous HMM and its enhancement for singing/humming query retrieval. In Reiss and Wiggins {34}, pages 546--551.
|
| |
22
|
A. Klapuri. Multiple fundamental frequency estimation by summing harmonic amplitudes. In Dannenberg et al. {8}, pages 216--221.
|
| |
23
|
M. Marolt. Gaussian mixture models for extraction of melodic lines from audio recordings. In Buyoli and Loureiro {5}, pages 80--83.
|
| |
24
|
M. Marolt. A mid-leve melody-based representation for calculating audio similarity. In Dannenberg et al. {8}, pages 280--285.
|
| |
25
|
D. Mazzoni and R. B. Dannenberg. Melody matching directly from audio. In Downie and Bainbridge {10}, pages 17--18.
|
| |
26
|
M. Müller, F. Kurth, and M. Clausen. Audio matching via chroma-based statistical features. In Reiss and Wiggins {34}, pages 288--295.
|
| |
27
|
M. Müller, F. Kurth, and T. Röder. Towards an efficient algorithm for automatic score-to-audio synchronization. In Buyoli and Loureiro {5}, pages 365--372.
|
| |
28
|
R. P. Paiva, T. Mendes, and A. Cardoso. On the detection of melody notes in polyphonic audio. In Reiss and Wiggins {34}, pages 175--182.
|
| |
29
|
B. Pardo and M. Sanghi. Polyphonic musical sequence alignment for database search. In Reiss and Wiggins {34}, pages 215--222.
|
| |
30
|
J. Pickens, J. P. Bello, G. Monti, T. Crawford, M. Dovey, M. Sandler, and D. Byrd. Polyphonic score retrieval using polyphonic audio quries: A harmonic modelling approach. In Fingerhut {12}, pages 140--149.
|
| |
31
|
J. Pickens and C. Iliopoulos. Markov random fields and maximum entropy modeling for music information retrieval. In Reiss and Wiggins {34}, pages 207--214.
|
| |
32
|
G. E. Poliner and D. P.W. Ellis. A classification approach to melody transcription. In Reiss and Wiggins {34}, pages 161--166.
|
| |
33
|
C. Raphael. A hybrid graphical model for aligning polyphonic audio with musical scores. In Buyoli and Loureiro {5}, pages 387--394.
|
| |
34
|
J. D. Reiss and G. A. Wiggins, editors. Proceedings of the Sixth International Conference on Music Information Retrieval, London, UK, Sept. 2005. Queen Mary, University of London.
|
 |
35
|
|
| |
36
|
S. Shalev-Shwartz, J. Keshet, and Y. Singer. Learning to align polyphonic music. In Buyoli and Loureiro {5}, pages 381--386.
|
| |
37
|
J. Shifrin and W. P. Birmingham. Effectiveness of HMM-based retrieval on large databases. In Hoos and Bainbridge {18}, pages 33--39.
|
| |
38
|
F. Soulez, X. Rodet, and D. Schwarz. Improving polyphonic and poly-instrumental music to score alignment. In Hoos and Bainbridge {18}, pages 143--148.
|
| |
39
|
I. S. H. Suyoto and A. L. Uitdenbogerd. Effectiveness of note duration information for music retrieval. In L. Zhou, B. C. Ooi, and X. Meng, editors, Proceedings of the Tenth International Conference on Database Systems for Advanced Applications, pages 265--275, Beijing, China, Apr. 2005. Springer-Verlag. Published as LNCS 3453.
|
| |
40
|
I. S. H. Suyoto and A. L. Uitdenbogerd. Aligning musical audio with symbols: A case study in Western classical music. Technical Report TR-07-1, School of Computer Science and Information Technology, RMIT, Mar. 2007. http://mirt.cs.rmit.edu.au/pubs/sbdg/.
|
| |
41
|
R. Typke, F. Wiering, and R. C. Veltkamp. A search method for notated polyphonic music with pitch and tempo fluctuations. In Buyoli and Loureiro {5}, pages 281--288.
|
| |
42
|
A. L. Uitdenbogerd. Music Information Retrieval Technology. PhD thesis, School of Computer Science and Information Technology, RMIT, Melbourne, Australia, 2002.
|
 |
43
|
|
 |
44
|
|
INDEX TERMS
Primary Classification:
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.3
Information Search and Retrieval
Subjects:
Query formulation
Additional Classification:
H.
Information Systems
H.3
INFORMATION STORAGE AND RETRIEVAL
H.3.3
Information Search and Retrieval
Subjects:
Search process;
Retrieval models
H.5
INFORMATION INTERFACES AND PRESENTATION (I.7)
H.5.5
Sound and Music Computing
Subjects:
Methodologies and techniques;
Signal analysis, synthesis, and processing
General Terms:
Algorithms,
Experimentation,
Measurement,
Performance
Keywords:
audio,
dynamic programming,
longest common subsequence,
music information retrieval,
polyphony,
symbolic,
transcription
|