ACM Home Page
Please provide us with feedback. Feedback
A novel framework for efficient automated singer identification in large music databases
Full text PdfPdf (893 KB)
Source
ACM Transactions on Information Systems (TOIS) archive
Volume 27 ,  Issue 3  (May 2009) table of contents
Article No. 18  
Year of Publication: 2009
ISSN:1046-8188
Authors
Jialie Shen  Singapore Management University, Singapore
John Shepherd  The University of New South Wales, Sidney, Australia
Bin Cui  Peking University, Beijing, China
Kian-Lee Tan  National University of Singapore, Kent Ridge, Singapore
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 66,   Downloads (12 Months): 241,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1508850.1508856
What is a DOI?

ABSTRACT

Over the past decade, there has been explosive growth in the availability of multimedia data, particularly image, video, and music. Because of this, content-based music retrieval has attracted attention from the multimedia database and information retrieval communities. Content-based music retrieval requires us to be able to automatically identify particular characteristics of music data. One such characteristic, useful in a range of applications, is the identification of the singer in a musical piece. Unfortunately, existing approaches to this problem suffer from either low accuracy or poor scalability. In this article, we propose a novel scheme, called Hybrid Singer Identifier (HSI), for efficient automated singer recognition. HSI uses multiple low-level features extracted from both vocal and nonvocal music segments to enhance the identification process; it achieves this via a hybrid architecture that builds profiles of individual singer characteristics based on statistical mixture models. An extensive experimental study on a large music database demonstrates the superiority of our method over state-of-the-art approaches in terms of effectiveness, efficiency, scalability, and robustness.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Bartsch, M. and Wakefield, G. 2004. Singing voice identification using spectral envelop estimation. IEEE Trans. Speech Aud. Process. 12, 100--109.
 
2
 
3
Berenzweig, A., Ellis, D. P. W., and Lawrence, S. 2002. Using voice segments to improve artist classification of music. In Proceedings of the AES 22nd International Conference on Virtual, Synthetic and Entertainment Audio. 119--122.
 
4
 
5
Berenzweig, A. L. and Ellis, D. P. W. 2001. Locating singing voice segments within music signals. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. 119--122.
6
 
7
 
8
Chang, C.-C. and Lin, C.-J. 2001. LIBSVM: A library for support vector machines. http://www.csie.ntu.edu.tw/~cjlin/libsvm.
 
9
10
 
11
Downie, J. S. 2006. The Music Information Retrieval Evaluation Exchange (MIREX). D-Lib Mag. 12, 12 (Dec.)
 
12
Downie, J. S., West, K., Ehmann, A., and Vincent, E. 2005b. The 2005 Music Information Retrieval Evaluation Exchange (MIREX 2005) preliminary overview. In Proceedings of the 6th International Conference on Music Information Retrieval (ISMIR). 320--323.
13
 
14
 
15
 
16
Hastie, T., Tibshirani, R., and Friedman, J. 2001. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Verlag, Berlin, Germany.
 
17
ISMIR. 2004. The Fifth International Conference on Music Information Retrieval. http://ismir2004.ismir.net/index.html.
 
18
Jordan, M. I. 1995. Why the logistic function? a tutorial discussion on probabilities and neural networks. Tech. rep. 9503. MIT, Cambridge, MA.
 
19
Kim, Y. E. and Whitman, B. 2002. Singer identification in popular music recordings using voice coding features. In Proceedings of the 3rd International Conference Music on Information Retrieval (ISMIR). 164--169.
 
20
Kim, Y. E., Williamson, D., and Pilli, S. 2006. Towards quantifying the album effect in artist identification. In Proceedings of the 7th International Conference Music Information Retrieval (ISMIR'06). 393--394.
21
 
22
Lebanon, G. and Lafferty, J. 2001. Boosting and maximum likelihood for exponential model and Bregman distances. In Advances in Neural Information Processing Systems 14 (Proceedings of NIPS). 110--121.
23
24
25
 
26
Livshin, A. and Rodet, X. 2004. Musical instrument identification in continuous recordings. In Proceedings of the 7th International Conference on Digital Audio Effects (DAFx). 222--227.
 
27
Lu, L., Zhang, H., and Li, S. Z. 2003. Content-based audio classification and segmentation by using support vector machines. Multimed. Syst. 8, 6, 482--492.
 
28
MIREX. 2005. Artist identification contest track. http://www.music-ir.org/evaluation/mirex-results/audio-artist/index.html.
 
29
MIREX. 2007. Artist identification contest track. http://www.music-ir.org/mirex2007/index.php/AudioArtistIdentificationResults.
30
31
32
 
33
 
34
Rabiner, L. and Schafer, R. 1978. Digital Processing of Speech Signals. Prentice-Hall, Englewood Cliffs, NJ.
 
35
Rissanen, J. 1978. Modeling by shortest data description. Automatica 14, 465--471.
 
36
 
37
Tolonen, T. and Karjalainen, M. 2000. A computationally efficient multipitch analysis model. IEEE Trans. Speech Aud. Process. 8, 4, 708--716.
 
38
Tsai, W. H. and Wang, H. M. 2006. Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals. IEEE Trans. Speech Aud. Process. 14, 1, 330--341.
 
39
Tsai, W. H., Wang, H. M., Rodgers, D., Cheng, S. S., and Yu, H. M. 2003. Blind clustering of popular music recordings based on singer voice characteristics. In Proceedings of the 4th international Conference on Music Information Retrieval (ISMIR). 167--173.
 
40
Vapnik, V. 1998. Statistical Learning Theory. John Wiley & Sons. New York, NY.
 
41
Whitman, B., Flake, G., and Lawrence, S. 2001. Artist detection in music with Minnowmatch. In Proceedings of the IEEE Workshop on Neural Networks for Signal Processing. 559--568.
 
42
Xu, C. S., Maddage, N., and Shao, X. 2005. Automatic music classification and summarization. IEEE Trans. Speech Aud. Process. 13, 3, 441--450.
 
43

Collaborative Colleagues:
Jialie Shen: colleagues
John Shepherd: colleagues
Bin Cui: colleagues
Kian-Lee Tan: colleagues