|
ABSTRACT
Perceptual features are motivated by human perception of sounds. In this paper, several perceptually-motivated features such as harmonic, vibrato and timbre are studied to detect singing voice segments in a song. In addition, singing formant and attack-decay envelope of the sound are also studied for acoustic feature formulation. The cepstral coefficients which reflect the timbre characteristics are formulated by combining information from harmonic content, vibrato, singing formant and attack-decay envelope of the sound. Bandpass filters that spread according to the octave frequency scale are used to extract vibrato and harmonic information. Several experiments are conducted using a database that includes 84 popular songs from commercially available CD recordings. The experiments show that the proposed feature formulation methods are effective.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Everest, F. A. The Master Handbook of Acoustics. New York, McGraw-Hill, 2001.
|
| |
3
|
Fujihara, H., Kitahara, T., Goto, M., Komatani, K., Ogata, T. and Okuno, H. G. F0 estimation method for singing voice in polyphonic audio signal based on statistical vocal model and viterbi search. in Proc. IEEE Int. Conf. Acoustics, Speech, and Signal Processing, 2006, vol. 5, pp. V-253--V-256.
|
| |
4
|
Goto, M. A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals. Speech Communication, vol. 43, no. 4, pp. 311--329, September 2004.
|
| |
5
|
Hackhaus, W. Die Ausgleichsvorgange. Zeitschrift fur Technische Physik, 1932.
|
| |
6
|
Mellody, M., Herseth, F. and Wakefield, G. H. Modal distribution analysis, synthesis, and perception of a soprano's sung vowels. J. Voice, vol. 15, pp. 469--482, December 2001.
|
| |
7
|
Nwe, T. L., Foo, S. W., and De Silva, L. C. Stress classification using subband based features. IEICE Trans. Information and Systems, Special Issue on Speech Information Processing, vol. E86-D, no.3, pp. 565--573, March 2003.
|
| |
8
|
Nwe, T. L. and Li, H. Exploring vibrato-motivated acoustic features for singer identification. IEEE Transactions, Audio, Speech and Language Processing: vol. 15, no. 2, 2007.
|
| |
9
|
|
| |
10
|
Sundberg, J. The Acoustics of The Singing Voice, Scientific American, 1977.
|
| |
11
|
Sundberg, J. The Science of Singing Voice. Northern Illinois University Press, 1987, ch. 8.
|
| |
12
|
Tzanetakis, G. Song-specific bootstrapping of singing voice structure. IEEE Int. Conf. Multimedia and Expo, 2004.
|
| |
13
|
Timmers, R., and Desain, P. Vibrato: Questions and answers from musicians and science. in Proc. Int. Conf. Music Perception and Cognition, England, 2000.
|
| |
14
|
"Vibrato", Word of the Day. Answers Corporation, 2006. Answers.com 13 Dec. 2006. http://www.answers.com/topic/vibrato
|
| |
15
|
Wakefield, G. H. and Bartsch, M. A. Where's Caruso? Singer identification by listener and machine. Cambridge Music Processing Colloquium, Cambridge, England, 2003.
|
| |
16
|
Winckell, F. Music, sound and sensation. Dover, NY, 1967.
|
| |
17
|
|
|