|
[1] T. Chi, Y. Gao, M. C. Guyton, P. Ru, and S. Shamma, "Spectro-temporal modulation transfer functions and speech intelligibility," The Journal of the Acoustical Society of America, vol. 106, p. 2719, 1999. [2] T. Chi, P. Ru, and S.A. Shamma, “Multi-resolution spectro-temporal analysis of complex sounds,” J. Acoust. Soc. Am., vol. 118, no. 2, pp. 887-906, 2005. [3] B. Schuller, G. Rigoll, and M. Lang, “Hidden Markov Model-Based Speech Emotion Recognition,” Proc. ICASSP, 2003, vol. 2, pp. 1-4. [4] Dan-Ning Jiang, and Lian-Hong Cai, “Speech Emotion Classification with the Combination of Statistic Features and Temporal Features”, ICME, 2004, pp. 1967-1970. [5] V. Ververidis and C. Kotropoulos, “Emotional speech recognition: Resources, features, and methods,” Speech Comm., vol. 48, no. 9, pp. 1162–1181, September 2006. [6] Z. Zeng, M. Pantic, G. I. Rosiman, and T. S. Huang, “A survey of affect recognition methods: Audio, visual, and spontaneous expressions,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 31, no. 1, pp. 39–58, 2009. [7] B. Schuller, and G. Rigoll, “Timing Levels in Segment-Based Speech Emotion Recognition,” Proc. INTERSPEECH 2006, ICSLP, ISCA, pp.1818-1821, Pittsburgh, PA, 2006. [8] F Ringeval, and M Chetouani, “A vowel based approach for acted emotion recognition,” Proc. Interspeech, 2008. 57 [9] B. Schuller, G. Rigoll, and M. Lang, “Speech Emotion Recognition Combining Acoustic Features and linguistic information in a hybrid support vector machine-belief network architecture,” Proc. ICASSP, 2004, Vol. I, pp. 577-580. [10] Feng Yu, Eric Chang, Ying-Qing Xu, and Heung-Yeung Shum, “Emotion detection from speech to enrich multimedia content,” Proc. IEEE Pacific-Rim Conf. on Multimedia 2001, Vol. 1, pp. 550–557. 2001. [11] Tsang-Long Pao, Yu-Te Chen, Jun-Heng Yeh, and Pei-Jia Li, “Mandarin emotional speech recognition based on SVM and NN,” Proc. of the 18th International Conference on Pattern Recognition (ICPR’06), vol. 1, September 2006, p. 1096-0. [12] B. Schuller, D. Arsić, F. Wallhoff, and G. Rigoll, “Emotion Recognition in the Noise Applying Large Acoustic Feature Sets,” in Proc. Speech Prosody, 2006. [13] B. Schuller, D. Seppi, A. Batliner, A. Maier, and S. Steidl, “Towards More Reality in the Recognition of Emotional Speech,” Proc. ICASSP, 2007, Vol. IV, pp. 941-944. [14] Felix Burkhardt, Astrid Paeschke, Miriam Rolfes, Walter Sendlmeier und Benjamin Weiss, “A Database of German Emotional Speech”, Proc. Interspeech, Lissabon, Portugal, 2005, pp. 489-492. [15] S. Steidl, “Automatic Classification of Emotion-Related User States in Spontaneous Children’s Speech,” Logos Verlag, Berlin, 2009. [16] Chih-Chung Chang and Chih-Jen Lin, LIBSVM : a library for support vector machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm [17] A. Varga and H.J.M. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems," Speech Comm., vol.12(3), pp. 247-251, 1993. [18] B. Schuller, S. Steidl, and A. Batliner, “The INTERSPEESH 2009 Emotion Challenge,” Proc. Interspeech, 2009, pp. 312-315. 58 [19] H. Kawahara, Alain de Cheveign´e, H. Banno, T. Takahashi and T. Irino, “Nearly Defect-free F0 Trajectory Extraction for Expressive Speech Modifications based on STRAIGHT,” Proc. Interspeech, 2005, pp. 537-540. [20] F. Eyben, M. Wollmer, B. Schuller (2009): Speech and Music Interpretation by Large-Space Extraction, http://sourceforge.net/projects/openSMILE. [21] B. Schuller, M. Wöllmer, F. Eyben, and G. Rigoll, "Spectral or Voice Quality? Feature Type Relevance for the Discrimination of Emotion Pairs," in The Role of Prosody in Affective Speech, Linguistic Insights, Studies in Language and Communication, Vol. 97, Slyvie Hancil (ed.), Peter Lang Publishing Group, ISBN 978-3-03911-696-6, pp. 285-307, 2009. [22] P. Pudil, F.J. Ferri, J. Novovicova, and J. Kittler, “Floating search methods for feature selection with nonmonotonic criterion functions,” Proc. international Conference on Computer Vision & Image Processing, pp. 279-283, 1994. [23] N. V. Chawla, L. O. Hall, K. W. Bowyer, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Oversampling Technique,” Journal of Artificial Intelligence Research 16, pp. 321-357, 2002. [24] G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A study of the behavior of several methods for balancing machine learning training data,” ACM SIGKDD Explorations Newsletter, vol. 6 , issue 1, pp. 20 – 29, 2004.
|