×

Fuzzy integral based information fusion for classification of highly confusable non-speech sounds. (English) Zbl 1140.68482

Summary: Acoustic event classification may help to describe acoustic scenes and contribute to improve the robustness of speech technologies. In this work, fusion of different information sources with the Fuzzy Integral (FI), and the associated Fuzzy Measure (FM), are applied to the problem of classifying a small set of highly confusable human non-speech sounds. As FI is a meaningful formalism for combining classifier outputs that can capture interactions among the various sources of information, it shows in our experiments a significantly better performance than that of any single classifier entering the FI fusion module. Actually, that FI decision-level fusion approach shows comparable results to the high-performing SVM feature-level fusion and thus it seems to be a good choice when feature-level fusion is not an option. We have also observed that the importance and the degree of interaction among the various feature types given by the FM can be used for feature selection, and gives a valuable insight into the problem.

MSC:

68T10 Pattern recognition, speech recognition

References:

[1] Kuncheva, L., ‘Fuzzy’ vs ‘Non-fuzzy’ in combining classifiers designed by boosting, IEEE Trans. Fuzzy Systems, 11, 6, 729-741 (2003)
[2] M. Sugeno, Theory of fuzzy integrals and its applications, Ph.D. Thesis, Tokyo Institute of Technology, 1974.; M. Sugeno, Theory of fuzzy integrals and its applications, Ph.D. Thesis, Tokyo Institute of Technology, 1974.
[3] Grabisch, M., Fuzzy integral in multi-criteria decision-making, Fuzzy Sets Systems, 69, 279-298 (1995) · Zbl 0845.90001
[4] S. Chang, S. Greenberg, Syllable-proximity evaluation in automatic speech recognition using fuzzy measures and a fuzzy integral, in: Proceedings of the 12th IEEE Fuzzy Systems Conference, 2003, pp. 828-833.; S. Chang, S. Greenberg, Syllable-proximity evaluation in automatic speech recognition using fuzzy measures and a fuzzy integral, in: Proceedings of the 12th IEEE Fuzzy Systems Conference, 2003, pp. 828-833.
[5] M. Grabisch, A new algorithm for identifying fuzzy measures and its application to pattern recognition, in: Proceedings of 4th IEEE International Conference on Fuzzy Systems, Yokohama, Japan, 1995, pp. 145-150.; M. Grabisch, A new algorithm for identifying fuzzy measures and its application to pattern recognition, in: Proceedings of 4th IEEE International Conference on Fuzzy Systems, Yokohama, Japan, 1995, pp. 145-150.
[6] Y. Wu, E. Chang, K. Chang, J Smith, Optimal multimodal fusion for multimedia data analysis, in: Proceedings of ACM International Conference on Multimedia, New York, 2004, pp. 572-579.; Y. Wu, E. Chang, K. Chang, J Smith, Optimal multimodal fusion for multimedia data analysis, in: Proceedings of ACM International Conference on Multimedia, New York, 2004, pp. 572-579.
[7] Temko, A.; Nadeu, C., Classification of acoustic events using SVM-based clustering schemes, Pattern Recogn., 39, 4, 682-694 (2006) · Zbl 1122.68507
[8] Schölkopf, B.; Smola, A., Learning with Kernels (2002), MIT Press: MIT Press Cambridge, MA
[9] J. Weston, J. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik, Feature selection for SVMs, in: Proceedings of NIPS, 2000.; J. Weston, J. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik, Feature selection for SVMs, in: Proceedings of NIPS, 2000.
[10] Kuncheva, L., Combining Pattern Classifiers (2004), John Wiley & Sons, Inc · Zbl 1066.68114
[11] Marichal, J.-L., Behavioral analysis of aggregation in multicriteria decision aid, preferences and decisions under incomplete knowledge, (Studies in Fuzziness and Soft Computing, vol. 51 (2000), Physica Verlag: Physica Verlag Heidelberg), 153-178 · Zbl 1002.91013
[12] M. Grabisch, The Choquet integral as a linear interpolator, in: Tenth International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2004), Perugia (Italy), 2004, pp. 373-378.; M. Grabisch, The Choquet integral as a linear interpolator, in: Tenth International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2004), Perugia (Italy), 2004, pp. 373-378.
[13] Marichal, J.-L., Entropy of discrete Choquet capacities, Eur. J. Oper. Res., 137, 3, 612-624 (2002) · Zbl 1035.94006
[14] I. Kojadinovic, J.-L. Marichal, M. Roubens, An axiomatic approach to the definition of the entropy of a discrete choquet capacity, in: Ninth International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2002), Annecy (France), 2002, pp. 763-768.; I. Kojadinovic, J.-L. Marichal, M. Roubens, An axiomatic approach to the definition of the entropy of a discrete choquet capacity, in: Ninth International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2002), Annecy (France), 2002, pp. 763-768.
[15] Noll, A., Cepstrum pitch determination, J. Acoust. Soc. Am., 41, 2, 293-309 (1967)
[16] Evaluation Packages for the First CHIL Evaluation Campaign, CHIL project Deliverable D7.4, downloadable from \(\langle;\) http://chil.server.de/servlet/is/\(2712/ \rangle;\); Evaluation Packages for the First CHIL Evaluation Campaign, CHIL project Deliverable D7.4, downloadable from \(\langle;\) http://chil.server.de/servlet/is/\(2712/ \rangle;\)
[17] A. Temko, C. Nadeu, J.-I. Biel, Acoustic event detection: SVM-based system and evaluation setup in CLEAR’07, CLEAR’07 Evaluation Campaign and Workshop, Baltimore, MD, USA (to appear in Multimodal Technologies for Perception of Humans, LNCS, Springer).; A. Temko, C. Nadeu, J.-I. Biel, Acoustic event detection: SVM-based system and evaluation setup in CLEAR’07, CLEAR’07 Evaluation Campaign and Workshop, Baltimore, MD, USA (to appear in Multimodal Technologies for Perception of Humans, LNCS, Springer).
[18] Hsu, C.; Lin, C., A comparison of methods for multi-class support vector machines, IEEE Trans. Neural Networks, 13, 415-425 (2002)
[19] Mikenina, L.; Zimmermann, H., Improved feature selection and classification by the 2-additive fuzzy measure, Fuzzy Sets Systems, 107, 2, 197-218 (1999) · Zbl 0958.68147
[20] Nadeu, C.; Macho, D.; Hernando, J., Frequency and time filtering of filter-bank energies for robust HMM speech recognition, Speech Commun., 34, 93-114 (2001) · Zbl 1005.68772
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.