×

Speech recognition using fractals. (English) Zbl 0993.68096

Summary: The use of fractal theory for speech recognition is investigated. First, the possibility of using iterated function systems for speech recognition is discussed. Next, the use of fractal dimension for phoneme recognition and word segmentation is presented. A phoneme recognition method is presented based on fractal theory. Fractal dimension and Iterated Function System (IFS) parameters are investigated for word segmentation. The IFS matrices and the eigenvalues of the covariance matrix are proposed for phoneme recognition.

MSC:

68T10 Pattern recognition, speech recognition

Software:

Mathematica
Full Text: DOI

References:

[1] Senevirathne, T. R.; Bohez, E. L.J.; Van Winden, J. A., Amplitude scale methodnew and efficient approach to measure the fractal dimension of speech wave forms, IEE Electron. Lett., 28, 4, 420-422 (1992)
[2] Bohez, E. L.J.; Senevirathne, T. R.; Van Winden, J. A., Fractal dimension and iterated function systems for speech recognition, IEE Electron. Lett., 28, 15, 1335-1382 (1992)
[3] Mandelbrot, B., The fractal geometry of nature (1982), Freeman: Freeman San Fransisco, CA · Zbl 0504.28001
[4] Barnsley, M., Fractals EveryWhere (1988), Academic Press: Academic Press New York · Zbl 0691.58001
[5] Hunt, F.; Sullivan, F., Effecient Algorithm for Computing Fractal Dimension, (Mayer-Kress, G., Dimension and Entropies in Chaotic Systems (1986), Spring: Spring Berlin), 74-81
[6] Wong, P.; Lin, J., Studying fractal geometry on submicron length scales by small angle scattering, Math. Geol., 20, 6, 655-665 (1988)
[7] Pentland, A. P., Fractal based description of natural scenes, IEEE Trans. Pattern Anal. Mach. Intell. PAMI., 6, 661-670 (1984)
[8] R. Creutzburg, E. Ivanov, Fast Algorithm for Computing Fractal Dimension of Image Segments, Lecture Notes on Computer Science, 1989, 43-51.; R. Creutzburg, E. Ivanov, Fast Algorithm for Computing Fractal Dimension of Image Segments, Lecture Notes on Computer Science, 1989, 43-51.
[9] Lamel, L. F.; Rabiner, L. R.; Rosenberg, A. E.; Wilpon, J. G., An improved end point detector for isolated word recognition, IEEE Trans. Acoustic, Speech Signal Process. ASSP, 29, 4, 777-785 (1981)
[10] O’Shaughnessy, D., Speech Communication (1987), Addison-Wesley: Addison-Wesley Reading, MA
[11] Mermelstein, P., Automatic segmentation of speech into syllabic units, J Acoustic Soc Amer., 58, 880-883 (1975)
[12] Rabiner, L. R.; Rosenberg, A. E.; Levinson, S. E., Considerations in dynamic time warping algorithm for discrete word recognition, IEEE Trans. Acoustics, Speech Signal Process. ASSP, 26, 6, 575-582 (1978) · Zbl 0413.68092
[13] Myers, C. S.; Rabiner, L. R., Connected digit Recognition using a level building DTW algorithm, IEEE Trans. Acoustic, Speech Signal Process. ASSP, 29, 2, 351-363 (1981) · Zbl 0521.68093
[14] D. Vanvinckenroye, S. Willems, De Fractale Dimensie Van Signalen een Middel voor het Onderscheiden van Spraak, Muziek en Ruis?, Master Thesis, Katholieke Universiteit Leuven, Belgium, 1990.; D. Vanvinckenroye, S. Willems, De Fractale Dimensie Van Signalen een Middel voor het Onderscheiden van Spraak, Muziek en Ruis?, Master Thesis, Katholieke Universiteit Leuven, Belgium, 1990.
[15] Rabiner, L. R.; Levinson, S. E.; Rosenberg, A. E.; Wilpon, J. G., Speaker independent recognition of isolated words using clustering techniques, IEEE Trans. Acoustics, Speech Signal Process ASSP, 27, 4, 336-349 (1979) · Zbl 0413.68090
[16] Rabiner, L. R.; Juang, B. H., Fundamentals of Speech Recognition (1993), Prentice Hall: Prentice Hall NJ · Zbl 0762.62036
[17] Stephen Wolfram, Mathematica, A System for Doing Mathematics by Computer, 2nd Edition, Addison-Wesley, Reading, MA, 1991.; Stephen Wolfram, Mathematica, A System for Doing Mathematics by Computer, 2nd Edition, Addison-Wesley, Reading, MA, 1991.
[18] Kaufman, L.; Rousseuw, P. J., Finding Groups in Data (1990), Wiley: Wiley New York · Zbl 1345.62009
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.