×

Wave-shape function analysis. When cepstrum meets time-frequency analysis. (English) Zbl 1394.42029

A signal with time-varying, approximately periodic behavior is modeled as \[ f(t) = A(t) s(\phi(t)). \] Here \(A(t)\) is the amplitude function, \(s(t)\) the wave shape function (periodic, but not necessarily sinusoidal), and \(\phi(t)\) is the phase function. \(\phi'(t)\) represents the instantaneous frequency. Examples of such signals include human respiration or ECG signals.
The goal of the algorithm proposed in this paper is to do a time-frequency analysis that estimates amplitude and phase in a small interval, independent of the wave shape.
The \(\gamma\)-generalized cepstrum or root cepstrum of a signal \(f(t)\) is defined as the inverse Fourier transform of the \(\gamma\)-power of the Fourier transform of \(f\): \[ \tilde f_\gamma(q) = \int \left| \hat f(\xi) \right|^\gamma e^{2\pi i q \xi} \,d\xi. \] By replacing the Fourier transform with the short-time Fourier transform, you can define a short-time cepstral transform. This forms the basis of the proposed algorithm DSST (de-shape synchro-squeezing transform).
The algorithm is analyzed, and illustrated with several examples, both synthetic data and measured signals.

MSC:

42C20 Other transformations of harmonic type
62-07 Data analysis (statistics) (MSC2010)

References:

[1] Alexandre, P., Lockwood, P.: Root cepstral analysis: a unified view. application to speech processing in car noise environments. Speech Commun. 12(3), 277-288 (1993) · doi:10.1016/0167-6393(93)90099-7
[2] Auger, F., Flandrin, P.: Improving the readability of time-frequency and time-scale representations by the reassignment method. IEEE Trans. Signal Process. 43(5), 1068-1089 (1995) · doi:10.1109/78.382394
[3] Balazs, P., Dörfler, M., Jaillet, F., Holighaus, N., Velasco, G.: Theory, implementation and applications of nonstationary Gabor frames. J. Comput. Appl. Math. 236(6), 1481-1496 (2011) · Zbl 1236.94026 · doi:10.1016/j.cam.2011.09.011
[4] Benchetrit, G.: Breathing pattern in humans: diversity and individuality. Respir. Physiol. 122(2-3), 123-129 (2000) · doi:10.1016/S0034-5687(00)00154-7
[5] Bogert, B.P., Healy, M.J.R., Tukey, J.W.: The quefrency alanysis of time series for echoes: cepstrum, pseudo-autocovariance, cross-cepstrum and shape cracking. Proc. Symp. Time Series Anal. 15, 209-243 (1963)
[6] Chen, Y.-C., Cheng, M.-Y., Wu, H.-T.: Nonparametric and adaptive modeling of dynamic seasonality and trend with heteroscedastic and dependent errors. J. R. Stat. Soc. B 76, 651-682 (2014) · Zbl 1411.62251 · doi:10.1111/rssb.12039
[7] Chui, C.K., Lin, Y.-T., Wu, H.-T.: Real-time dynamics acquisition from irregular samples—with application to anesthesia evaluation. Anal. Appl. 14(4), 1550016 (2016). doi:10.1142/S0219530515500165 · Zbl 1382.94028 · doi:10.1142/S0219530515500165
[8] Chui, C.K., Mhaskar, H.N.: Signal decomposition and analysis via extraction of frequencies. Appl. Comput. Harmon. Anal. 40(1), 97-136 (2016) · Zbl 1330.94013 · doi:10.1016/j.acha.2015.01.003
[9] Cicone, A., Liu, J., Zhou, H.: Adaptive local iterative filtering for signal decomposition and instantaneous frequency analysis. Appl. Comput. Harmon. Anal. 41(2), 384-411 (2016) · Zbl 1360.94068 · doi:10.1016/j.acha.2016.03.001
[10] Clifford, G.D., Azuaje, E., McSharry, P.E.: Advanced Methods and Tools for ECG Data Analysis. Artech House Publishers, Norwood (2006)
[11] Coifman, R.R., Steinerberger, S.: Nonlinear phase unwinding of functions. J. Fourier Anal. Appl. (2015). doi:10.1007/s00041-016-9489-3 · Zbl 1421.30002
[12] Daubechies, I., Lu, J., Wu, H.-T.: Synchrosqueezed wavelet transforms: an empirical mode decomposition-like tool. Appl. Comput. Harmon. Anal. 30, 243-261 (2011) · Zbl 1213.42133 · doi:10.1016/j.acha.2010.08.002
[13] Daubechies, I., Wang, Y., Wu, H.-T.: ConceFT: concentration of frequency and time via a multitapered synchrosqueezing transform. Philos. Trans. R. Soc. Lond. A 374(2065), 20150193 (2016) · Zbl 1353.42031 · doi:10.1098/rsta.2015.0193
[14] Davila, M.I.: Noncontact extraction of human arterial pulse with a commercial digital color video camera [thesis]. Ph.D. thesis, University of Illinois at Chicago, Chicago (2012) · Zbl 1353.94017
[15] Emiya, V., David, B., Badeau, R.: A parametric method for pitch estimation of piano tones. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Proc., pp. 249-252 (2007) · Zbl 1394.94432
[16] Flandrin, P.: Time-Frequency/Time-Scale Analysis, Wavelet Analysis and Its Applications, vol. 10. Academic Press Inc., San Diego (1999) · Zbl 0954.94003
[17] Fletcher, H.: Normal vibration frequencies of a stiff piano string. J. Acoust. Soc. Am. 36(1), 203-209 (1964) · doi:10.1121/1.1918933
[18] Fletcher, N.H., Rossing, I.: The Physics of Musical Instruments, 2nd edn. Springer, New York (2010) · Zbl 0898.00008
[19] Fossa, A.A., Zhou, M.: Assessing QT prolongation and electrocardiography restitution using a beat-to-beat method. Cardiol. J. 17(3), 230-243 (2010)
[20] Fridericia, L.S.: EKG systolic duration in normal subjects and heart disease patients. Acta Med. Scand. 53, 469-488 (1920) · doi:10.1111/j.0954-6820.1920.tb18266.x
[21] Goldberger, A.L.: Clinical Electrocardiography: A Simplified Approach. Mosby, St. Louis (2006)
[22] Goldberger, A.L., Amaral, L.A.N., Glass, L., Hausdorff, J.M., Ivanov, PCh., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.-K., Stanley, H.E.: Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23), e215-e220 (2000) · doi:10.1161/01.CIR.101.23.e215
[23] Guharay, S., Thakur, G., Goodman, F., Rosen, S., Houser, D.: Analysis of non-stationary dynamics in the financial system. Econ. Lett. 121, 454-457 (2013) · Zbl 1288.91197 · doi:10.1016/j.econlet.2013.09.026
[24] Hermansky, H.: Perceptual linear predictive (PLP) analysis of speech. J. Acoust. Soc. Am. 87(4), 1738-1752 (1990) · doi:10.1121/1.399423
[25] Herry, C.L., Frasch, M., Seely, A., Wu, H.-T.: Heart beat classification from single-lead ECG using the synchrosqueezing transform. Physiol. Meas. 38, 171 (2016) · doi:10.1088/1361-6579/aa5070
[26] Hormander, L.: The Analysis of Linear Partial Differential Operators I. Springer, Berlin (1990) · Zbl 0712.35001
[27] Hou, T., Shi, Z.: Data-driven time-frequency analysis. Appl. Comput. Harmon. Anal. 35(2), 284-308 (2013) · Zbl 1336.94019 · doi:10.1016/j.acha.2012.10.001
[28] Hou, T.Y., Shi, Z.: Extracting a shape function for a signal with intra-wave frequency modulation. Philos. Trans. R. Soc. Lond. A 374(2065), 20150194 (2016) · Zbl 1353.94017 · doi:10.1098/rsta.2015.0194
[29] Huang, N.E., Shen, Z., Long, S.R., Wu, M.C., Shih, H.H., Zheng, Q., Yen, N.-C., Tung, C.C., Liu, H.H.: The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. A 454(1971), 903-995 (1998) · Zbl 0945.62093 · doi:10.1098/rspa.1998.0193
[30] Iatsenko, D., Bernjak, A., Stankovski, T., Shiogai, Y., Owen-Lynch, P.J., Clarkson, P.B.M., McClintock, P.V.E., Stefanovska, A.: Evolution of cardiorespiratory interactions with age Evolution of cardiorespiratory interactions with age. Philos. Trans. R. Soc. A 371(20110622), 1-18 (2013) · Zbl 1353.92034
[31] Indefrey, H., Hess, W., Seeser, G.: Design and evaluation of double-transform pitch determination algorithms with nonlinear distortion in the frequency domain-preliminary results. In: Signal Process, Proc. IEEE Int. Conf. Acoust. Speech, pp. 415-418 (1985)
[32] Khadkevich, M., Omologo, M.: Time-frequency reassigned features for automatic chord recognition. In: IEEE, Proc. ICASSP, pp. 181-184 (2011)
[33] Klapuri, A.: Multipitch analysis of polyphonic music and speech signals using an auditory model. IEEE Trans. Audio, Speech, Lang. Proc. 16(2), 255-266 (2008) · doi:10.1109/TASL.2007.908129
[34] Kobayashi, T., Imai, S.: Spectral analysis using generalized cepstrum. IEEE Trans. Acoust. Speech Signal Proc. 32(5), 1087-1089 (1984) · doi:10.1109/TASSP.1984.1164416
[35] Kowalski, M., Meynard, A., Wu, H.-T.: Convex optimization approach to signals with fast varying instantaneous frequency. Appl. Comput. Harmon. Anal. (2016). doi:10.1016/j.acha.2016.03.008 · Zbl 1375.94076
[36] Kraft, S., Zölzer, U.: Polyphonic pitch detection by iterative analysis of the autocorrelation function. In: Proc. Int. Conf. Digital Audio Effects, pp. 1-8 (2014)
[37] Lim, J.S.: Spectral root homomorphic deconvolution system. IEEE Trans. Acoust. Speech, Signal Proc. 27(3), 223-233 (1979) · Zbl 0415.93031 · doi:10.1109/TASSP.1979.1163234
[38] Lin, Y.-T., Hseu, S.-S., Yien, H.-W., Tsao, J.: Analyzing autonomic activity in electrocardiography about general anesthesia by spectrogram with multitaper time-frequency reassignment. IEEE-BMEI 2, 628-632 (2011)
[39] Lin, Y.-T., Wu, H.-T.: ConceFT for time-varying heart rate variability analysis as a measure of noxious stimulation during general anesthesia. IEEE Trans. Biomed. Eng. 64(1), 145-154 (2016) · doi:10.1109/TBME.2016.2549048
[40] Lin, Y.-T., Wu, H.-T., Tsao, J., Yien, H.-W., Hseu, S.-S.: Time-varying spectral analysis revealing differential effects of sevoflurane anaesthesia: non-rhythmic-to-rhythmic ratio. Acta Anaesthesiol. Scand. 58, 157-167 (2014) · doi:10.1111/aas.12251
[41] Montgomery, H.L.: Lectures on the Interface Between Analytic Number Theory and Harmonic Analysis. AMS, Providence (1994) · Zbl 0814.11001 · doi:10.1090/cbms/084
[42] Oberlin, T., Meignen, S., Perrier, V.: Second-order synchrosqueezing transform or invertible reassignment? Towards ideal time-frequency representations. IEEE Trans. Signal Process. 63(5), 1335-1344 (2015) · Zbl 1394.94432 · doi:10.1109/TSP.2015.2391077
[43] Oppenheim, A.V., Schafer, R.W.: From frequency to quefrency: a history of the cepstrum. IEEE Signal Process. Mag. 21(5), 95-106 (2004) · doi:10.1109/MSP.2004.1328092
[44] Oppenheim, A.V., Schafer, R.W.: Discrete-Time Signal Processing, 3rd edn. Prentice Hall, Englewood Cliffs (2009) · Zbl 0676.42001
[45] Passilongo, D., Mattioli, L., Bassi, E., Szabó, L., Apollonio, M.: Visualizing sound: counting wolves by using a spectral view of the chorus howling. Front. Zool. 12(1), 1-10 (2015) · doi:10.1186/s12983-015-0114-0
[46] Peeters, G.: Music pitch representation by periodicity measures based on combined temporal and spectral representations. In: Proc. IEEE Int. Conf. Acoust. Speech Signal Proc. (2006)
[47] Peeters, G., Rodet, X.: Sinola: a new analysis/synthesis method using spectrum peak shape distortion, phase and reassigned spectrum. In: Proc. ICMC, vol. 99, Citeseer (1999) · Zbl 1317.65263
[48] Ricaud, B., Stempfel, G., Torrésani, B.: An optimally concentrated Gabor transform for localized time-frequency components. Adv. Comput. Math. 40, 683-702 (2014) · Zbl 1302.65157 · doi:10.1007/s10444-013-9337-9
[49] Stevens, S.S.: On the psychophysical law. Psychol. Rev. 64(3), 153 (1957) · doi:10.1037/h0046162
[50] Su, L., Chuang, T.-Y., Yang, Y.-H.: Exploiting frequency, periodicity and harmonicity using advanced time-frequency concentration techniques for multipitch estimation of choir and symphony. In: ISMIR (2016)
[51] Su, L., Yang, Y.-H.: Combining spectral and temporal representations for multipitch estimation of polyphonic music. IEEE/ACM Trans. Audio Speech Lang. Process. 23(10), 1600-1612 (2015) · doi:10.1109/TASLP.2015.2442411
[52] Su, L., Yu, L.-F., Lai, H.-Y., Yang, Y.-H.: Resolving octave ambiguities: a cross-dataset investigation. In: Proc, Sound and Music Computing (SMC) (2014)
[53] Taxt, T.: Comparison of cepstrum-based methods for radial blind deconvolution of ultrasound images. IEEE Trans. Ultrason. Ferroelectr. Freq. Control 44(3), 666-674 (1997) · doi:10.1109/58.658327
[54] Ternström, S.: Perceptual evaluations of voice scatter in unison choir sounds. J. Voice 7(2), 129-135 (1993) · doi:10.1016/S0892-1997(05)80342-X
[55] Thakur, G., The synchrosqueezing transform for instantaneous spectral analysis, No. 4, 397-406 (2015), Berlin · Zbl 1381.94042 · doi:10.1007/978-3-319-20188-7_15
[56] Tokuda, K., Kobayashi, T., Masuko, T., Imai, S.: Mel-generalized cepstral analysis: a unified approach to speech spectral estimation. In: Proc. Int. Conf. Spoken Language Processing (1994) · Zbl 1236.94026
[57] Tolonen, T., Karjalainen, M.: A computationally efficient multipitch analysis model. IEEE Speech Audio Process. 8(6), 708-716 (2000) · doi:10.1109/89.876309
[58] Wu, H.-T.: Instantaneous frequency and wave shape functions (I). Appl. Comput. Harmon. Anal. 35, 181-199 (2013) · Zbl 1305.42005 · doi:10.1016/j.acha.2012.08.008
[59] Wu, H.-T., Chang, H.-H., Wu, H.-K., Wang, C.-L., Yang, Y.-L., Wu, W.-H.: Application of wave-shape functions and synchrosqueezing transform to pulse signal analysis, submitted (2015)
[60] Wu, H.-T., Talmon, R., Lo, Y.-L.: Assess sleep stage by modern signal processing techniques. IEEE Trans. Biomed. Eng. 62, 1159-1168 (2015) · doi:10.1109/TBME.2014.2375292
[61] Xi, S., Cao, H., Chen, X., Zhang, X., Jin, X.: A frequency-shift synchrosqueezing method for instantaneous speed estimation of rotating machinery. ASME J. Manuf. Sci. Eng. 137(3), 031012-031012-11 (2015) · doi:10.1115/1.4029824
[62] Yang, H.: Synchrosqueezed wave packet transforms and diffeomorphism based spectral analysis for 1D general mode decompositions. Appl. Comput. Harmon. Anal. 39, 33-66 (2014) · Zbl 1317.65263 · doi:10.1016/j.acha.2014.08.004
[63] Zhao, X., Wang, D.: Analyzing noise robustness of mfcc and gfcc features in speaker identification. In: IEEE Int. Conf. Acoustics, Speech, Signal Proc. (ICASSP), IEEE, pp. 7204-7208 (2013)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.