Investigating self-supervised, weakly supervised and fully supervised training approaches for multi-domain automatic speech recognition: a study on Bangladeshi Bangla
Authors:
Ahnaf Mozib Samin,
M. Humayon Kobir,
Md. Mushtaq Shahriyar Rafee,
M. Firoz Ahmed,
Mehedi Hasan,
Partha Ghosh,
Shafkat Kibria,
M. Shahidur Rahman
Abstract:
Despite huge improvements in automatic speech recognition (ASR) employing neural networks, ASR systems still suffer from a lack of robustness and generalizability issues due to domain shifting. This is mainly because principal corpus design criteria are often not identified and examined adequately while compiling ASR datasets. In this study, we investigate the robustness of the state-of-the-art tr…
▽ More
Despite huge improvements in automatic speech recognition (ASR) employing neural networks, ASR systems still suffer from a lack of robustness and generalizability issues due to domain shifting. This is mainly because principal corpus design criteria are often not identified and examined adequately while compiling ASR datasets. In this study, we investigate the robustness of the state-of-the-art transfer learning approaches such as self-supervised wav2vec 2.0 and weakly supervised Whisper as well as fully supervised convolutional neural networks (CNNs) for multi-domain ASR. We also demonstrate the significance of domain selection while building a corpus by assessing these models on a novel multi-domain Bangladeshi Bangla ASR evaluation benchmark - BanSpeech, which contains approximately 6.52 hours of human-annotated speech and 8085 utterances from 13 distinct domains. SUBAK.KO, a mostly read speech corpus for the morphologically rich language Bangla, has been used to train the ASR systems. Experimental evaluation reveals that self-supervised cross-lingual pre-training is the best strategy compared to weak supervision and full supervision to tackle the multi-domain ASR task. Moreover, the ASR models trained on SUBAK.KO face difficulty recognizing speech from domains with mostly spontaneous speech. The BanSpeech will be publicly available to meet the need for a challenging evaluation benchmark for Bangla ASR.
△ Less
Submitted 10 May, 2023; v1 submitted 23 October, 2022;
originally announced October 2022.
Global city densities: re-examining urban scaling theory
Authors:
Joseph R. Burger,
Jordan G. Okie,
Ian Hatton,
Vanessa P. Weinberger,
Munik Shrestha,
Kyra J. Liedtke,
Tam Be,
Austin R. Cruz,
Xiao Feng,
Cesar Hinojo-Hinojo,
Abu S. M. G. Kibria,
Kacey C. Ernst,
Brian J. Enquist
Abstract:
Understanding scaling relations of social and environmental attributes of urban systems is necessary for effectively managing cities. Urban scaling theory (UST) has assumed that population density scales positively with city size. We present a new global analysis using a publicly available database of 933 cities from 38 countries. Our results showed that (18/38) 47% of countries analyzed supported…
▽ More
Understanding scaling relations of social and environmental attributes of urban systems is necessary for effectively managing cities. Urban scaling theory (UST) has assumed that population density scales positively with city size. We present a new global analysis using a publicly available database of 933 cities from 38 countries. Our results showed that (18/38) 47% of countries analyzed supported increasing density scaling (pop ~ area) with exponents ~5/6 as UST predicts. In contrast, 17 of 38 countries (~45%) exhibited density scalings statistically indistinguishable from constant population densities across cities of varying sizes. These results were generally consistent in years spanning four decades from 1975 to 2015. Importantly, density varies by an order of magnitude between regions and countries and decreases in more developed economies. Our results (i) point to how economic and regional differences may affect the scaling of density with city size and (ii) show how understanding country- and region-specific strategies could inform effective management of urban systems for biodiversity, public health, conservation and resiliency from local to global scales.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.