ISCA Archive - Language identification using language-dependent phonemes and language-independent speech units

This paper reports on results from ongoing research on language-identification (LID) performed on the three languages: American-English, German and Spanish. The speech material used is from the Oregon Graduate Institute Spontaneous Telephone Speech Corpus, OGI_TS.

The baseline LID-system consists of three parallel phoneme recognisers each of which are followed by three language modelling modules each characterising the bigram probabilities. The phoneme models used are derived on the basis of the combined speech corpus comprising the three languages. The phonemes are handled differently in analysis performed in two experiments. In the first experiment they are trained and tested language-specifically. In the second, they are separated into a number of groups, one of which contains those language-independent speech units which are similar enough to be equated across the training languages, the remaining containing the non-combinable language-dependent phonemes for each of the languages. A data-driven technique has been devised to separate the speech sounds contained within the training corpus into these groups.In order to prepare for an optimal separation between the input classes, a linear discriminant analysis is performed on the training speech material.

Results from a number of experiments show that average language-identification scores of close to 90% can be retained by the LIDsystem presented here even for a high number of language-independent speech units.

Language identification using language-dependent phonemes and language-independent speech units

Paul Dalsgaard, Ove Andersen, Hanne Hesselager, Bojan Petek