Google
The primary aim of this research is to build a dialect-to-dialect translation model to construct contextual conversions between SAE and AAVE. Some parts of the�...
Aug 28, 2024Both professional translators and large language model, ChatGPT, will be used to create parallel corpora containing AAVE and SAE. This short�...
Aug 31, 2024This paper describes the Arabic broadcast transcription system fielded by IBM in the GALE Phase 2.5 machine translation evaluation.
We discuss challenges in working with low-resource languages and propose strategies to cope with data scarcity in low-resource machine translation (MT).
People also ask
AAVE Corpus Generation and Low-Resource Dialect Machine Translation. Conference Paper. Aug 2024. Eric Graves � Shreyas Aswar � Rujuta Desai � Ted Hall � View.
We survey past research in NLP for dialects in terms of datasets, and approaches. We describe a wide range of NLP tasks in terms of two categories.
Pronouns are critical for tasks like machine translation and summarization, which depend on coreference resolution (Sukthanker et al., 2020). Our pronoun�...
Jul 7, 2024We create DialectBench, a large-scale benchmark covering 40 language clusters with 281 varieties, spanning 10 NLP tasks.
Useful for through two applications - automatic readability assessment and automatic text simplification. The corpus consists of 189 texts, each in three�...
This survey delves into an important attribute of these datasets: the dialect of a language, and describes a wide range of NLP tasks in terms of two categories.