Skip to main content

Showing 1–8 of 8 results for author: Titeux, H

  1. arXiv:2306.01506  [pdf, other

    cs.CL eess.AS stat.ML

    BabySLM: language-acquisition-friendly benchmark of self-supervised spoken language models

    Authors: Marvin Lavechin, Yaya Sy, Hadrien Titeux, María Andrea Cruz Blandón, Okko Räsänen, Hervé Bredin, Emmanuel Dupoux, Alejandrina Cristia

    Abstract: Self-supervised techniques for learning speech representations have been shown to develop linguistic competence from exposure to speech without the need for human labels. In order to fully realize the potential of these approaches and further our understanding of how infants learn language, simulations must closely emulate real-life situations by training on developmentally plausible corpora and b… ▽ More

    Submitted 8 June, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: Proceedings of Interspeech 2023

  2. arXiv:2302.12057  [pdf, other

    cs.CL cs.SD eess.AS

    ProsAudit, a prosodic benchmark for self-supervised speech models

    Authors: Maureen de Seyssel, Marvin Lavechin, Hadrien Titeux, Arthur Thomas, Gwendal Virlet, Andrea Santos Revilla, Guillaume Wisniewski, Bogdan Ludusan, Emmanuel Dupoux

    Abstract: We present ProsAudit, a benchmark in English to assess structural prosodic knowledge in self-supervised learning (SSL) speech models. It consists of two subtasks, their corresponding metrics, and an evaluation dataset. In the protosyntax task, the model must correctly identify strong versus weak prosodic boundaries. In the lexical task, the model needs to correctly distinguish between pauses inser… ▽ More

    Submitted 1 June, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: Accepted at Interspeech 2023. 4 pages + references, 1 figure

  3. arXiv:2210.13248  [pdf, other

    eess.AS cs.SD

    Brouhaha: multi-task training for voice activity detection, speech-to-noise ratio, and C50 room acoustics estimation

    Authors: Marvin Lavechin, Marianne Métais, Hadrien Titeux, Alodie Boissonnet, Jade Copet, Morgane Rivière, Elika Bergelson, Alejandrina Cristia, Emmanuel Dupoux, Hervé Bredin

    Abstract: Most automatic speech processing systems register degraded performance when applied to noisy or reverberant speech. But how can one tell whether speech is noisy or reverberant? We propose Brouhaha, a neural network jointly trained to extract speech/non-speech segments, speech-to-noise ratios, and C50room acoustics from single-channel recordings. Brouhaha is trained using a data-driven approach in… ▽ More

    Submitted 25 May, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

  4. arXiv:2010.16131  [pdf, other

    eess.AS cs.CL

    Comparison of Speaker Role Recognition and Speaker Enrollment Protocol for conversational Clinical Interviews

    Authors: Rachid Riad, Hadrien Titeux, Laurie Lemoine, Justine Montillot, Agnes Sliwinski, Jennifer Hamet Bagnou, Xuan Nga Cao, Anne-Catherine Bachoud-Lévi, Emmanuel Dupoux

    Abstract: Conversations between a clinician and a patient, in natural conditions, are valuable sources of information for medical follow-up. The automatic analysis of these dialogues could help extract new language markers and speed-up the clinicians' reports. Yet, it is not clear which speech processing pipeline is the most performing to detect and identify the speaker turns, especially for individuals wit… ▽ More

    Submitted 5 November, 2020; v1 submitted 30 October, 2020; originally announced October 2020.

    Comments: Submitted to ICASSP 2021,1 pages of supplementary material appear only in the arxiv version

  5. arXiv:2006.05365  [pdf, other

    eess.AS cs.CL cs.SD

    Vocal markers from sustained phonation in Huntington's Disease

    Authors: Rachid Riad, Hadrien Titeux, Laurie Lemoine, Justine Montillot, Jennifer Hamet Bagnou, Xuan Nga Cao, Emmanuel Dupoux, Anne-Catherine Bachoud-Lévi

    Abstract: Disease-modifying treatments are currently assessed in neurodegenerative diseases. Huntington's Disease represents a unique opportunity to design automatic sub-clinical markers, even in premanifest gene carriers. We investigated phonatory impairments as potential clinical markers and propose them for both diagnosis and gene carriers follow-up. We used two sets of features: Phonatory features and M… ▽ More

    Submitted 31 July, 2020; v1 submitted 9 June, 2020; originally announced June 2020.

    Comments: To appear at INTERSPEECH 2020. 1 pages of supplementary material appear only in the arxiv version. Code to replicate https://github.com/bootphon/sustained-phonation-features

  6. arXiv:2003.01472  [pdf, other

    cs.CL

    Seshat: A tool for managing and verifying annotation campaigns of audio data

    Authors: Hadrien Titeux, Rachid Riad, Xuan-Nga Cao, Nicolas Hamilakis, Kris Madden, Alejandrina Cristia, Anne-Catherine Bachoud-Lévi, Emmanuel Dupoux

    Abstract: We introduce Seshat, a new, simple and open-source software to efficiently manage annotations of speech corpora. The Seshat software allows users to easily customise and manage annotations of large audio corpora while ensuring compliance with the formatting and naming conventions of the annotated output files. In addition, it includes procedures for checking the content of annotations following sp… ▽ More

    Submitted 17 February, 2021; v1 submitted 3 March, 2020; originally announced March 2020.

    Journal ref: LREC 2020 - 12th Language Resources and Evaluation Conference, May 2020, Marseille, France. pp.6976-6982

  7. arXiv:1912.00938  [pdf

    eess.AS cs.SD

    Speaker detection in the wild: Lessons learned from JSALT 2019

    Authors: Paola Garcia, Jesus Villalba, Herve Bredin, Jun Du, Diego Castan, Alejandrina Cristia, Latane Bullock, Ling Guo, Koji Okabe, Phani Sankar Nidadavolu, Saurabh Kataria, Sizhu Chen, Leo Galmant, Marvin Lavechin, Lei Sun, Marie-Philippe Gill, Bar Ben-Yair, Sajjad Abdoli, Xin Wang, Wassim Bouaziz, Hadrien Titeux, Emmanuel Dupoux, Kong Aik Lee, Najim Dehak

    Abstract: This paper presents the problems and solutions addressed at the JSALT workshop when using a single microphone for speaker detection in adverse scenarios. The main focus was to tackle a wide range of conditions that go from meetings to wild speech. We describe the research threads we explored and a set of modules that was successful for these scenarios. The ultimate goal was to explore speaker dete… ▽ More

    Submitted 2 December, 2019; originally announced December 2019.

    Comments: Submitted to ICASSP 2020

  8. arXiv:1911.01255  [pdf, other

    eess.AS cs.SD

    pyannote.audio: neural building blocks for speaker diarization

    Authors: Hervé Bredin, Ruiqing Yin, Juan Manuel Coria, Gregory Gelly, Pavel Korshunov, Marvin Lavechin, Diego Fustes, Hadrien Titeux, Wassim Bouaziz, Marie-Philippe Gill

    Abstract: We introduce pyannote.audio, an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines. pyannote.audio also comes with pre-trained models covering a wide range of domains for voice activity detection,… ▽ More

    Submitted 4 November, 2019; originally announced November 2019.

    Comments: Submitted to ICASSP 2020