Skip to main content

Showing 1–50 of 211 results for author: Yoshioka, T

  1. arXiv:2410.11367  [pdf, ps, other

    physics.acc-ph hep-ex

    Acceleration of positive muons by a radio-frequency cavity

    Authors: S. Aritome, K. Futatsukawa, H. Hara, K. Hayasaka, Y. Ibaraki, T. Ichikawa, T. Iijima, H. Iinuma, Y. Ikedo, Y. Imai, K. Inami, K. Ishida, S. Kamal, S. Kamioka, N. Kawamura, M. Kimura, A. Koda, S. Koji, K. Kojima, A. Kondo, Y. Kondo, M. Kuzuba, R. Matsushita, T. Mibe, Y. Miyamoto , et al. (29 additional authors not shown)

    Abstract: Acceleration of positive muons from thermal energy to $100~$keV has been demonstrated. Thermal muons were generated by resonant multi-photon ionization of muonium atoms emitted from a sheet of laser-ablated aerogel. The thermal muons were first electrostatically accelerated to $5.7~$keV, followed by further acceleration to 100 keV using a radio-frequency quadrupole. The transverse normalized emitt… ▽ More

    Submitted 15 October, 2024; originally announced October 2024.

  2. arXiv:2410.08707  [pdf, other

    astro-ph.GA

    The nature of low-luminosity AGNs discovered by JWST at $5<z<6$ based on clustering analysis: ancestors of quasars at $z\lesssim3$?

    Authors: Junya Arita, Nobunari Kashikawa, Masafusa Onoue, Takehiro Yoshioka, Yoshihiro Takeda, Hiroki Hoshi, Shunta Shimizu

    Abstract: James Webb Space Telescope (JWST) has discovered many faint AGNs at high-$z$ by detecting their broad Balmer lines. However, their high number density, lack of X-ray emission, and overly high black hole masses with respect to their host stellar masses suggest that they are a distinct population from general type-1 quasars. Here, we present clustering analysis of 28 low-luminosity broad-line AGNs f… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

    Comments: 12 pages, 7 figures, submitted to MNRAS

  3. arXiv:2407.14219  [pdf, other

    quant-ph

    Probing instantaneous quantum circuit refrigeration in the quantum regime

    Authors: Shuji Nakamura, Teruaki Yoshioka, Sergei Lemziakov, Dmitrii Lvov, Hiroto Mukai, Akiyoshi Tomonaga, Shintaro Takada, Yuma Okazaki, Nobu-Hisa Kaneko, Jukka Pekola, Jaw-Shen Tsai

    Abstract: Recent advancements in circuit quantum electrodynamics have enabled precise manipulation and detection of the single energy quantum in quantum systems. A quantum circuit refrigerator (QCR) is capable of electrically cooling the excited population of quantum systems, such as superconducting resonators and qubits, through photon-assisted tunneling of quasi-particles within a superconductor-insulator… ▽ More

    Submitted 13 August, 2024; v1 submitted 19 July, 2024; originally announced July 2024.

    Comments: 15 pages, 9 figures, and 1 table

  4. Target conversation extraction: Source separation using turn-taking dynamics

    Authors: Tuochao Chen, Qirui Wang, Bohan Wu, Malek Itani, Sefik Emre Eskimez, Takuya Yoshioka, Shyamnath Gollakota

    Abstract: Extracting the speech of participants in a conversation amidst interfering speakers and noise presents a challenging problem. In this paper, we introduce the novel task of target conversation extraction, where the goal is to extract the audio of a target conversation based on the speaker embedding of one of its participants. To accomplish this, we propose leveraging temporal patterns inherent in h… ▽ More

    Submitted 29 July, 2024; v1 submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by Interspeech 2024

  5. arXiv:2407.11055  [pdf, other

    cs.LG cs.SD eess.AS

    Knowledge boosting during low-latency inference

    Authors: Vidya Srinivas, Malek Itani, Tuochao Chen, Sefik Emre Eskimez, Takuya Yoshioka, Shyamnath Gollakota

    Abstract: Models for low-latency, streaming applications could benefit from the knowledge capacity of larger models, but edge devices cannot run these models due to resource constraints. A possible solution is to transfer hints during inference from a large model running remotely to a small model running on-device. However, this incurs a communication delay that breaks real-time requirements and does not gu… ▽ More

    Submitted 25 July, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

    Comments: Accepted by Interspeech 2024

  6. arXiv:2406.15401  [pdf, ps, other

    physics.ins-det nucl-ex

    Circular polarization measurement for individual gamma rays in capture reactions with intense pulsed neutrons

    Authors: S. Endo, R. Abe, H. Fujioka, T. Ino, O. Iwamoto, N. Iwamoto, S. Kawamura, A. Kimura, M. Kitaguchi, R. Kobayashi, S. Nakamura, T. Oku T. Okudaira, M. Okuizumi, M. Omer, G. Rovira, T. Shima, H. M. Shimizu, T. Shizuma, Y. Taira, S. Takada, S. Takahashi, H. Yoshikawa, T. Yoshioka, H. Zen

    Abstract: Measurements of circular polarization of $γ$-ray emitted from neutron capture reactions provide valuable information for nuclear physics studies. The spin and parity of excited states can be determined by measuring the circular polarization from polarized neutron capture reactions. Furthermore, the $γ$-ray circular polarization in a neutron capture resonance is crucial for studying the enhancement… ▽ More

    Submitted 7 May, 2024; originally announced June 2024.

    Comments: 10pages, 13 figures

  7. arXiv:2405.06289  [pdf, other

    cs.SD cs.AI eess.AS

    Look Once to Hear: Target Speech Hearing with Noisy Examples

    Authors: Bandhav Veluri, Malek Itani, Tuochao Chen, Takuya Yoshioka, Shyamnath Gollakota

    Abstract: In crowded settings, the human brain can focus on speech from a target speaker, given prior knowledge of how they sound. We introduce a novel intelligent hearable system that achieves this capability, enabling target speech hearing to ignore all interfering speech and noise, but the target speaker. A naive approach is to require a clean speech example to enroll the target speaker. This is however… ▽ More

    Submitted 29 May, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

    Comments: Best paper honorable mention at CHI 2024

  8. arXiv:2404.18387  [pdf, ps, other

    cond-mat.str-el cond-mat.mes-hall quant-ph

    Quantum entanglement in a pure state of strongly correlated quantum impurity systems

    Authors: Yunori Nishikawa, Tomoki Yoshioka

    Abstract: We consider quantum entanglement in strongly correlated quantum impurity systems for states manifesting interesting properties such as multi-level Kondo effect and dual nature between itineracy and localization etc.. For this purpose, we set up a system consisting of one or two quantum impurities arbitrarily selected from the system as a subsystem, and investigate quantum entanglement with its env… ▽ More

    Submitted 28 April, 2024; originally announced April 2024.

  9. arXiv:2404.16381  [pdf, ps, other

    cs.PL

    Abstracting Effect Systems for Algebraic Effect Handlers

    Authors: Takuma Yoshioka, Taro Sekiyama, Atsushi Igarashi

    Abstract: Many effect systems for algebraic effect handlers are designed to guarantee that all invoked effects are handled adequately. However, respective researchers have developed their own effect systems that differ in how to represent the collections of effects that may happen. This situation results in blurring what is required for the representation and manipulation of effect collections in a safe eff… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    ACM Class: D.3.1; D.3.2; D.3.3; F.3.3

  10. arXiv:2404.09841  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Anatomy of Industrial Scale Multilingual ASR

    Authors: Francis McCann Ramirez, Luka Chkhetiani, Andrew Ehrenberg, Robert McHardy, Rami Botros, Yash Khare, Andrea Vanzo, Taufiquzzaman Peyash, Gabriel Oexle, Michael Liang, Ilya Sklyar, Enver Fakhan, Ahmed Etefy, Daniel McCrystal, Sam Flamini, Domenic Donato, Takuya Yoshioka

    Abstract: This paper describes AssemblyAI's industrial-scale automatic speech recognition (ASR) system, designed to meet the requirements of large-scale, multilingual ASR serving various application needs. Our system leverages a diverse training dataset comprising unsupervised (12.5M hours), supervised (188k hours), and pseudo-labeled (1.6M hours) data across four languages. We provide a detailed descriptio… ▽ More

    Submitted 16 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

  11. arXiv:2403.04632  [pdf, other

    physics.ins-det

    Software Compensation for Highly Granular Calorimeters using Machine Learning

    Authors: S. Lai, J. Utehs, A. Wilhahn, O. Bach, E. Brianne, A. Ebrahimi, K. Gadow, P. Göttlicher, O. Hartbrich, D. Heuchel, A. Irles, K. Krüger, J. Kvasnicka, S. Lu, C. Neubüser, A. Provenza, M. Reinecke, F. Sefkow, S. Schuwalow, M. De Silva, Y. Sudo, H. L. Tran, E. Buhmann, E. Garutti, S. Huck , et al. (39 additional authors not shown)

    Abstract: A neural network for software compensation was developed for the highly granular CALICE Analogue Hadronic Calorimeter (AHCAL). The neural network uses spatial and temporal event information from the AHCAL and energy information, which is expected to improve sensitivity to shower development and the neutron fraction of the hadron shower. The neural network method produced a depth-dependent energy w… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  12. arXiv:2402.18876  [pdf, ps, other

    nucl-ex

    Transverse asymmetry of individual $γ$-rays in the $^{139}$La($\vec{n}$, $γ$)$^{140}$La reaction

    Authors: M. Okuizumi, C. J. Auton, S. Endo, H. Fujioka, K. Hirota, T. Ino, K. Ishizaki, A. Kimura, M. Kitaguchi, J. Koga, S. Makise, Y. Niinomi, T. Oku, T. Okudaira, K. Sakai, T. Shima, H. M. Shimizu, H. Tada, S. Takada, S. Takahashi, Y. Tani, T. Yamamoto, H. Yoshikawa, T. Yoshioka

    Abstract: The enhancement of the parity-violating asymmetry in the vicinity of $p$-wave compound nuclear resonances was observed for a variety of medium-heavy nuclei. The enhanced parity-violating asymmetry can be understood using the $s$-$p$ mixing model. The $s$-$p$ mixing model predicts several neutron energy-dependent angular correlations between the neutron momentum $\vec k_n$, neutron spin $\vecσ_n$,… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 5 pages, 5 figures, 2 tables

  13. arXiv:2401.11920  [pdf, other

    physics.ins-det hep-ex

    The quality assurance test of the SliT ASIC for the J-PARC muon $g-2$/EDM experiment

    Authors: Takashi Yamanaka, Yoichi Fujita, Eitaro Hamada, Tetsuichi Kishishita, Tsutomu Mibe, Yutaro Sato, Yoshiaki Seino, Masayoshi Shoji, Taikain Suehara, Manobu M. Tanaka, Junji Tojo, Keisuke Umebayashi, Tamaki Yoshioka

    Abstract: The SliT ASIC is a readout chip for the silicon strip detector to be used at the J-PARC muon $g-2$/EDM experiment. The production version of SliT128D was designed and mass production was finished. A quality assurance test method for bare SliT128D chips was developed to provide a sufficient number of chips for the experiment. The quality assurance test of the SliT128D chips was performed and 5735 c… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

    Comments: 5 pages, 8 figures

  14. arXiv:2312.12959  [pdf, other

    physics.ins-det nucl-ex

    Performance of the Fully-equipped Spin Flip Chopper For Neutron Lifetime Experiment at J-PARC

    Authors: K. Mishima, G. Ichikawa, Y. Fuwa, T. Hasegawa, M. Hino, R. Hosokawa, T. Ino, Y. Iwashita, M. Kitaguchi, S. Matsuzaki, T. Mogi, H. Okabe, T. Oku, T. Okudaira, Y. Seki, H. E. Shimizu, H. M. Shimizu, S. Takahashi, M. Tanida, S. Yamashita, M. Yokohashi, T. Yoshioka

    Abstract: To solve the ''neutron lifetime puzzle,'' where measured neutron lifetimes differ depending on the measurement methods, an experiment with pulsed neutron beam at J-PARC is in progress. In this experiment, neutrons are bunched into 40-cm lengths using a spin flip chopper (SFC), where the statistical sensitivity was limited by the aperture size of the SFC. The SFC comprises three sets of magnetic su… ▽ More

    Submitted 31 July, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

    Comments: 33 pages, 22 figures

  15. arXiv:2312.06115  [pdf, ps, other

    nucl-ex hep-ex

    High sensitivity of a future search for P-odd/T-odd interactions on the 0.75 eV $p$-wave resonance in $\vec{n}+^{139}\vec{\rm La}$ forward transmission determined using pulsed neutron beam

    Authors: R. Nakabe, C. J. Auton, S. Endo, H. Fujioka, V. Gudkov, K. Hirota, I. Ide, T. Ino, M. Ishikado, W. Kambara, S. Kawamura, A. Kimura, M. Kitaguchi, R. Kobayashi, T. Okamura, T. Oku, T. Okudaira, M. Okuizumi, J. G. Otero Munoz, J. D. Parker, K. Sakai, T. Shima, H. M. Shimizu, T. Shinohara, W. M. Snow , et al. (5 additional authors not shown)

    Abstract: Neutron transmission experiments can offer a new type of highly sensitive search for time-reversal invariance violating (TRIV) effects in nucleon-nucleon interactions via the same enhancement mechanism observed for large parity violating (PV) effects in neutron-induced compound nuclear processes. In these compound processes, the TRIV cross-section is given as the product of the PV cross-section, a… ▽ More

    Submitted 10 December, 2023; originally announced December 2023.

  16. Experimental Demonstration of Fermionic QAOA with One-Dimensional Cyclic Driver Hamiltonian

    Authors: Takuya Yoshioka, Keita Sasada, Yuichiro Nakano, Keisuke Fujii

    Abstract: Quantum approximate optimization algorithm (QAOA) has attracted much attention as an algorithm that has the potential to efficiently solve combinatorial optimization problems. Among them, a fermionic QAOA (FQAOA) for solving constrained optimization problems has been developed [Yoshioka, Sasada, Nakano, and Fujii, Phys. Rev. Research vol. 5, 023071, 2023]. In this algorithm, the constraints are es… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: published in 2023 IEEE International Conference on Quantum Computing and Engineering (QCE)

    Journal ref: 2023 IEEE International Conference on Quantum Computing and Engineering (QCE), Bellevue, WA, USA, 2023, pp. 300-306

  17. arXiv:2311.00320  [pdf, other

    cs.SD cs.LG eess.AS

    Semantic Hearing: Programming Acoustic Scenes with Binaural Hearables

    Authors: Bandhav Veluri, Malek Itani, Justin Chan, Takuya Yoshioka, Shyamnath Gollakota

    Abstract: Imagine being able to listen to the birds chirping in a park without hearing the chatter from other hikers, or being able to block out traffic noise on a busy street while still being able to hear emergency sirens and car honks. We introduce semantic hearing, a novel capability for hearable devices that enables them to, in real-time, focus on, or ignore, specific sounds from real-world environment… ▽ More

    Submitted 1 November, 2023; originally announced November 2023.

  18. arXiv:2309.12521  [pdf, other

    cs.SD eess.AS

    Profile-Error-Tolerant Target-Speaker Voice Activity Detection

    Authors: Dongmei Wang, Xiong Xiao, Naoyuki Kanda, Midia Yousefi, Takuya Yoshioka, Jian Wu

    Abstract: Target-Speaker Voice Activity Detection (TS-VAD) utilizes a set of speaker profiles alongside an input audio signal to perform speaker diarization. While its superiority over conventional methods has been demonstrated, the method can suffer from errors in speaker profiles, as those profiles are typically obtained by running a traditional clustering-based diarization method over the input signal. T… ▽ More

    Submitted 3 April, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

    Comments: Submission for ICASSP 2024

  19. arXiv:2309.08905  [pdf, ps, other

    nucl-ex

    Spin dependence in the $p$-wave resonance of ${^{139}\vec{\rm{La}}+\vec{n}}$

    Authors: T. Okudaira, R. Nakabe, S. Endo, H. Fujioka, V. Gudkov, I. Ide, T. Ino, M. Ishikado, W. Kambara, S. Kawamura, R. Kobayashi, M. Kitaguchi, T. Okamura, T. Oku, J. G. Otero Munoz, J. D. Parker, K. Sakai, T. Shima, H. M. Shimizu, T. Shinohara, W. M. Snow, S. Takada, Y. Tsuchikawa, R. Takahashi, S. Takahashi , et al. (2 additional authors not shown)

    Abstract: We measured the spin dependence in a neutron-induced $p$-wave resonance by using a polarized epithermal neutron beam and a polarized nuclear target. Our study focuses on the 0.75~eV $p$-wave resonance state of $^{139}$La+$n$, where largely enhanced parity violation has been observed. We determined the partial neutron width of the $p$-wave resonance by measuring the spin dependence of the neutron a… ▽ More

    Submitted 16 September, 2023; originally announced September 2023.

  20. arXiv:2309.08131  [pdf, other

    eess.AS cs.SD

    t-SOT FNT: Streaming Multi-talker ASR with Text-only Domain Adaptation Capability

    Authors: Jian Wu, Naoyuki Kanda, Takuya Yoshioka, Rui Zhao, Zhuo Chen, Jinyu Li

    Abstract: Token-level serialized output training (t-SOT) was recently proposed to address the challenge of streaming multi-talker automatic speech recognition (ASR). T-SOT effectively handles overlapped speech by representing multi-talker transcriptions as a single token stream with $\langle \text{cc}\rangle$ symbols interspersed. However, the use of a naive neural transducer architecture significantly cons… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

    Comments: 5 pages, 2 figures, submitted to ICASSP2024

  21. arXiv:2309.08007  [pdf, ps, other

    eess.AS cs.CL cs.SD

    DiariST: Streaming Speech Translation with Speaker Diarization

    Authors: Mu Yang, Naoyuki Kanda, Xiaofei Wang, Junkun Chen, Peidong Wang, Jian Xue, Jinyu Li, Takuya Yoshioka

    Abstract: End-to-end speech translation (ST) for conversation recordings involves several under-explored challenges such as speaker diarization (SD) without accurate word time stamps and handling of overlapping speech in a streaming fashion. In this work, we propose DiariST, the first streaming ST and SD solution. It is built upon a neural transducer-based streaming ST system and integrates token-level seri… ▽ More

    Submitted 22 January, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024

  22. arXiv:2308.13666  [pdf, other

    astro-ph.HE

    A Joint Fermi-GBM and Swift-BAT Analysis of Gravitational-Wave Candidates from the Third Gravitational-wave Observing Run

    Authors: C. Fletcher, J. Wood, R. Hamburg, P. Veres, C. M. Hui, E. Bissaldi, M. S. Briggs, E. Burns, W. H. Cleveland, M. M. Giles, A. Goldstein, B. A. Hristov, D. Kocevski, S. Lesage, B. Mailyan, C. Malacaria, S. Poolakkil, A. von Kienlin, C. A. Wilson-Hodge, The Fermi Gamma-ray Burst Monitor Team, M. Crnogorčević, J. DeLaunay, A. Tohuvavohu, R. Caputo, S. B. Cenko , et al. (1674 additional authors not shown)

    Abstract: We present Fermi Gamma-ray Burst Monitor (Fermi-GBM) and Swift Burst Alert Telescope (Swift-BAT) searches for gamma-ray/X-ray counterparts to gravitational wave (GW) candidate events identified during the third observing run of the Advanced LIGO and Advanced Virgo detectors. Using Fermi-GBM on-board triggers and sub-threshold gamma-ray burst (GRB) candidates found in the Fermi-GBM ground analyses,… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

  23. arXiv:2308.06873  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    SpeechX: Neural Codec Language Model as a Versatile Speech Transformer

    Authors: Xiaofei Wang, Manthan Thakker, Zhuo Chen, Naoyuki Kanda, Sefik Emre Eskimez, Sanyuan Chen, Min Tang, Shujie Liu, Jinyu Li, Takuya Yoshioka

    Abstract: Recent advancements in generative speech models based on audio-text prompts have enabled remarkable innovations like high-quality zero-shot text-to-speech. However, existing models still face limitations in handling diverse audio-text speech generation tasks involving transforming input speech and processing audio captured in adverse acoustic conditions. This paper introduces SpeechX, a versatile… ▽ More

    Submitted 25 June, 2024; v1 submitted 13 August, 2023; originally announced August 2023.

    Comments: To appear in TASLP. See https://aka.ms/speechx for demo samples

  24. arXiv:2307.02531  [pdf, other

    astro-ph.GA

    Subaru High-$z$ Exploration of Low-Luminosity Quasars (SHELLQs). XVIII. The Dark Matter Halo Mass of Quasars at $z\sim6$

    Authors: Junya Arita, Nobunari Kashikawa, Yoshiki Matsuoka, Wanqiu He, Kei Ito, Yongming Liang, Rikako Ishimoto, Takehiro Yoshioka, Yoshihiro Takeda, Kazushi Iwasawa, Masafusa Onoue, Yoshiki Toba, Masatoshi Imanishi

    Abstract: We present, for the first time, dark matter halo (DMH) mass measurement of quasars at $z\sim6$ based on a clustering analysis of 107 quasars. Spectroscopically identified quasars are homogeneously extracted from the HSC-SSP wide layer over $891\,\mathrm{deg^2}$. We evaluate the clustering strength by three different auto-correlation functions: projected correlation function, angular correlation fu… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

    Comments: 22 pages, 8 figures, accepted for publication in ApJ

  25. arXiv:2306.10212  [pdf, other

    quant-ph

    Active Initialization Experiment of Superconducting Qubit Using Quantum-circuit Refrigerator

    Authors: Teruaki Yoshioka, Hiroto Mukai, Akiyoshi Tomonaga, Shintaro Takada, Yuma Okazaki, Nobu-Hisa Kaneko, Shuji Nakamura, Jaw-Shen Tsai

    Abstract: The initialization of superconducting qubits is one of the essential techniques for the realization of quantum computation. In previous research, initialization above 99\% fidelity has been achieved at 280 ns. Here, we demonstrate the rapid initialization of a superconducting qubit with a quantum-circuit refrigerator (QCR). Photon-assisted tunneling of quasiparticles in the QCR can temporally incr… ▽ More

    Submitted 16 June, 2023; originally announced June 2023.

  26. arXiv:2305.18747  [pdf, other

    eess.AS cs.CL

    Adapting Multi-Lingual ASR Models for Handling Multiple Talkers

    Authors: Chenda Li, Yao Qian, Zhuo Chen, Naoyuki Kanda, Dongmei Wang, Takuya Yoshioka, Yanmin Qian, Michael Zeng

    Abstract: State-of-the-art large-scale universal speech models (USMs) show a decent automatic speech recognition (ASR) performance across multiple domains and languages. However, it remains a challenge for these models to recognize overlapped speech, which is often seen in meeting conversations. We propose an approach to adapt USMs for multi-talker ASR. We first develop an enhanced version of serialized out… ▽ More

    Submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted by Interspeech 2023

  27. arXiv:2305.13738  [pdf, other

    cs.CL cs.AI cs.CV

    i-Code Studio: A Configurable and Composable Framework for Integrative AI

    Authors: Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, Ziyi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang

    Abstract: Artificial General Intelligence (AGI) requires comprehensive understanding and generation capabilities for a variety of tasks spanning different modalities and functionalities. Integrative AI is one important direction to approach AGI, through combining multiple models to tackle complex multimodal tasks. However, there is a lack of a flexible and composable platform to facilitate efficient and eff… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

  28. arXiv:2305.12311  [pdf, other

    cs.CL cs.AI cs.CV cs.LG eess.AS

    i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data

    Authors: Ziyi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang

    Abstract: The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities. We propose closing this gap with i-Code V2, the first model capable of generating natural language from any combination of Vision, Language, and Speech data. i-Code V2 is a… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

  29. arXiv:2304.08393  [pdf, other

    gr-qc astro-ph.CO astro-ph.HE

    Search for gravitational-lensing signatures in the full third observing run of the LIGO-Virgo network

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, C. Alléné, A. Allocca, P. A. Altin , et al. (1670 additional authors not shown)

    Abstract: Gravitational lensing by massive objects along the line of sight to the source causes distortions of gravitational wave-signals; such distortions may reveal information about fundamental physics, cosmology and astrophysics. In this work, we have extended the search for lensing signatures to all binary black hole events from the third observing run of the LIGO--Virgo network. We search for repeated… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

    Comments: 28 pages, 11 figures

    Report number: LIGO-P2200031

  30. arXiv:2303.08372  [pdf, other

    eess.AS cs.SD

    Target Sound Extraction with Variable Cross-modality Clues

    Authors: Chenda Li, Yao Qian, Zhuo Chen, Dongmei Wang, Takuya Yoshioka, Shujie Liu, Yanmin Qian, Michael Zeng

    Abstract: Automatic target sound extraction (TSE) is a machine learning approach to mimic the human auditory perception capability of attending to a sound source of interest from a mixture of sources. It often uses a model conditioned on a fixed form of target sound clues, such as a sound class label, which limits the ways in which users can interact with the model to specify the target sounds. To leverage… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: Accepted by ICASSP 2023

  31. arXiv:2302.12369  [pdf, other

    eess.AS cs.CL cs.SD

    Factual Consistency Oriented Speech Recognition

    Authors: Naoyuki Kanda, Takuya Yoshioka, Yang Liu

    Abstract: This paper presents a novel optimization framework for automatic speech recognition (ASR) with the aim of reducing hallucinations produced by an ASR model. The proposed framework optimizes the ASR model to maximize an expected factual consistency score between ASR hypotheses and ground-truth transcriptions, where the factual consistency score is computed by a separately trained estimator. Experime… ▽ More

    Submitted 23 February, 2023; originally announced February 2023.

    Comments: 5 pages, 1 figure, 3 tables

  32. Fermionic Quantum Approximate Optimization Algorithm

    Authors: Takuya Yoshioka, Keita Sasada, Yuichiro Nakano, Keisuke Fujii

    Abstract: Quantum computers are expected to accelerate solving combinatorial optimization problems, including algorithms such as Grover adaptive search and quantum approximate optimization algorithm (QAOA). However, many combinatorial optimization problems involve constraints which, when imposed as soft constraints in the cost function, can negatively impact the performance of the optimization algorithm. In… ▽ More

    Submitted 30 April, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: Accepted for publication in Physical Review Research on March 29, 2023. 16 pages, 8 figures

    Journal ref: Physical Review Research 5, 023071 (2023)

  33. Angular distribution of $γ$-rays from a neutron-induced $p$-wave resonance of $^{132}$Xe

    Authors: T. Okudaira, Y. Tani, S. Endo, J. Doskow, H. Fujioka, K. Hirota, K. Kameda, A. Kimura, M. Kitaguchi, M. Luxnat, K. Sakai, D. Schaper, T. Shima, H. M. Shimizu, W. M. Snow, S. Takada, T. Yamamoto, H. Yoshikawa, T. Yoshioka

    Abstract: A neutron-energy dependent angular distribution was measured for individual $γ$-rays from the 3.2 eV $p$-wave resonance of $^{131}$Xe+$n$, that shows enhanced parity violation owing to a mixing between $s$- and $p$-wave amplitudes. The $γ$-ray transitions from the $p$-wave resonance were identified, and the angular distribution with respect to the neutron momentum was evaluated as a function of th… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

  34. arXiv:2212.01477  [pdf, other

    astro-ph.HE astro-ph.CO

    Search for subsolar-mass black hole binaries in the second part of Advanced LIGO's and Advanced Virgo's third observing run

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, C. Alléné, A. Allocca, P. A. Altin , et al. (1680 additional authors not shown)

    Abstract: We describe a search for gravitational waves from compact binaries with at least one component with mass 0.2 $M_\odot$ -- $1.0 M_\odot$ and mass ratio $q \geq 0.1$ in Advanced LIGO and Advanced Virgo data collected between 1 November 2019, 15:00 UTC and 27 March 2020, 17:00 UTC. No signals were detected. The most significant candidate has a false alarm rate of 0.2 $\mathrm{yr}^{-1}$. We estimate t… ▽ More

    Submitted 26 January, 2024; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: https://dcc.ligo.org/P2200139

  35. arXiv:2211.09988  [pdf, ps, other

    eess.AS cs.SD

    Exploring WavLM on Speech Enhancement

    Authors: Hyungchan Song, Sanyuan Chen, Zhuo Chen, Yu Wu, Takuya Yoshioka, Min Tang, Jong Won Shin, Shujie Liu

    Abstract: There is a surge in interest in self-supervised learning approaches for end-to-end speech encoding in recent years as they have achieved great success. Especially, WavLM showed state-of-the-art performance on various speech processing tasks. To better understand the efficacy of self-supervised learning models for speech enhancement, in this work, we design and conduct a series of experiments with… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

    Comments: Accepted by IEEE SLT 2022

  36. arXiv:2211.06493  [pdf, other

    eess.AS cs.SD eess.SP

    Handling Trade-Offs in Speech Separation with Sparsely-Gated Mixture of Experts

    Authors: Xiaofei Wang, Zhuo Chen, Yu Shi, Jian Wu, Naoyuki Kanda, Takuya Yoshioka

    Abstract: Employing a monaural speech separation (SS) model as a front-end for automatic speech recognition (ASR) involves balancing two kinds of trade-offs. First, while a larger model improves the SS performance, it also requires a higher computational cost. Second, an SS model that is more optimized for handling overlapped speech is likely to introduce more processing artifacts in non-overlapped-speech r… ▽ More

    Submitted 30 May, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

  37. arXiv:2211.05564  [pdf, other

    eess.AS cs.SD

    Self-supervised learning with bi-label masked speech prediction for streaming multi-talker speech recognition

    Authors: Zili Huang, Zhuo Chen, Naoyuki Kanda, Jian Wu, Yiming Wang, Jinyu Li, Takuya Yoshioka, Xiaofei Wang, Peidong Wang

    Abstract: Self-supervised learning (SSL), which utilizes the input data itself for representation learning, has achieved state-of-the-art results for various downstream speech tasks. However, most of the previous studies focused on offline single-talker applications, with limited investigations in multi-talker cases, especially for streaming scenarios. In this paper, we investigate SSL for streaming multi-t… ▽ More

    Submitted 10 November, 2022; originally announced November 2022.

    Comments: submitted to ICASSP 2023

  38. arXiv:2211.05172  [pdf, other

    eess.AS cs.CL cs.SD

    Speech separation with large-scale self-supervised learning

    Authors: Zhuo Chen, Naoyuki Kanda, Jian Wu, Yu Wu, Xiaofei Wang, Takuya Yoshioka, Jinyu Li, Sunit Sivasankaran, Sefik Emre Eskimez

    Abstract: Self-supervised learning (SSL) methods such as WavLM have shown promising speech separation (SS) results in small-scale simulation-based experiments. In this work, we extend the exploration of the SSL-based SS by massively scaling up both the pre-training data (more than 300K hours) and fine-tuning data (10K hours). We also investigate various techniques to efficiently integrate the pre-trained mo… ▽ More

    Submitted 25 November, 2022; v1 submitted 9 November, 2022; originally announced November 2022.

  39. arXiv:2211.02944  [pdf, other

    eess.AS cs.SD

    Breaking the trade-off in personalized speech enhancement with cross-task knowledge distillation

    Authors: Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka

    Abstract: Personalized speech enhancement (PSE) models achieve promising results compared with unconditional speech enhancement models due to their ability to remove interfering speech in addition to background noise. Unlike unconditional speech enhancement, causal PSE models may occasionally remove the target speech by mistake. The PSE models also tend to leak interfering speech when the target speaker is… ▽ More

    Submitted 5 November, 2022; originally announced November 2022.

    Comments: Submitted to ICASSP 2023

  40. arXiv:2211.02773  [pdf, other

    eess.AS cs.SD

    Real-Time Joint Personalized Speech Enhancement and Acoustic Echo Cancellation

    Authors: Sefik Emre Eskimez, Takuya Yoshioka, Alex Ju, Min Tang, Tanel Parnamaa, Huaming Wang

    Abstract: Personalized speech enhancement (PSE) is a real-time SE approach utilizing a speaker embedding of a target person to remove background noise, reverberation, and interfering voices. To deploy a PSE model for full duplex communications, the model must be combined with acoustic echo cancellation (AEC), although such a combination has been less explored. This paper proposes a series of methods that ar… ▽ More

    Submitted 25 May, 2023; v1 submitted 4 November, 2022; originally announced November 2022.

    Comments: Accepted to Interspeech 2023

  41. arXiv:2211.02250  [pdf, other

    cs.SD cs.LG eess.AS

    Real-Time Target Sound Extraction

    Authors: Bandhav Veluri, Justin Chan, Malek Itani, Tuochao Chen, Takuya Yoshioka, Shyamnath Gollakota

    Abstract: We present the first neural network model to achieve real-time and streaming target sound extraction. To accomplish this, we propose Waveformer, an encoder-decoder architecture with a stack of dilated causal convolution layers as the encoder, and a transformer decoder layer as the decoder. This hybrid architecture uses dilated causal convolutions for processing large receptive fields in a computat… ▽ More

    Submitted 19 April, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: ICASSP 2023 camera-ready

  42. Measurement of the transverse asymmetry of $γ$-rays in the $^{117}$Sn(n,$γ$)$^{118}$Sn reaction

    Authors: S. Endo, T. Okudaira, R. Abe, H. Fujioka, K. Hirota, A. Kimura, M. Kitaguchi, T. Oku, K. Sakai, T. Shima, H. M. Shimizu, S. Takada, S. Takahashi, T. Yamamoto, H. Yoshikawa, T. Yoshioka

    Abstract: Largely enhanced parity-violating effects observed in compound resonances induced by epithermal neutrons are currently attributed to the mixing of parity-unfavored partial amplitudes in the entrance channel of the compound states. Furthermore, it is proposed that the same mechanism that enhances the parity-violation also enhances the breaking of time-reversal-invariance in the compound nucleus. Th… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

    Comments: 7 pages, 10 figures

  43. arXiv:2210.15715  [pdf, ps, other

    eess.AS cs.CL cs.SD

    Simulating realistic speech overlaps improves multi-talker ASR

    Authors: Muqiao Yang, Naoyuki Kanda, Xiaofei Wang, Jian Wu, Sunit Sivasankaran, Zhuo Chen, Jinyu Li, Takuya Yoshioka

    Abstract: Multi-talker automatic speech recognition (ASR) has been studied to generate transcriptions of natural conversation including overlapping speech of multiple speakers. Due to the difficulty in acquiring real conversation data with high-quality human transcriptions, a naïve simulation of multi-talker speech by randomly mixing multiple utterances was conventionally used for model training. In this wo… ▽ More

    Submitted 17 November, 2022; v1 submitted 27 October, 2022; originally announced October 2022.

    Comments: v2: fix minor typo

  44. arXiv:2210.10931  [pdf, other

    astro-ph.HE

    Search for gravitational-wave transients associated with magnetar bursts in Advanced LIGO and Advanced Virgo data from the third observing run

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, K. Agatsuma, N. Aggarwal, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, A. Allocca, P. A. Altin , et al. (1645 additional authors not shown)

    Abstract: Gravitational waves are expected to be produced from neutron star oscillations associated with magnetar giant flares and short bursts. We present the results of a search for short-duration (milliseconds to seconds) and long-duration ($\sim$ 100 s) transient gravitational waves from 13 magnetar short bursts observed during Advanced LIGO, Advanced Virgo and KAGRA's third observation run. These 13 bu… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: 30 pages with appendices, 5 figures, 10 tables

    Report number: LIGO-P2100387

  45. arXiv:2210.05934  [pdf, other

    gr-qc physics.ins-det

    Input optics systems of the KAGRA detector during O3GK

    Authors: T. Akutsu, M. Ando, K. Arai, Y. Arai, S. Araki, A. Araya, N. Aritomi, H. Asada, Y. Aso, S. Bae, Y. Bae, L. Baiotti, R. Bajpai, M. A. Barton, K. Cannon, Z. Cao, E. Capocasa, M. Chan, C. Chen, K. Chen, Y. Chen, C-I. Chiang, H. Chu, Y-K. Chu, S. Eguchi , et al. (228 additional authors not shown)

    Abstract: KAGRA, the underground and cryogenic gravitational-wave detector, was operated for its solo observation from February 25th to March 10th, 2020, and its first joint observation with the GEO 600 detector from April 7th -- 21st, 2020 (O3GK). This study presents an overview of the input optics systems of the KAGRA detector, which consist of various optical systems, such as a laser source, its intensit… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  46. arXiv:2209.04974  [pdf, other

    eess.AS cs.CL cs.SD

    VarArray Meets t-SOT: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition

    Authors: Naoyuki Kanda, Jian Wu, Xiaofei Wang, Zhuo Chen, Jinyu Li, Takuya Yoshioka

    Abstract: This paper presents a novel streaming automatic speech recognition (ASR) framework for multi-talker overlapping speech captured by a distant microphone array with an arbitrary geometry. Our framework, named t-SOT-VA, capitalizes on independently developed two recent technologies; array-geometry-agnostic continuous speech separation, or VarArray, and streaming multi-talker ASR based on token-level… ▽ More

    Submitted 3 October, 2022; v1 submitted 11 September, 2022; originally announced September 2022.

    Comments: 6 pages, 2 figure, 3 tables, v2: Appendix A has been added

  47. arXiv:2209.02863  [pdf

    astro-ph.HE gr-qc

    Model-based cross-correlation search for gravitational waves from the low-mass X-ray binary Scorpius X-1 in LIGO O3 data

    Authors: The LIGO Scientific Collaboration, the Virgo Collaboration, the KAGRA Collaboration, R. Abbott, H. Abe, F. Acernese, K. Ackley, S. Adhicary, N. Adhikari, R. X. Adhikari, V. K. Adkins, V. B. Adya, C. Affeldt, D. Agarwal, M. Agathos, O. D. Aguiar, L. Aiello, A. Ain, P. Ajith, T. Akutsu, S. Albanesi, R. A. Alfaidi, C. Alléné, A. Allocca, P. A. Altin , et al. (1670 additional authors not shown)

    Abstract: We present the results of a model-based search for continuous gravitational waves from the low-mass X-ray binary Scorpius X-1 using LIGO detector data from the third observing run of Advanced LIGO, Advanced Virgo and KAGRA. This is a semicoherent search which uses details of the signal model to coherently combine data separated by less than a specified coherence time, which can be adjusted to bala… ▽ More

    Submitted 2 January, 2023; v1 submitted 6 September, 2022; originally announced September 2022.

    Comments: 19 pages, Open Access Journal PDF

    Report number: LIGO-P2100110-v13

    Journal ref: The Astrophysical Journal Letters, 941, L30 (2022)

  48. arXiv:2208.13085  [pdf, other

    eess.AS cs.CL cs.SD

    Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization

    Authors: Dongmei Wang, Xiong Xiao, Naoyuki Kanda, Takuya Yoshioka, Jian Wu

    Abstract: This paper describes a speaker diarization model based on target speaker voice activity detection (TS-VAD) using transformers. To overcome the original TS-VAD model's drawback of being unable to handle an arbitrary number of speakers, we investigate model architectures that use input tensors with variable-length time and speaker dimensions. Transformer layers are applied to the speaker axis to mak… ▽ More

    Submitted 25 September, 2022; v1 submitted 27 August, 2022; originally announced August 2022.

  49. Description and stability of a RPC-based calorimeter in electromagnetic and hadronic shower environments

    Authors: D. Boumediene, V. Francais, J. Apostolakis, G. Folger, A. Ribon, E. Sicking, K. Goto, K. Kawagoe, M. Kuhara, T. Suehara, T. Yoshioka, A. Pingault, M. Tytgat, G. Garillot, G. Grenier, T. Kurca, I. Laktineh, B. Liu, B. Li, L. Mirabito, E. Calvo Alamillo, C. Carrillo, M. C. Fouz, H. Garcia Cabrera, J. Marin , et al. (14 additional authors not shown)

    Abstract: The CALICE Semi-Digital Hadron Calorimeter technological prototype completed in 2011 is a sampling calorimeter using Glass Resistive Plate Chamber (GRPC) detectors as the active medium. This technology is one of the two options proposed for the hadron calorimeter of the International Large Detector for the International Linear Collider. The prototype was exposed in 2015 to beams of muons, electron… ▽ More

    Submitted 21 March, 2023; v1 submitted 13 July, 2022; originally announced July 2022.

    Comments: Version published in JINST

    Report number: CALICE-PUB-2022-02

    Journal ref: 2023 JINST 18 P03035

  50. arXiv:2207.05098  [pdf, other

    astro-ph.GA astro-ph.CO

    The physical origin for spatially large scatter of IGM opacity at the end of reionization: the IGM Ly$α$ opacity-galaxy density relation

    Authors: Rikako Ishimoto, Nobunari Kashikawa, Daichi Kashino, Kei Ito, Yongming Liang, Zheng Cai, Takehiro Yoshioka, Katsuya Okoshi, Toru Misawa, Masafusa Onoue, Yoshihiro Takeda, Hisakazu Uchiyama

    Abstract: The large opacity fluctuations in the $z > 5.5$ Ly$α$ forest may indicate inhomogeneous progress of reionization. To explain the observed large scatter of the effective Ly$α$ optical depth ($τ_{\rm eff}$) of the intergalactic medium (IGM), fluctuation of UV background ($Γ$ model) or the IGM gas temperature ($T$ model) have been proposed, which predict opposite correlations between $τ_{\rm eff}$ an… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: 13 pages, 14 figures, accepted for publication in MNRAS