Skip to main content

Showing 1–25 of 25 results for author: Haseyama, M

  1. arXiv:2409.01534  [pdf, other

    cs.CV cs.AI cs.MM

    Think Twice Before Recognizing: Large Multimodal Models for General Fine-grained Traffic Sign Recognition

    Authors: Yaozong Gan, Guang Li, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama

    Abstract: We propose a new strategy called think twice before recognizing to improve fine-grained traffic sign recognition (TSR). Fine-grained TSR in the wild is difficult due to the complex road conditions, and existing approaches particularly struggle with cross-country TSR when data is lacking. Our strategy achieves effective fine-grained TSR by stimulating the multiple-thinking capability of large multi… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  2. arXiv:2409.00919  [pdf, other

    cs.SD cs.AI eess.AS

    MMT-BERT: Chord-aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT

    Authors: Jinlong Zhu, Keigo Sakurai, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: We propose a novel symbolic music representation and Generative Adversarial Network (GAN) framework specially designed for symbolic multitrack music generation. The main theme of symbolic music generation primarily encompasses the preprocessing of music data and the implementation of a deep learning framework. Current techniques dedicated to symbolic music generation generally encounter two signif… ▽ More

    Submitted 1 September, 2024; originally announced September 2024.

    Comments: Accepted to the 25th International Society for Music Information Retrieval Conference (ISMIR 2024)

  3. arXiv:2408.08610  [pdf, other

    cs.CV cs.AI cs.LG

    Generative Dataset Distillation Based on Diffusion Model

    Authors: Duo Su, Junjie Hou, Guang Li, Ren Togo, Rui Song, Takahiro Ogawa, Miki Haseyama

    Abstract: This paper presents our method for the generative track of The First Dataset Distillation Challenge at ECCV 2024. Since the diffusion model has become the mainstay of generative models because of its high-quality generative effects, we focus on distillation methods based on the diffusion model. Considering that the track can only generate a fixed number of images in 10 minutes using a generative m… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

    Comments: The Third Place Winner in Generative Track of the ECCV 2024 DD Challenge

  4. arXiv:2407.05814  [pdf, other

    cs.CV cs.AI cs.MM

    Cross-domain Few-shot In-context Learning for Enhancing Traffic Sign Recognition

    Authors: Yaozong Gan, Guang Li, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama

    Abstract: Recent multimodal large language models (MLLM) such as GPT-4o and GPT-4v have shown great potential in autonomous driving. In this paper, we propose a cross-domain few-shot in-context learning method based on the MLLM for enhancing traffic sign recognition (TSR). We first construct a traffic sign detection network based on Vision Transformer Adapter and an extraction module to extract traffic sign… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

  5. arXiv:2406.18836  [pdf, other

    cs.CV cs.IR

    Zero-shot Composed Image Retrieval Considering Query-target Relationship Leveraging Masked Image-text Pairs

    Authors: Huaying Zhang, Rintaro Yanagi, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: This paper proposes a novel zero-shot composed image retrieval (CIR) method considering the query-target relationship by masked image-text pairs. The objective of CIR is to retrieve the target image using a query image and a query text. Existing methods use a textual inversion network to convert the query image into a pseudo word to compose the image and text and use a pre-trained visual-language… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted as a conference paper in IEEE ICIP 2024

  6. arXiv:2406.13316  [pdf, other

    cs.CV cs.MM

    Reinforcing Pre-trained Models Using Counterfactual Images

    Authors: Xiang Li, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama

    Abstract: This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images. Deep learning classification models are often trained using datasets that mirror real-world scenarios. In this training process, because learning is based solely on correlations with labels, there is a risk that models may learn spurious relationships, such as an overreli… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 6 pages, 4 figures

  7. arXiv:2404.17732  [pdf, other

    cs.CV cs.AI cs.LG

    Generative Dataset Distillation: Balancing Global Structure and Local Details

    Authors: Longzhen Li, Guang Li, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama

    Abstract: In this paper, we propose a new dataset distillation method that considers balancing global structure and local details when distilling the information from a large dataset into a generative model. Dataset distillation has been proposed to reduce the size of the required dataset when training models. The conventional dataset distillation methods face the problem of long redeployment time and poor… ▽ More

    Submitted 26 April, 2024; originally announced April 2024.

    Comments: Accepted by the 1st CVPR Workshop on Dataset Distillation

  8. arXiv:2403.18258  [pdf, other

    cs.CV cs.AI

    Enhancing Generative Class Incremental Learning Performance with Model Forgetting Approach

    Authors: Taro Togo, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama

    Abstract: This study presents a novel approach to Generative Class Incremental Learning (GCIL) by introducing the forgetting mechanism, aimed at dynamically managing class information for better adaptation to streaming data. GCIL is one of the hot topics in the field of computer vision, and this is considered one of the crucial tasks in society, specifically the continual learning of generative models. The… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

  9. arXiv:2402.09677  [pdf, other

    cs.CV

    Prompt-based Personalized Federated Learning for Medical Visual Question Answering

    Authors: He Zhu, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: We present a novel prompt-based personalized federated learning (pFL) method to address data heterogeneity and privacy concerns in traditional medical visual question answering (VQA) methods. Specifically, we regard medical datasets from different organs as clients and use pFL to train personalized transformer-based VQA models for each client. To address the high computational complexity of client… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: Accept by ICASSP2024

  10. arXiv:2401.15863  [pdf, other

    cs.CV cs.AI cs.LG

    Importance-Aware Adaptive Dataset Distillation

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: Herein, we propose a novel dataset distillation method for constructing small informative datasets that preserve the information of the large original datasets. The development of deep learning models is enabled by the availability of large-scale datasets. Despite unprecedented success, large-scale datasets considerably increase the storage and transmission costs, resulting in a cumbersome model t… ▽ More

    Submitted 28 January, 2024; originally announced January 2024.

    Comments: Published as a journal paper in Elsevier Neural Networks

  11. arXiv:2307.02799  [pdf, other

    eess.IV cs.LG

    Few-shot Personalized Saliency Prediction Based on Inter-personnel Gaze Patterns

    Authors: Yuya Moroto, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama

    Abstract: This paper presents few-shot personalized saliency prediction based on inter-personnel gaze patterns. In contrast to general saliency maps, personalized saliecny maps (PSMs) have been great potential since PSMs indicate the person-specific visual attention useful for obtaining individual visual preferences. The PSM prediction is needed for acquiring the PSMs for unseen images, but its prediction i… ▽ More

    Submitted 3 March, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

    Comments: 5pages, 3 figures

  12. arXiv:2303.04388  [pdf, other

    cs.CV

    Interpretable Visual Question Answering Referring to Outside Knowledge

    Authors: He Zhu, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: We present a novel multimodal interpretable VQA model that can answer the question more accurately and generate diverse explanations. Although researchers have proposed several methods that can generate human-readable and fine-grained natural language sentences to explain a model's decision, these methods have focused solely on the information in the image. Ideally, the model should refer to vario… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: Under review

  13. arXiv:2212.09281  [pdf, other

    eess.IV cs.CV

    Boosting Automatic COVID-19 Detection Performance with Self-Supervised Learning and Batch Knowledge Ensembling

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: Problem: Detecting COVID-19 from chest X-Ray (CXR) images has become one of the fastest and easiest methods for detecting COVID-19. However, the existing methods usually use supervised transfer learning from natural images as a pretraining process. These methods do not consider the unique features of COVID-19 and the similar features between COVID-19 and other pneumonia. Aim: In this paper, we wan… ▽ More

    Submitted 30 March, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: Published as a journal paper at Elsevier CIBM

  14. arXiv:2212.09276  [pdf, other

    eess.IV cs.CV cs.LG

    COVID-19 Detection Based on Self-Supervised Transfer Learning Using Chest X-Ray Images

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: Purpose: Considering several patients screened due to COVID-19 pandemic, computer-aided detection has strong potential in assisting clinical workflow efficiency and reducing the incidence of infections among radiologists and healthcare providers. Since many confirmed COVID-19 cases present radiological findings of pneumonia, radiologic examinations can be useful for fast detection. Therefore, ches… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: Published as a journal paper at Springer IJCARS

  15. Union-set Multi-source Model Adaptation for Semantic Segmentation

    Authors: Zongyao Li, Ren Togo, Takahiro Ogawa, Miki haseyama

    Abstract: This paper solves a generalized version of the problem of multi-source model adaptation for semantic segmentation. Model adaptation is proposed as a new domain adaptation problem which requires access to a pre-trained model instead of data for the source domain. A general multi-source setting of model adaptation assumes strictly that each source domain shares a common label space with the target d… ▽ More

    Submitted 6 December, 2022; originally announced December 2022.

    Comments: Accepted by ECCV2022

  16. arXiv:2211.00313  [pdf, other

    cs.CV cs.LG eess.IV

    RGMIM: Region-Guided Masked Image Modeling for Learning Meaningful Representations from X-Ray Images

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: In this study, we propose a novel method called region-guided masked image modeling (RGMIM) for learning meaningful representations from X-ray images. Our method adopts a new masking strategy that utilizes organ mask information to identify valid regions for learning more meaningful representations. We conduct quantitative evaluations on an open lung X-ray image dataset as well as masking ratio hy… ▽ More

    Submitted 17 August, 2024; v1 submitted 1 November, 2022; originally announced November 2022.

    Comments: Accepted by ECCV 2024 Workshop on Human-inspired Computer Vision

  17. Dataset Complexity Assessment Based on Cumulative Maximum Scaled Area Under Laplacian Spectrum

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: Dataset complexity assessment aims to predict classification performance on a dataset with complexity calculation before training a classifier, which can also be used for classifier selection and dataset reduction. The training process of deep convolutional neural networks (DCNNs) is iterative and time-consuming because of hyperparameter uncertainty and the domain shift introduced by different dat… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

    Comments: Published as a journal paper at Springer MTAP

  18. Compressed Gastric Image Generation Based on Soft-Label Dataset Distillation for Medical Data Sharing

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: Background and objective: Sharing of medical data is required to enable the cross-agency flow of healthcare information and construct high-accuracy computer-aided diagnosis systems. However, the large sizes of medical datasets, the massive amount of memory of saved deep convolutional neural network (DCNN) models, and patients' privacy protection are problems that can lead to inefficient medical da… ▽ More

    Submitted 1 November, 2022; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: Published as a journal paper at Elsevier CMPB

  19. arXiv:2209.14609  [pdf, other

    cs.CV cs.AI cs.LG

    Dataset Distillation Using Parameter Pruning

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: In this study, we propose a novel dataset distillation method based on parameter pruning. The proposed method can synthesize more robust distilled datasets and improve distillation performance by pruning difficult-to-match parameters during the distillation process. Experimental results on two benchmark datasets show the superiority of the proposed method.

    Submitted 20 August, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: Published as a journal paper at IEICE Trans. Fund

  20. arXiv:2209.14603  [pdf, other

    cs.CR cs.CV cs.LG eess.IV

    Dataset Distillation for Medical Dataset Sharing

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: Sharing medical datasets between hospitals is challenging because of the privacy-protection problem and the massive cost of transmitting and storing many high-resolution medical images. However, dataset distillation can synthesize a small dataset such that models trained on it achieve comparable performance with the original large dataset, which shows potential for solving the existing medical sha… ▽ More

    Submitted 23 December, 2022; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: Accepted by AAAI-23 Workshop on Representation Learning for Responsible Human-Centric AI

  21. arXiv:2209.07007  [pdf, other

    cs.LG cs.CV

    Gromov-Wasserstein Autoencoders

    Authors: Nao Nakagawa, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: Variational Autoencoder (VAE)-based generative models offer flexible representation learning by incorporating meta-priors, general premises considered beneficial for downstream tasks. However, the incorporated meta-priors often involve ad-hoc model deviations from the original likelihood architecture, causing undesirable changes in their training. In this paper, we propose a novel representation l… ▽ More

    Submitted 24 February, 2023; v1 submitted 14 September, 2022; originally announced September 2022.

    Comments: 38 pages, 9 tables, 13 figures; accepted at ICLR2023

  22. TriBYOL: Triplet BYOL for Self-Supervised Representation Learning

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: This paper proposes a novel self-supervised learning method for learning better representations with small batch sizes. Many self-supervised learning methods based on certain forms of the siamese network have emerged and received significant attention. However, these methods need to use large batch sizes to learn good representations and require heavy computational resources. We present a new trip… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: Published as a conference paper at ICASSP 2022

  23. Self-Knowledge Distillation based Self-Supervised Learning for Covid-19 Detection from Chest X-Ray Images

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: The global outbreak of the Coronavirus 2019 (COVID-19) has overloaded worldwide healthcare systems. Computer-aided diagnosis for COVID-19 fast detection and patient triage is becoming critical. This paper proposes a novel self-knowledge distillation based self-supervised learning method for COVID-19 detection from chest X-ray images. Our method can use self-knowledge of images based on similaritie… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: Published as a conference paper at ICASSP 2022

  24. arXiv:2104.02864  [pdf, other

    cs.CV

    Self-Supervised Learning for Gastritis Detection with Gastric X-ray Images

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: Purpose: Manual annotation of gastric X-ray images by doctors for gastritis detection is time-consuming and expensive. To solve this, a self-supervised learning method is developed in this study. The effectiveness of the proposed self-supervised learning method in gastritis detection is verified using a few annotated gastric X-ray images. Methods: In this study, we develop a novel method that can… ▽ More

    Submitted 27 March, 2023; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: Published as a journal paper at Springer IJCARS

  25. Soft-Label Anonymous Gastric X-ray Image Distillation

    Authors: Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama

    Abstract: This paper presents a soft-label anonymous gastric X-ray image distillation method based on a gradient descent approach. The sharing of medical data is demanded to construct high-accuracy computer-aided diagnosis (CAD) systems. However, the large size of the medical dataset and privacy protection are remaining problems in medical data sharing, which hindered the research of CAD systems. The idea o… ▽ More

    Submitted 20 March, 2024; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: The first paper to explore real-world dataset distillation; Work was done in 2019 and published as a conference paper at ICIP 2020