subscribe to arXiv mailings

Multimodal Learning and Cognitive Processes in Radiology: MedGaze for Chest X-ray Scanpath Prediction

Authors: Akash Awasthi, Ngan Le, Zhigang Deng, Rishi Agrawal, Carol C. Wu, Hien Van Nguyen

Abstract: Predicting human gaze behavior within computer vision is integral for developing interactive systems that can anticipate user attention, address fundamental questions in cognitive science, and hold implications for fields like human-computer interaction (HCI) and augmented/virtual reality (AR/VR) systems. Despite methodologies introduced for modeling human eye gaze behavior, applying these models… ▽ More Predicting human gaze behavior within computer vision is integral for developing interactive systems that can anticipate user attention, address fundamental questions in cognitive science, and hold implications for fields like human-computer interaction (HCI) and augmented/virtual reality (AR/VR) systems. Despite methodologies introduced for modeling human eye gaze behavior, applying these models to medical imaging for scanpath prediction remains unexplored. Our proposed system aims to predict eye gaze sequences from radiology reports and CXR images, potentially streamlining data collection and enhancing AI systems using larger datasets. However, predicting human scanpaths on medical images presents unique challenges due to the diverse nature of abnormal regions. Our model predicts fixation coordinates and durations critical for medical scanpath prediction, outperforming existing models in the computer vision community. Utilizing a two-stage training process and large publicly available datasets, our approach generates static heatmaps and eye gaze videos aligned with radiology reports, facilitating comprehensive analysis. We validate our approach by comparing its performance with state-of-the-art methods and assessing its generalizability among different radiologists, introducing novel strategies to model radiologists' search patterns during CXR image diagnosis. Based on the radiologist's evaluation, MedGaze can generate human-like gaze sequences with a high focus on relevant regions over the CXR images. It sometimes also outperforms humans in terms of redundancy and randomness in the scanpaths. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: Submitted to the Journal

arXiv:2406.19686 [pdf]

Enhancing Radiological Diagnosis: A Collaborative Approach Integrating AI and Human Expertise for Visual Miss Correction

Authors: Akash Awasthi, Ngan Le, Zhigang Deng, Carol C. Wu, Hien Van Nguyen

Abstract: Human-AI collaboration to identify and correct perceptual errors in chest radiographs has not been previously explored. This study aimed to develop a collaborative AI system, CoRaX, which integrates eye gaze data and radiology reports to enhance diagnostic accuracy in chest radiology by pinpointing perceptual errors and refining the decision-making process. Using public datasets REFLACX and EGD-CX… ▽ More Human-AI collaboration to identify and correct perceptual errors in chest radiographs has not been previously explored. This study aimed to develop a collaborative AI system, CoRaX, which integrates eye gaze data and radiology reports to enhance diagnostic accuracy in chest radiology by pinpointing perceptual errors and refining the decision-making process. Using public datasets REFLACX and EGD-CXR, the study retrospectively developed CoRaX, employing a large multimodal model to analyze image embeddings, eye gaze data, and radiology reports. The system's effectiveness was evaluated based on its referral-making process, the quality of referrals, and performance in collaborative diagnostic settings. CoRaX was tested on a simulated error dataset of 271 samples with 28% (93 of 332) missed abnormalities. The system corrected 21% (71 of 332) of these errors, leaving 7% (22 of 312) unresolved. The Referral-Usefulness score, indicating the accuracy of predicted regions for all true referrals, was 0.63 (95% CI 0.59, 0.68). The Total-Usefulness score, reflecting the diagnostic accuracy of CoRaX's interactions with radiologists, showed that 84% (237 of 280) of these interactions had a score above 0.40. In conclusion, CoRaX efficiently collaborates with radiologists to address perceptual errors across various abnormalities, with potential applications in the education and training of novice radiologists. △ Less

Submitted 28 June, 2024; originally announced June 2024.

Comments: Under Review in Journal

arXiv:2309.12325 [pdf]

FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare

Authors: Karim Lekadir, Aasa Feragen, Abdul Joseph Fofanah, Alejandro F Frangi, Alena Buyx, Anais Emelie, Andrea Lara, Antonio R Porras, An-Wen Chan, Arcadi Navarro, Ben Glocker, Benard O Botwe, Bishesh Khanal, Brigit Beger, Carol C Wu, Celia Cintas, Curtis P Langlotz, Daniel Rueckert, Deogratias Mzurikwao, Dimitrios I Fotiadis, Doszhan Zhussupov, Enzo Ferrante, Erik Meijering, Eva Weicken, Fabio A González , et al. (95 additional authors not shown)

Abstract: Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted… ▽ More Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted by patients, clinicians, health organisations and authorities. This work describes the FUTURE-AI guideline as the first international consensus framework for guiding the development and deployment of trustworthy AI tools in healthcare. The FUTURE-AI consortium was founded in 2021 and currently comprises 118 inter-disciplinary experts from 51 countries representing all continents, including AI scientists, clinicians, ethicists, and social scientists. Over a two-year period, the consortium defined guiding principles and best practices for trustworthy AI through an iterative process comprising an in-depth literature review, a modified Delphi survey, and online consensus meetings. The FUTURE-AI framework was established based on 6 guiding principles for trustworthy AI in healthcare, i.e. Fairness, Universality, Traceability, Usability, Robustness and Explainability. Through consensus, a set of 28 best practices were defined, addressing technical, clinical, legal and socio-ethical dimensions. The recommendations cover the entire lifecycle of medical AI, from design, development and validation to regulation, deployment, and monitoring. FUTURE-AI is a risk-informed, assumption-free guideline which provides a structured approach for constructing medical AI tools that will be trusted, deployed and adopted in real-world practice. Researchers are encouraged to take the recommendations into account in proof-of-concept stages to facilitate future translation towards clinical practice of medical AI. △ Less

Submitted 8 July, 2024; v1 submitted 11 August, 2023; originally announced September 2023.

ACM Class: I.2.0; I.4.0; I.5.0

arXiv:2004.07407 [pdf, other]

Radiologist-Level COVID-19 Detection Using CT Scans with Detail-Oriented Capsule Networks

Authors: Aryan Mobiny, Pietro Antonio Cicalese, Samira Zare, Pengyu Yuan, Mohammadsajad Abavisani, Carol C. Wu, Jitesh Ahuja, Patricia M. de Groot, Hien Van Nguyen

Abstract: Radiographic images offer an alternative method for the rapid screening and monitoring of Coronavirus Disease 2019 (COVID-19) patients. This approach is limited by the shortage of radiology experts who can provide a timely interpretation of these images. Motivated by this challenge, our paper proposes a novel learning architecture, called Detail-Oriented Capsule Networks (DECAPS), for the automati… ▽ More Radiographic images offer an alternative method for the rapid screening and monitoring of Coronavirus Disease 2019 (COVID-19) patients. This approach is limited by the shortage of radiology experts who can provide a timely interpretation of these images. Motivated by this challenge, our paper proposes a novel learning architecture, called Detail-Oriented Capsule Networks (DECAPS), for the automatic diagnosis of COVID-19 from Computed Tomography (CT) scans. Our network combines the strength of Capsule Networks with several architecture improvements meant to boost classification accuracies. First, DECAPS uses an Inverted Dynamic Routing mechanism which increases model stability by preventing the passage of information from non-descriptive regions. Second, DECAPS employs a Peekaboo training procedure which uses a two-stage patch crop and drop strategy to encourage the network to generate activation maps for every target concept. The network then uses the activation maps to focus on regions of interest and combines both coarse and fine-grained representations of the data. Finally, we use a data augmentation method based on conditional generative adversarial networks to deal with the issue of data scarcity. Our model achieves 84.3% precision, 91.5% recall, and 96.1% area under the ROC curve, significantly outperforming state-of-the-art methods. We compare the performance of the DECAPS model with three experienced, well-trained thoracic radiologists and show that the architecture significantly outperforms them. While further studies on larger datasets are required to confirm this finding, our results imply that architectures like DECAPS can be used to assist radiologists in the CT scan mediated diagnosis of COVID-19. △ Less

Submitted 15 April, 2020; originally announced April 2020.

arXiv:1906.04569 [pdf, other]

DropConnect Is Effective in Modeling Uncertainty of Bayesian Deep Networks

Authors: Aryan Mobiny, Hien V. Nguyen, Supratik Moulik, Naveen Garg, Carol C. Wu

Abstract: Deep neural networks (DNNs) have achieved state-of-the-art performances in many important domains, including medical diagnosis, security, and autonomous driving. In these domains where safety is highly critical, an erroneous decision can result in serious consequences. While a perfect prediction accuracy is not always achievable, recent work on Bayesian deep networks shows that it is possible to k… ▽ More Deep neural networks (DNNs) have achieved state-of-the-art performances in many important domains, including medical diagnosis, security, and autonomous driving. In these domains where safety is highly critical, an erroneous decision can result in serious consequences. While a perfect prediction accuracy is not always achievable, recent work on Bayesian deep networks shows that it is possible to know when DNNs are more likely to make mistakes. Knowing what DNNs do not know is desirable to increase the safety of deep learning technology in sensitive applications. Bayesian neural networks attempt to address this challenge. However, traditional approaches are computationally intractable and do not scale well to large, complex neural network architectures. In this paper, we develop a theoretical framework to approximate Bayesian inference for DNNs by imposing a Bernoulli distribution on the model weights. This method, called MC-DropConnect, gives us a tool to represent the model uncertainty with little change in the overall model structure or computational cost. We extensively validate the proposed algorithm on multiple network architectures and datasets for classification and semantic segmentation tasks. We also propose new metrics to quantify the uncertainty estimates. This enables an objective comparison between MC-DropConnect and prior approaches. Our empirical results demonstrate that the proposed framework yields significant improvement in both prediction accuracy and uncertainty estimation quality compared to the state of the art. △ Less

Submitted 7 June, 2019; originally announced June 2019.

arXiv:1010.6099 [pdf]

doi 10.1021/nl1039549

Capacitive Spring Softening in Single-Walled Carbon Nanotube Nanoelectromechanical Resonators

Authors: Chung Chiang Wu, Zhaohui Zhong

Abstract: We report the capacitive spring softening effect observed in single-walled carbon nanotube (SWNT) nanoelectromechanical (NEM) resonators. The nanotube resonators adopt dual-gate configuration with both bottom-gate and side-gate capable of tuning the resonance frequency through capacitive coupling. Interestingly, downward resonance frequency shifting is observed with increasing side-gate voltage, w… ▽ More We report the capacitive spring softening effect observed in single-walled carbon nanotube (SWNT) nanoelectromechanical (NEM) resonators. The nanotube resonators adopt dual-gate configuration with both bottom-gate and side-gate capable of tuning the resonance frequency through capacitive coupling. Interestingly, downward resonance frequency shifting is observed with increasing side-gate voltage, which can be attributed to the capacitive softening of spring constant. Furthermore, in-plane vibrational modes exhibit much stronger spring softening effect than out-of-plan modes. Our dual-gate design should enable the differentiation between these two types of vibrational modes, and open up new possibility for nonlinear operation of nanotube resonators. △ Less

Submitted 28 October, 2010; originally announced October 2010.

Comments: 12 pages/ 3 figures

arXiv:cond-mat/0601218 [pdf, ps, other]

doi 10.1016/j.synthmet.2005.01.039

Apparent phonon side band modes in pi-conjugated systems: polymers, oligomers and crystals

Authors: E. Ehrenfreund, C. C. Wu, Z. V. Vardeny

Abstract: The emission spectra of many pi-conjugated polymers and oligomers contain side-band replicas with apparent frequencies that do not match the Raman active mode frequencies. Using a time dependent model we show that in such many mode systems, the increased damping of the time dependent transition dipole moment correlation function results in an effective elimination of the vibrational modes from t… ▽ More The emission spectra of many pi-conjugated polymers and oligomers contain side-band replicas with apparent frequencies that do not match the Raman active mode frequencies. Using a time dependent model we show that in such many mode systems, the increased damping of the time dependent transition dipole moment correlation function results in an effective elimination of the vibrational modes from the emission spectrum; subsequently causing the appearance of a regularly spaced progression at a new apparent frequency. We use this damping dependent vibrational reshaping to quantitatively account for the vibronic structure in the emission spectra of pi-conjugated systems in the form of films, dilute solutions and single crystals. In particular, we show that by using the experimentally measured Raman spectrum we can account in detail for the apparent progression frequencies and their relative intensities in the emission spectrum. △ Less

Submitted 11 January, 2006; originally announced January 2006.

Comments: Presented in "Optical Probes 2005", Bangalore, India

Journal ref: Synthetic Metals, Vol. 155, pp.266-269 (2005)

arXiv:cond-mat/0512067 [pdf, ps, other]

doi 10.1103/PhysRevB.71.081201

Apparent vibrational side-bands in pi-conjugated systems: the case of distyrylbenzene

Authors: C. C. Wu, E. Ehrenfreund, J. J. Gutierrez, J. P. Ferraris, Z. V. Vardeny

Abstract: The photoluminescence (PL) spectra of dilute solution and single crystals of distyrylbenzene show unique temperature dependent vibronic structures. The characteristic single frequency progression at high temperatures is modulated by a low frequency progression series at low temperatures. None of the series side band modes corresponds to any of the distyrylbenzene Raman frequencies. We explain th… ▽ More The photoluminescence (PL) spectra of dilute solution and single crystals of distyrylbenzene show unique temperature dependent vibronic structures. The characteristic single frequency progression at high temperatures is modulated by a low frequency progression series at low temperatures. None of the series side band modes corresponds to any of the distyrylbenzene Raman frequencies. We explain these PL properties using a time dependent model with temperature dependent damping, in which the many-mode system is effectively transformed to two- and then to a single "apparent" mode as damping increases. △ Less

Submitted 3 December, 2005; originally announced December 2005.

Comments: 4 pages, 3 figures

Journal ref: Phy. Rev. B71 081201 (2005)

arXiv:cond-mat/0107234 [pdf, ps, other]

Experimental evidence of the spin selection rule in KLL Auger transition

Authors: D. J. Huang, W. P. Wu, C. F. Chang, J. Chen, S. C. Chung, L. H. Tjeng, C. T. Chen, S. G. Shyu, C. C. Wu

Abstract: With on- and off- resonant excitation photons, spin-resolved Auger electron spectra of epitaxial CrO$_{2}$ thin films show an experimental evidence of the spin-selective KLL Auger decay. The on-resonance O KLL Auger electrons are found to be highly spin-polarized, while the off-resonance ones with almost zero spin polarization. These results lead to conclude that the two-hole final state in KLL… ▽ More With on- and off- resonant excitation photons, spin-resolved Auger electron spectra of epitaxial CrO$_{2}$ thin films show an experimental evidence of the spin-selective KLL Auger decay. The on-resonance O KLL Auger electrons are found to be highly spin-polarized, while the off-resonance ones with almost zero spin polarization. These results lead to conclude that the two-hole final state in KLL Auger decay is a spin-singlet. Applications to spin-resolved absorption spectroscopy are discussed. △ Less

Submitted 12 July, 2001; v1 submitted 11 July, 2001; originally announced July 2001.

Comments: 4 pages, 3 figures

Showing 1–9 of 9 results for author: Wu, C C