Skip to main content

Showing 1–50 of 113 results for author: Chau, D

  1. Nested Fusion: A Method for Learning High Resolution Latent Structure of Multi-Scale Measurement Data on Mars

    Authors: Austin P. Wright, Scott Davidoff, Duen Horng Chau

    Abstract: The Mars Perseverance Rover represents a generational change in the scale of measurements that can be taken on Mars, however this increased resolution introduces new challenges for techniques in exploratory data analysis. The multiple different instruments on the rover each measures specific properties of interest to scientists, so analyzing how underlying phenomena affect multiple different instr… ▽ More

    Submitted 24 August, 2024; originally announced September 2024.

    Comments: 11 pages

    Journal ref: In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD '24), August 25-29, 2024, Barcelona, Spain. ACM, New York, NY, USA

  2. arXiv:2408.04619  [pdf, other

    cs.LG cs.AI cs.CL cs.HC

    Transformer Explainer: Interactive Learning of Text-Generative Models

    Authors: Aeree Cho, Grace C. Kim, Alexander Karpekov, Alec Helbling, Zijie J. Wang, Seongmin Lee, Benjamin Hoover, Duen Horng Chau

    Abstract: Transformers have revolutionized machine learning, yet their inner workings remain opaque to many. We present Transformer Explainer, an interactive visualization tool designed for non-experts to learn about Transformers through the GPT-2 model. Our tool helps users understand complex Transformer concepts by integrating a model overview and enabling smooth transitions across abstraction levels of m… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: To be presented at IEEE VIS 2024

  3. arXiv:2408.03249  [pdf, other

    cs.HC

    Multi-User Mobile Augmented Reality for Cardiovascular Surgical Planning

    Authors: Pratham Mehta, Rahul O Narayanan, Harsha Karanth, Haoyang Yang, Timothy C Slesnick, Fawwaz Shaw, Duen Horng Chau

    Abstract: Collaborative planning for congenital heart diseases typically involves creating physical heart models through 3D printing, which are then examined by both surgeons and cardiologists. Recent developments in mobile augmented reality (AR) technologies have presented a viable alternative, known for their ease of use and portability. However, there is still a lack of research examining the utilization… ▽ More

    Submitted 7 August, 2024; v1 submitted 6 August, 2024; originally announced August 2024.

  4. arXiv:2407.01972  [pdf, other

    cs.IR cs.AI cs.HC cs.LG

    MeMemo: On-device Retrieval Augmentation for Private and Personalized Text Generation

    Authors: Zijie J. Wang, Duen Horng Chau

    Abstract: Retrieval-augmented text generation (RAG) addresses the common limitations of large language models (LLMs), such as hallucination, by retrieving information from an updatable external knowledge base. However, existing approaches often require dedicated backend servers for data storage and retrieval, thereby limiting their applicability in use cases that require strict data privacy, such as persona… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: Accepted to SIGIR 2024. 6 pages, 2 figures. For a live demo, visit https://poloclub.github.io/mememo/. Code is open-source at https://github.com/poloclub/mememo

  5. arXiv:2405.17374  [pdf, other

    cs.LG

    Navigating the Safety Landscape: Measuring Risks in Finetuning Large Language Models

    Authors: ShengYun Peng, Pin-Yu Chen, Matthew Hull, Duen Horng Chau

    Abstract: Safety alignment is the key to guiding the behaviors of large language models (LLMs) that are in line with human preferences and restrict harmful behaviors at inference time, but recent studies show that it can be easily compromised by finetuning with only a few adversarially designed training examples. We aim to measure the risks in finetuning LLMs through navigating the LLM safety landscape. We… ▽ More

    Submitted 28 May, 2024; v1 submitted 27 May, 2024; originally announced May 2024.

  6. arXiv:2404.01361  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    LLM Attributor: Interactive Visual Attribution for LLM Generation

    Authors: Seongmin Lee, Zijie J. Wang, Aishwarya Chakravarthy, Alec Helbling, ShengYun Peng, Mansi Phute, Duen Horng Chau, Minsuk Kahng

    Abstract: While large language models (LLMs) have shown remarkable capability to generate convincing text across diverse domains, concerns around its potential risks have highlighted the importance of understanding the rationale behind text generation. We present LLM Attributor, a Python library that provides interactive visualizations for training data attribution of an LLM's text generation. Our library o… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: 8 pages, 3 figures, For a video demo, see https://youtu.be/mIG2MDQKQxM

  7. arXiv:2403.04822  [pdf, other

    cs.CV cs.LG

    UniTable: Towards a Unified Framework for Table Recognition via Self-Supervised Pretraining

    Authors: ShengYun Peng, Aishwarya Chakravarthy, Seongmin Lee, Xiaojing Wang, Rajarajeswari Balasubramaniyan, Duen Horng Chau

    Abstract: Tables convey factual and quantitative data with implicit conventions created by humans that are often challenging for machines to parse. Prior work on table recognition (TR) has mainly centered around complex task-specific combinations of available inputs and tools. We present UniTable, a training framework that unifies both the training paradigm and training objective of TR. Its training paradig… ▽ More

    Submitted 27 May, 2024; v1 submitted 7 March, 2024; originally announced March 2024.

  8. arXiv:2402.15578  [pdf, other

    cs.CV

    Self-Supervised Pre-Training for Table Structure Recognition Transformer

    Authors: ShengYun Peng, Seongmin Lee, Xiaojing Wang, Rajarajeswari Balasubramaniyan, Duen Horng Chau

    Abstract: Table structure recognition (TSR) aims to convert tabular images into a machine-readable format. Although hybrid convolutional neural network (CNN)-transformer architecture is widely used in existing approaches, linear projection transformer has outperformed the hybrid architecture in numerous vision tasks due to its simplicity and efficiency. However, existing research has demonstrated that a dir… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: AAAI'24 Workshop on Scientific Document Understanding Oral. arXiv admin note: text overlap with arXiv:2311.05565

  9. arXiv:2402.05075  [pdf, other

    cs.HC

    ARCollab: Towards Multi-User Interactive Cardiovascular Surgical Planning in Mobile Augmented Reality

    Authors: Pratham Mehta, Harsha Karanth, Haoyang Yang, Timothy Slesnick, Fawwaz Shaw, Duen Horng Chau

    Abstract: Surgical planning for congenital heart diseases requires a collaborative approach, traditionally involving the 3D-printing of physical heart models for inspection by surgeons and cardiologists. Recent advancements in mobile augmented reality (AR) technologies have offered a promising alternative, noted for their ease-of-use and portability. Despite this progress, there remains a gap in research ex… ▽ More

    Submitted 7 February, 2024; originally announced February 2024.

  10. arXiv:2402.01877  [pdf, other

    cs.HC cs.AI cs.LG

    Mobile Fitting Room: On-device Virtual Try-on via Diffusion Models

    Authors: Justin Blalock, David Munechika, Harsha Karanth, Alec Helbling, Pratham Mehta, Seongmin Lee, Duen Horng Chau

    Abstract: The growing digital landscape of fashion e-commerce calls for interactive and user-friendly interfaces for virtually trying on clothes. Traditional try-on methods grapple with challenges in adapting to diverse backgrounds, poses, and subjects. While newer methods, utilizing the recent advances of diffusion models, have achieved higher-quality image generation, the human-centered dimensions of mobi… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

    Comments: 7 pages, 3 figures

  11. arXiv:2401.14447  [pdf, other

    cs.HC cs.AI cs.CL cs.LG

    Wordflow: Social Prompt Engineering for Large Language Models

    Authors: Zijie J. Wang, Aishwarya Chakravarthy, David Munechika, Duen Horng Chau

    Abstract: Large language models (LLMs) require well-crafted prompts for effective use. Prompt engineering, the process of designing prompts, is challenging, particularly for non-experts who are less familiar with AI technologies. While researchers have proposed techniques and tools to assist LLM users in prompt design, these works primarily target AI application developers rather than non-experts. To addres… ▽ More

    Submitted 25 January, 2024; originally announced January 2024.

    Comments: 8 pages, 7 figures. Wordflow is available at: https://poloclub.github.io/wordflow. The code is available at: https://github.com/poloclub/wordflow/. For a demo video, see: https://youtu.be/3dOcVuofGVo

  12. arXiv:2311.05565  [pdf, other

    cs.CV cs.LG

    High-Performance Transformers for Table Structure Recognition Need Early Convolutions

    Authors: ShengYun Peng, Seongmin Lee, Xiaojing Wang, Rajarajeswari Balasubramaniyan, Duen Horng Chau

    Abstract: Table structure recognition (TSR) aims to convert tabular images into a machine-readable format, where a visual encoder extracts image features and a textual decoder generates table-representing tokens. Existing approaches use classic convolutional neural network (CNN) backbones for the visual encoder and transformers for the textual decoder. However, this hybrid CNN-Transformer architecture intro… ▽ More

    Submitted 9 November, 2023; originally announced November 2023.

    Comments: Table Representation Learning Workshop at NeurIPS 2023 (Oral)

  13. arXiv:2310.12347  [pdf, other

    cs.HC

    VisGrader: Automatic Grading of D3 Visualizations

    Authors: Matthew Hull, Vivian Pednekar, Hannah Murray, Nimisha Roy, Emmanuel Tung, Susanta Routray, Connor Guerin, Justin Chen, Zijie J. Wang, Seongmin Lee, Mahdi Roozbahani, Duen Horng Chau

    Abstract: Manually grading D3 data visualizations is a challenging endeavor, and is especially difficult for large classes with hundreds of students. Grading an interactive visualization requires a combination of interactive, quantitative, and qualitative evaluation that are conventionally done manually and are difficult to scale up as the visualization complexity, data size, and number of students increase… ▽ More

    Submitted 19 October, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  14. arXiv:2310.12243  [pdf, other

    cs.LG cs.CV

    REVAMP: Automated Simulations of Adversarial Attacks on Arbitrary Objects in Realistic Scenes

    Authors: Matthew Hull, Zijie J. Wang, Duen Horng Chau

    Abstract: Deep Learning models, such as those used in an autonomous vehicle are vulnerable to adversarial attacks where an attacker could place an adversarial object in the environment, leading to mis-classification. Generating these adversarial objects in the digital space has been extensively studied, however successfully transferring these attacks from the digital realm to the physical realm has proven c… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

  15. arXiv:2310.06968  [pdf, other

    cs.CV cs.LG

    ObjectComposer: Consistent Generation of Multiple Objects Without Fine-tuning

    Authors: Alec Helbling, Evan Montoya, Duen Horng Chau

    Abstract: Recent text-to-image generative models can generate high-fidelity images from text prompts. However, these models struggle to consistently generate the same objects in different contexts with the same appearance. Consistent object generation is important to many downstream tasks like generating comic book illustrations with consistent characters and setting. Numerous approaches attempt to solve th… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

  16. arXiv:2309.16750  [pdf, other

    cs.LG cs.AI math.DS

    Memory in Plain Sight: Surveying the Uncanny Resemblances of Associative Memories and Diffusion Models

    Authors: Benjamin Hoover, Hendrik Strobelt, Dmitry Krotov, Judy Hoffman, Zsolt Kira, Duen Horng Chau

    Abstract: The generative process of Diffusion Models (DMs) has recently set state-of-the-art on many AI generation benchmarks. Though the generative process is traditionally understood as an "iterative denoiser", there is no universally accepted language to describe it. We introduce a novel perspective to describe DMs using the mathematical language of memory retrieval from the field of energy-based Associa… ▽ More

    Submitted 28 May, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

    Comments: 15 pages, 4 figures

  17. arXiv:2308.16258  [pdf, other

    cs.CV

    Robust Principles: Architectural Design Principles for Adversarially Robust CNNs

    Authors: ShengYun Peng, Weilin Xu, Cory Cornelius, Matthew Hull, Kevin Li, Rahul Duggal, Mansi Phute, Jason Martin, Duen Horng Chau

    Abstract: Our research aims to unify existing works' diverging opinions on how architectural components affect the adversarial robustness of CNNs. To accomplish our goal, we synthesize a suite of three generalizable robust architectural design principles: (a) optimal range for depth and width configurations, (b) preferring convolutional over patchify stem stage, and (c) robust residual block design through… ▽ More

    Submitted 31 August, 2023; v1 submitted 30 August, 2023; originally announced August 2023.

    Comments: Published at BMVC'23

  18. arXiv:2308.07308  [pdf, other

    cs.CL cs.AI

    LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked

    Authors: Mansi Phute, Alec Helbling, Matthew Hull, ShengYun Peng, Sebastian Szyller, Cory Cornelius, Duen Horng Chau

    Abstract: Large language models (LLMs) are popular for high-quality text generation but can produce harmful content, even when aligned with human values through reinforcement learning. Adversarial prompts can bypass their safety measures. We propose LLM Self Defense, a simple approach to defend against these attacks by having an LLM screen the induced responses. Our method does not require any fine-tuning,… ▽ More

    Submitted 2 May, 2024; v1 submitted 14 August, 2023; originally announced August 2023.

  19. arXiv:2306.17108  [pdf, other

    cs.LG cs.HC

    ManimML: Communicating Machine Learning Architectures with Animation

    Authors: Alec Helbling, Duen Horng Chau

    Abstract: There has been an explosion in interest in machine learning (ML) in recent years due to its applications to science and engineering. However, as ML techniques have advanced, tools for explaining and visualizing novel ML algorithms have lagged behind. Animation has been shown to be a powerful tool for making engaging visualizations of systems that dynamically change over time, which makes it well s… ▽ More

    Submitted 14 November, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

    Comments: Winner of the Best Poster Award at IEEE VIS 2023

  20. arXiv:2306.09328  [pdf, other

    cs.LG cs.CL cs.CV cs.HC

    WizMap: Scalable Interactive Visualization for Exploring Large Machine Learning Embeddings

    Authors: Zijie J. Wang, Fred Hohman, Duen Horng Chau

    Abstract: Machine learning models often learn latent embedding representations that capture the domain semantics of their training data. These embedding representations are valuable for interpreting trained models, building new models, and analyzing new datasets. However, interpreting and using embeddings can be challenging due to their opaqueness, high dimensionality, and the large size of modern datasets.… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 8 pages, 8 figures, Accepted to ACL 2023. For a demo video, see https://youtu.be/8fJG87QVceQ. For a live demo, see https://poloclub.github.io/wizmap. Code is available at https://github.com/poloclub/wizmap

  21. arXiv:2305.03509  [pdf, other

    cs.CL cs.AI cs.HC cs.LG

    Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion

    Authors: Seongmin Lee, Benjamin Hoover, Hendrik Strobelt, Zijie J. Wang, ShengYun Peng, Austin Wright, Kevin Li, Haekyu Park, Haoyang Yang, Duen Horng Chau

    Abstract: Diffusion-based generative models' impressive ability to create convincing images has garnered global attention. However, their complex structures and operations often pose challenges for non-experts to grasp. We present Diffusion Explainer, the first interactive visualization tool that explains how Stable Diffusion transforms text prompts into images. Diffusion Explainer tightly integrates a visu… ▽ More

    Submitted 31 August, 2024; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: 5 pages, 7 figures

  22. SuperNOVA: Design Strategies and Opportunities for Interactive Visualization in Computational Notebooks

    Authors: Zijie J. Wang, David Munechika, Seongmin Lee, Duen Horng Chau

    Abstract: Computational notebooks, such as Jupyter Notebook, have become data scientists' de facto programming environments. Many visualization researchers and practitioners have developed interactive visualization tools that support notebooks, yet little is known about the appropriate design of these tools. To address this critical research gap, we investigate the design strategies in this space by analyzi… ▽ More

    Submitted 28 March, 2024; v1 submitted 4 May, 2023; originally announced May 2023.

    Comments: Accepted at CHI 2024 (Late-Breaking Work). 17 pages, 11 figures, 1 table. SuperNOVA is available at: http://poloclub.github.io/supernova/. The code is available at: https://github.com/poloclub/supernova

  23. arXiv:2303.09624  [pdf, other

    physics.optics cond-mat.mes-hall physics.flu-dyn

    Microfluidic Filling and Spectroscopy of Colloidal CdSe/CdS Nanoplatelets in Liquid Core Fibers

    Authors: Simon Spelthann, Dan Huy Chau, Lars F. Klepzig, Dominik A. Rudolph, Mario Chemnitz, Saher Junaid, Detlev Ristau, Markus A. Schmidt, Jannika Lauth, Michael Steinke

    Abstract: Colloidal 2D semiconductor nanoplatelets are highly efficient light emitters, which exhibit large absorption and emission cross sections, and constitute promising laser gain media. However, if dispersed in solutions, such nanoplatelets lack a suitable optical platform for scalable and application-oriented integration into optical setups such as lasers. Here, we demonstrate the first successful int… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 4 Pages, 3 Figures

  24. arXiv:2303.09545  [pdf, other

    cs.LG cs.AI cs.HC

    WebSHAP: Towards Explaining Any Machine Learning Models Anywhere

    Authors: Zijie J. Wang, Duen Horng Chau

    Abstract: As machine learning (ML) is increasingly integrated into our everyday Web experience, there is a call for transparent and explainable web-based ML. However, existing explainability techniques often require dedicated backend servers, which limit their usefulness as the Web community moves toward in-browser ML for lower latency and greater privacy. To address the pressing need for a client-side expl… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 5 pages, 4 figures. Accepted at the ACM Web Conference 2023 (WWW 2023). For a live demo, visit https://poloclub.github.io/webshap/. Code is open-source at https://github.com/poloclub/webshap

  25. arXiv:2302.14165  [pdf, other

    cs.LG cs.AI cs.HC

    GAM Coach: Towards Interactive and User-centered Algorithmic Recourse

    Authors: Zijie J. Wang, Jennifer Wortman Vaughan, Rich Caruana, Duen Horng Chau

    Abstract: Machine learning (ML) recourse techniques are increasingly used in high-stakes domains, providing end users with actions to alter ML predictions, but they assume ML developers understand what input variables can be changed. However, a recourse plan's actionability is subjective and unlikely to match developers' expectations completely. We present GAM Coach, a novel open-source system that adapts i… ▽ More

    Submitted 28 February, 2023; v1 submitted 27 February, 2023; originally announced February 2023.

    Comments: Accepted to CHI 2023. 20 pages, 12 figures. For a demo video, see https://youtu.be/ubacP34H9XE. For a live demo, visit https://poloclub.github.io/gam-coach/

  26. arXiv:2302.07253  [pdf, other

    cs.LG cond-mat.dis-nn cs.CV q-bio.NC stat.ML

    Energy Transformer

    Authors: Benjamin Hoover, Yuchen Liang, Bao Pham, Rameswar Panda, Hendrik Strobelt, Duen Horng Chau, Mohammed J. Zaki, Dmitry Krotov

    Abstract: Our work combines aspects of three promising paradigms in machine learning, namely, attention mechanism, energy-based models, and associative memory. Attention is the power-house driving modern deep learning successes, but it lacks clear theoretical foundations. Energy-based models allow a principled approach to discriminative and generative tasks, but the design of the energy functional is not st… ▽ More

    Submitted 31 October, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Journal ref: 37th Conference on Neural Information Processing Systems (NeurIPS 2023)

  27. arXiv:2302.07187  [pdf, other

    cs.HC cs.CE cs.LG

    Lessons from the Development of an Anomaly Detection Interface on the Mars Perseverance Rover using the ISHMAP Framework

    Authors: Austin P. Wright, Peter Nemere, Adrian Galvin, Duen Horng Chau, Scott Davidoff

    Abstract: While anomaly detection stands among the most important and valuable problems across many scientific domains, anomaly detection research often focuses on AI methods that can lack the nuance and interpretability so critical to conducting scientific inquiry. In this application paper we present the results of utilizing an alternative approach that situates the mathematical framing of machine learnin… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  28. arXiv:2301.03110  [pdf, other

    cs.CV cs.AI

    RobArch: Designing Robust Architectures against Adversarial Attacks

    Authors: ShengYun Peng, Weilin Xu, Cory Cornelius, Kevin Li, Rahul Duggal, Duen Horng Chau, Jason Martin

    Abstract: Adversarial Training is the most effective approach for improving the robustness of Deep Neural Networks (DNNs). However, compared to the large body of research in optimizing the adversarial training process, there are few investigations into how architecture components affect robustness, and they rarely constrain model capacity. Thus, it is unclear where robustness precisely comes from. In this w… ▽ More

    Submitted 8 January, 2023; originally announced January 2023.

  29. arXiv:2210.14896  [pdf, other

    cs.CV cs.AI cs.HC cs.LG

    DiffusionDB: A Large-scale Prompt Gallery Dataset for Text-to-Image Generative Models

    Authors: Zijie J. Wang, Evan Montoya, David Munechika, Haoyang Yang, Benjamin Hoover, Duen Horng Chau

    Abstract: With recent advancements in diffusion models, users can generate high-quality images by writing text prompts in natural language. However, generating images with desired details requires proper prompts, and it is often unclear how a model reacts to different prompts or what the best prompts are. To help researchers tackle these critical challenges, we introduce DiffusionDB, the first large-scale t… ▽ More

    Submitted 6 July, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

    Comments: Accepted to ACL 2023 (nominated for best paper, top 1.6% of submissions, oral presentation). 17 pages, 11 figures. The dataset is available at https://huggingface.co/datasets/poloclub/diffusiondb. The code is at https://github.com/poloclub/diffusiondb. The interactive visualization demo is at https://poloclub.github.io/diffusiondb/explorer/

  30. arXiv:2210.13510  [pdf, other

    cs.HC

    Evaluation of Argo Scholar with Observational Study

    Authors: Kevin Li, Haoyang Yang, Evan Montoya, Anish Upadhayay, Zhiyan Zhou, Jon Saad-Falcon, Duen Horng Chau

    Abstract: Discovering and making sense of relevant literature is fundamental in any scientific field. Node-link diagram-based visualization tools can aid this process; however, existing tools have been evaluated only on small scales. This paper evaluates Argo Scholar, an open-source visualization tool designed for interactive exploration of literature and easy sharing of exploration results. A large-scale u… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: VIS IEEE 22

  31. arXiv:2210.12492  [pdf, other

    cs.HC cs.LG

    NeuroMapper: In-browser Visualizer for Neural Network Training

    Authors: Zhiyan Zhou, Kevin Li, Haekyu Park, Megan Dass, Austin Wright, Nilaksh Das, Duen Horng Chau

    Abstract: We present our ongoing work NeuroMapper, an in-browser visualization tool that helps machine learning (ML) developers interpret the evolution of a model during training, providing a new way to monitor the training process and visually discover reasons for suboptimal training. While most existing deep neural networks (DNNs) interpretation tools are designed for already-trained model, NeuroMapper sc… ▽ More

    Submitted 22 October, 2022; originally announced October 2022.

    Comments: IEEE VIS 2022

  32. arXiv:2210.05598  [pdf, other

    cs.CL cs.AI

    Enriching Biomedical Knowledge for Low-resource Language Through Large-Scale Translation

    Authors: Long Phan, Tai Dang, Hieu Tran, Trieu H. Trinh, Vy Phan, Lam D. Chau, Minh-Thang Luong

    Abstract: Biomedical data and benchmarks are highly valuable yet very limited in low-resource languages other than English such as Vietnamese. In this paper, we make use of a state-of-the-art translation model in English-Vietnamese to translate and produce both pretrained as well as supervised data in the biomedical domains. Thanks to such large-scale translation, we introduce ViPubmedT5, a pretrained Encod… ▽ More

    Submitted 29 January, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

  33. arXiv:2210.00160  [pdf, other

    cs.SI cs.CR cs.CY cs.HC

    Explaining Website Reliability by Visualizing Hyperlink Connectivity

    Authors: Seongmin Lee, Sadia Afroz, Haekyu Park, Zijie J. Wang, Omar Shaikh, Vibhor Sehgal, Ankit Peshin, Duen Horng Chau

    Abstract: As the information on the Internet continues growing exponentially, understanding and assessing the reliability of a website is becoming increasingly important. Misinformation has far-ranging repercussions, from sowing mistrust in media to undermining democratic elections. While some research investigates how to alert people to misinformation on the web, much less research has been conducted on ex… ▽ More

    Submitted 30 September, 2022; originally announced October 2022.

    Comments: Accepted at IEEE VIS 2022, 5 pages, 4 figures, For a live demo, visit https://poloclub.github.io/MisVis

  34. arXiv:2210.00136  [pdf, other

    cs.LG cs.CV

    IMB-NAS: Neural Architecture Search for Imbalanced Datasets

    Authors: Rahul Duggal, Shengyun Peng, Hao Zhou, Duen Horng Chau

    Abstract: Class imbalance is a ubiquitous phenomenon occurring in real world data distributions. To overcome its detrimental effect on training accurate classifiers, existing work follows three major directions: class re-balancing, information transfer, and representation learning. In this paper, we propose a new and complementary direction for improving performance on long tailed datasets - optimizing the… ▽ More

    Submitted 30 September, 2022; originally announced October 2022.

  35. TimberTrek: Exploring and Curating Sparse Decision Trees with Interactive Visualization

    Authors: Zijie J. Wang, Chudi Zhong, Rui Xin, Takuya Takagi, Zhi Chen, Duen Horng Chau, Cynthia Rudin, Margo Seltzer

    Abstract: Given thousands of equally accurate machine learning (ML) models, how can users choose among them? A recent ML technique enables domain experts and data scientists to generate a complete Rashomon set for sparse decision trees--a huge set of almost-optimal interpretable ML models. To help ML practitioners identify models with desirable properties from this Rashomon set, we develop TimberTrek, the f… ▽ More

    Submitted 19 September, 2022; originally announced September 2022.

    Comments: Accepted at IEEE VIS 2022. 5 pages, 6 figures. For a demo video, see https://youtu.be/3eGqTmsStJM. For a live demo, visit https://poloclub.github.io/timbertrek

  36. arXiv:2208.10639  [pdf, other

    cs.HC

    Evaluating Cardiovascular Surgical Planning in Mobile Augmented Reality

    Authors: Haoyang Yang, Pratham Darrpan Mehta, Jonathan Leo, Zhiyan Zhou, Megan Dass, Anish Upadhayay, Timothy C. Slesnick, Fawwaz Shaw, Amanda Randles, Duen Horng Chau

    Abstract: Advanced surgical procedures for congenital heart diseases (CHDs) require precise planning before the surgeries. The conventional approach utilizes 3D-printing and cutting physical heart models, which is a time and resource intensive process. While rapid advances in augmented reality (AR) technologies have the potential to streamline surgical planning, there is limited research that evaluates such… ▽ More

    Submitted 22 August, 2022; originally announced August 2022.

    Comments: IEEE VIS 2022. 2 pages, 1 figure

  37. arXiv:2206.15465  [pdf, other

    cs.LG cs.AI cs.HC

    Interpretability, Then What? Editing Machine Learning Models to Reflect Human Knowledge and Values

    Authors: Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark E. Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana

    Abstract: Machine learning (ML) interpretability techniques can reveal undesirable patterns in data that models exploit to make predictions--potentially causing harms once deployed. However, how to take action to address these patterns is not always clear. In a collaboration between ML and human-computer interaction researchers, physicians, and data scientists, we develop GAM Changer, the first interactive… ▽ More

    Submitted 30 June, 2022; originally announced June 2022.

    Comments: Accepted at KDD 2022. 11 pages, 19 figures. For a demo video, see https://youtu.be/D6whtfInqTc. For a live demo, visit https://interpret.ml/gam-changer

  38. arXiv:2206.12540  [pdf, other

    cs.HC cs.LG

    Visual Auditor: Interactive Visualization for Detection and Summarization of Model Biases

    Authors: David Munechika, Zijie J. Wang, Jack Reidy, Josh Rubin, Krishna Gade, Krishnaram Kenthapadi, Duen Horng Chau

    Abstract: As machine learning (ML) systems become increasingly widespread, it is necessary to audit these systems for biases prior to their deployment. Recent research has developed algorithms for effectively identifying intersectional bias in the form of interpretable, underperforming subsets (or slices) of the data. However, these solutions and their insights are limited without a tool for visually unders… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

  39. arXiv:2205.03963  [pdf, other

    cs.HC

    NOVA: A Practical Method for Creating Notebook-Ready Visual Analytics

    Authors: Zijie J. Wang, David Munechika, Seongmin Lee, Duen Horng Chau

    Abstract: How can we develop visual analytics (VA) tools that can be easily adopted? Visualization researchers have developed a large number of web-based VA tools to help data scientists in a wide range of tasks. However, adopting these standalone systems can be challenging, as they require data scientists to create new workflows to streamline the VA processes. Recent surveys suggest computational notebooks… ▽ More

    Submitted 15 May, 2023; v1 submitted 8 May, 2022; originally announced May 2022.

    Comments: Accepted to IEEE VIS 2022 (poster). 2 pages, 1 figure. For a live demo, visit https://poloclub.github.io/nova. For method application examples, see https://github.com/poloclub/nova

  40. arXiv:2204.05899  [pdf, other

    cs.CV cs.HC cs.LG

    VisCUIT: Visual Auditor for Bias in CNN Image Classifier

    Authors: Seongmin Lee, Zijie J. Wang, Judy Hoffman, Duen Horng Chau

    Abstract: CNN image classifiers are widely used, thanks to their efficiency and accuracy. However, they can suffer from biases that impede their practical applications. Most existing bias investigation techniques are either inapplicable to general image classification tasks or require significant user efforts in perusing all data subgroups to manually specify which data attributes to inspect. We present Vis… ▽ More

    Submitted 13 April, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

    Comments: 9 pages, 4 figures

  41. arXiv:2204.02381  [pdf, other

    eess.AS cs.LG

    Hear No Evil: Towards Adversarial Robustness of Automatic Speech Recognition via Multi-Task Learning

    Authors: Nilaksh Das, Duen Horng Chau

    Abstract: As automatic speech recognition (ASR) systems are now being widely deployed in the wild, the increasing threat of adversarial attacks raises serious questions about the security and reliability of using such systems. On the other hand, multi-task learning (MTL) has shown success in training models that can resist adversarial attacks in the computer vision domain. In this work, we investigate the i… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

    Comments: Submitted to Insterspeech 2022

  42. arXiv:2204.00734  [pdf, other

    cs.CV cs.LG

    SkeleVision: Towards Adversarial Resiliency of Person Tracking with Multi-Task Learning

    Authors: Nilaksh Das, Sheng-Yun Peng, Duen Horng Chau

    Abstract: Person tracking using computer vision techniques has wide ranging applications such as autonomous driving, home security and sports analytics. However, the growing threat of adversarial attacks raises serious concerns regarding the security and reliability of such techniques. In this work, we study the impact of multi-task learning (MTL) on the adversarial robustness of the widely used SiamRPN tra… ▽ More

    Submitted 1 April, 2022; originally announced April 2022.

  43. arXiv:2203.16475  [pdf, other

    cs.LG cs.CV

    Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries

    Authors: Haekyu Park, Seongmin Lee, Benjamin Hoover, Austin P. Wright, Omar Shaikh, Rahul Duggal, Nilaksh Das, Kevin Li, Judy Hoffman, Duen Horng Chau

    Abstract: We present ConceptEvo, a unified interpretation framework for deep neural networks (DNNs) that reveals the inception and evolution of learned concepts during training. Our work addresses a critical gap in DNN interpretation research, as existing methods primarily focus on post-training interpretation. ConceptEvo introduces two novel technical contributions: (1) an algorithm that generates a unifie… ▽ More

    Submitted 22 August, 2023; v1 submitted 30 March, 2022; originally announced March 2022.

    Comments: Accepted at CIKM'23

  44. Erasing Labor with Labor: Dark Patterns and Lockstep Behaviors on Google Play

    Authors: Ashwin Singh, Arvindh Arun, Ayushi Jain, Pooja Desur, Pulak Malhotra, Duen Horng Chau, Ponnurangam Kumaraguru

    Abstract: Google Play's policy forbids the use of incentivized installs, ratings, and reviews to manipulate the placement of apps. However, there still exist apps that incentivize installs for other apps on the platform. To understand how install-incentivizing apps affect users, we examine their ecosystem through a socio-technical lens and perform a mixed-methods analysis of their reviews and permissions. O… ▽ More

    Submitted 17 May, 2022; v1 submitted 9 February, 2022; originally announced February 2022.

  45. arXiv:2112.03245  [pdf, other

    cs.LG cs.AI cs.HC

    GAM Changer: Editing Generalized Additive Models with Interactive Visualization

    Authors: Zijie J. Wang, Alex Kale, Harsha Nori, Peter Stella, Mark Nunnally, Duen Horng Chau, Mihaela Vorvoreanu, Jennifer Wortman Vaughan, Rich Caruana

    Abstract: Recent strides in interpretable machine learning (ML) research reveal that models exploit undesirable patterns in the data to make predictions, which potentially causes harms in deployment. However, it is unclear how we can fix these models. We present our ongoing work, GAM Changer, an open-source interactive system to help data scientists and domain experts easily and responsibly edit their Gener… ▽ More

    Submitted 6 December, 2021; originally announced December 2021.

    Comments: 7 pages, 15 figures, accepted to the Research2Clinics workshop at NeurIPS 2021. For a demo video, see https://youtu.be/2gVSoPoSeJ8. For a live demo, visit https://interpret.ml/gam-changer/

  46. arXiv:2110.14060  [pdf, other

    cs.HC

    Argo Scholar: Interactive Visual Exploration of Literature in Browsers

    Authors: Kevin Li, Haoyang Yang, Anish Upadhayay, Zhiyan Zhou, Jon Saad-Falcon, Duen Horng Chau

    Abstract: Discovering and making sense of relevant research literature is fundamental to becoming knowledgeable in any scientific discipline. Visualization can aid this process; however, existing tools' adoption and impact have often been constrained, such as by their reliance on small curated paper datasets that quickly become outdated or a lack of support for personalized exploration. We introduce Argo Sc… ▽ More

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: IEEE VIS 2021

  47. arXiv:2110.11227  [pdf, other

    cs.HC

    Towards Automatic Grading of D3.js Visualizations

    Authors: Matthew Hull, Connor Guerin, Justin Chen, Susanta Routray, Duen Horng Chau

    Abstract: Manually grading D3 data visualizations is a challenging endeavor, and is especially difficult for large classes with hundreds of students. Grading an interactive visualization requires a combination of interactive, quantitative, and qualitative evaluation that are conventionally done manually and are difficult to scale up as the visualization complexity, data size, and number of students increase… ▽ More

    Submitted 21 October, 2021; originally announced October 2021.

    Comments: Accepted to IEEE VIS'21. For a demo video, see https://youtu.be/hA2I36Gm0YM

    ACM Class: H.5

  48. arXiv:2108.13751  [pdf, other

    cs.CL cs.HC cs.IR

    A Search Engine for Discovery of Scientific Challenges and Directions

    Authors: Dan Lahav, Jon Saad Falcon, Bailey Kuehl, Sophie Johnson, Sravanthi Parasa, Noam Shomron, Duen Horng Chau, Diyi Yang, Eric Horvitz, Daniel S. Weld, Tom Hope

    Abstract: Keeping track of scientific challenges, advances and emerging directions is a fundamental part of research. However, researchers face a flood of papers that hinders discovery of important knowledge. In biomedicine, this directly impacts human lives. To address this problem, we present a novel task of extraction and search of scientific challenges and directions, to facilitate rapid knowledge disco… ▽ More

    Submitted 19 January, 2022; v1 submitted 31 August, 2021; originally announced August 2021.

    Comments: AAAI 2022

    Journal ref: AAAI 2022

  49. arXiv:2108.12931  [pdf, other

    cs.CV

    NeuroCartography: Scalable Automatic Visual Summarization of Concepts in Deep Neural Networks

    Authors: Haekyu Park, Nilaksh Das, Rahul Duggal, Austin P. Wright, Omar Shaikh, Fred Hohman, Duen Horng Chau

    Abstract: Existing research on making sense of deep neural networks often focuses on neuron-level interpretation, which may not adequately capture the bigger picture of how concepts are collectively encoded by multiple neurons. We present NeuroCartography, an interactive system that scalably summarizes and visualizes concepts learned by neural networks. It automatically discovers and groups neurons that det… ▽ More

    Submitted 29 August, 2021; originally announced August 2021.

    Comments: Accepted to IEEE VIS'21

  50. arXiv:2106.11846  [pdf, other

    econ.GN cs.IR

    Quantifying the Impact of Human Capital, Job History, and Language Factors on Job Seniority with a Large-scale Analysis of Resumes

    Authors: Austin P Wright, Caleb Ziems, Haekyu Park, Jon Saad-Falcon, Duen Horng Chau, Diyi Yang, Maria Tomprou

    Abstract: As job markets worldwide have become more competitive and applicant selection criteria have become more opaque, and different (and sometimes contradictory) information and advice is available for job seekers wishing to progress in their careers, it has never been more difficult to determine which factors in a résumé most effectively help career progression. In this work we present a novel, large s… ▽ More

    Submitted 15 June, 2021; originally announced June 2021.

    Comments: 9 Pages, 5 Figures, 8 Tables