-
Show, Don't Tell: Aligning Language Models with Demonstrated Feedback
Authors:
Omar Shaikh,
Michelle Lam,
Joey Hejna,
Yijia Shao,
Michael Bernstein,
Diyi Yang
Abstract:
Language models are aligned to emulate the collective voice of many, resulting in outputs that align with no one in particular. Steering LLMs away from generic output is possible through supervised finetuning or RLHF, but requires prohibitively large datasets for new ad-hoc tasks. We argue that it is instead possible to align an LLM to a specific setting by leveraging a very small number ($<10$) o…
▽ More
Language models are aligned to emulate the collective voice of many, resulting in outputs that align with no one in particular. Steering LLMs away from generic output is possible through supervised finetuning or RLHF, but requires prohibitively large datasets for new ad-hoc tasks. We argue that it is instead possible to align an LLM to a specific setting by leveraging a very small number ($<10$) of demonstrations as feedback. Our method, Demonstration ITerated Task Optimization (DITTO), directly aligns language model outputs to a user's demonstrated behaviors. Derived using ideas from online imitation learning, DITTO cheaply generates online comparison data by treating users' demonstrations as preferred over output from the LLM and its intermediate checkpoints. We evaluate DITTO's ability to learn fine-grained style and task alignment across domains such as news articles, emails, and blog posts. Additionally, we conduct a user study soliciting a range of demonstrations from participants ($N=16$). Across our benchmarks and user study, we find that win-rates for DITTO outperform few-shot prompting, supervised fine-tuning, and other self-play methods by an average of 19% points. By using demonstrations as feedback directly, DITTO offers a novel method for effective customization of LLMs.
△ Less
Submitted 2 June, 2024;
originally announced June 2024.
-
Social Skill Training with Large Language Models
Authors:
Diyi Yang,
Caleb Ziems,
William Held,
Omar Shaikh,
Michael S. Bernstein,
John Mitchell
Abstract:
People rely on social skills like conflict resolution to communicate effectively and to thrive in both work and personal life. However, practice environments for social skills are typically out of reach for most people. How can we make social skill training more available, accessible, and inviting? Drawing upon interdisciplinary research from communication and psychology, this perspective paper id…
▽ More
People rely on social skills like conflict resolution to communicate effectively and to thrive in both work and personal life. However, practice environments for social skills are typically out of reach for most people. How can we make social skill training more available, accessible, and inviting? Drawing upon interdisciplinary research from communication and psychology, this perspective paper identifies social skill barriers to enter specialized fields. Then we present a solution that leverages large language models for social skill training via a generic framework. Our AI Partner, AI Mentor framework merges experiential learning with realistic practice and tailored feedback. This work ultimately calls for cross-disciplinary innovation to address the broader implications for workforce development and social equality.
△ Less
Submitted 5 April, 2024;
originally announced April 2024.
-
Grounding Gaps in Language Model Generations
Authors:
Omar Shaikh,
Kristina Gligorić,
Ashna Khetan,
Matthias Gerstgrasser,
Diyi Yang,
Dan Jurafsky
Abstract:
Effective conversation requires common ground: a shared understanding between the participants. Common ground, however, does not emerge spontaneously in conversation. Speakers and listeners work together to both identify and construct a shared basis while avoiding misunderstanding. To accomplish grounding, humans rely on a range of dialogue acts, like clarification (What do you mean?) and acknowle…
▽ More
Effective conversation requires common ground: a shared understanding between the participants. Common ground, however, does not emerge spontaneously in conversation. Speakers and listeners work together to both identify and construct a shared basis while avoiding misunderstanding. To accomplish grounding, humans rely on a range of dialogue acts, like clarification (What do you mean?) and acknowledgment (I understand.). However, it is unclear whether large language models (LLMs) generate text that reflects human grounding. To this end, we curate a set of grounding acts and propose corresponding metrics that quantify attempted grounding. We study whether LLM generations contain grounding acts, simulating turn-taking from several dialogue datasets and comparing results to humans. We find that -- compared to humans -- LLMs generate language with less conversational grounding, instead generating text that appears to simply presume common ground. To understand the roots of the identified grounding gap, we examine the role of instruction tuning and preference optimization, finding that training on contemporary preference data leads to a reduction in generated grounding acts. Altogether, we highlight the need for more research investigating conversational grounding in human-AI interaction.
△ Less
Submitted 2 April, 2024; v1 submitted 15 November, 2023;
originally announced November 2023.
-
Rehearsal: Simulating Conflict to Teach Conflict Resolution
Authors:
Omar Shaikh,
Valentino Chai,
Michele J. Gelfand,
Diyi Yang,
Michael S. Bernstein
Abstract:
Interpersonal conflict is an uncomfortable but unavoidable fact of life. Navigating conflict successfully is a skill -- one that can be learned through deliberate practice -- but few have access to effective training or feedback. To expand this access, we introduce Rehearsal, a system that allows users to rehearse conflicts with a believable simulated interlocutor, explore counterfactual "what if?…
▽ More
Interpersonal conflict is an uncomfortable but unavoidable fact of life. Navigating conflict successfully is a skill -- one that can be learned through deliberate practice -- but few have access to effective training or feedback. To expand this access, we introduce Rehearsal, a system that allows users to rehearse conflicts with a believable simulated interlocutor, explore counterfactual "what if?" scenarios to identify alternative conversational paths, and learn through feedback on how and when to apply specific conflict strategies. Users can utilize Rehearsal to practice handling a variety of predefined conflict scenarios, from office disputes to relationship issues, or they can choose to create their own setting. To enable Rehearsal, we develop IRP prompting, a method of conditioning output of a large language model on the influential Interest-Rights-Power (IRP) theory from conflict resolution. Rehearsal uses IRP to generate utterances grounded in conflict resolution theory, guiding users towards counterfactual conflict resolution strategies that help de-escalate difficult conversations. In a between-subjects evaluation, 40 participants engaged in an actual conflict with a confederate after training. Compared to a control group with lecture material covering the same IRP theory, participants with simulated training from Rehearsal significantly improved their performance in the unaided conflict: they reduced their use of escalating competitive strategies by an average of 67%, while doubling their use of cooperative strategies. Overall, Rehearsal highlights the potential effectiveness of language models as tools for learning and practicing interpersonal skills.
△ Less
Submitted 29 February, 2024; v1 submitted 21 September, 2023;
originally announced September 2023.
-
Modeling Cross-Cultural Pragmatic Inference with Codenames Duet
Authors:
Omar Shaikh,
Caleb Ziems,
William Held,
Aryan J. Pariani,
Fred Morstatter,
Diyi Yang
Abstract:
Pragmatic reference enables efficient interpersonal communication. Prior work uses simple reference games to test models of pragmatic reasoning, often with unidentified speakers and listeners. In practice, however, speakers' sociocultural background shapes their pragmatic assumptions. For example, readers of this paper assume NLP refers to "Natural Language Processing," and not "Neuro-linguistic P…
▽ More
Pragmatic reference enables efficient interpersonal communication. Prior work uses simple reference games to test models of pragmatic reasoning, often with unidentified speakers and listeners. In practice, however, speakers' sociocultural background shapes their pragmatic assumptions. For example, readers of this paper assume NLP refers to "Natural Language Processing," and not "Neuro-linguistic Programming." This work introduces the Cultural Codes dataset, which operationalizes sociocultural pragmatic inference in a simple word reference game.
Cultural Codes is based on the multi-turn collaborative two-player game, Codenames Duet. Our dataset consists of 794 games with 7,703 turns, distributed across 153 unique players. Alongside gameplay, we collect information about players' personalities, values, and demographics. Utilizing theories of communication and pragmatics, we predict each player's actions via joint modeling of their sociocultural priors and the game context. Our experiments show that accounting for background characteristics significantly improves model performance for tasks related to both clue giving and guessing, indicating that sociocultural priors play a vital role in gameplay decisions.
△ Less
Submitted 4 June, 2023;
originally announced June 2023.
-
Can Large Language Models Transform Computational Social Science?
Authors:
Caleb Ziems,
William Held,
Omar Shaikh,
Jiaao Chen,
Zhehao Zhang,
Diyi Yang
Abstract:
Large Language Models (LLMs) are capable of successfully performing many language processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify and explain social phenomena like persuasiveness and political ideology, then LLMs could augment the Computational Social Science (CSS) pipeline in important ways. This work provides a road map for using LLMs as CSS tools…
▽ More
Large Language Models (LLMs) are capable of successfully performing many language processing tasks zero-shot (without training data). If zero-shot LLMs can also reliably classify and explain social phenomena like persuasiveness and political ideology, then LLMs could augment the Computational Social Science (CSS) pipeline in important ways. This work provides a road map for using LLMs as CSS tools. Towards this end, we contribute a set of prompting best practices and an extensive evaluation pipeline to measure the zero-shot performance of 13 language models on 25 representative English CSS benchmarks. On taxonomic labeling tasks (classification), LLMs fail to outperform the best fine-tuned models but still achieve fair levels of agreement with humans. On free-form coding tasks (generation), LLMs produce explanations that often exceed the quality of crowdworkers' gold references. We conclude that the performance of today's LLMs can augment the CSS research pipeline in two ways: (1) serving as zero-shot data annotators on human annotation teams, and (2) bootstrapping challenging creative generation tasks (e.g., explaining the underlying attributes of a text). In summary, LLMs are posed to meaningfully participate in social science analysis in partnership with humans.
△ Less
Submitted 26 February, 2024; v1 submitted 12 April, 2023;
originally announced May 2023.
-
On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning
Authors:
Omar Shaikh,
Hongxin Zhang,
William Held,
Michael Bernstein,
Diyi Yang
Abstract:
Generating a Chain of Thought (CoT) has been shown to consistently improve large language model (LLM) performance on a wide range of NLP tasks. However, prior work has mainly focused on logical reasoning tasks (e.g. arithmetic, commonsense QA); it remains unclear whether improvements hold for more diverse types of reasoning, especially in socially situated contexts. Concretely, we perform a contro…
▽ More
Generating a Chain of Thought (CoT) has been shown to consistently improve large language model (LLM) performance on a wide range of NLP tasks. However, prior work has mainly focused on logical reasoning tasks (e.g. arithmetic, commonsense QA); it remains unclear whether improvements hold for more diverse types of reasoning, especially in socially situated contexts. Concretely, we perform a controlled evaluation of zero-shot CoT across two socially sensitive domains: harmful questions and stereotype benchmarks. We find that zero-shot CoT reasoning in sensitive domains significantly increases a model's likelihood to produce harmful or undesirable output, with trends holding across different prompt formats and model variants. Furthermore, we show that harmful CoTs increase with model size, but decrease with improved instruction following. Our work suggests that zero-shot CoT should be used with caution on socially important tasks, especially when marginalized groups or sensitive topics are involved.
△ Less
Submitted 4 June, 2023; v1 submitted 15 December, 2022;
originally announced December 2022.
-
Explaining Website Reliability by Visualizing Hyperlink Connectivity
Authors:
Seongmin Lee,
Sadia Afroz,
Haekyu Park,
Zijie J. Wang,
Omar Shaikh,
Vibhor Sehgal,
Ankit Peshin,
Duen Horng Chau
Abstract:
As the information on the Internet continues growing exponentially, understanding and assessing the reliability of a website is becoming increasingly important. Misinformation has far-ranging repercussions, from sowing mistrust in media to undermining democratic elections. While some research investigates how to alert people to misinformation on the web, much less research has been conducted on ex…
▽ More
As the information on the Internet continues growing exponentially, understanding and assessing the reliability of a website is becoming increasingly important. Misinformation has far-ranging repercussions, from sowing mistrust in media to undermining democratic elections. While some research investigates how to alert people to misinformation on the web, much less research has been conducted on explaining how websites engage in spreading false information. To fill the research gap, we present MisVis, a web-based interactive visualization tool that helps users assess a website's reliability by understanding how it engages in spreading false information on the World Wide Web. MisVis visualizes the hyperlink connectivity of the website and summarizes key characteristics of the Twitter accounts that mention the site. A large-scale user study with 139 participants demonstrates that MisVis facilitates users to assess and understand false information on the web and node-link diagrams can be used to communicate with non-experts. MisVis is available at the public demo link: https://poloclub.github.io/MisVis.
△ Less
Submitted 30 September, 2022;
originally announced October 2022.
-
Six Feet Apart: Online Payments During the COVID-19 Pandemic
Authors:
Omar Shaikh,
Cassandra Ung,
Diyi Yang,
Felipe Chacon
Abstract:
Since the COVID-19 pandemic, businesses have faced unprecedented challenges when trying to remain open. Because COVID-19 spreads through aerosolized droplets, businesses were forced to distance their services; in some cases, distancing may have involved moving business services online. In this work, we explore digitization strategies used by small businesses that remained open during the pandemic,…
▽ More
Since the COVID-19 pandemic, businesses have faced unprecedented challenges when trying to remain open. Because COVID-19 spreads through aerosolized droplets, businesses were forced to distance their services; in some cases, distancing may have involved moving business services online. In this work, we explore digitization strategies used by small businesses that remained open during the pandemic, and survey/interview small businesses owners to understand preliminary challenges associated with moving online. Furthermore, we analyze payments from 400K businesses across Japan, Australia, United States, Great Britain, and Canada. Following initial government interventions, we observe (at minimum for each country) a 47% increase in digitizing businesses compared to pre-pandemic levels, with about 80% of surveyed businesses digitizing in under a week. From both our quantitative models and our surveys/interviews, we find that businesses rapidly digitized at the start of the pandemic in preparation of future uncertainty. We also conduct a case-study of initial digitization in the United States, examining finer relationships between specific government interventions, business sectors, political orientation, and resulting digitization shifts. Finally, we discuss the implications of rapid & widespread digitization for small businesses in the context of usability challenges and interpersonal interactions, while highlighting potential shifts in pre-existing social norms.
△ Less
Submitted 29 April, 2022;
originally announced April 2022.
-
Concept Evolution in Deep Learning Training: A Unified Interpretation Framework and Discoveries
Authors:
Haekyu Park,
Seongmin Lee,
Benjamin Hoover,
Austin P. Wright,
Omar Shaikh,
Rahul Duggal,
Nilaksh Das,
Kevin Li,
Judy Hoffman,
Duen Horng Chau
Abstract:
We present ConceptEvo, a unified interpretation framework for deep neural networks (DNNs) that reveals the inception and evolution of learned concepts during training. Our work addresses a critical gap in DNN interpretation research, as existing methods primarily focus on post-training interpretation. ConceptEvo introduces two novel technical contributions: (1) an algorithm that generates a unifie…
▽ More
We present ConceptEvo, a unified interpretation framework for deep neural networks (DNNs) that reveals the inception and evolution of learned concepts during training. Our work addresses a critical gap in DNN interpretation research, as existing methods primarily focus on post-training interpretation. ConceptEvo introduces two novel technical contributions: (1) an algorithm that generates a unified semantic space, enabling side-by-side comparison of different models during training, and (2) an algorithm that discovers and quantifies important concept evolutions for class predictions. Through a large-scale human evaluation and quantitative experiments, we demonstrate that ConceptEvo successfully identifies concept evolutions across different models, which are not only comprehensible to humans but also crucial for class predictions. ConceptEvo is applicable to both modern DNN architectures, such as ConvNeXt, and classic DNNs, such as VGGs and InceptionV3.
△ Less
Submitted 22 August, 2023; v1 submitted 30 March, 2022;
originally announced March 2022.
-
NeuroCartography: Scalable Automatic Visual Summarization of Concepts in Deep Neural Networks
Authors:
Haekyu Park,
Nilaksh Das,
Rahul Duggal,
Austin P. Wright,
Omar Shaikh,
Fred Hohman,
Duen Horng Chau
Abstract:
Existing research on making sense of deep neural networks often focuses on neuron-level interpretation, which may not adequately capture the bigger picture of how concepts are collectively encoded by multiple neurons. We present NeuroCartography, an interactive system that scalably summarizes and visualizes concepts learned by neural networks. It automatically discovers and groups neurons that det…
▽ More
Existing research on making sense of deep neural networks often focuses on neuron-level interpretation, which may not adequately capture the bigger picture of how concepts are collectively encoded by multiple neurons. We present NeuroCartography, an interactive system that scalably summarizes and visualizes concepts learned by neural networks. It automatically discovers and groups neurons that detect the same concepts, and describes how such neuron groups interact to form higher-level concepts and the subsequent predictions. NeuroCartography introduces two scalable summarization techniques: (1) neuron clustering groups neurons based on the semantic similarity of the concepts detected by neurons (e.g., neurons detecting "dog faces" of different breeds are grouped); and (2) neuron embedding encodes the associations between related concepts based on how often they co-occur (e.g., neurons detecting "dog face" and "dog tail" are placed closer in the embedding space). Key to our scalable techniques is the ability to efficiently compute all neuron pairs' relationships, in time linear to the number of neurons instead of quadratic time. NeuroCartography scales to large data, such as the ImageNet dataset with 1.2M images. The system's tightly coordinated views integrate the scalable techniques to visualize the concepts and their relationships, projecting the concept associations to a 2D space in Neuron Projection View, and summarizing neuron clusters and their relationships in Graph View. Through a large-scale human evaluation, we demonstrate that our technique discovers neuron groups that represent coherent, human-meaningful concepts. And through usage scenarios, we describe how our approaches enable interesting and surprising discoveries, such as concept cascades of related and isolated concepts. The NeuroCartography visualization runs in modern browsers and is open-sourced.
△ Less
Submitted 29 August, 2021;
originally announced August 2021.
-
EnergyVis: Interactively Tracking and Exploring Energy Consumption for ML Models
Authors:
Omar Shaikh,
Jon Saad-Falcon,
Austin P Wright,
Nilaksh Das,
Scott Freitas,
Omar Isaac Asensio,
Duen Horng Chau
Abstract:
The advent of larger machine learning (ML) models have improved state-of-the-art (SOTA) performance in various modeling tasks, ranging from computer vision to natural language. As ML models continue increasing in size, so does their respective energy consumption and computational requirements. However, the methods for tracking, reporting, and comparing energy consumption remain limited. We present…
▽ More
The advent of larger machine learning (ML) models have improved state-of-the-art (SOTA) performance in various modeling tasks, ranging from computer vision to natural language. As ML models continue increasing in size, so does their respective energy consumption and computational requirements. However, the methods for tracking, reporting, and comparing energy consumption remain limited. We presentEnergyVis, an interactive energy consumption tracker for ML models. Consisting of multiple coordinated views, EnergyVis enables researchers to interactively track, visualize and compare model energy consumption across key energy consumption and carbon footprint metrics (kWh and CO2), helping users explore alternative deployment locations and hardware that may reduce carbon footprints. EnergyVis aims to raise awareness concerning computational sustainability by interactively highlighting excessive energy usage during model training; and by providing alternative training options to reduce energy usage.
△ Less
Submitted 30 March, 2021;
originally announced March 2021.
-
RECAST: Enabling User Recourse and Interpretability of Toxicity Detection Models with Interactive Visualization
Authors:
Austin P Wright,
Omar Shaikh,
Haekyu Park,
Will Epperson,
Muhammed Ahmed,
Stephane Pinel,
Duen Horng Chau,
Diyi Yang
Abstract:
With the widespread use of toxic language online, platforms are increasingly using automated systems that leverage advances in natural language processing to automatically flag and remove toxic comments. However, most automated systems -- when detecting and moderating toxic language -- do not provide feedback to their users, let alone provide an avenue of recourse for these users to make actionabl…
▽ More
With the widespread use of toxic language online, platforms are increasingly using automated systems that leverage advances in natural language processing to automatically flag and remove toxic comments. However, most automated systems -- when detecting and moderating toxic language -- do not provide feedback to their users, let alone provide an avenue of recourse for these users to make actionable changes. We present our work, RECAST, an interactive, open-sourced web tool for visualizing these models' toxic predictions, while providing alternative suggestions for flagged toxic language. Our work also provides users with a new path of recourse when using these automated moderation tools. RECAST highlights text responsible for classifying toxicity, and allows users to interactively substitute potentially toxic phrases with neutral alternatives. We examined the effect of RECAST via two large-scale user evaluations, and found that RECAST was highly effective at helping users reduce toxicity as detected through the model. Users also gained a stronger understanding of the underlying toxicity criterion used by black-box models, enabling transparency and recourse. In addition, we found that when users focus on optimizing language for these models instead of their own judgement (which is the implied incentive and goal of deploying automated models), these models cease to be effective classifiers of toxicity compared to human annotations. This opens a discussion for how toxicity detection models work and should work, and their effect on the future of online discourse.
△ Less
Submitted 10 February, 2021; v1 submitted 8 February, 2021;
originally announced February 2021.
-
Examining the Ordering of Rhetorical Strategies in Persuasive Requests
Authors:
Omar Shaikh,
Jiaao Chen,
Jon Saad-Falcon,
Duen Horng Chau,
Diyi Yang
Abstract:
Interpreting how persuasive language influences audiences has implications across many domains like advertising, argumentation, and propaganda. Persuasion relies on more than a message's content. Arranging the order of the message itself (i.e., ordering specific rhetorical strategies) also plays an important role. To examine how strategy orderings contribute to persuasiveness, we first utilize a V…
▽ More
Interpreting how persuasive language influences audiences has implications across many domains like advertising, argumentation, and propaganda. Persuasion relies on more than a message's content. Arranging the order of the message itself (i.e., ordering specific rhetorical strategies) also plays an important role. To examine how strategy orderings contribute to persuasiveness, we first utilize a Variational Autoencoder model to disentangle content and rhetorical strategies in textual requests from a large-scale loan request corpus. We then visualize interplay between content and strategy through an attentional LSTM that predicts the success of textual requests. We find that specific (orderings of) strategies interact uniquely with a request's content to impact success rate, and thus the persuasiveness of a request.
△ Less
Submitted 11 October, 2020; v1 submitted 9 October, 2020;
originally announced October 2020.
-
Mapping Researchers with PeopleMap
Authors:
Jon Saad-Falcon,
Omar Shaikh,
Zijie J. Wang,
Austin P. Wright,
Sasha Richardson,
Duen Horng Chau
Abstract:
Discovering research expertise at universities can be a difficult task. Directories routinely become outdated, and few help in visually summarizing researchers' work or supporting the exploration of shared interests among researchers. This results in lost opportunities for both internal and external entities to discover new connections, nurture research collaboration, and explore the diversity of…
▽ More
Discovering research expertise at universities can be a difficult task. Directories routinely become outdated, and few help in visually summarizing researchers' work or supporting the exploration of shared interests among researchers. This results in lost opportunities for both internal and external entities to discover new connections, nurture research collaboration, and explore the diversity of research. To address this problem, at Georgia Tech, we have been developing PeopleMap, an open-source interactive web-based tool that uses natural language processing (NLP) to create visual maps for researchers based on their research interests and publications. Requiring only the researchers' Google Scholar profiles as input, PeopleMap generates and visualizes embeddings for the researchers, significantly reducing the need for manual curation of publication information. To encourage and facilitate easy adoption and extension of PeopleMap, we have open-sourced it under the permissive MIT license at https://github.com/poloclub/people-map. PeopleMap has received positive feedback and enthusiasm for expanding its adoption across Georgia Tech.
△ Less
Submitted 31 August, 2020;
originally announced September 2020.
-
Argo Lite: Open-Source Interactive Graph Exploration and Visualization in Browsers
Authors:
Siwei Li,
Zhiyan Zhou,
Anish Upadhayay,
Omar Shaikh,
Scott Freitas,
Haekyu Park,
Zijie J. Wang,
Susanta Routray,
Matthew Hull,
Duen Horng Chau
Abstract:
Graph data have become increasingly common. Visualizing them helps people better understand relations among entities. Unfortunately, existing graph visualization tools are primarily designed for single-person desktop use, offering limited support for interactive web-based exploration and online collaborative analysis. To address these issues, we have developed Argo Lite, a new in-browser interacti…
▽ More
Graph data have become increasingly common. Visualizing them helps people better understand relations among entities. Unfortunately, existing graph visualization tools are primarily designed for single-person desktop use, offering limited support for interactive web-based exploration and online collaborative analysis. To address these issues, we have developed Argo Lite, a new in-browser interactive graph exploration and visualization tool. Argo Lite enables users to publish and share interactive graph visualizations as URLs and embedded web widgets. Users can explore graphs incrementally by adding more related nodes, such as highly cited papers cited by or citing a paper of interest in a citation network. Argo Lite works across devices and platforms, leveraging WebGL for high-performance rendering. Argo Lite has been used by over 1,000 students at Georgia Tech's Data and Visual Analytics class. Argo Lite may serve as a valuable open-source tool for advancing multiple CIKM research areas, from data presentation, to interfaces for information systems and more.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
PeopleMap: Visualization Tool for Mapping Out Researchers using Natural Language Processing
Authors:
Jon Saad-Falcon,
Omar Shaikh,
Zijie J. Wang,
Austin P. Wright,
Sasha Richardson,
Duen Horng Chau
Abstract:
Discovering research expertise at institutions can be a difficult task. Manually curated university directories easily become out of date and they often lack the information necessary for understanding a researcher's interests and past work, making it harder to explore the diversity of research at an institution and identify research talents. This results in lost opportunities for both internal an…
▽ More
Discovering research expertise at institutions can be a difficult task. Manually curated university directories easily become out of date and they often lack the information necessary for understanding a researcher's interests and past work, making it harder to explore the diversity of research at an institution and identify research talents. This results in lost opportunities for both internal and external entities to discover new connections and nurture research collaboration. To solve this problem, we have developed PeopleMap, the first interactive, open-source, web-based tool that visually "maps out" researchers based on their research interests and publications by leveraging embeddings generated by natural language processing (NLP) techniques. PeopleMap provides a new engaging way for institutions to summarize their research talents and for people to discover new connections. The platform is developed with ease-of-use and sustainability in mind. Using only researchers' Google Scholar profiles as input, PeopleMap can be readily adopted by any institution using its publicly-accessible repository and detailed documentation.
△ Less
Submitted 10 June, 2020;
originally announced June 2020.
-
CNN Explainer: Learning Convolutional Neural Networks with Interactive Visualization
Authors:
Zijie J. Wang,
Robert Turko,
Omar Shaikh,
Haekyu Park,
Nilaksh Das,
Fred Hohman,
Minsuk Kahng,
Duen Horng Chau
Abstract:
Deep learning's great success motivates many practitioners and students to learn about this exciting technology. However, it is often challenging for beginners to take their first step due to the complexity of understanding and applying deep learning. We present CNN Explainer, an interactive visualization tool designed for non-experts to learn and examine convolutional neural networks (CNNs), a fo…
▽ More
Deep learning's great success motivates many practitioners and students to learn about this exciting technology. However, it is often challenging for beginners to take their first step due to the complexity of understanding and applying deep learning. We present CNN Explainer, an interactive visualization tool designed for non-experts to learn and examine convolutional neural networks (CNNs), a foundational deep learning model architecture. Our tool addresses key challenges that novices face while learning about CNNs, which we identify from interviews with instructors and a survey with past students. CNN Explainer tightly integrates a model overview that summarizes a CNN's structure, and on-demand, dynamic visual explanation views that help users understand the underlying components of CNNs. Through smooth transitions across levels of abstraction, our tool enables users to inspect the interplay between low-level mathematical operations and high-level model structures. A qualitative user study shows that CNN Explainer helps users more easily understand the inner workings of CNNs, and is engaging and enjoyable to use. We also derive design lessons from our study. Developed using modern web technologies, CNN Explainer runs locally in users' web browsers without the need for installation or specialized hardware, broadening the public's education access to modern deep learning techniques.
△ Less
Submitted 28 August, 2020; v1 submitted 30 April, 2020;
originally announced April 2020.
-
Real-Time Well Log Prediction From Drilling Data Using Deep Learning
Authors:
Rayan Kanfar,
Obai Shaikh,
Mehrdad Yousefzadeh,
Tapan Mukerji
Abstract:
The objective is to study the feasibility of predicting subsurface rock properties in wells from real-time drilling data. Geophysical logs, namely, density, porosity and sonic logs are of paramount importance for subsurface resource estimation and exploitation. These wireline petro-physical measurements are selectively deployed as they are expensive to acquire; meanwhile, drilling information is r…
▽ More
The objective is to study the feasibility of predicting subsurface rock properties in wells from real-time drilling data. Geophysical logs, namely, density, porosity and sonic logs are of paramount importance for subsurface resource estimation and exploitation. These wireline petro-physical measurements are selectively deployed as they are expensive to acquire; meanwhile, drilling information is recorded in every drilled well. Hence a predictive tool for wireline log prediction from drilling data can help management make decisions about data acquisition, especially for delineation and production wells. This problem is non-linear with strong ineractions between drilling parameters; hence the potential for deep learning to address this problem is explored. We present a workflow for data augmentation and feature engineering using Distance-based Global Sensitivity Analysis. We propose an Inception-based Convolutional Neural Network combined with a Temporal Convolutional Network as the deep learning model. The model is designed to learn both low and high frequency content of the data. 12 wells from the Equinor dataset for the Volve field in the North Sea are used for learning. The model predictions not only capture trends but are also physically consistent across density, porosity, and sonic logs. On the test data, the mean square error reaches a low value of 0.04 but the correlation coefficient plateaus around 0.6. The model is able however to differentiate between different types of rocks such as cemented sandstone, unconsolidated sands, and shale.
△ Less
Submitted 27 January, 2020;
originally announced January 2020.
-
CNN 101: Interactive Visual Learning for Convolutional Neural Networks
Authors:
Zijie J. Wang,
Robert Turko,
Omar Shaikh,
Haekyu Park,
Nilaksh Das,
Fred Hohman,
Minsuk Kahng,
Duen Horng Chau
Abstract:
The success of deep learning solving previously-thought hard problems has inspired many non-experts to learn and understand this exciting technology. However, it is often challenging for learners to take the first steps due to the complexity of deep learning models. We present our ongoing work, CNN 101, an interactive visualization system for explaining and teaching convolutional neural networks.…
▽ More
The success of deep learning solving previously-thought hard problems has inspired many non-experts to learn and understand this exciting technology. However, it is often challenging for learners to take the first steps due to the complexity of deep learning models. We present our ongoing work, CNN 101, an interactive visualization system for explaining and teaching convolutional neural networks. Through tightly integrated interactive views, CNN 101 offers both overview and detailed descriptions of how a model works. Built using modern web technologies, CNN 101 runs locally in users' web browsers without requiring specialized hardware, broadening the public's education access to modern deep learning techniques.
△ Less
Submitted 27 February, 2020; v1 submitted 7 January, 2020;
originally announced January 2020.
-
RECAST: Interactive Auditing of Automatic Toxicity Detection Models
Authors:
Austin P. Wright,
Omar Shaikh,
Haekyu Park,
Will Epperson,
Muhammed Ahmed,
Stephane Pinel,
Diyi Yang,
Duen Horng Chau
Abstract:
As toxic language becomes nearly pervasive online, there has been increasing interest in leveraging the advancements in natural language processing (NLP), from very large transformer models to automatically detecting and removing toxic comments. Despite the fairness concerns, lack of adversarial robustness, and limited prediction explainability for deep learning systems, there is currently little…
▽ More
As toxic language becomes nearly pervasive online, there has been increasing interest in leveraging the advancements in natural language processing (NLP), from very large transformer models to automatically detecting and removing toxic comments. Despite the fairness concerns, lack of adversarial robustness, and limited prediction explainability for deep learning systems, there is currently little work for auditing these systems and understanding how they work for both developers and users. We present our ongoing work, RECAST, an interactive tool for examining toxicity detection models by visualizing explanations for predictions and providing alternative wordings for detected toxic speech.
△ Less
Submitted 1 July, 2020; v1 submitted 6 January, 2020;
originally announced January 2020.