subscribe to arXiv mailings

"It's a Good Idea to Put It Into Words": Writing `Rudders' in the Initial Stages of Visualization Design

Authors: Chase Stokes, Clara Hu, Marti A. Hearst

Abstract: Written language is a useful tool for non-visual creative activities like writing essays and planning searches. This paper investigates the integration of written language in to the visualization design process. We create the idea of a 'writing rudder,' which acts as a guiding force or strategy for the design. Via an interview study of 24 working visualization designers, we first established that… ▽ More Written language is a useful tool for non-visual creative activities like writing essays and planning searches. This paper investigates the integration of written language in to the visualization design process. We create the idea of a 'writing rudder,' which acts as a guiding force or strategy for the design. Via an interview study of 24 working visualization designers, we first established that only a minority of participants systematically use writing to aid in design. A second study with 15 visualization designers examined four different variants of written rudders: asking questions, stating conclusions, composing a narrative, and writing titles. Overall, participants had a positive reaction; designers recognized the benefits of explicitly writing down components of the design and indicated that they would use this approach in future design work. More specifically, two approaches - writing questions and writing conclusions/takeaways - were seen as beneficial across the design process, while writing narratives showed promise mainly for the creation stage. Although concerns around potential bias during data exploration were raised, participants also discussed strategies to mitigate such concerns. This paper contributes to a deeper understanding of the interplay between language and visualization, and proposes a straightforward, lightweight addition to the visualization design process. △ Less

Submitted 22 July, 2024; originally announced July 2024.

Comments: 11 pages, 3 figures, accepted for IEEE VIS conference 2024

ACM Class: H.5.0

arXiv:2404.00131 [pdf, ps, other]

Give Text A Chance: Advocating for Equal Consideration for Language and Visualization

Authors: Chase Stokes, Marti A. Hearst

Abstract: Visualization research tends to de-emphasize consideration of the textual context in which its images are placed. We argue that visualization research should consider textual representations as a primary alternative to visual options when assessing designs, and when assessing designs, equal attention should be given to the construction of the language as to the visualizations. We also call for a c… ▽ More Visualization research tends to de-emphasize consideration of the textual context in which its images are placed. We argue that visualization research should consider textual representations as a primary alternative to visual options when assessing designs, and when assessing designs, equal attention should be given to the construction of the language as to the visualizations. We also call for a consideration of readability when integrating visualizations with written text. In highlighting these points, visualization research would be elevated in efficacy and demonstrate thorough accounting for viewers' needs and responses. △ Less

Submitted 29 March, 2024; originally announced April 2024.

Comments: 2 pages

Journal ref: NL VIZ: 2021 Workshop on Exploring Opportunities and Challenges for Natural Language Techniques to Support Visual Analysis

arXiv:2402.12255 [pdf, other]

Shallow Synthesis of Knowledge in GPT-Generated Texts: A Case Study in Automatic Related Work Composition

Authors: Anna Martin-Boyle, Aahan Tyagi, Marti A. Hearst, Dongyeop Kang

Abstract: Numerous AI-assisted scholarly applications have been developed to aid different stages of the research process. We present an analysis of AI-assisted scholarly writing generated with ScholaCite, a tool we built that is designed for organizing literature and composing Related Work sections for academic papers. Our evaluation method focuses on the analysis of citation graphs to assess the structura… ▽ More Numerous AI-assisted scholarly applications have been developed to aid different stages of the research process. We present an analysis of AI-assisted scholarly writing generated with ScholaCite, a tool we built that is designed for organizing literature and composing Related Work sections for academic papers. Our evaluation method focuses on the analysis of citation graphs to assess the structural complexity and inter-connectedness of citations in texts and involves a three-way comparison between (1) original human-written texts, (2) purely GPT-generated texts, and (3) human-AI collaborative texts. We find that GPT-4 can generate reasonable coarse-grained citation groupings to support human users in brainstorming, but fails to perform detailed synthesis of related works without human intervention. We suggest that future writing assistant tools should not be used to draft text independently of the human author. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: 15 pages, 5 figures, submitted to ACL 2024

arXiv:2401.04052 [pdf]

doi 10.1109/TVCG.2023.3338451

The Role of Text in Visualizations: How Annotations Shape Perceptions of Bias and Influence Predictions

Authors: Chase Stokes, Cindy Xiong Bearfield, Marti A. Hearst

Abstract: This paper investigates the role of text in visualizations, specifically the impact of text position, semantic content, and biased wording. Two empirical studies were conducted based on two tasks (predicting data trends and appraising bias) using two visualization types (bar and line charts). While the addition of text had a minimal effect on how people perceive data trends, there was a significan… ▽ More This paper investigates the role of text in visualizations, specifically the impact of text position, semantic content, and biased wording. Two empirical studies were conducted based on two tasks (predicting data trends and appraising bias) using two visualization types (bar and line charts). While the addition of text had a minimal effect on how people perceive data trends, there was a significant impact on how biased they perceive the authors to be. This finding revealed a relationship between the degree of bias in textual information and the perception of the authors' bias. Exploratory analyses support an interaction between a person's prediction and the degree of bias they perceived. This paper also develops a crowdsourced method for creating chart annotations that range from neutral to highly biased. This research highlights the need for designers to mitigate potential polarization of readers' opinions based on how authors' ideas are expressed. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: 12 pages, 7 figures, for supplemental materials: https://github.com/chasejstokes/role-text

ACM Class: H.5.0

arXiv:2309.15337 [pdf, other]

Beyond the Chat: Executable and Verifiable Text-Editing with LLMs

Authors: Philippe Laban, Jesse Vig, Marti A. Hearst, Caiming Xiong, Chien-Sheng Wu

Abstract: Conversational interfaces powered by Large Language Models (LLMs) have recently become a popular way to obtain feedback during document editing. However, standard chat-based conversational interfaces do not support transparency and verifiability of the editing changes that they suggest. To give the author more agency when editing with an LLM, we present InkSync, an editing interface that suggests… ▽ More Conversational interfaces powered by Large Language Models (LLMs) have recently become a popular way to obtain feedback during document editing. However, standard chat-based conversational interfaces do not support transparency and verifiability of the editing changes that they suggest. To give the author more agency when editing with an LLM, we present InkSync, an editing interface that suggests executable edits directly within the document being edited. Because LLMs are known to introduce factual errors, Inksync also supports a 3-stage approach to mitigate this risk: Warn authors when a suggested edit introduces new information, help authors Verify the new information's accuracy through external search, and allow an auditor to perform an a-posteriori verification by Auditing the document via a trace of all auto-generated content. Two usability studies confirm the effectiveness of InkSync's components when compared to standard LLM-based chat interfaces, leading to more accurate, more efficient editing, and improved user experience. △ Less

Submitted 26 September, 2023; originally announced September 2023.

arXiv:2305.14660 [pdf, other]

Complex Mathematical Symbol Definition Structures: A Dataset and Model for Coordination Resolution in Definition Extraction

Authors: Anna Martin-Boyle, Andrew Head, Kyle Lo, Risham Sidhu, Marti A. Hearst, Dongyeop Kang

Abstract: Mathematical symbol definition extraction is important for improving scholarly reading interfaces and scholarly information extraction (IE). However, the task poses several challenges: math symbols are difficult to process as they are not composed of natural language morphemes; and scholarly papers often contain sentences that require resolving complex coordinate structures. We present SymDef, an… ▽ More Mathematical symbol definition extraction is important for improving scholarly reading interfaces and scholarly information extraction (IE). However, the task poses several challenges: math symbols are difficult to process as they are not composed of natural language morphemes; and scholarly papers often contain sentences that require resolving complex coordinate structures. We present SymDef, an English language dataset of 5,927 sentences from full-text scientific papers where each sentence is annotated with all mathematical symbols linked with their corresponding definitions. This dataset focuses specifically on complex coordination structures such as "respectively" constructions, which often contain overlapping definition spans. We also introduce a new definition extraction method that masks mathematical symbols, creates a copy of each sentence for each symbol, specifies a target symbol, and predicts its corresponding definition spans using slot filling. Our experiments show that our definition extraction model significantly outperforms RoBERTa and other strong IE baseline systems by 10.9 points with a macro F1 score of 84.82. With our dataset and model, we can detect complex definitions in scholarly documents to make scientific writing more readable. △ Less

Submitted 23 May, 2023; originally announced May 2023.

Comments: 9 pages, 4 figures

ACM Class: I.2.7

arXiv:2304.02812 [pdf, other]

Pragmatically Appropriate Diversity for Dialogue Evaluation

Authors: Katherine Stasaski, Marti A. Hearst

Abstract: Linguistic pragmatics state that a conversation's underlying speech acts can constrain the type of response which is appropriate at each turn in the conversation. When generating dialogue responses, neural dialogue agents struggle to produce diverse responses. Currently, dialogue diversity is assessed using automatic metrics, but the underlying speech acts do not inform these metrics. To remedy… ▽ More Linguistic pragmatics state that a conversation's underlying speech acts can constrain the type of response which is appropriate at each turn in the conversation. When generating dialogue responses, neural dialogue agents struggle to produce diverse responses. Currently, dialogue diversity is assessed using automatic metrics, but the underlying speech acts do not inform these metrics. To remedy this, we propose the notion of Pragmatically Appropriate Diversity, defined as the extent to which a conversation creates and constrains the creation of multiple diverse responses. Using a human-created multi-response dataset, we find significant support for the hypothesis that speech acts provide a signal for the diversity of the set of next responses. Building on this result, we propose a new human evaluation task where creative writers predict the extent to which conversations inspire the creation of multiple diverse responses. Our studies find that writers' judgments align with the Pragmatically Appropriate Diversity of conversations. Our work suggests that expectations for diversity metric scores should vary depending on the speech act. △ Less

Submitted 5 April, 2023; originally announced April 2023.

arXiv:2303.14334 [pdf, other]

The Semantic Reader Project: Augmenting Scholarly Documents through AI-Powered Interactive Reading Interfaces

Authors: Kyle Lo, Joseph Chee Chang, Andrew Head, Jonathan Bragg, Amy X. Zhang, Cassidy Trier, Chloe Anastasiades, Tal August, Russell Authur, Danielle Bragg, Erin Bransom, Isabel Cachola, Stefan Candra, Yoganand Chandrasekhar, Yen-Sung Chen, Evie Yu-Yen Cheng, Yvonne Chou, Doug Downey, Rob Evans, Raymond Fok, Fangzhou Hu, Regan Huff, Dongyeop Kang, Tae Soo Kim, Rodney Kinney , et al. (30 additional authors not shown)

Abstract: Scholarly publications are key to the transfer of knowledge from scholars to others. However, research papers are information-dense, and as the volume of the scientific literature grows, the need for new technology to support the reading process grows. In contrast to the process of finding papers, which has been transformed by Internet technology, the experience of reading research papers has chan… ▽ More Scholarly publications are key to the transfer of knowledge from scholars to others. However, research papers are information-dense, and as the volume of the scientific literature grows, the need for new technology to support the reading process grows. In contrast to the process of finding papers, which has been transformed by Internet technology, the experience of reading research papers has changed little in decades. The PDF format for sharing research papers is widely used due to its portability, but it has significant downsides including: static content, poor accessibility for low-vision readers, and difficulty reading on mobile devices. This paper explores the question "Can recent advances in AI and HCI power intelligent, interactive, and accessible reading interfaces -- even for legacy PDFs?" We describe the Semantic Reader Project, a collaborative effort across multiple institutions to explore automatic creation of dynamic reading interfaces for research papers. Through this project, we've developed ten research prototype interfaces and conducted usability studies with more than 300 participants and real-world users showing improved reading experiences for scholars. We've also released a production reading interface for research papers that will incorporate the best features as they mature. We structure this paper around challenges scholars and the public face when reading research papers -- Discovery, Efficiency, Comprehension, Synthesis, and Accessibility -- and present an overview of our progress and remaining open challenges. △ Less

Submitted 23 April, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

arXiv:2205.04561 [pdf, other]

doi 10.1145/3581641.3584034

Scim: Intelligent Skimming Support for Scientific Papers

Authors: Raymond Fok, Hita Kambhamettu, Luca Soldaini, Jonathan Bragg, Kyle Lo, Andrew Head, Marti A. Hearst, Daniel S. Weld

Abstract: Researchers need to keep up with immense literatures, though it is time-consuming and difficult to do so. In this paper, we investigate the role that intelligent interfaces can play in helping researchers skim papers, that is, rapidly reviewing a paper to attain a cursory understanding of its contents. After conducting formative interviews and a design probe, we suggest that skimming aids should a… ▽ More Researchers need to keep up with immense literatures, though it is time-consuming and difficult to do so. In this paper, we investigate the role that intelligent interfaces can play in helping researchers skim papers, that is, rapidly reviewing a paper to attain a cursory understanding of its contents. After conducting formative interviews and a design probe, we suggest that skimming aids should aim to thread the needle of highlighting content that is simultaneously diverse, evenly-distributed, and important. We introduce Scim, a novel intelligent skimming interface that reifies this aim, designed to support the skimming process by highlighting salient paper contents to direct a skimmer's focus. Key to the design is that the highlights are faceted by content type, evenly-distributed across a paper, with a density configurable by readers at both the global and local level. We evaluate Scim with an in-lab usability study and deployment study, revealing how skimming aids can support readers throughout the skimming experience and yielding design considerations and tensions for the design of future intelligent skimming tools. △ Less

Submitted 25 September, 2023; v1 submitted 9 May, 2022; originally announced May 2022.

Comments: Updated to reflect version published in proceedings of IUI 2023

arXiv:2205.01497 [pdf, other]

Semantic Diversity in Dialogue with Natural Language Inference

Authors: Katherine Stasaski, Marti A. Hearst

Abstract: Generating diverse, interesting responses to chitchat conversations is a problem for neural conversational agents. This paper makes two substantial contributions to improving diversity in dialogue generation. First, we propose a novel metric which uses Natural Language Inference (NLI) to measure the semantic diversity of a set of model responses for a conversation. We evaluate this metric using an… ▽ More Generating diverse, interesting responses to chitchat conversations is a problem for neural conversational agents. This paper makes two substantial contributions to improving diversity in dialogue generation. First, we propose a novel metric which uses Natural Language Inference (NLI) to measure the semantic diversity of a set of model responses for a conversation. We evaluate this metric using an established framework (Tevet and Berant, 2021) and find strong evidence indicating NLI Diversity is correlated with semantic diversity. Specifically, we show that the contradiction relation is more useful than the neutral relation for measuring this diversity and that incorporating the NLI model's confidence achieves state-of-the-art results. Second, we demonstrate how to iteratively improve the semantic diversity of a sampled set of responses via a new generation procedure called Diversity Threshold Generation, which results in an average 137% increase in NLI Diversity compared to standard generation procedures. △ Less

Submitted 3 May, 2022; originally announced May 2022.

Comments: To appear at NAACL 2022

arXiv:2203.00130 [pdf, other]

Paper Plain: Making Medical Research Papers Approachable to Healthcare Consumers with Natural Language Processing

Authors: Tal August, Lucy Lu Wang, Jonathan Bragg, Marti A. Hearst, Andrew Head, Kyle Lo

Abstract: When seeking information not covered in patient-friendly documents, like medical pamphlets, healthcare consumers may turn to the research literature. Reading medical papers, however, can be a challenging experience. To improve access to medical papers, we introduce a novel interactive interface-Paper Plain-with four features powered by natural language processing: definitions of unfamiliar terms,… ▽ More When seeking information not covered in patient-friendly documents, like medical pamphlets, healthcare consumers may turn to the research literature. Reading medical papers, however, can be a challenging experience. To improve access to medical papers, we introduce a novel interactive interface-Paper Plain-with four features powered by natural language processing: definitions of unfamiliar terms, in-situ plain language section summaries, a collection of key questions that guide readers to answering passages, and plain language summaries of the answering passages. We evaluate Paper Plain, finding that participants who use Paper Plain have an easier time reading and understanding research papers without a loss in paper comprehension compared to those who use a typical PDF reader. Altogether, the study results suggest that guiding readers to relevant passages and providing plain language summaries, or "gists," alongside the original paper content can make reading medical papers easier and give readers more confidence to approach these papers. △ Less

Submitted 28 February, 2022; originally announced March 2022.

Comments: 39 pages, 10 figures

ACM Class: H.5.2

arXiv:2202.07146 [pdf, other]

doi 10.1145/3490099.3511147

NewsPod: Automatic and Interactive News Podcasts

Authors: Philippe Laban, Elicia Ye, Srujay Korlakunta, John Canny, Marti A. Hearst

Abstract: News podcasts are a popular medium to stay informed and dive deep into news topics. Today, most podcasts are handcrafted by professionals. In this work, we advance the state-of-the-art in automatically generated podcasts, making use of recent advances in natural language processing and text-to-speech technology. We present NewsPod, an automatically generated, interactive news podcast. The podcast… ▽ More News podcasts are a popular medium to stay informed and dive deep into news topics. Today, most podcasts are handcrafted by professionals. In this work, we advance the state-of-the-art in automatically generated podcasts, making use of recent advances in natural language processing and text-to-speech technology. We present NewsPod, an automatically generated, interactive news podcast. The podcast is divided into segments, each centered on a news event, with each segment structured as a Question and Answer conversation, whose goal is to engage the listener. A key aspect of the design is the use of distinct voices for each role (questioner, responder), to better simulate a conversation. Another novel aspect of NewsPod allows listeners to interact with the podcast by asking their own questions and receiving automatically generated answers. We validate the soundness of this system design through two usability studies, focused on evaluating the narrative style and interactions with the podcast, respectively. We find that NewsPod is preferred over a baseline by participants, with 80% claiming they would use the system in the future. △ Less

Submitted 14 February, 2022; originally announced February 2022.

Comments: Accepted at IUI 2022, 16 pages, 10 figures

arXiv:2111.09525 [pdf, other]

SummaC: Re-Visiting NLI-based Models for Inconsistency Detection in Summarization

Authors: Philippe Laban, Tobias Schnabel, Paul N. Bennett, Marti A. Hearst

Abstract: In the summarization domain, a key requirement for summaries is to be factually consistent with the input document. Previous work has found that natural language inference (NLI) models do not perform competitively when applied to inconsistency detection. In this work, we revisit the use of NLI for inconsistency detection, finding that past work suffered from a mismatch in input granularity between… ▽ More In the summarization domain, a key requirement for summaries is to be factually consistent with the input document. Previous work has found that natural language inference (NLI) models do not perform competitively when applied to inconsistency detection. In this work, we revisit the use of NLI for inconsistency detection, finding that past work suffered from a mismatch in input granularity between NLI datasets (sentence-level), and inconsistency detection (document level). We provide a highly effective and light-weight method called SummaCConv that enables NLI models to be successfully used for this task by segmenting documents into sentence units and aggregating scores between pairs of sentences. On our newly introduced benchmark called SummaC (Summary Consistency) consisting of six large inconsistency detection datasets, SummaCConv obtains state-of-the-art results with a balanced accuracy of 74.4%, a 5% point improvement compared to prior work. We make the models and datasets available: https://github.com/tingofurro/summac △ Less

Submitted 18 November, 2021; originally announced November 2021.

Comments: TACL pre-MIT Press publication version; 11 pages, 2 figures, 5 tables

arXiv:2107.03448 [pdf, other]

Can Transformer Models Measure Coherence In Text? Re-Thinking the Shuffle Test

Authors: Philippe Laban, Luke Dai, Lucas Bandarkar, Marti A. Hearst

Abstract: The Shuffle Test is the most common task to evaluate whether NLP models can measure coherence in text. Most recent work uses direct supervision on the task; we show that by simply finetuning a RoBERTa model, we can achieve a near perfect accuracy of 97.8%, a state-of-the-art. We argue that this outstanding performance is unlikely to lead to a good model of text coherence, and suggest that the Shuf… ▽ More The Shuffle Test is the most common task to evaluate whether NLP models can measure coherence in text. Most recent work uses direct supervision on the task; we show that by simply finetuning a RoBERTa model, we can achieve a near perfect accuracy of 97.8%, a state-of-the-art. We argue that this outstanding performance is unlikely to lead to a good model of text coherence, and suggest that the Shuffle Test should be approached in a Zero-Shot setting: models should be evaluated without being trained on the task itself. We evaluate common models in this setting, such as Generative and Bi-directional Transformers, and find that larger architectures achieve high-performance out-of-the-box. Finally, we suggest the k-Block Shuffle Test, a modification of the original by increasing the size of blocks shuffled. Even though human reader performance remains high (around 95% accuracy), model performance drops from 94% to 78% as block size increases, creating a conceptually simple challenge to benchmark NLP models. Code available: https://github.com/tingofurro/shuffle_test/ △ Less

Submitted 7 July, 2021; originally announced July 2021.

Comments: Accepted at ACL-IJCNLP 2021 (short paper), 7 pages, 4 figures

Journal ref: Association for Computational Linguistics (2021)

arXiv:2107.03444 [pdf, other]

Keep it Simple: Unsupervised Simplification of Multi-Paragraph Text

Authors: Philippe Laban, Tobias Schnabel, Paul Bennett, Marti A. Hearst

Abstract: This work presents Keep it Simple (KiS), a new approach to unsupervised text simplification which learns to balance a reward across three properties: fluency, salience and simplicity. We train the model with a novel algorithm to optimize the reward (k-SCST), in which the model proposes several candidate simplifications, computes each candidate's reward, and encourages candidates that outperform th… ▽ More This work presents Keep it Simple (KiS), a new approach to unsupervised text simplification which learns to balance a reward across three properties: fluency, salience and simplicity. We train the model with a novel algorithm to optimize the reward (k-SCST), in which the model proposes several candidate simplifications, computes each candidate's reward, and encourages candidates that outperform the mean reward. Finally, we propose a realistic text comprehension task as an evaluation method for text simplification. When tested on the English news domain, the KiS model outperforms strong supervised baselines by more than 4 SARI points, and can help people complete a comprehension task an average of 18% faster while retaining accuracy, when compared to the original text. Code available: https://github.com/tingofurro/keep_it_simple △ Less

Submitted 7 July, 2021; originally announced July 2021.

Comments: Accepted at ACL-IJCNLP 2021, 14 pages, 7 figures

Journal ref: Association for Computational Linguistics (2021)

arXiv:2105.05392 [pdf, other]

doi 10.18653/v1/2020.acl-demos.43

What's The Latest? A Question-driven News Chatbot

Authors: Philippe Laban, John Canny, Marti A. Hearst

Abstract: This work describes an automatic news chatbot that draws content from a diverse set of news articles and creates conversations with a user about the news. Key components of the system include the automatic organization of news articles into topical chatrooms, integration of automatically generated questions into the conversation, and a novel method for choosing which questions to present which avo… ▽ More This work describes an automatic news chatbot that draws content from a diverse set of news articles and creates conversations with a user about the news. Key components of the system include the automatic organization of news articles into topical chatrooms, integration of automatically generated questions into the conversation, and a novel method for choosing which questions to present which avoids repetitive suggestions. We describe the algorithmic framework and present the results of a usability study that shows that news readers using the system successfully engage in multi-turn conversations about specific news stories. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: ACL2020 Demo Track, 8 pages, 5 figures

Journal ref: ACL Demos (2020) 380-387

arXiv:2105.05391 [pdf, other]

News Headline Grouping as a Challenging NLU Task

Authors: Philippe Laban, Lucas Bandarkar, Marti A. Hearst

Abstract: Recent progress in Natural Language Understanding (NLU) has seen the latest models outperform human performance on many standard tasks. These impressive results have led the community to introspect on dataset limitations, and iterate on more nuanced challenges. In this paper, we introduce the task of HeadLine Grouping (HLG) and a corresponding dataset (HLGD) consisting of 20,056 pairs of news head… ▽ More Recent progress in Natural Language Understanding (NLU) has seen the latest models outperform human performance on many standard tasks. These impressive results have led the community to introspect on dataset limitations, and iterate on more nuanced challenges. In this paper, we introduce the task of HeadLine Grouping (HLG) and a corresponding dataset (HLGD) consisting of 20,056 pairs of news headlines, each labeled with a binary judgement as to whether the pair belongs within the same group. On HLGD, human annotators achieve high performance of around 0.9 F-1, while current state-of-the art Transformer models only reach 0.75 F-1, opening the path for further improvements. We further propose a novel unsupervised Headline Generator Swap model for the task of HeadLine Grouping that achieves within 3 F-1 of the best supervised model. Finally, we analyze high-performing models with consistency tests, and find that models are not consistent in their predictions, revealing modeling limits of current architectures. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: NAACL2021, 13 pages, 8 figures

arXiv:2105.05361 [pdf, other]

doi 10.18653/v1/2020.acl-main.460

The Summary Loop: Learning to Write Abstractive Summaries Without Examples

Authors: Philippe Laban, Andrew Hsi, John Canny, Marti A. Hearst

Abstract: This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint. It introduces a novel method that encourages the inclusion of key terms from the original document into the summary: key terms are masked out of the original document and must be filled in by a coverage model using the current generate… ▽ More This work presents a new approach to unsupervised abstractive summarization based on maximizing a combination of coverage and fluency for a given length constraint. It introduces a novel method that encourages the inclusion of key terms from the original document into the summary: key terms are masked out of the original document and must be filled in by a coverage model using the current generated summary. A novel unsupervised training procedure leverages this coverage model along with a fluency model to generate and score summaries. When tested on popular news summarization datasets, the method outperforms previous unsupervised methods by more than 2 R-1 points, and approaches results of competitive supervised methods. Our model attains higher levels of abstraction with copied passages roughly two times shorter than prior work, and learns to compress and merge sentences without supervision. △ Less

Submitted 11 May, 2021; originally announced May 2021.

Comments: ACL2020, 16 pages, 9 figures

Journal ref: Association for Computational Linguistics (2020) 5135-5150

arXiv:2105.00121 [pdf, other]

Lux: Always-on Visualization Recommendations for Exploratory Dataframe Workflows

Authors: Doris Jung-Lin Lee, Dixin Tang, Kunal Agarwal, Thyne Boonmark, Caitlyn Chen, Jake Kang, Ujjaini Mukhopadhyay, Jerry Song, Micah Yong, Marti A. Hearst, Aditya G. Parameswaran

Abstract: Exploratory data science largely happens in computational notebooks with dataframe APIs, such as pandas, that support flexible means to transform, clean, and analyze data. Yet, visually exploring data in dataframes remains tedious, requiring substantial programming effort for visualization and mental effort to determine what analysis to perform next. We propose Lux, an always-on framework for acce… ▽ More Exploratory data science largely happens in computational notebooks with dataframe APIs, such as pandas, that support flexible means to transform, clean, and analyze data. Yet, visually exploring data in dataframes remains tedious, requiring substantial programming effort for visualization and mental effort to determine what analysis to perform next. We propose Lux, an always-on framework for accelerating visual insight discovery in dataframe workflows. When users print a dataframe in their notebooks, Lux recommends visualizations to provide a quick overview of the patterns and trends and suggests promising analysis directions. Lux features a high level language for generating visualizations on demand to encourage rapid visual experimentation with data. We demonstrate that through the use of a careful design and three system optimizations, Lux adds no more than two seconds of overhead on top of pandas for over 98% of datasets in the UCI repository. We evaluate Lux in terms of usability via a controlled first-use study and interviews with early adopters, finding that Lux helps fulfill the needs of data scientists for visualization support within their dataframe workflows. Lux has already been embraced by data science practitioners, with over 3.1k stars on Github. △ Less

Submitted 22 December, 2021; v1 submitted 30 April, 2021; originally announced May 2021.

arXiv:2010.05129 [pdf, other]

Document-Level Definition Detection in Scholarly Documents: Existing Models, Error Analyses, and Future Directions

Authors: Dongyeop Kang, Andrew Head, Risham Sidhu, Kyle Lo, Daniel S. Weld, Marti A. Hearst

Abstract: The task of definition detection is important for scholarly papers, because papers often make use of technical terminology that may be unfamiliar to readers. Despite prior work on definition detection, current approaches are far from being accurate enough to use in real-world applications. In this paper, we first perform in-depth error analysis of the current best performing definition detection s… ▽ More The task of definition detection is important for scholarly papers, because papers often make use of technical terminology that may be unfamiliar to readers. Despite prior work on definition detection, current approaches are far from being accurate enough to use in real-world applications. In this paper, we first perform in-depth error analysis of the current best performing definition detection system and discover major causes of errors. Based on this analysis, we develop a new definition detection system, HEDDEx, that utilizes syntactic features, transformer encoders, and heuristic filters, and evaluate it on a standard sentence-level benchmark. Because current benchmarks evaluate randomly sampled sentences, we propose an alternative evaluation that assesses every sentence within a document. This allows for evaluating recall in addition to precision. HEDDEx outperforms the leading system on both the sentence-level and the document-level tasks, by 12.7 F1 points and 14.4 F1 points, respectively. We note that performance on the high-recall document-level task is much lower than in the standard evaluation approach, due to the necessity of incorporation of document structure as features. We discuss remaining challenges in document-level definition detection, ideas for improvements, and potential issues for the development of reading aid applications. △ Less

Submitted 10 October, 2020; originally announced October 2020.

Comments: Workshop on Scholarly Document Processing (SDP), EMNLP 2020

arXiv:2009.14237 [pdf, other]

Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols

Authors: Andrew Head, Kyle Lo, Dongyeop Kang, Raymond Fok, Sam Skjonsberg, Daniel S. Weld, Marti A. Hearst

Abstract: Despite the central importance of research papers to scientific progress, they can be difficult to read. Comprehension is often stymied when the information needed to understand a passage resides somewhere else: in another section, or in another paper. In this work, we envision how interfaces can bring definitions of technical terms and symbols to readers when and where they need them most. We int… ▽ More Despite the central importance of research papers to scientific progress, they can be difficult to read. Comprehension is often stymied when the information needed to understand a passage resides somewhere else: in another section, or in another paper. In this work, we envision how interfaces can bring definitions of technical terms and symbols to readers when and where they need them most. We introduce ScholarPhi, an augmented reading interface with four novel features: (1) tooltips that surface position-sensitive definitions from elsewhere in a paper, (2) a filter over the paper that "declutters" it to reveal how the term or symbol is used across the paper, (3) automatic equation diagrams that expose multiple definitions in parallel, and (4) an automatically generated glossary of important terms and symbols. A usability study showed that the tool helps researchers of all experience levels read papers. Furthermore, researchers were eager to have ScholarPhi's definitions available to support their everyday reading. △ Less

Submitted 27 April, 2021; v1 submitted 29 September, 2020; originally announced September 2020.

Comments: 18 pages, 17 figures, 2 tables. To appear at the 2021 ACM CHI Conference on Human Factors in Computing Systems. For associated video, see https://youtu.be/yYcQf-Yq8B0. v2 changes: expanded discussion of design process and implementation; improved figure design. v3 changes: fixed typo in cell of Table 2; updated HEDDEx and Schwarz-Hearst accuracy in Section 5.3

ACM Class: H.5.2

arXiv:2005.12668 [pdf, other]

SciSight: Combining faceted navigation and research group detection for COVID-19 exploratory scientific search

Authors: Tom Hope, Jason Portenoy, Kishore Vasan, Jonathan Borchardt, Eric Horvitz, Daniel S. Weld, Marti A. Hearst, Jevin West

Abstract: The COVID-19 pandemic has sparked unprecedented mobilization of scientists, generating a deluge of papers that makes it hard for researchers to keep track and explore new directions. Search engines are designed for targeted queries, not for discovery of connections across a corpus. In this paper, we present SciSight, a system for exploratory search of COVID-19 research integrating two key capabili… ▽ More The COVID-19 pandemic has sparked unprecedented mobilization of scientists, generating a deluge of papers that makes it hard for researchers to keep track and explore new directions. Search engines are designed for targeted queries, not for discovery of connections across a corpus. In this paper, we present SciSight, a system for exploratory search of COVID-19 research integrating two key capabilities: first, exploring associations between biomedical facets automatically extracted from papers (e.g., genes, drugs, diseases, patient outcomes); second, combining textual and network information to search and visualize groups of researchers and their ties. SciSight has so far served over $15K$ users with over $42K$ page views and $13\%$ returns. △ Less

Submitted 20 September, 2020; v1 submitted 20 May, 2020; originally announced May 2020.

Comments: Accepted to EMNLP 2020

arXiv:1410.6751 [pdf, other]

doi 10.1007/s10032-016-0260-8

Detecting Figures and Part Labels in Patents: Competition-Based Development of Image Processing Algorithms

Authors: Christoph Riedl, Richard Zanibbi, Marti A. Hearst, Siyu Zhu, Michael Menietti, Jason Crusan, Ivan Metelsky, Karim R. Lakhani

Abstract: We report the findings of a month-long online competition in which participants developed algorithms for augmenting the digital version of patent documents published by the United States Patent and Trademark Office (USPTO). The goal was to detect figures and part labels in U.S. patent drawing pages. The challenge drew 232 teams of two, of which 70 teams (30%) submitted solutions. Collectively, tea… ▽ More We report the findings of a month-long online competition in which participants developed algorithms for augmenting the digital version of patent documents published by the United States Patent and Trademark Office (USPTO). The goal was to detect figures and part labels in U.S. patent drawing pages. The challenge drew 232 teams of two, of which 70 teams (30%) submitted solutions. Collectively, teams submitted 1,797 solutions that were compiled on the competition servers. Participants reported spending an average of 63 hours developing their solutions, resulting in a total of 5,591 hours of development time. A manually labeled dataset of 306 patents was used for training, online system tests, and evaluation. The design and performance of the top-5 systems are presented, along with a system developed after the competition which illustrates that winning teams produced near state-of-the-art results under strict time and computation constraints. For the 1st place system, the harmonic mean of recall and precision (f-measure) was 88.57% for figure region detection, 78.81% for figure regions with correctly recognized figure titles, and 70.98% for part label detection and character recognition. Data and software from the competition are available through the online UCI Machine Learning repository to inspire follow-on work by the image processing community. △ Less

Submitted 11 November, 2014; v1 submitted 24 October, 2014; originally announced October 2014.

arXiv:cmp-lg/9411022 [pdf, ps, other]

Adaptive Sentence Boundary Disambiguation

Authors: David D. Palmer, Marti A. Hearst

Abstract: Labeling of sentence boundaries is a necessary prerequisite for many natural language processing tasks, including part-of-speech tagging and sentence alignment. End-of-sentence punctuation marks are ambiguous; to disambiguate them most systems use brittle, special-purpose regular expression grammars and exception rules. As an alternative, we have developed an efficient, trainable algorithm that… ▽ More Labeling of sentence boundaries is a necessary prerequisite for many natural language processing tasks, including part-of-speech tagging and sentence alignment. End-of-sentence punctuation marks are ambiguous; to disambiguate them most systems use brittle, special-purpose regular expression grammars and exception rules. As an alternative, we have developed an efficient, trainable algorithm that uses a lexicon with part-of-speech probabilities and a feed-forward neural network. After training for less than one minute, the method correctly labels over 98.5\% of sentence boundaries in a corpus of over 27,000 sentence-boundary marks. We show the method to be efficient and easily adaptable to different text genres, including single-case texts. △ Less

Submitted 21 November, 1994; v1 submitted 16 November, 1994; originally announced November 1994.

Comments: This is a Latex version of the previously submitted ps file (formatted as a uuencoded gz-compressed .tar file created by csh script). The software from the work described in this paper is available by contacting dpalmer@cs.berkeley.edu

Journal ref: Proceedings of ANLP 94

arXiv:cmp-lg/9406037 [pdf, ps, other]

Multi-Paragraph Segmentation of Expository Text

Authors: Marti A. Hearst

Abstract: This paper describes TextTiling, an algorithm for partitioning expository texts into coherent multi-paragraph discourse units which reflect the subtopic structure of the texts. The algorithm uses domain-independent lexical frequency and distribution information to recognize the interactions of multiple simultaneous themes. Two fully-implemented versions of the algorithm are described and shown t… ▽ More This paper describes TextTiling, an algorithm for partitioning expository texts into coherent multi-paragraph discourse units which reflect the subtopic structure of the texts. The algorithm uses domain-independent lexical frequency and distribution information to recognize the interactions of multiple simultaneous themes. Two fully-implemented versions of the algorithm are described and shown to produce segmentation that corresponds well to human judgments of the major subtopic boundaries of thirteen lengthy texts. △ Less

Submitted 23 June, 1994; originally announced June 1994.

Comments: To Appear in ACL '94 Proceedings; 8 pages POSTSCRIPT format

Showing 1–25 of 25 results for author: Hearst, M A