subscribe to arXiv mailings

25-Fold Resolution Enhancement of X-ray Microscopy Using Multipixel Ghost Imaging

Authors: O. Sefi, A. Ben Yehuda, Y. Klein, S. Bloch, H. Schwartz, E. Cohen, S. Shwartz

Abstract: Hard x-ray imaging is indispensable across diverse fields owing to its high penetrability. However, the resolution of traditional x-ray imaging modalities, such as computed tomography (CT) systems, is constrained by factors including beam properties, the absence of optical components, and detection resolution. As a result, typical resolution in commercial imaging systems is limited to a few hundre… ▽ More Hard x-ray imaging is indispensable across diverse fields owing to its high penetrability. However, the resolution of traditional x-ray imaging modalities, such as computed tomography (CT) systems, is constrained by factors including beam properties, the absence of optical components, and detection resolution. As a result, typical resolution in commercial imaging systems is limited to a few hundred microns. This study advances high-photon-energy imaging by extending the concept of computational ghost imaging to multipixel ghost imaging with x-rays. We demonstrate a remarkable enhancement in resolution from 500 microns to approximately 20 microns for an image spanning 0.9 by 1 cm^2, comprised of 400,000 pixels and involving only 1000 realizations. Furthermore, we present a high-resolution CT reconstruction using our method, revealing enhanced visibility and resolution. Our achievement is facilitated by an innovative x-ray lithography technique and the computed tiling of images captured by each detector pixel. Importantly, this method can be scaled up for larger images without sacrificing the short measurement time, thereby opening intriguing possibilities for noninvasive high-resolution imaging of small features that are invisible with the present modalities. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 9 pages, 4 figures

arXiv:2402.01980 [pdf, other]

SOCIALITE-LLAMA: An Instruction-Tuned Model for Social Scientific Tasks

Authors: Gourab Dey, Adithya V Ganesan, Yash Kumar Lal, Manal Shah, Shreyashee Sinha, Matthew Matero, Salvatore Giorgi, Vivek Kulkarni, H. Andrew Schwartz

Abstract: Social science NLP tasks, such as emotion or humor detection, are required to capture the semantics along with the implicit pragmatics from text, often with limited amounts of training data. Instruction tuning has been shown to improve the many capabilities of large language models (LLMs) such as commonsense reasoning, reading comprehension, and computer programming. However, little is known about… ▽ More Social science NLP tasks, such as emotion or humor detection, are required to capture the semantics along with the implicit pragmatics from text, often with limited amounts of training data. Instruction tuning has been shown to improve the many capabilities of large language models (LLMs) such as commonsense reasoning, reading comprehension, and computer programming. However, little is known about the effectiveness of instruction tuning on the social domain where implicit pragmatic cues are often needed to be captured. We explore the use of instruction tuning for social science NLP tasks and introduce Socialite-Llama -- an open-source, instruction-tuned Llama. On a suite of 20 social science tasks, Socialite-Llama improves upon the performance of Llama as well as matches or improves upon the performance of a state-of-the-art, multi-task finetuned model on a majority of them. Further, Socialite-Llama also leads to improvement on 5 out of 6 related social tasks as compared to Llama, suggesting instruction tuning can lead to generalized social understanding. All resources including our code, model and dataset can be found through bit.ly/socialitellama. △ Less

Submitted 14 March, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

Comments: Short paper accepted to EACL 2024. 4 pgs, 2 tables

arXiv:2401.12492 [pdf, other]

Comparing Pre-trained Human Language Models: Is it Better with Human Context as Groups, Individual Traits, or Both?

Authors: Nikita Soni, Niranjan Balasubramanian, H. Andrew Schwartz, Dirk Hovy

Abstract: Pre-trained language models consider the context of neighboring words and documents but lack any author context of the human generating the text. However, language depends on the author's states, traits, social, situational, and environmental attributes, collectively referred to as human context (Soni et al., 2024). Human-centered natural language processing requires incorporating human context in… ▽ More Pre-trained language models consider the context of neighboring words and documents but lack any author context of the human generating the text. However, language depends on the author's states, traits, social, situational, and environmental attributes, collectively referred to as human context (Soni et al., 2024). Human-centered natural language processing requires incorporating human context into language models. Currently, two methods exist: pre-training with 1) group-wise attributes (e.g., over-45-year-olds) or 2) individual traits. Group attributes are simple but coarse -- not all 45-year-olds write the same way -- while individual traits allow for more personalized representations, but require more complex modeling and data. It is unclear which approach benefits what tasks. We compare pre-training models with human context via 1) group attributes, 2) individual users, and 3) a combined approach on five user- and document-level tasks. Our results show that there is no best approach, but that human-centered language modeling holds avenues for different methods. △ Less

Submitted 18 July, 2024; v1 submitted 23 January, 2024; originally announced January 2024.

arXiv:2312.07751 [pdf, other]

Large Human Language Models: A Need and the Challenges

Authors: Nikita Soni, H. Andrew Schwartz, João Sedoc, Niranjan Balasubramanian

Abstract: As research in human-centered NLP advances, there is a growing recognition of the importance of incorporating human and social factors into NLP models. At the same time, our NLP systems have become heavily reliant on LLMs, most of which do not model authors. To build NLP systems that can truly understand human language, we must better integrate human contexts into LLMs. This brings to the fore a r… ▽ More As research in human-centered NLP advances, there is a growing recognition of the importance of incorporating human and social factors into NLP models. At the same time, our NLP systems have become heavily reliant on LLMs, most of which do not model authors. To build NLP systems that can truly understand human language, we must better integrate human contexts into LLMs. This brings to the fore a range of design considerations and challenges in terms of what human aspects to capture, how to represent them, and what modeling strategies to pursue. To address these, we advocate for three positions toward creating large human language models (LHLMs) using concepts from psychological and behavioral sciences: First, LM training should include the human context. Second, LHLMs should recognize that people are more than their group(s). Third, LHLMs should be able to account for the dynamic and temporally-dependent nature of the human context. We refer to relevant advances and present open challenges that need to be addressed and their possible solutions in realizing these goals. △ Less

Submitted 9 May, 2024; v1 submitted 8 November, 2023; originally announced December 2023.

arXiv:2311.06467 [pdf, other]

ALBA: Adaptive Language-based Assessments for Mental Health

Authors: Vasudha Varadarajan, Sverker Sikström, Oscar N. E. Kjell, H. Andrew Schwartz

Abstract: Mental health issues differ widely among individuals, with varied signs and symptoms. Recently, language-based assessments have shown promise in capturing this diversity, but they require a substantial sample of words per person for accuracy. This work introduces the task of Adaptive Language-Based Assessment ALBA, which involves adaptively ordering questions while also scoring an individual's lat… ▽ More Mental health issues differ widely among individuals, with varied signs and symptoms. Recently, language-based assessments have shown promise in capturing this diversity, but they require a substantial sample of words per person for accuracy. This work introduces the task of Adaptive Language-Based Assessment ALBA, which involves adaptively ordering questions while also scoring an individual's latent psychological trait using limited language responses to previous questions. To this end, we develop adaptive testing methods under two psychometric measurement theories: Classical Test Theory and Item Response Theory. We empirically evaluate ordering and scoring strategies, organizing into two new methods: a semi-supervised item response theory-based method ALIRT and a supervised Actor-Critic model. While we found both methods to improve over non-adaptive baselines, We found ALIRT to be the most accurate and scalable, achieving the highest accuracy with fewer questions (e.g., Pearson r ~ 0.93 after only 3 questions as compared to typically needing at least 7 questions). In general, adaptive language-based assessments of depression and anxiety were able to utilize a smaller sample of language without compromising validity or large computational costs. △ Less

Submitted 16 May, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

arXiv:2307.06388 [pdf, other]

Doubles of Gluck twists: a five dimensional approach

Authors: David Gabai, Patrick Naylor, Hannah Schwartz

Abstract: Using a 5-dimensional perspective, we balance algebraic and geometric handle cancellation to show that doubles of Gluck twists of certain 2-spheres with two minima are standard. This includes all 2-spheres which are unions of ribbon discs, one of which has undisking number one. As an application, we produce new examples of Schoenflies balls not known to be standard. Using a 5-dimensional perspective, we balance algebraic and geometric handle cancellation to show that doubles of Gluck twists of certain 2-spheres with two minima are standard. This includes all 2-spheres which are unions of ribbon discs, one of which has undisking number one. As an application, we produce new examples of Schoenflies balls not known to be standard. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: 24 pages, 21 figures. Comments welcome!

arXiv:2306.01183 [pdf, other]

Systematic Evaluation of GPT-3 for Zero-Shot Personality Estimation

Authors: Adithya V Ganesan, Yash Kumar Lal, August Håkan Nilsson, H. Andrew Schwartz

Abstract: Very large language models (LLMs) perform extremely well on a spectrum of NLP tasks in a zero-shot setting. However, little is known about their performance on human-level NLP problems which rely on understanding psychological concepts, such as assessing personality traits. In this work, we investigate the zero-shot ability of GPT-3 to estimate the Big 5 personality traits from users' social media… ▽ More Very large language models (LLMs) perform extremely well on a spectrum of NLP tasks in a zero-shot setting. However, little is known about their performance on human-level NLP problems which rely on understanding psychological concepts, such as assessing personality traits. In this work, we investigate the zero-shot ability of GPT-3 to estimate the Big 5 personality traits from users' social media posts. Through a set of systematic experiments, we find that zero-shot GPT-3 performance is somewhat close to an existing pre-trained SotA for broad classification upon injecting knowledge about the trait in the prompts. However, when prompted to provide fine-grained classification, its performance drops to close to a simple most frequent class (MFC) baseline. We further analyze where GPT-3 performs better, as well as worse, than a pretrained lexical model, illustrating systematic errors that suggest ways to improve LLMs on human-level NLP tasks. △ Less

Submitted 1 June, 2023; originally announced June 2023.

Comments: Short Paper (5 pages), Accepted to (WASSA) 13th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis at ACL 2023

MSC Class: 68T50 ACM Class: J.4; I.2; I.7

arXiv:2305.14757 [pdf, other]

Psychological Metrics for Dialog System Evaluation

Authors: Salvatore Giorgi, Shreya Havaldar, Farhan Ahmed, Zuhaib Akhtar, Shalaka Vaidya, Gary Pan, Lyle H. Ungar, H. Andrew Schwartz, Joao Sedoc

Abstract: We present metrics for evaluating dialog systems through a psychologically-grounded "human" lens in which conversational agents express a diversity of both states (e.g., emotion) and traits (e.g., personality), just as people do. We present five interpretable metrics from established psychology that are fundamental to human communication and relationships: emotional entropy, linguistic style and e… ▽ More We present metrics for evaluating dialog systems through a psychologically-grounded "human" lens in which conversational agents express a diversity of both states (e.g., emotion) and traits (e.g., personality), just as people do. We present five interpretable metrics from established psychology that are fundamental to human communication and relationships: emotional entropy, linguistic style and emotion matching, agreeableness, and empathy. These metrics can be applied (1) across dialogs and (2) on turns within dialogs. The psychological metrics are compared against seven state-of-the-art traditional metrics (e.g., BARTScore and BLEURT) on seven standard dialog system data sets. We also introduce a novel data set, the Three Bot Dialog Evaluation Corpus, which consists of annotated conversations from ChatGPT, GPT-3, and BlenderBot. We demonstrate that our proposed metrics offer novel information; they are uncorrelated with traditional metrics, can be used to meaningfully compare dialog systems, and lead to increased accuracy (beyond existing traditional metrics) in predicting crowd-sourced dialog judgements. The interpretability and unique signal of our psychological metrics make them a valuable tool for evaluating and improving dialog systems. △ Less

Submitted 15 September, 2023; v1 submitted 24 May, 2023; originally announced May 2023.

arXiv:2305.12468 [pdf]

High-resolution computed tomography with scattered x-ray radiation and a single pixel detector

Authors: A. Ben Yehuda, O. Sefi, Y. Klein, R. H Shukrun, H. Schwartz, E. Cohen, S. Shwartz

Abstract: X-ray imaging is a prevalent technique for non-invasively visualizing the interior of the human body and opaque instruments. In most commercial x-ray modalities, an image is formed by measuring the x-rays that pass through the object of interest. However, despite the potential of scattered radiation to provide additional information about the object, it is often disregarded due to its inherent ten… ▽ More X-ray imaging is a prevalent technique for non-invasively visualizing the interior of the human body and opaque instruments. In most commercial x-ray modalities, an image is formed by measuring the x-rays that pass through the object of interest. However, despite the potential of scattered radiation to provide additional information about the object, it is often disregarded due to its inherent tendency to cause blurring. Consequently, conventional imaging modalities do not measure or utilize these valuable data. In contrast, we propose and experimentally demonstrate a high-resolution technique for x-ray computed tomography (CT) that measures scattered radiation by exploiting computational ghost imaging (CGI). We show that our method can provide sub-200 μm resolution, exceeding the capabilities of most existing x-ray imaging modalities. Our research reveals a promising technique for incorporating scattered radiation data in CT scans to improve image resolution and minimize radiation exposure for patients. The findings of our study suggest that our technique could represent a significant advancement in the fields of medical and industrial imaging, with the potential to enhance the accuracy and safety of diagnostic imaging procedures. △ Less

Submitted 21 May, 2023; originally announced May 2023.

Comments: 17 pages, 10 figures

arXiv:2305.02459 [pdf, other]

Transfer and Active Learning for Dissonance Detection: Addressing the Rare-Class Challenge

Authors: Vasudha Varadarajan, Swanie Juhng, Syeda Mahwish, Xiaoran Liu, Jonah Luby, Christian Luhmann, H. Andrew Schwartz

Abstract: While transformer-based systems have enabled greater accuracies with fewer training examples, data acquisition obstacles still persist for rare-class tasks -- when the class label is very infrequent (e.g. < 5% of samples). Active learning has in general been proposed to alleviate such challenges, but choice of selection strategy, the criteria by which rare-class examples are chosen, has not been s… ▽ More While transformer-based systems have enabled greater accuracies with fewer training examples, data acquisition obstacles still persist for rare-class tasks -- when the class label is very infrequent (e.g. < 5% of samples). Active learning has in general been proposed to alleviate such challenges, but choice of selection strategy, the criteria by which rare-class examples are chosen, has not been systematically evaluated. Further, transformers enable iterative transfer-learning approaches. We propose and investigate transfer- and active learning solutions to the rare class problem of dissonance detection through utilizing models trained on closely related tasks and the evaluation of acquisition strategies, including a proposed probability-of-rare-class (PRC) approach. We perform these experiments for a specific rare class problem: collecting language samples of cognitive dissonance from social media. We find that PRC is a simple and effective strategy to guide annotations and ultimately improve model accuracy while transfer-learning in a specific order can improve the cold-start performance of the learner but does not benefit iterations of active learning. △ Less

Submitted 4 May, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

arXiv:2304.04869 [pdf, other]

doi 10.1088/1538-3873/acd1b5

The James Webb Space Telescope Mission

Authors: Jonathan P. Gardner, John C. Mather, Randy Abbott, James S. Abell, Mark Abernathy, Faith E. Abney, John G. Abraham, Roberto Abraham, Yasin M. Abul-Huda, Scott Acton, Cynthia K. Adams, Evan Adams, David S. Adler, Maarten Adriaensen, Jonathan Albert Aguilar, Mansoor Ahmed, Nasif S. Ahmed, Tanjira Ahmed, Rüdeger Albat, Loïc Albert, Stacey Alberts, David Aldridge, Mary Marsha Allen, Shaune S. Allen, Martin Altenburg , et al. (983 additional authors not shown)

Abstract: Twenty-six years ago a small committee report, building on earlier studies, expounded a compelling and poetic vision for the future of astronomy, calling for an infrared-optimized space telescope with an aperture of at least $4m$. With the support of their governments in the US, Europe, and Canada, 20,000 people realized that vision as the $6.5m$ James Webb Space Telescope. A generation of astrono… ▽ More Twenty-six years ago a small committee report, building on earlier studies, expounded a compelling and poetic vision for the future of astronomy, calling for an infrared-optimized space telescope with an aperture of at least $4m$. With the support of their governments in the US, Europe, and Canada, 20,000 people realized that vision as the $6.5m$ James Webb Space Telescope. A generation of astronomers will celebrate their accomplishments for the life of the mission, potentially as long as 20 years, and beyond. This report and the scientific discoveries that follow are extended thank-you notes to the 20,000 team members. The telescope is working perfectly, with much better image quality than expected. In this and accompanying papers, we give a brief history, describe the observatory, outline its objectives and current observing program, and discuss the inventions and people who made it possible. We cite detailed reports on the design and the measured performance on orbit. △ Less

Submitted 10 April, 2023; originally announced April 2023.

Comments: Accepted by PASP for the special issue on The James Webb Space Telescope Overview, 29 pages, 4 figures

arXiv:2302.12952 [pdf]

Robust language-based mental health assessments in time and space through social media

Authors: Siddharth Mangalik, Johannes C. Eichstaedt, Salvatore Giorgi, Jihu Mun, Farhan Ahmed, Gilvir Gill, Adithya V. Ganesan, Shashanka Subrahmanya, Nikita Soni, Sean A. P. Clouston, H. Andrew Schwartz

Abstract: Compared to physical health, population mental health measurement in the U.S. is very coarse-grained. Currently, in the largest population surveys, such as those carried out by the Centers for Disease Control or Gallup, mental health is only broadly captured through "mentally unhealthy days" or "sadness", and limited to relatively infrequent state or metropolitan estimates. Through the large scale… ▽ More Compared to physical health, population mental health measurement in the U.S. is very coarse-grained. Currently, in the largest population surveys, such as those carried out by the Centers for Disease Control or Gallup, mental health is only broadly captured through "mentally unhealthy days" or "sadness", and limited to relatively infrequent state or metropolitan estimates. Through the large scale analysis of social media data, robust estimation of population mental health is feasible at much higher resolutions, up to weekly estimates for counties. In the present work, we validate a pipeline that uses a sample of 1.2 billion Tweets from 2 million geo-located users to estimate mental health changes for the two leading mental health conditions, depression and anxiety. We find moderate to large associations between the language-based mental health assessments and survey scores from Gallup for multiple levels of granularity, down to the county-week (fixed effects $β= .25$ to $1.58$; $p<.001$). Language-based assessment allows for the cost-effective and scalable monitoring of population mental health at weekly time scales. Such spatially fine-grained time series are well suited to monitor effects of societal events and policies as well as enable quasi-experimental study designs in population health and other disciplines. Beyond mental health in the U.S., this method generalizes to a broad set of psychological outcomes and allows for community measurement in under-resourced settings where no traditional survey measures - but social media data - are available. △ Less

Submitted 24 February, 2023; originally announced February 2023.

Comments: 9 pages, 7 figures, pre-print

ACM Class: J.4; I.2.7

arXiv:2205.05128 [pdf, other]

doi 10.18653/v1/2022.findings-acl.52

Human Language Modeling

Authors: Nikita Soni, Matthew Matero, Niranjan Balasubramanian, H. Andrew Schwartz

Abstract: Natural language is generated by people, yet traditional language modeling views words or documents as if generated independently. Here, we propose human language modeling (HuLM), a hierarchical extension to the language modeling problem whereby a human-level exists to connect sequences of documents (e.g. social media messages) and capture the notion that human language is moderated by changing hu… ▽ More Natural language is generated by people, yet traditional language modeling views words or documents as if generated independently. Here, we propose human language modeling (HuLM), a hierarchical extension to the language modeling problem whereby a human-level exists to connect sequences of documents (e.g. social media messages) and capture the notion that human language is moderated by changing human states. We introduce, HaRT, a large-scale transformer model for the HuLM task, pre-trained on approximately 100,000 social media users, and demonstrate its effectiveness in terms of both language modeling (perplexity) for social media and fine-tuning for 4 downstream tasks spanning document- and user-levels: stance detection, sentiment classification, age estimation, and personality assessment. Results on all tasks meet or surpass the current state-of-the-art. △ Less

Submitted 10 May, 2022; originally announced May 2022.

arXiv:2112.13795 [pdf, other]

Evaluating Contextual Embeddings and their Extraction Layers for Depression Assessment

Authors: Matthew Matero, Albert Hung, H. Andrew Schwartz

Abstract: Recent works have demonstrated ability to assess aspects of mental health from personal discourse. At the same time, pre-trained contextual word embedding models have grown to dominate much of NLP but little is known empirically on how to best apply them for mental health assessment. Using degree of depression as a case study, we do an empirical analysis on which off-the-shelf language model, indi… ▽ More Recent works have demonstrated ability to assess aspects of mental health from personal discourse. At the same time, pre-trained contextual word embedding models have grown to dominate much of NLP but little is known empirically on how to best apply them for mental health assessment. Using degree of depression as a case study, we do an empirical analysis on which off-the-shelf language model, individual layers, and combinations of layers seem most promising when applied to human-level NLP tasks. Notably, we find RoBERTa most effective and, despite the standard in past work suggesting the second-to-last or concatenation of the last 4 layers, we find layer 19 (sixth-to last) is at least as good as layer 23 when using 1 layer. Further, when using multiple layers, distributing them across the second half (i.e. Layers 12+), rather than last 4, of the 24 layers yielded the most accurate results. △ Less

Submitted 28 April, 2022; v1 submitted 27 December, 2021; originally announced December 2021.

arXiv:2109.13397 [pdf, other]

A 4-dimensional light bulb theorem for disks

Authors: Hannah Schwartz

Abstract: We give a 4-dimensional light bulb theorem for properly embedded disks, generalizing recent work of Gabai and Kosanovic-Teichner in certain contexts, and extending the 4-dimensional light bulb theorem for 2-spheres due to Gabai and Schneiderman-Teichner. In particular, we provide conditions under which homotopic disks properly embedded in a compact 4-manifold X with a common dual in the interior o… ▽ More We give a 4-dimensional light bulb theorem for properly embedded disks, generalizing recent work of Gabai and Kosanovic-Teichner in certain contexts, and extending the 4-dimensional light bulb theorem for 2-spheres due to Gabai and Schneiderman-Teichner. In particular, we provide conditions under which homotopic disks properly embedded in a compact 4-manifold X with a common dual in the interior of X are smoothly isotopic rel boundary. We also provide a new geometric interpretation of the Dax invariant, to aid in its computation. △ Less

Submitted 14 June, 2024; v1 submitted 27 September, 2021; originally announced September 2021.

Comments: Updated version -- the hypotheses and conclusion of the main result have been made more general, Figure 14 and Remarks 4.2 and 4.3 added, and additional small revisions/corrections throughout. Comments encouraged!

arXiv:2109.08113 [pdf, other]

MeLT: Message-Level Transformer with Masked Document Representations as Pre-Training for Stance Detection

Authors: Matthew Matero, Nikita Soni, Niranjan Balasubramanian, H. Andrew Schwartz

Abstract: Much of natural language processing is focused on leveraging large capacity language models, typically trained over single messages with a task of predicting one or more tokens. However, modeling human language at higher-levels of context (i.e., sequences of messages) is under-explored. In stance detection and other social media tasks where the goal is to predict an attribute of a message, we have… ▽ More Much of natural language processing is focused on leveraging large capacity language models, typically trained over single messages with a task of predicting one or more tokens. However, modeling human language at higher-levels of context (i.e., sequences of messages) is under-explored. In stance detection and other social media tasks where the goal is to predict an attribute of a message, we have contextual data that is loosely semantically connected by authorship. Here, we introduce Message-Level Transformer (MeLT) -- a hierarchical message-encoder pre-trained over Twitter and applied to the task of stance prediction. We focus on stance prediction as a task benefiting from knowing the context of the message (i.e., the sequence of previous messages). The model is trained using a variant of masked-language modeling; where instead of predicting tokens, it seeks to generate an entire masked (aggregated) message vector via reconstruction loss. We find that applying this pre-trained masked message-level transformer to the downstream task of stance detection achieves F1 performance of 67%. △ Less

Submitted 1 November, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

arXiv:2106.01335 [pdf, other]

On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers

Authors: Tianchu Ji, Shraddhan Jain, Michael Ferdman, Peter Milder, H. Andrew Schwartz, Niranjan Balasubramanian

Abstract: How much information do NLP tasks really need from a transformer's attention mechanism at application-time (inference)? From recent work, we know that there is sparsity in transformers and that the floating-points within its computation can be discretized to fewer values with minimal loss to task accuracies. However, this requires retraining or even creating entirely new models, both of which can… ▽ More How much information do NLP tasks really need from a transformer's attention mechanism at application-time (inference)? From recent work, we know that there is sparsity in transformers and that the floating-points within its computation can be discretized to fewer values with minimal loss to task accuracies. However, this requires retraining or even creating entirely new models, both of which can be expensive and carbon-emitting. Focused on optimizations that do not require training, we systematically study the full range of typical attention values necessary. This informs the design of an inference-time quantization technique using both pruning and log-scaled mapping which produces only a few (e.g. $2^3$) unique values. Over the tasks of question answering and sentiment analysis, we find nearly 80% of attention values can be pruned to zeros with minimal ($< 1.0\%$) relative loss in accuracy. We use this pruning technique in conjunction with quantizing the attention values to only a 3-bit format, without retraining, resulting in only a 0.8% accuracy reduction on question answering with fine-tuned RoBERTa. △ Less

Submitted 2 June, 2021; originally announced June 2021.

arXiv:2105.03484 [pdf, other]

doi 10.18653/v1/2021.naacl-main.357

Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality

Authors: Adithya V Ganesan, Matthew Matero, Aravind Reddy Ravula, Huy Vu, H. Andrew Schwartz

Abstract: In human-level NLP tasks, such as predicting mental health, personality, or demographics, the number of observations is often smaller than the standard 768+ hidden state sizes of each layer within modern transformer-based language models, limiting the ability to effectively leverage transformers. Here, we provide a systematic study on the role of dimension reduction methods (principal components a… ▽ More In human-level NLP tasks, such as predicting mental health, personality, or demographics, the number of observations is often smaller than the standard 768+ hidden state sizes of each layer within modern transformer-based language models, limiting the ability to effectively leverage transformers. Here, we provide a systematic study on the role of dimension reduction methods (principal components analysis, factorization techniques, or multi-layer auto-encoders) as well as the dimensionality of embedding vectors and sample sizes as a function of predictive performance. We first find that fine-tuning large models with a limited amount of data pose a significant difficulty which can be overcome with a pre-trained dimension reduction regime. RoBERTa consistently achieves top performance in human-level tasks, with PCA giving benefit over other reduction methods in better handling users that write longer texts. Finally, we observe that a majority of the tasks achieve results comparable to the best performance with just $\frac{1}{12}$ of the embedding dimensions. △ Less

Submitted 7 May, 2021; originally announced May 2021.

Comments: 2021 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT)

arXiv:2105.01306 [pdf, other]

Discourse Relation Embeddings: Representing the Relations between Discourse Segments in Social Media

Authors: Youngseo Son, Vasudha Varadarajan, H Andrew Schwartz

Abstract: Discourse relations are typically modeled as a discrete class that characterizes the relation between segments of text (e.g. causal explanations, expansions). However, such predefined discrete classes limits the universe of potential relationships and their nuanced differences. Analogous to contextual word embeddings, we propose representing discourse relations as points in high dimensional contin… ▽ More Discourse relations are typically modeled as a discrete class that characterizes the relation between segments of text (e.g. causal explanations, expansions). However, such predefined discrete classes limits the universe of potential relationships and their nuanced differences. Analogous to contextual word embeddings, we propose representing discourse relations as points in high dimensional continuous space. However, unlike words, discourse relations often have no surface form (relations are between two segments, often with no word or phrase in that gap) which presents a challenge for existing embedding techniques. We present a novel method for automatically creating discourse relation embeddings (DiscRE), addressing the embedding challenge through a weakly supervised, multitask approach to learn diverse and nuanced relations between discourse segments in social media. Results show DiscRE can: (1) obtain the best performance on Twitter discourse relation classification task (macro F1=0.76) (2) improve the state of the art in social media causality prediction (from F1=.79 to .81), (3) perform beyond modern sentence and contextual word embeddings at traditional discourse relation classification, and (4) capture novel nuanced relations (e.g. relations semantically at the intersection of causal explanations and counterfactuals). △ Less

Submitted 28 February, 2023; v1 submitted 4 May, 2021; originally announced May 2021.

Comments: Published in EMNLP 2022 UM-IoS

arXiv:2102.07755 [pdf]

Chemical element mapping by x-ray ghost fluorescence

Authors: Y. Klein, O. Sefi, H. Schwartz, S. Shwartz

Abstract: Chemical element mapping is an imaging tool that provides essential information on composite materials and it is crucial for a broad range of fields ranging from fundamental science to numerous applications. Methods that exploit x-ray fluorescence are very advantageous and are widely used, but require focusing of the input beam and raster scanning of the sample. Thus the methods are slow and exhib… ▽ More Chemical element mapping is an imaging tool that provides essential information on composite materials and it is crucial for a broad range of fields ranging from fundamental science to numerous applications. Methods that exploit x-ray fluorescence are very advantageous and are widely used, but require focusing of the input beam and raster scanning of the sample. Thus the methods are slow and exhibit limited resolution due to focusing challenges. We demonstrate a new focusing free x-ray fluorescence method based ghost imaging that overcomes those limitations. We combine our procedure with compressed sensing to reduce the measurement time and the exposure to radiation by more than 80%. Since our method does not require focusing, it opens the possibility for improving the resolution and image quality of chemical element maps with tabletop x-ray sources and for extending the applicability of x-ray fluorescence detection to new fields such as medical imaging and homeland security applications △ Less

Submitted 15 February, 2021; originally announced February 2021.

arXiv:2012.05939 [pdf, other]

Duals of non-zero square

Authors: Hannah R. Schwartz

Abstract: In this short note, for each non-zero integer n, we construct a 4-manifold containing a smoothly concordant pair of spheres with a common dual of square n but no automorphism carrying one sphere to the other. Our examples, besides showing that the square zero assumption on the dual is necessary in Gabai's and Schneiderman-Teichner's versions of the 4D Light Bulb Theorem, have the interesting featu… ▽ More In this short note, for each non-zero integer n, we construct a 4-manifold containing a smoothly concordant pair of spheres with a common dual of square n but no automorphism carrying one sphere to the other. Our examples, besides showing that the square zero assumption on the dual is necessary in Gabai's and Schneiderman-Teichner's versions of the 4D Light Bulb Theorem, have the interesting feature that both the Freedman-Quinn and Kervaire-Milnor invariant of the pair of spheres vanishes. The proof gives a surprising application of results due to Akbulut-Matveyev and Auckly-Kim-Melvin-Ruberman pertaining to the well-known Mazur cork. △ Less

Submitted 27 December, 2020; v1 submitted 10 December, 2020; originally announced December 2020.

Comments: Some typos fixed, and acknowledgements added

arXiv:2011.06457 [pdf]

World Trade Center responders in their own words: Predicting PTSD symptom trajectories with AI-based language analyses of interviews

Authors: Youngseo Son, Sean A. P. Clouston, Roman Kotov, Johannes C. Eichstaedt, Evelyn J. Bromet, Benjamin J. Luft, H Andrew Schwartz

Abstract: Background: Oral histories from 9/11 responders to the World Trade Center (WTC) attacks provide rich narratives about distress and resilience. Artificial Intelligence (AI) models promise to detect psychopathology in natural language, but they have been evaluated primarily in non-clinical settings using social media. This study sought to test the ability of AI-based language assessments to predict… ▽ More Background: Oral histories from 9/11 responders to the World Trade Center (WTC) attacks provide rich narratives about distress and resilience. Artificial Intelligence (AI) models promise to detect psychopathology in natural language, but they have been evaluated primarily in non-clinical settings using social media. This study sought to test the ability of AI-based language assessments to predict PTSD symptom trajectories among responders. Methods: Participants were 124 responders whose health was monitored at the Stony Brook WTC Health and Wellness Program who completed oral history interviews about their initial WTC experiences. PTSD symptom severity was measured longitudinally using the PTSD Checklist (PCL) for up to 7 years post-interview. AI-based indicators were computed for depression, anxiety, neuroticism, and extraversion along with dictionary-based measures of linguistic and interpersonal style. Linear regression and multilevel models estimated associations of AI indicators with concurrent and subsequent PTSD symptom severity (significance adjusted by false discovery rate). Results: Cross-sectionally, greater depressive language (beta=0.32; p=0.043) and first-person singular usage (beta=0.31; p=0.044) were associated with increased symptom severity. Longitudinally, anxious language predicted future worsening in PCL scores (beta=0.31; p=0.031), whereas first-person plural usage (beta=-0.37; p=0.007) and longer words usage (beta=-0.36; p=0.007) predicted improvement. Conclusions: This is the first study to demonstrate the value of AI in understanding PTSD in a vulnerable population. Future studies should extend this application to other trauma exposures and to other demographic groups, especially under-represented minorities. △ Less

Submitted 12 November, 2020; originally announced November 2020.

Comments: 20 pages, 2 figures

arXiv:2011.03983 [pdf, other]

Detecting Emerging Symptoms of COVID-19 using Context-based Twitter Embeddings

Authors: Roshan Santosh, H. Andrew Schwartz, Johannes C. Eichstaedt, Lyle H. Ungar, Sharath C. Guntuku

Abstract: In this paper, we present an iterative graph-based approach for the detection of symptoms of COVID-19, the pathology of which seems to be evolving. More generally, the method can be applied to finding context-specific words and texts (e.g. symptom mentions) in large imbalanced corpora (e.g. all tweets mentioning #COVID-19). Given the novelty of COVID-19, we also test if the proposed approach gener… ▽ More In this paper, we present an iterative graph-based approach for the detection of symptoms of COVID-19, the pathology of which seems to be evolving. More generally, the method can be applied to finding context-specific words and texts (e.g. symptom mentions) in large imbalanced corpora (e.g. all tweets mentioning #COVID-19). Given the novelty of COVID-19, we also test if the proposed approach generalizes to the problem of detecting Adverse Drug Reaction (ADR). We find that the approach applied to Twitter data can detect symptom mentions substantially before being reported by the Centers for Disease Control (CDC). △ Less

Submitted 8 November, 2020; originally announced November 2020.

Comments: In proceedings of EMNLP 2020 (Empirical Methods in NLP) workshop on COVID-19

arXiv:2009.05703 [pdf, other]

doi 10.2140/agt.2022.22.973

Gluck twisting roll spun knots

Authors: Patrick Naylor, Hannah Schwartz

Abstract: We show that the smooth homotopy 4-sphere obtained by Gluck twisting the m-twist n-roll spin of any unknotting number one knot is diffeomorphic to the standard 4-sphere, for any pair of integers (m,n). It follows as a corollary that an infinite collection of twisted doubles of Gompf's infinite order corks are standard. We show that the smooth homotopy 4-sphere obtained by Gluck twisting the m-twist n-roll spin of any unknotting number one knot is diffeomorphic to the standard 4-sphere, for any pair of integers (m,n). It follows as a corollary that an infinite collection of twisted doubles of Gompf's infinite order corks are standard. △ Less

Submitted 11 September, 2020; originally announced September 2020.

Comments: Comments welcome!

Journal ref: Algebr. Geom. Topol. 22 (2022) 973-990

arXiv:2007.13244 [pdf, other]

doi 10.1112/topo.12209

Unknotting numbers of 2-spheres in the 4-sphere

Authors: Jason Joseph, Michael Klug, Benjamin Ruppik, Hannah Schwartz

Abstract: We compare two naturally arising notions of unknotting number for 2-spheres in the 4-sphere: namely, the minimal number of 1-handle stabilizations needed to obtain an unknotted surface, and the minimal number of Whitney moves required in a regular homotopy to the unknotted 2-sphere. We refer to these invariants as the stabilization number and the Casson-Whitney number of the sphere, respectively.… ▽ More We compare two naturally arising notions of unknotting number for 2-spheres in the 4-sphere: namely, the minimal number of 1-handle stabilizations needed to obtain an unknotted surface, and the minimal number of Whitney moves required in a regular homotopy to the unknotted 2-sphere. We refer to these invariants as the stabilization number and the Casson-Whitney number of the sphere, respectively. Using both algebraic and geometric techniques, we show that the stabilization number is bounded above by one more than the Casson-Whitney number. We also provide explicit families of spheres for which these invariants are equal, as well as families for which they are distinct. Furthermore, we give additional bounds for both invariants, concrete examples of their non-additivity, and applications to classical unknotting number of 1-knots. △ Less

Submitted 21 September, 2021; v1 submitted 26 July, 2020; originally announced July 2020.

Comments: 29 pages, 22 figures; v2 is the final draft which has been accepted for publication in Journal of Topology; v2 includes improvements to the exposition, the numbering of the theorems in the introduction and in some of the subsequent sections has changed

Report number: MPIM-Bonn-2020 MSC Class: 57K45 (Primary) 57K10; 57K40; 57R42; 57R52 (Secondary)

Journal ref: Journal of Topology, 14.4 (2021) 1321-1350

arXiv:2004.06303 [pdf, other]

doi 10.1145/3366423.3380066

Quantifying Community Characteristics of Maternal Mortality Using Social Media

Authors: Rediet Abebe, Salvatore Giorgi, Anna Tedijanto, Anneke Buffone, H. Andrew Schwartz

Abstract: While most mortality rates have decreased in the US, maternal mortality has increased and is among the highest of any OECD nation. Extensive public health research is ongoing to better understand the characteristics of communities with relatively high or low rates. In this work, we explore the role that social media language can play in providing insights into such community characteristics. Analy… ▽ More While most mortality rates have decreased in the US, maternal mortality has increased and is among the highest of any OECD nation. Extensive public health research is ongoing to better understand the characteristics of communities with relatively high or low rates. In this work, we explore the role that social media language can play in providing insights into such community characteristics. Analyzing pregnancy-related tweets generated in US counties, we reveal a diverse set of latent topics including Morning Sickness, Celebrity Pregnancies, and Abortion Rights. We find that rates of mentioning these topics on Twitter predicts maternal mortality rates with higher accuracy than standard socioeconomic and risk variables such as income, race, and access to health-care, holding even after reducing the analysis to six topics chosen for their interpretability and connections to known risk factors. We then investigate psychological dimensions of community language, finding the use of less trustful, more stressed, and more negative affective language is significantly associated with higher mortality rates, while trust and negative affect also explain a significant portion of racial disparities in maternal mortality. We discuss the potential for these insights to inform actionable health interventions at the community-level. △ Less

Submitted 14 April, 2020; originally announced April 2020.

Comments: In Proceedings of The Web Conference 2020(WWW '20)

arXiv:1912.11078 [pdf, other]

doi 10.18653/v1/2020.acl-main.468

Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview

Authors: Deven Shah, H. Andrew Schwartz, Dirk Hovy

Abstract: An increasing number of works in natural language processing have addressed the effect of bias on the predicted outcomes, introducing mitigation techniques that act on different parts of the standard NLP pipeline (data and models). However, these works have been conducted in isolation, without a unifying framework to organize efforts within the field. This leads to repetitive approaches, and puts… ▽ More An increasing number of works in natural language processing have addressed the effect of bias on the predicted outcomes, introducing mitigation techniques that act on different parts of the standard NLP pipeline (data and models). However, these works have been conducted in isolation, without a unifying framework to organize efforts within the field. This leads to repetitive approaches, and puts an undue focus on the effects of bias, rather than on their origins. Research focused on bias symptoms rather than the underlying origins could limit the development of effective countermeasures. In this paper, we propose a unifying conceptualization: the predictive bias framework for NLP. We summarize the NLP literature and propose a general mathematical definition of predictive bias in NLP along with a conceptual framework, differentiating four main origins of biases: label bias, selection bias, model overamplification, and semantic bias. We discuss how past work has countered each bias origin. Our framework serves to guide an introductory overview of predictive bias in NLP, integrating existing work into a single structure and opening avenues for future research. △ Less

Submitted 12 September, 2020; v1 submitted 9 November, 2019; originally announced December 2019.

Comments: 9 pages excluding references, 1 figure, 3 pages for appendix

Journal ref: Association for Computational Linguistics. (2020) 5248--5264

arXiv:1911.05238 [pdf, other]

Benchmarking results for the Newton-Anderson method

Authors: Sara Pollock, Hunter Schwartz

Abstract: This paper primarily presents numerical results for the Anderson accelerated Newton method on a set of benchmark problems. The results demonstrate superlinear convergence to solutions of both degenerate and nondegenerate problems. The convergence for nondegenerate problems is also justified theoretically. For degenerate problems, those whose Jacobians are singular at a solution, the domain of conv… ▽ More This paper primarily presents numerical results for the Anderson accelerated Newton method on a set of benchmark problems. The results demonstrate superlinear convergence to solutions of both degenerate and nondegenerate problems. The convergence for nondegenerate problems is also justified theoretically. For degenerate problems, those whose Jacobians are singular at a solution, the domain of convergence is studied. It is observed in that setting that Newton-Anderson has a domain of convergence similar to Newton, but it may be attracted to a different solution than Newton if the problems are slightly perturbed. △ Less

Submitted 12 November, 2019; originally announced November 2019.

Comments: 12 pages, 5 figures, 3 tables

MSC Class: 65B05; 65H10

arXiv:1911.03855 [pdf, other]

Correcting Sociodemographic Selection Biases for Population Prediction from Social Media

Authors: Salvatore Giorgi, Veronica Lynn, Keshav Gupta, Farhan Ahmed, Sandra Matz, Lyle Ungar, H. Andrew Schwartz

Abstract: Social media is increasingly used for large-scale population predictions, such as estimating community health statistics. However, social media users are not typically a representative sample of the intended population -- a "selection bias". Within the social sciences, such a bias is typically addressed with restratification techniques, where observations are reweighted according to how under- or… ▽ More Social media is increasingly used for large-scale population predictions, such as estimating community health statistics. However, social media users are not typically a representative sample of the intended population -- a "selection bias". Within the social sciences, such a bias is typically addressed with restratification techniques, where observations are reweighted according to how under- or over-sampled their socio-demographic groups are. Yet, restratifaction is rarely evaluated for improving prediction. In this two-part study, we first evaluate standard, "out-of-the-box" restratification techniques, finding they provide no improvement and often even degraded prediction accuracies across four tasks of esimating U.S. county population health statistics from Twitter. The core reasons for degraded performance seem to be tied to their reliance on either sparse or shrunken estimates of each population's socio-demographics. In the second part of our study, we develop and evaluate Robust Poststratification, which consists of three methods to address these problems: (1) estimator redistribution to account for shrinking, as well as (2) adaptive binning and (3) informed smoothing to handle sparse socio-demographic estimates. We show that each of these methods leads to significant improvement in prediction accuracies over the standard restratification approaches. Taken together, Robust Poststratification enables state-of-the-art prediction accuracies, yielding a 53.0% increase in variance explained (R^2) in the case of surveyed life satisfaction, and a 17.8% average increase across all tasks. △ Less

Submitted 7 June, 2022; v1 submitted 10 November, 2019; originally announced November 2019.

Comments: Published at the 16th International AAAI Conference on Web and Social Media (ICWSM) 2022

arXiv:1902.02840 [pdf, other]

Higher order corks

Authors: Paul Melvin, Hannah Schwartz

Abstract: It is shown that any finite list of smooth closed simply-connected 4-manifolds homeomorphic to a given one X can be obtained by removing a single compact contractible submanifold (or cork) from X, and then regluing it by powers of a boundary diffeomorphism. We then use this result to "separate" finite families of corks embedded in a fixed 4-manifold. It is shown that any finite list of smooth closed simply-connected 4-manifolds homeomorphic to a given one X can be obtained by removing a single compact contractible submanifold (or cork) from X, and then regluing it by powers of a boundary diffeomorphism. We then use this result to "separate" finite families of corks embedded in a fixed 4-manifold. △ Less

Submitted 28 November, 2020; v1 submitted 7 February, 2019; originally announced February 2019.

Comments: Final version, to appear in Inventiones Mathematicae. Note that many of our infinite order results have been removed due to errors

arXiv:1811.02753 [pdf, other]

doi 10.2140/agt.2020.20.3313

A note on the complexity of h-cobordisms

Authors: Hannah R. Schwartz

Abstract: We show that the number of double points of smoothly immersed 2-spheres representing certain homology classes of an oriented, smooth, closed, simply-connected 4-manifold X must increase with the complexity of corresponding h-cobordisms from X to X. As an application, we give results restricting the minimal number of double points of immersed spheres in manifolds homeomorphic to rational surfaces. We show that the number of double points of smoothly immersed 2-spheres representing certain homology classes of an oriented, smooth, closed, simply-connected 4-manifold X must increase with the complexity of corresponding h-cobordisms from X to X. As an application, we give results restricting the minimal number of double points of immersed spheres in manifolds homeomorphic to rational surfaces. △ Less

Submitted 5 April, 2020; v1 submitted 6 November, 2018; originally announced November 2018.

Comments: Minor corrections, final version to appear in AGT

Journal ref: Algebr. Geom. Topol. 20 (2020) 3313-3327

arXiv:1810.10949 [pdf, other]

Learning Emotion from 100 Observations: Unexpected Robustness of Deep Learning under Strong Data Limitations

Authors: Sven Buechel, João Sedoc, H. Andrew Schwartz, Lyle Ungar

Abstract: One of the major downsides of Deep Learning is its supposed need for vast amounts of training data. As such, these techniques appear ill-suited for NLP areas where annotated data is limited, such as less-resourced languages or emotion analysis, with its many nuanced and hard-to-acquire annotation formats. We conduct a questionnaire study indicating that indeed the vast majority of researchers in e… ▽ More One of the major downsides of Deep Learning is its supposed need for vast amounts of training data. As such, these techniques appear ill-suited for NLP areas where annotated data is limited, such as less-resourced languages or emotion analysis, with its many nuanced and hard-to-acquire annotation formats. We conduct a questionnaire study indicating that indeed the vast majority of researchers in emotion analysis deems neural models inferior to traditional machine learning when training data is limited. In stark contrast to those survey results, we provide empirical evidence for English, Polish, and Portuguese that commonly used neural architectures can be trained on surprisingly few observations, outperforming $n$-gram based ridge regression on only 100 data points. Our analysis suggests that high-quality, pre-trained word embeddings are a main factor for achieving those results. △ Less

Submitted 7 December, 2020; v1 submitted 25 October, 2018; originally announced October 2018.

Comments: Published at PEOPLES 2020

arXiv:1809.01202 [pdf, other]

Causal Explanation Analysis on Social Media

Authors: Youngseo Son, Nipun Bayas, H. Andrew Schwartz

Abstract: Understanding causal explanations - reasons given for happenings in one's life - has been found to be an important psychological factor linked to physical and mental health. Causal explanations are often studied through manual identification of phrases over limited samples of personal writing. Automatic identification of causal explanations in social media, while challenging in relying on contextu… ▽ More Understanding causal explanations - reasons given for happenings in one's life - has been found to be an important psychological factor linked to physical and mental health. Causal explanations are often studied through manual identification of phrases over limited samples of personal writing. Automatic identification of causal explanations in social media, while challenging in relying on contextual and sequential cues, offers a larger-scale alternative to expensive manual ratings and opens the door for new applications (e.g. studying prevailing beliefs about causes, such as climate change). Here, we explore automating causal explanation analysis, building on discourse parsing, and presenting two novel subtasks: causality detection (determining whether a causal explanation exists at all) and causal explanation identification (identifying the specific phrase that is the explanation). We achieve strong accuracies for both tasks but find different approaches best: an SVM for causality prediction (F1 = 0.791) and a hierarchy of Bidirectional LSTMs for causal explanation identification (F1 = 0.853). Finally, we explore applications of our complete pipeline (F1 = 0.868), showing demographic differences in mentions of causal explanation and that the association between a word and sentiment can change when it is used within a causal explanation. △ Less

Submitted 18 October, 2018; v1 submitted 4 September, 2018; originally announced September 2018.

Comments: To appear in EMNLP 2018; 10 pages

arXiv:1808.09600 [pdf, ps, other]

The Remarkable Benefit of User-Level Aggregation for Lexical-based Population-Level Predictions

Authors: Salvatore Giorgi, Daniel Preotiuc-Pietro, Anneke Buffone, Daniel Rieman, Lyle H. Ungar, H. Andrew Schwartz

Abstract: Nowcasting based on social media text promises to provide unobtrusive and near real-time predictions of community-level outcomes. These outcomes are typically regarding people, but the data is often aggregated without regard to users in the Twitter populations of each community. This paper describes a simple yet effective method for building community-level models using Twitter language aggregated… ▽ More Nowcasting based on social media text promises to provide unobtrusive and near real-time predictions of community-level outcomes. These outcomes are typically regarding people, but the data is often aggregated without regard to users in the Twitter populations of each community. This paper describes a simple yet effective method for building community-level models using Twitter language aggregated by user. Results on four different U.S. county-level tasks, spanning demographic, health, and psychological outcomes show large and consistent improvements in prediction accuracies (e.g. from Pearson r=.73 to .82 for median income prediction or r=.37 to .47 for life satisfaction prediction) over the standard approach of aggregating all tweets. We make our aggregated and anonymized community-level data, derived from 37 billion tweets -- over 1 billion of which were mapped to counties, available for research. △ Less

Submitted 28 August, 2018; originally announced August 2018.

Comments: To appear in the proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP)

arXiv:1808.09479 [pdf, other]

Residualized Factor Adaptation for Community Social Media Prediction Tasks

Authors: Mohammadzaman Zamani, H. Andrew Schwartz, Veronica E. Lynn, Salvatore Giorgi, Niranjan Balasubramanian

Abstract: Predictive models over social media language have shown promise in capturing community outcomes, but approaches thus far largely neglect the socio-demographic context (e.g. age, education rates, race) of the community from which the language originates. For example, it may be inaccurate to assume people in Mobile, Alabama, where the population is relatively older, will use words the same way as th… ▽ More Predictive models over social media language have shown promise in capturing community outcomes, but approaches thus far largely neglect the socio-demographic context (e.g. age, education rates, race) of the community from which the language originates. For example, it may be inaccurate to assume people in Mobile, Alabama, where the population is relatively older, will use words the same way as those from San Francisco, where the median age is younger with a higher rate of college education. In this paper, we present residualized factor adaptation, a novel approach to community prediction tasks which both (a) effectively integrates community attributes, as well as (b) adapts linguistic features to community attributes (factors). We use eleven demographic and socioeconomic attributes, and evaluate our approach over five different community-level predictive tasks, spanning health (heart disease mortality, percent fair/poor health), psychology (life satisfaction), and economics (percent housing price increase, foreclosure rate). Our evaluation shows that residualized factor adaptation significantly improves 4 out of 5 community-level outcome predictions over prior state-of-the-art for incorporating socio-demographic contexts. △ Less

Submitted 28 August, 2018; originally announced August 2018.

Comments: Conference on Empirical Methods in Natural Language Processing (EMNLP 2018)

Journal ref: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 3560-3569, 2018

arXiv:1808.05668 [pdf, other]

doi 10.18653/v1/W18-0619

Predicting Human Trustfulness from Facebook Language

Authors: Mohammadzaman Zamani, Anneke Buffone, H. Andrew Schwartz

Abstract: Trustfulness -- one's general tendency to have confidence in unknown people or situations -- predicts many important real-world outcomes such as mental health and likelihood to cooperate with others such as clinicians. While data-driven measures of interpersonal trust have previously been introduced, here, we develop the first language-based assessment of the personality trait of trustfulness by f… ▽ More Trustfulness -- one's general tendency to have confidence in unknown people or situations -- predicts many important real-world outcomes such as mental health and likelihood to cooperate with others such as clinicians. While data-driven measures of interpersonal trust have previously been introduced, here, we develop the first language-based assessment of the personality trait of trustfulness by fitting one's language to an accepted questionnaire-based trust score. Further, using trustfulness as a type of case study, we explore the role of questionnaire size as well as word count in developing language-based predictive models of users' psychological traits. We find that leveraging a longer questionnaire can yield greater test set accuracy, while, for training, we find it beneficial to include users who took smaller questionnaires which offers more observations for training. Similarly, after noting a decrease in individual prediction error as word count increased, we found a word count-weighted training scheme was helpful when there were very few users in the first place. △ Less

Submitted 16 August, 2018; originally announced August 2018.

Comments: CLPsych2018

Journal ref: In Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic, pages 174-181, 2018

arXiv:1806.07541 [pdf, other]

doi 10.1112/topo.12121

Equivalent non-isotopic spheres in 4-manifolds

Authors: Hannah R. Schwartz

Abstract: We construct infinitely many smooth oriented 4-manifolds containing pairs of homotopic, smoothly embedded 2-spheres that are not topologically isotopic, but that are equivalent by an ambient diffeomorphism inducing the identity on homology. These examples show that Gabai's recent "Generalized" 4D Lightbulb Theorem does not generalize to arbitrary 4-manifolds. In contrast, we also show that there a… ▽ More We construct infinitely many smooth oriented 4-manifolds containing pairs of homotopic, smoothly embedded 2-spheres that are not topologically isotopic, but that are equivalent by an ambient diffeomorphism inducing the identity on homology. These examples show that Gabai's recent "Generalized" 4D Lightbulb Theorem does not generalize to arbitrary 4-manifolds. In contrast, we also show that there are smoothly embedded 2-spheres that are both equivalent and topologically isotopic, but not smoothly isotopic. △ Less

Submitted 12 July, 2019; v1 submitted 20 June, 2018; originally announced June 2018.

Comments: Final version, accepted for publication by the Journal of Topology

arXiv:1806.05740 [pdf, other]

Using Search Queries to Understand Health Information Needs in Africa

Authors: Rediet Abebe, Shawndra Hill, Jennifer Wortman Vaughan, Peter M. Small, H. Andrew Schwartz

Abstract: The lack of comprehensive, high-quality health data in developing nations creates a roadblock for combating the impacts of disease. One key challenge is understanding the health information needs of people in these nations. Without understanding people's everyday needs, concerns, and misconceptions, health organizations and policymakers lack the ability to effectively target education and programm… ▽ More The lack of comprehensive, high-quality health data in developing nations creates a roadblock for combating the impacts of disease. One key challenge is understanding the health information needs of people in these nations. Without understanding people's everyday needs, concerns, and misconceptions, health organizations and policymakers lack the ability to effectively target education and programming efforts. In this paper, we propose a bottom-up approach that uses search data from individuals to uncover and gain insight into health information needs in Africa. We analyze Bing searches related to HIV/AIDS, malaria, and tuberculosis from all 54 African nations. For each disease, we automatically derive a set of common search themes or topics, revealing a wide-spread interest in various types of information, including disease symptoms, drugs, concerns about breastfeeding, as well as stigma, beliefs in natural cures, and other topics that may be hard to uncover through traditional surveys. We expose the different patterns that emerge in health information needs by demographic groups (age and sex) and country. We also uncover discrepancies in the quality of content returned by search engines to users by topic. Combined, our results suggest that search data can help illuminate health information needs in Africa and inform discussions on health policy and targeted education efforts both on- and offline. △ Less

Submitted 17 April, 2019; v1 submitted 14 June, 2018; originally announced June 2018.

Comments: Extended version of an ICWSM 2019 paper

arXiv:1708.03208 [pdf, ps, other]

Isotopy of surfaces in 4-manifolds after a single stabilization

Authors: Dave Auckly, Hee Jung Kim, Paul Melvin, Daniel Ruberman, Hannah Schwartz

Abstract: Any two homologous surfaces of the same genus embedded in a smooth 4-manifold X with simply-connected complements are shown to be smoothly isotopic in the connected sum of X and the product of a 2-sphere with itself, if the surfaces are ordinary, and in the connected sum of X with the non-trivial sphere bundle over the sphere if they are characteristic. Any two homologous surfaces of the same genus embedded in a smooth 4-manifold X with simply-connected complements are shown to be smoothly isotopic in the connected sum of X and the product of a 2-sphere with itself, if the surfaces are ordinary, and in the connected sum of X with the non-trivial sphere bundle over the sphere if they are characteristic. △ Less

Submitted 10 August, 2017; originally announced August 2017.

arXiv:1705.08038 [pdf, other]

doi 10.1371/journal.pone.0201703

Latent Human Traits in the Language of Social Media: An Open-Vocabulary Approach

Authors: Vivek Kulkarni, Margaret L. Kern, David Stillwell, Michal Kosinski, Sandra Matz, Lyle Ungar, Steven Skiena, H. Andrew Schwartz

Abstract: Over the past century, personality theory and research has successfully identified core sets of characteristics that consistently describe and explain fundamental differences in the way people think, feel and behave. Such characteristics were derived through theory, dictionary analyses, and survey research using explicit self-reports. The availability of social media data spanning millions of user… ▽ More Over the past century, personality theory and research has successfully identified core sets of characteristics that consistently describe and explain fundamental differences in the way people think, feel and behave. Such characteristics were derived through theory, dictionary analyses, and survey research using explicit self-reports. The availability of social media data spanning millions of users now makes it possible to automatically derive characteristics from language use -- at large scale. Taking advantage of linguistic information available through Facebook, we study the process of inferring a new set of potential human traits based on unprompted language use. We subject these new traits to a comprehensive set of evaluations and compare them with a popular five factor model of personality. We find that our language-based trait construct is often more generalizable in that it often predicts non-questionnaire-based outcomes better than questionnaire-based traits (e.g. entities someone likes, income and intelligence quotient), while the factors remain nearly as stable as traditional factors. Our approach suggests a value in new constructs of personality derived from everyday human language use. △ Less

Submitted 22 May, 2017; originally announced May 2017.

Comments: In submission to PLOS One

arXiv:1503.06969 [pdf]

Hox genes underlie metazoan development, but what controls them?

Authors: Raffaele Di Giacomo, Bruno Maresca, Jeffrey H. Schwartz

Abstract: Although metazoan development is conceived as resulting from gene regulatory networks (GRNs) controlled by Hox genes, a better analogy is computer architecture: i.e., a task accomplished in sequential steps linked to an external referent that "counts" each step. A developmental "step" equals the expression of genes in specific cells at specific times and telomeres represent external "counters" whe… ▽ More Although metazoan development is conceived as resulting from gene regulatory networks (GRNs) controlled by Hox genes, a better analogy is computer architecture: i.e., a task accomplished in sequential steps linked to an external referent that "counts" each step. A developmental "step" equals the expression of genes in specific cells at specific times and telomeres represent external "counters" wherein "counting" is a function of telomere shortening at each cell division that permits the sequential expression of Hox genes and, ultimately, complex form. Metazoan development thus best resembles a Turing machine, which could be used to model the development of any metazoan. △ Less

Submitted 24 March, 2015; originally announced March 2015.

arXiv:1407.0297

Deleting an edge of a 3-cycle in an intrinsically knotted graph gives an intrinsically linked graph

Authors: Ramin Naimi, Elena Pavelescu, Hannah Schwartz

Abstract: We show that deleting an edge of a 3-cycle in an intrinsically knotted graph gives an intrinsically linked graph. We show that deleting an edge of a 3-cycle in an intrinsically knotted graph gives an intrinsically linked graph. △ Less

Submitted 8 April, 2021; v1 submitted 1 July, 2014; originally announced July 2014.

Comments: The authors found an error in the proof of Proposition 2. They also found a counterexample to Proposition 2

MSC Class: 05C10 (Primary); 57M15; 57M25 (Secondary)

arXiv:1401.3907 [pdf]

doi 10.1613/jair.3384

Policy Invariance under Reward Transformations for General-Sum Stochastic Games

Authors: Xiaosong Lu, Howard M. Schwartz, Sidney N. Givigi Jr

Abstract: We extend the potential-based shaping method from Markov decision processes to multi-player general-sum stochastic games. We prove that the Nash equilibria in a stochastic game remains unchanged after potential-based shaping is applied to the environment. The property of policy invariance provides a possible way of speeding convergence when learning to play a stochastic game. We extend the potential-based shaping method from Markov decision processes to multi-player general-sum stochastic games. We prove that the Nash equilibria in a stochastic game remains unchanged after potential-based shaping is applied to the environment. The property of policy invariance provides a possible way of speeding convergence when learning to play a stochastic game. △ Less

Submitted 16 January, 2014; originally announced January 2014.

Journal ref: Journal Of Artificial Intelligence Research, Volume 41, pages 397-406, 2011

Showing 1–43 of 43 results for author: Schwartz, H