Skip to main content

Showing 1–50 of 181 results for author: Lerman, K

  1. arXiv:2410.13776  [pdf, other

    cs.CL cs.AI

    Aggregation Artifacts in Subjective Tasks Collapse Large Language Models' Posteriors

    Authors: Georgios Chochlakis, Alexandros Potamianos, Kristina Lerman, Shrikanth Narayanan

    Abstract: In-context Learning (ICL) has become the primary method for performing natural language tasks with Large Language Models (LLMs). The knowledge acquired during pre-training is crucial for this few-shot capability, providing the model with task priors. However, recent studies have shown that ICL predominantly relies on retrieving task priors rather than "learning" to perform tasks. This limitation i… ▽ More

    Submitted 18 October, 2024; v1 submitted 17 October, 2024; originally announced October 2024.

    Comments: 12 pages, 7 figures, 2 tables

  2. arXiv:2409.16558  [pdf, other

    cs.SI cs.CY

    Bias Reduction in Social Networks through Agent-Based Simulations

    Authors: Nathan Bartley, Keith Burghardt, Kristina Lerman

    Abstract: Online social networks use recommender systems to suggest relevant information to their users in the form of personalized timelines. Studying how these systems expose people to information at scale is difficult to do as one cannot assume each user is subject to the same timeline condition and building appropriate evaluation infrastructure is costly. We show that a simple agent-based model where us… ▽ More

    Submitted 24 September, 2024; originally announced September 2024.

  3. arXiv:2409.13237  [pdf, other

    cs.SI

    RTs != Endorsements: Rethinking Exposure Fairness on Social Media Platforms

    Authors: Nathan Bartley, Kristina Lerman

    Abstract: Recommender systems underpin many of the personalized services in the online information & social media ecosystem. However, the assumptions in the research on content recommendations in domains like search, video, and music are often applied wholesale to domains that require a better understanding of why and how users interact with the systems. In this position paper we focus on social media and a… ▽ More

    Submitted 20 September, 2024; originally announced September 2024.

  4. arXiv:2409.13064  [pdf, other

    cs.SI cs.AI

    Fear and Loathing on the Frontline: Decoding the Language of Othering by Russia-Ukraine War Bloggers

    Authors: Patrick Gerard, William Theisen, Tim Weninger, Kristina Lerman

    Abstract: Othering, the act of portraying outgroups as fundamentally different from the ingroup, often escalates into framing them as existential threats--fueling intergroup conflict and justifying exclusion and violence. These dynamics are alarmingly pervasive, spanning from the extreme historical examples of genocides against minorities in Germany and Rwanda to the ongoing violence and rhetoric targeting… ▽ More

    Submitted 19 September, 2024; originally announced September 2024.

    Comments: 15 pages

  5. arXiv:2409.07710  [pdf, other

    cs.SI cs.DL

    Surprising Resilience of Science During a Global Pandemic: A Large-Scale Descriptive Analysis

    Authors: Kian Ahrabian, Casandra Rusti, Ziao Wang, Jay Pujara, Kristina Lerman

    Abstract: The COVID-19 pandemic profoundly impacted people globally, yet its effect on scientists and research institutions has yet to be fully examined. To address this knowledge gap, we use a newly available bibliographic dataset covering tens of millions of papers and authors to investigate changes in research activity and collaboration during this period. Employing statistical methods, we analyze the pa… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

  6. arXiv:2409.07684  [pdf, other

    cs.SI cs.AI

    Modeling Information Narrative Detection and Evolution on Telegram during the Russia-Ukraine War

    Authors: Patrick Gerard, Svitlana Volkova, Louis Penafiel, Kristina Lerman, Tim Weninger

    Abstract: Following the Russian Federation's full-scale invasion of Ukraine in February 2022, a multitude of information narratives emerged within both pro-Russian and pro-Ukrainian communities online. As the conflict progresses, so too do the information narratives, constantly adapting and influencing local and global community perceptions and attitudes. This dynamic nature of the evolving information envi… ▽ More

    Submitted 11 September, 2024; originally announced September 2024.

    Comments: 12 pages, International AAAI Conference on Web and Social Media 2025

  7. arXiv:2409.06173  [pdf, other

    cs.CL cs.AI

    Larger Language Models Don't Care How You Think: Why Chain-of-Thought Prompting Fails in Subjective Tasks

    Authors: Georgios Chochlakis, Niyantha Maruthu Pandiyan, Kristina Lerman, Shrikanth Narayanan

    Abstract: In-Context Learning (ICL) in Large Language Models (LLM) has emerged as the dominant technique for performing natural language tasks, as it does not require updating the model parameters with gradient-based methods. ICL promises to "adapt" the LLM to perform the present task at a competitive or state-of-the-art level at a fraction of the computational cost. ICL can be augmented by incorporating th… ▽ More

    Submitted 17 October, 2024; v1 submitted 9 September, 2024; originally announced September 2024.

    Comments: 5 pages, 2 figures, 1 table. arXiv admin note: text overlap with arXiv:2403.17125

  8. arXiv:2409.04043  [pdf, other

    cs.CL

    Towards Safer Online Spaces: Simulating and Assessing Intervention Strategies for Eating Disorder Discussions

    Authors: Louis Penafiel, Hsien-Te Kao, Isabel Erickson, David Chu, Robert McCormack, Kristina Lerman, Svitlana Volkova

    Abstract: Eating disorders are complex mental health conditions that affect millions of people around the world. Effective interventions on social media platforms are crucial, yet testing strategies in situ can be risky. We present a novel LLM-driven experimental testbed for simulating and assessing intervention strategies in ED-related discussions. Our framework generates synthetic conversations across mul… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

    Comments: 9 pages, 5 figures

  9. arXiv:2408.09366  [pdf, other

    cs.CL cs.CY cs.SI

    Improving and Assessing the Fidelity of Large Language Models Alignment to Online Communities

    Authors: Minh Duc Chu, Zihao He, Rebecca Dorn, Kristina Lerman

    Abstract: Large language models (LLMs) have shown promise in representing individuals and communities, offering new ways to study complex social dynamics. However, effectively aligning LLMs with specific human groups and systematically assessing the fidelity of the alignment remains a challenge. This paper presents a robust framework for aligning LLMs with online communities via instruction-tuning and compr… ▽ More

    Submitted 18 August, 2024; originally announced August 2024.

  10. arXiv:2407.03551  [pdf, other

    cs.SI cs.CL cs.CY

    Feelings about Bodies: Emotions on Diet and Fitness Forums Reveal Gendered Stereotypes and Body Image Concerns

    Authors: Cinthia Sánchez, Minh Duc Chu, Zihao He, Rebecca Dorn, Stuart Murray, Kristina Lerman

    Abstract: The gendered expectations about ideal body types can lead to body image concerns, dissatisfaction, and in extreme cases, disordered eating and other psychopathologies across the gender spectrum. While research has focused on pro-anorexia online communities that glorify the 'thin ideal', less attention has been given to the broader spectrum of body image concerns or how emerging disorders like musc… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  11. arXiv:2406.12074  [pdf, other

    cs.CL

    COMMUNITY-CROSS-INSTRUCT: Unsupervised Instruction Generation for Aligning Large Language Models to Online Communities

    Authors: Zihao He, Minh Duc Chu, Rebecca Dorn, Siyi Guo, Kristina Lerman

    Abstract: Social scientists use surveys to probe the opinions and beliefs of populations, but these methods are slow, costly, and prone to biases. Recent advances in large language models (LLMs) enable the creating of computational representations or "digital twins" of populations that generate human-like responses mimicking the population's language, styles, and attitudes. We introduce Community-Cross-Inst… ▽ More

    Submitted 6 October, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

  12. arXiv:2406.01866  [pdf, other

    cs.CL cs.CY cs.SI

    #EpiTwitter: Public Health Messaging During the COVID-19 Pandemic

    Authors: Ashwin Rao, Nazanin Sabri, Siyi Guo, Louiqa Raschid, Kristina Lerman

    Abstract: Effective communication during health crises is critical, with social media serving as a key platform for public health experts (PHEs) to engage with the public. However, it also amplifies pseudo-experts promoting contrarian views. Despite its importance, the role of emotional and moral language in PHEs' communication during COVID-19 remains under explored. This study examines how PHEs and pseudo-… ▽ More

    Submitted 10 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  13. arXiv:2406.00020  [pdf, other

    cs.CL cs.CY

    Harmful Speech Detection by Language Models Exhibits Gender-Queer Dialect Bias

    Authors: Rebecca Dorn, Lee Kezar, Fred Morstatter, Kristina Lerman

    Abstract: Content moderation on social media platforms shapes the dynamics of online discourse, influencing whose voices are amplified and whose are suppressed. Recent studies have raised concerns about the fairness of content moderation practices, particularly for aggressively flagging posts from transgender and non-binary individuals as toxic. In this study, we investigate the presence of bias in harmful… ▽ More

    Submitted 21 June, 2024; v1 submitted 23 May, 2024; originally announced June 2024.

  14. arXiv:2405.18374  [pdf, other

    cs.CY cs.HC

    Hostile Counterspeech Drives Users From Hate Subreddits

    Authors: Daniel Hickey, Matheus Schmitz, Daniel M. T. Fessler, Paul E. Smaldino, Kristina Lerman, Goran Murić, Keith Burghardt

    Abstract: Counterspeech -- speech that opposes hate speech -- has gained significant attention recently as a strategy to reduce hate on social media. While previous studies suggest that counterspeech can somewhat reduce hate speech, little is known about its effects on participation in online hate communities, nor which counterspeech tactics reduce harmful behavior. We begin to address these gaps by identif… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 19 pages, 11 figures. arXiv admin note: text overlap with arXiv:2303.13641

  15. arXiv:2405.17838  [pdf, other

    cs.LG cs.AI

    Trust and Terror: Hazards in Text Reveal Negatively Biased Credulity and Partisan Negativity Bias

    Authors: Keith Burghardt, Daniel M. T. Fessler, Chyna Tang, Anne Pisor, Kristina Lerman

    Abstract: Socio-linguistic indicators of text, such as emotion or sentiment, are often extracted using neural networks in order to better understand features of social media. One indicator that is often overlooked, however, is the presence of hazards within text. Recent psychological research suggests that statements about hazards are more believable than statements about benefits (a property known as negat… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Comments: 14 pages, 16 figures

  16. arXiv:2405.17410  [pdf, other

    cs.SI cs.CY cs.HC

    The Peripatetic Hater: Predicting Movement Among Hate Subreddits

    Authors: Daniel Hickey, Daniel M. T. Fessler, Kristina Lerman, Keith Burghardt

    Abstract: Many online hate groups exist to disparage others based on race, gender identity, sex, or other characteristics. The accessibility of these communities allows users to join multiple types of hate groups (e.g., a racist community and misogynistic community), which calls into question whether these peripatetic users could be further radicalized compared to users that stay in one type of hate group.… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

    Comments: 11 pages, 6 figures

  17. arXiv:2405.05275  [pdf, other

    cs.SI cs.AI cs.IR

    SoMeR: Multi-View User Representation Learning for Social Media

    Authors: Siyi Guo, Keith Burghardt, Valeria Pantè, Kristina Lerman

    Abstract: User representation learning aims to capture user preferences, interests, and behaviors in low-dimensional vector representations. These representations have widespread applications in recommendation systems and advertising; however, existing methods typically rely on specific features like text content, activity patterns, or platform metadata, failing to holistically model user behavior across di… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  18. arXiv:2405.03688  [pdf, other

    cs.CL cs.LG

    Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames

    Authors: Keith Burghardt, Kai Chen, Kristina Lerman

    Abstract: Adversarial information operations can destabilize societies by undermining fair elections, manipulating public opinions on policies, and promoting scams. Despite their widespread occurrence and potential impacts, our understanding of influence campaigns is limited by manual analysis of messages and subjective interpretation of their observable behavior. In this paper, we explore whether these lim… ▽ More

    Submitted 6 May, 2024; originally announced May 2024.

    Comments: 15 pages, 9 figures

  19. arXiv:2404.15457  [pdf, other

    cs.SI

    Hidden in Plain Sight: Exploring the Intersections of Mental Health, Eating Disorders, and Content Moderation on TikTok

    Authors: Charles Bickham, Kia Kazemi-Nia, Luca Luceri, Kristina Lerman, Emilio Ferrara

    Abstract: Social media platforms actively moderate content glorifying harmful behaviors like eating disorders, which include anorexia and bulimia. However, users have adapted to evade moderation by using coded hashtags. Our study investigates the prevalence of moderation evaders on the popular social media platform TikTok and contrasts their use and emotional valence with mainstream hashtags. We notice that… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 10 pages, 5 figures, 2 tables

  20. arXiv:2403.17125  [pdf, other

    cs.CL cs.AI

    The Strong Pull of Prior Knowledge in Large Language Models and Its Impact on Emotion Recognition

    Authors: Georgios Chochlakis, Alexandros Potamianos, Kristina Lerman, Shrikanth Narayanan

    Abstract: In-context Learning (ICL) has emerged as a powerful paradigm for performing natural language tasks with Large Language Models (LLM) without updating the models' parameters, in contrast to the traditional gradient-based finetuning. The promise of ICL is that the LLM can adapt to perform the present task at a competitive or state-of-the-art level at a fraction of the cost. The ability of LLMs to per… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: 30 pages, 27 figures

  21. arXiv:2403.16940  [pdf, other

    cs.SI cs.MA physics.soc-ph

    Dynamics of Affective Polarization: From Consensus to Partisan Divides

    Authors: Buddhika Nettasinghe, Allon G. Percus, Kristina Lerman

    Abstract: Politically divided societies are also often divided emotionally: people like and trust those with similar political views (in-group favoritism) while disliking and distrusting those with different views (out-group animosity). This phenomenon, called affective polarization, influences individual decisions, including seemingly apolitical choices such as whether to wear a mask or what car to buy. We… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  22. arXiv:2403.04085  [pdf, other

    cs.CL cs.CY

    Don't Blame the Data, Blame the Model: Understanding Noise and Bias When Learning from Subjective Annotations

    Authors: Abhishek Anand, Negar Mokhberian, Prathyusha Naresh Kumar, Anweasha Saha, Zihao He, Ashwin Rao, Fred Morstatter, Kristina Lerman

    Abstract: Researchers have raised awareness about the harms of aggregating labels especially in subjective tasks that naturally contain disagreements among human annotators. In this work we show that models that are only provided aggregated labels show low confidence on high-disagreement data instances. While previous studies consider such instances as mislabeled, we argue that the reason the high-disagreem… ▽ More

    Submitted 6 March, 2024; originally announced March 2024.

  23. arXiv:2402.11725  [pdf, other

    cs.CL cs.CR cs.CY

    How Susceptible are Large Language Models to Ideological Manipulation?

    Authors: Kai Chen, Zihao He, Jun Yan, Taiwei Shi, Kristina Lerman

    Abstract: Large Language Models (LLMs) possess the potential to exert substantial influence on public perceptions and interactions with information. This raises concerns about the societal impact that could arise if the ideologies within these models can be easily manipulated. In this work, we investigate how effectively LLMs can learn and generalize ideological biases from their instruction-tuning data. Ou… ▽ More

    Submitted 18 June, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

  24. arXiv:2402.11114  [pdf, other

    cs.CL cs.CY cs.SI

    Whose Emotions and Moral Sentiments Do Language Models Reflect?

    Authors: Zihao He, Siyi Guo, Ashwin Rao, Kristina Lerman

    Abstract: Language models (LMs) are known to represent the perspectives of some social groups better than others, which may impact their performance, especially on subjective tasks such as content moderation and hate speech detection. To explore how LMs represent different perspectives, existing research focused on positional alignment, i.e., how closely the models mimic the opinions and stances of differen… ▽ More

    Submitted 17 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  25. arXiv:2402.05882  [pdf, other

    cs.SI cs.CY cs.HC

    GET-Tok: A GenAI-Enriched Multimodal TikTok Dataset Documenting the 2022 Attempted Coup in Peru

    Authors: Gabriela Pinto, Keith Burghardt, Kristina Lerman, Emilio Ferrara

    Abstract: TikTok is one of the largest and fastest-growing social media sites in the world. TikTok features, however, such as voice transcripts, are often missing and other important features, such as OCR or video descriptions, do not exist. We introduce the Generative AI Enriched TikTok (GET-Tok) data, a pipeline for collecting TikTok videos and enriched data by augmenting the TikTok Research API with gene… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: Github repository: https://github.com/gabbypinto/GET-Tok-Peru

  26. arXiv:2402.01091  [pdf, other

    cs.CL cs.CY cs.SI

    Reading Between the Tweets: Deciphering Ideological Stances of Interconnected Mixed-Ideology Communities

    Authors: Zihao He, Ashwin Rao, Siyi Guo, Negar Mokhberian, Kristina Lerman

    Abstract: Recent advances in NLP have improved our ability to understand the nuanced worldviews of online communities. Existing research focused on probing ideological stances treats liberals and conservatives as separate groups. However, this fails to account for the nuanced views of the organically formed online communities and the connections between them. In this paper, we study discussions of the 2020… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  27. arXiv:2401.09647  [pdf, other

    cs.SI cs.CL cs.CY

    Large Language Models Help Reveal Unhealthy Diet and Body Concerns in Online Eating Disorders Communities

    Authors: Minh Duc Chu, Zihao He, Rebecca Dorn, Kristina Lerman

    Abstract: Eating disorders (ED), a severe mental health condition with high rates of mortality and morbidity, affect millions of people globally, especially adolescents. The proliferation of online communities that promote and normalize ED has been linked to this public health crisis. However, identifying harmful communities is challenging due to the use of coded language and other obfuscations. To address… ▽ More

    Submitted 23 May, 2024; v1 submitted 17 January, 2024; originally announced January 2024.

  28. arXiv:2401.08202  [pdf, other

    cs.SI cs.CY cs.DL

    IsamasRed: A Public Dataset Tracking Reddit Discussions on Israel-Hamas Conflict

    Authors: Kai Chen, Zihao He, Keith Burghardt, Jingxin Zhang, Kristina Lerman

    Abstract: The conflict between Israel and Palestinians significantly escalated after the October 7, 2023 Hamas attack, capturing global attention. To understand the public discourse on this conflict, we present a meticulously compiled dataset-IsamasRed-comprising nearly 400,000 conversations and over 8 million comments from Reddit, spanning from August 2023 to November 2023. We introduce an innovative keywo… ▽ More

    Submitted 16 April, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

  29. arXiv:2401.06275  [pdf, other

    cs.SI

    The Pulse of Mood Online: Unveiling Emotional Reactions in a Dynamic Social Media Landscape

    Authors: Siyi Guo, Zihao He, Ashwin Rao, Fred Morstatter, Jeffrey Brantingham, Kristina Lerman

    Abstract: The rich and dynamic information environment of social media provides researchers, policy makers, and entrepreneurs with opportunities to learn about social phenomena in a timely manner. However, using these data to understand social behavior is difficult due to heterogeneity of topics and events discussed in the highly dynamic online information environment. To address these challenges, we presen… ▽ More

    Submitted 11 January, 2024; originally announced January 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2307.10245

  30. arXiv:2311.16831  [pdf, other

    cs.CY

    Tracking a Year of Polarized Twitter Discourse on Abortion

    Authors: Ashwin Rao, Rong-Ching Chang, Qiankun Zhong, Kristina Lerman, Magdalena Wojcieszak

    Abstract: Abortion is one of the most contentious issues in American politics. The Dobbs v. Jackson Women's Health Organization ruling in 2022, which shifted the authority to regulate abortion from the federal government to the states, triggering intense protests and emotional debates across the nation. Yet, little is known about how online discourse about abortion rights fluctuated on social media platform… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  31. arXiv:2311.09743  [pdf, other

    cs.CL

    Capturing Perspectives of Crowdsourced Annotators in Subjective Learning Tasks

    Authors: Negar Mokhberian, Myrl G. Marmarelis, Frederic R. Hopp, Valerio Basile, Fred Morstatter, Kristina Lerman

    Abstract: Supervised classification heavily depends on datasets annotated by humans. However, in subjective tasks such as toxicity classification, these annotations often exhibit low agreement among raters. Annotations have commonly been aggregated by employing methods like majority voting to determine a single ground truth label. In subjective tasks, aggregating labels will result in biased labeling and, c… ▽ More

    Submitted 16 May, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  32. arXiv:2311.09687  [pdf, other

    cs.CL

    Inducing Political Bias Allows Language Models Anticipate Partisan Reactions to Controversies

    Authors: Zihao He, Siyi Guo, Ashwin Rao, Kristina Lerman

    Abstract: Social media platforms are rife with politically charged discussions. Therefore, accurately deciphering and predicting partisan biases using Large Language Models (LLMs) is increasingly critical. In this study, we address the challenge of understanding political bias in digitized discourse using LLMs. While traditional approaches often rely on finetuning separate models for each political faction,… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  33. arXiv:2310.18553  [pdf, other

    cs.SI physics.soc-ph

    Affective Polarization and Dynamics of Information Spread in Online Networks

    Authors: Kristina Lerman, Dan Feldman, Zihao He, Ashwin Rao

    Abstract: Members of different political groups not only disagree about issues but also dislike and distrust each other. While social media can amplify this emotional divide -- called affective polarization by political scientists -- there is a lack of agreement on its strength and prevalence. We measure affective polarization on social media by quantifying the emotions and toxicity of reply interactions. W… ▽ More

    Submitted 7 May, 2024; v1 submitted 27 October, 2023; originally announced October 2023.

  34. arXiv:2307.10245  [pdf, other

    cs.SI physics.soc-ph

    Measuring Online Emotional Reactions to Events

    Authors: Siyi Guo, Zihao He, Ashwin Rao, Eugene Jang, Yuanfeixue Nan, Fred Morstatter, Jeffrey Brantingham, Kristina Lerman

    Abstract: The rich and dynamic information environment of social media provides researchers, policy makers, and entrepreneurs with opportunities to learn about social phenomena in a timely manner. However, using this data to understand social behavior is difficult due heterogeneity of topics and events discussed in the highly dynamic online information environment. To address these challenges, we present a… ▽ More

    Submitted 28 March, 2024; v1 submitted 17 July, 2023; originally announced July 2023.

    Comments: Proceedings of the International Conference on Advances in Social Networks Analysis and Mining. 2023

  35. arXiv:2307.08541  [pdf, other

    cs.CL cs.SI

    Discovering collective narratives shifts in online discussions

    Authors: Wanying Zhao, Siyi Guo, Kristina Lerman, Yong-Yeol Ahn

    Abstract: Narrative is a foundation of human cognition and decision making. Because narratives play a crucial role in societal discourses and spread of misinformation and because of the pervasive use of social media, the narrative dynamics on social media can have profound societal impact. Yet, systematic and computational understanding of online narratives faces critical challenge of the scale and dynamics… ▽ More

    Submitted 17 July, 2023; originally announced July 2023.

  36. arXiv:2305.18533  [pdf, other

    cs.SI cs.CY

    Pandemic Culture Wars: Partisan Differences in the Moral Language of COVID-19 Discussions

    Authors: Ashwin Rao, Siyi Guo, Sze-Yuh Nina Wang, Fred Morstatter, Kristina Lerman

    Abstract: Effective response to pandemics requires coordinated adoption of mitigation measures, like masking and quarantines, to curb a virus's spread. However, as the COVID-19 pandemic demonstrated, political divisions can hinder consensus on the appropriate response. To better understand these divisions, our study examines a vast collection of COVID-19-related tweets. We focus on five contentious issues:… ▽ More

    Submitted 17 October, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

  37. arXiv:2305.11867  [pdf, other

    cs.SI

    Socio-Linguistic Characteristics of Coordinated Inauthentic Accounts

    Authors: Keith Burghardt, Ashwin Rao, Siyi Guo, Zihao He, Georgios Chochlakis, Baruah Sabyasachee, Andrew Rojecki, Shri Narayanan, Kristina Lerman

    Abstract: Online manipulation is a pressing concern for democracies, but the actions and strategies of coordinated inauthentic accounts, which have been used to interfere in elections, are not well understood. We analyze a five million-tweet multilingual dataset related to the 2017 French presidential election, when a major information campaign led by Russia called "#MacronLeaks" took place. We utilize heur… ▽ More

    Submitted 30 May, 2023; v1 submitted 19 May, 2023; originally announced May 2023.

    Comments: 12 pages, 9 figures; figures updated

  38. arXiv:2305.11316  [pdf, other

    cs.SI

    Radicalized by Thinness: Using a Model of Radicalization to Understand Pro-Anorexia Communities on Twitter

    Authors: Kristina Lerman, Aryan Karnati, Shuchan Zhou, Siyi Chen, Sudesh Kumar, Zihao He, Joanna Yau, Abigail Horn

    Abstract: The rise in eating disorders, a condition with serious health complications, has been linked to the proliferation of idealized body images on social media platforms. However, the relationship between social media and eating disorders is more complex, with online platforms potentially enabling harmful behaviors by linking people to ``pro-ana'' communities that promote eating disorders. We conceptua… ▽ More

    Submitted 30 August, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

  39. arXiv:2305.09846  [pdf, other

    cs.CL cs.SI

    CPL-NoViD: Context-Aware Prompt-based Learning for Norm Violation Detection in Online Communities

    Authors: Zihao He, Jonathan May, Kristina Lerman

    Abstract: Detecting norm violations in online communities is critical to maintaining healthy and safe spaces for online discussions. Existing machine learning approaches often struggle to adapt to the diverse rules and interpretations across different communities due to the inherent challenges of fine-tuning models for such context-specific tasks. In this paper, we introduce Context-aware Prompt-based Learn… ▽ More

    Submitted 16 April, 2024; v1 submitted 16 May, 2023; originally announced May 2023.

  40. Clique Densification in Networks

    Authors: Haochen Pi, Keith Burghardt, Allon G. Percus, Kristina Lerman

    Abstract: Real-world networks are rarely static. Recently, there has been increasing interest in both network growth and network densification, in which the number of edges scales superlinearly with the number of nodes. Less studied but equally important, however, are scaling laws of higher-order cliques, which can drive clustering and network redundancy. In this paper, we study how cliques grow with networ… ▽ More

    Submitted 7 April, 2023; originally announced April 2023.

    Comments: 14 pages, 11 figures. Paper is in press at Physical Review E

  41. arXiv:2304.02144  [pdf, other

    cs.CL

    A Data Fusion Framework for Multi-Domain Morality Learning

    Authors: Siyi Guo, Negar Mokhberian, Kristina Lerman

    Abstract: Language models can be trained to recognize the moral sentiment of text, creating new opportunities to study the role of morality in human life. As interest in language and morality has grown, several ground truth datasets with moral annotations have been released. However, these datasets vary in the method of data collection, domain, topics, instructions for annotators, etc. Simply aggregating su… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  42. arXiv:2303.04837  [pdf, other

    cs.SI

    Non-Binary Gender Expression in Online Interactions

    Authors: Rebecca Dorn, Negar Mokhberian, Julie Jiang, Jeremy Abramson, Fred Morstatter, Kristina Lerman

    Abstract: Many openly non-binary gender individuals participate in social networks. However, the relationship between gender and online interactions is not well understood, which may result in disparate treatment by large language models. We investigate individual identity on Twitter, focusing on gender expression as represented by users chosen pronouns. We find that non-binary groups tend to receive less a… ▽ More

    Submitted 12 September, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

  43. arXiv:2302.01439  [pdf, other

    cs.CY cs.SI

    #RoeOverturned: Twitter Dataset on the Abortion Rights Controversy

    Authors: Rong-Ching Chang, Ashwin Rao, Qiankun Zhong, Magdalena Wojcieszak, Kristina Lerman

    Abstract: On June 24, 2022, the United States Supreme Court overturned landmark rulings made in its 1973 verdict in Roe v. Wade. The justices by way of a majority vote in Dobbs v. Jackson Women's Health Organization, decided that abortion wasn't a constitutional right and returned the issue of abortion to the elected representatives. This decision triggered multiple protests and debates across the US, espec… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

    Comments: 9 pages, 5 figures

  44. arXiv:2301.11994  [pdf, other

    cs.SI cs.CY

    Gender and Prestige Bias in Coronavirus News Reporting

    Authors: Rebecca Dorn, Yiwen Ma, Fred Morstatter, Kristina Lerman

    Abstract: Journalists play a vital role in surfacing issues of societal importance, but their choices of what to highlight and who to interview are influenced by societal biases. In this work, we use natural language processing tools to measure these biases in a large corpus of news articles about the Covid-19 pandemic. Specifically, we identify when experts are quoted in news and extract their names and in… ▽ More

    Submitted 27 January, 2023; originally announced January 2023.

  45. arXiv:2301.06615  [pdf, other

    cs.LG cs.AI stat.ME

    Data-Driven Estimation of Heterogeneous Treatment Effects

    Authors: Christopher Tran, Keith Burghardt, Kristina Lerman, Elena Zheleva

    Abstract: Estimating how a treatment affects different individuals, known as heterogeneous treatment effect estimation, is an important problem in empirical sciences. In the last few years, there has been a considerable interest in adapting machine learning algorithms to the problem of estimating heterogeneous effects from observational and experimental data. However, these algorithms often make strong assu… ▽ More

    Submitted 17 October, 2024; v1 submitted 16 January, 2023; originally announced January 2023.

  46. arXiv:2212.10901  [pdf, other

    cs.SD cs.CL cs.IR cs.MM eess.AS

    ALCAP: Alignment-Augmented Music Captioner

    Authors: Zihao He, Weituo Hao, Wei-Tsung Lu, Changyou Chen, Kristina Lerman, Xuchen Song

    Abstract: Music captioning has gained significant attention in the wake of the rising prominence of streaming media platforms. Traditional approaches often prioritize either the audio or lyrics aspect of the music, inadvertently ignoring the intricate interplay between the two. However, a comprehensive understanding of music necessitates the integration of both these elements. In this study, we delve into t… ▽ More

    Submitted 21 October, 2023; v1 submitted 21 December, 2022; originally announced December 2022.

  47. arXiv:2212.03810  [pdf, other

    cs.CY

    The Social Emotional Web

    Authors: Kristina Lerman

    Abstract: The social web has linked people on a global scale, transforming how we communicate and interact. The massive interconnectedness has created new vulnerabilities in the form of social manipulation and misinformation. As the social web matures, we are entering a new phase, where people share their private feelings and emotions. This so-called social emotional web creates new opportunities for human… ▽ More

    Submitted 7 December, 2022; originally announced December 2022.

    Comments: The 8th IEEE International Conference on Collaboration and Internet Computing (IEEE CIC 2022)

  48. arXiv:2212.00339  [pdf, other

    cs.CL cs.CY

    Anger Breeds Controversy: Analyzing Controversy and Emotions on Reddit

    Authors: Kai Chen, Zihao He, Rong-Ching Chang, Jonathan May, Kristina Lerman

    Abstract: Emotions play an important role in interpersonal interactions and social conflict, yet their function in the development of controversy and disagreement in online conversations has not been explored. To address this gap, we study controversy on Reddit, a popular network of online discussion forums. We collect discussions from a wide variety of topical forums and use emotion detection to recognize… ▽ More

    Submitted 1 December, 2022; originally announced December 2022.

  49. arXiv:2211.16480  [pdf, other

    cs.SI cs.CY

    Retweets Amplify the Echo Chamber Effect

    Authors: Ashwin Rao, Fred Morstatter, Kristina Lerman

    Abstract: The growing prominence of social media in public discourse has led to a greater scrutiny of the quality of online information and the role it plays in amplifying political polarization. However, studies of polarization on social media platforms like Twitter have been hampered by the difficulty of collecting data about the social graph, specifically follow links that shape the echo chambers users j… ▽ More

    Submitted 26 July, 2023; v1 submitted 29 November, 2022; originally announced November 2022.

    Comments: 8 pages, 8 figures

  50. arXiv:2211.00171  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Using Emotion Embeddings to Transfer Knowledge Between Emotions, Languages, and Annotation Formats

    Authors: Georgios Chochlakis, Gireesh Mahajan, Sabyasachee Baruah, Keith Burghardt, Kristina Lerman, Shrikanth Narayanan

    Abstract: The need for emotional inference from text continues to diversify as more and more disciplines integrate emotions into their theories and applications. These needs include inferring different emotion types, handling multiple languages, and different annotation formats. A shared model between different configurations would enable the sharing of knowledge and a decrease in training costs, and would… ▽ More

    Submitted 11 March, 2023; v1 submitted 31 October, 2022; originally announced November 2022.

    Comments: Accepted at ICASSP'23, 5 pages