-
CausalGraph2LLM: Evaluating LLMs for Causal Queries
Authors:
Ivaxi Sheth,
Bahare Fatemi,
Mario Fritz
Abstract:
Causality is essential in scientific research, enabling researchers to interpret true relationships between variables. These causal relationships are often represented by causal graphs, which are directed acyclic graphs. With the recent advancements in Large Language Models (LLMs), there is an increasing interest in exploring their capabilities in causal reasoning and their potential use to hypoth…
▽ More
Causality is essential in scientific research, enabling researchers to interpret true relationships between variables. These causal relationships are often represented by causal graphs, which are directed acyclic graphs. With the recent advancements in Large Language Models (LLMs), there is an increasing interest in exploring their capabilities in causal reasoning and their potential use to hypothesize causal graphs. These tasks necessitate the LLMs to encode the causal graph effectively for subsequent downstream tasks. In this paper, we propose a comprehensive benchmark, \emph{CausalGraph2LLM}, encompassing a variety of causal graph settings to assess the causal graph understanding capability of LLMs. We categorize the causal queries into two types: graph-level and node-level queries. We benchmark both open-sourced and closed models for our study. Our findings reveal that while LLMs show promise in this domain, they are highly sensitive to the encoding used. Even capable models like GPT-4 and Gemini-1.5 exhibit sensitivity to encoding, with deviations of about $60\%$. We further demonstrate this sensitivity for downstream causal intervention tasks. Moreover, we observe that LLMs can often display biases when presented with contextual information about a causal graph, potentially stemming from their parametric memory.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
LLM4GRN: Discovering Causal Gene Regulatory Networks with LLMs -- Evaluation through Synthetic Data Generation
Authors:
Tejumade Afonja,
Ivaxi Sheth,
Ruta Binkyte,
Waqar Hanif,
Thomas Ulas,
Matthias Becker,
Mario Fritz
Abstract:
Gene regulatory networks (GRNs) represent the causal relationships between transcription factors (TFs) and target genes in single-cell RNA sequencing (scRNA-seq) data. Understanding these networks is crucial for uncovering disease mechanisms and identifying therapeutic targets. In this work, we investigate the potential of large language models (LLMs) for GRN discovery, leveraging their learned bi…
▽ More
Gene regulatory networks (GRNs) represent the causal relationships between transcription factors (TFs) and target genes in single-cell RNA sequencing (scRNA-seq) data. Understanding these networks is crucial for uncovering disease mechanisms and identifying therapeutic targets. In this work, we investigate the potential of large language models (LLMs) for GRN discovery, leveraging their learned biological knowledge alone or in combination with traditional statistical methods. We develop a task-based evaluation strategy to address the challenge of unavailable ground truth causal graphs. Specifically, we use the GRNs suggested by LLMs to guide causal synthetic data generation and compare the resulting data against the original dataset. Our statistical and biological assessments show that LLMs can support statistical modeling and data synthesis for biological research.
△ Less
Submitted 21 October, 2024;
originally announced October 2024.
-
LLM2Swarm: Robot Swarms that Responsively Reason, Plan, and Collaborate through LLMs
Authors:
Volker Strobel,
Marco Dorigo,
Mario Fritz
Abstract:
Robot swarms are composed of many simple robots that communicate and collaborate to fulfill complex tasks. Robot controllers usually need to be specified by experts on a case-by-case basis via programming code. This process is time-consuming, prone to errors, and unable to take into account all situations that may be encountered during deployment. On the other hand, recent Large Language Models (L…
▽ More
Robot swarms are composed of many simple robots that communicate and collaborate to fulfill complex tasks. Robot controllers usually need to be specified by experts on a case-by-case basis via programming code. This process is time-consuming, prone to errors, and unable to take into account all situations that may be encountered during deployment. On the other hand, recent Large Language Models (LLMs) have demonstrated reasoning and planning capabilities, introduced new ways to interact with and program machines, and incorporate both domain-specific and commonsense knowledge. Hence, we propose to address the aforementioned challenges by integrating LLMs with robot swarms and show the potential in proofs of concept (showcases). For this integration, we explore two approaches. The first approach is 'indirect integration,' where LLMs are used to synthesize and validate the robot controllers. This approach may reduce development time and human error before deployment. Moreover, during deployment, it could be used for on-the-fly creation of new robot behaviors. The second approach is 'direct integration,' where each robot locally executes a separate LLM instance during deployment for robot-robot collaboration and human-swarm interaction. These local LLM instances enable each robot to reason, plan, and collaborate using natural language, as demonstrated in our showcases where the robots are able to detect a variety of anomalies, without prior information about the nature of these anomalies. To enable further research on our mainly conceptual contribution, we release the software and videos for our LLM2Swarm system: https://github.com/Pold87/LLM2Swarm.
△ Less
Submitted 16 October, 2024; v1 submitted 15 October, 2024;
originally announced October 2024.
-
Language Models as Zero-shot Lossless Gradient Compressors: Towards General Neural Parameter Prior Models
Authors:
Hui-Po Wang,
Mario Fritz
Abstract:
Despite the widespread use of statistical prior models in various fields, such models for neural network gradients have long been overlooked. The inherent challenge stems from their high-dimensional structures and complex interdependencies, which complicate effective modeling. In this work, we demonstrate the potential of large language models (LLMs) to act as gradient priors in a zero-shot settin…
▽ More
Despite the widespread use of statistical prior models in various fields, such models for neural network gradients have long been overlooked. The inherent challenge stems from their high-dimensional structures and complex interdependencies, which complicate effective modeling. In this work, we demonstrate the potential of large language models (LLMs) to act as gradient priors in a zero-shot setting. We examine the property by considering lossless gradient compression -- a critical application in distributed learning -- that depends heavily on precise probability modeling. To achieve this, we introduce LM-GC, a novel method that integrates LLMs with arithmetic coding. Our technique converts plain gradients into text-like formats, enhancing token efficiency by up to 38 times compared to their plain representations. We ensure that this data conversion maintains a close alignment with the structure of plain gradients and the symbols commonly recognized by LLMs. Our experiments indicate that LM-GC surpasses existing state-of-the-art lossless compression methods, improving compression rates by 10\% up to 17.2\% across various datasets and architectures. Additionally, our approach shows promising compatibility with lossy compression techniques such as quantization and sparsification. These findings highlight the significant potential of LLMs as a model for effectively handling gradients. We will release the source code upon publication.
△ Less
Submitted 26 September, 2024;
originally announced September 2024.
-
A weather-driven mathematical model of Culex population abundance and the impact of vector control interventions
Authors:
Suman Bhowmick,
Patrick Irwin,
Kristina Lopez,
Megan Lindsay Fritz,
Rebecca Lee Smith
Abstract:
Even as the incidence of mosquito-borne diseases like West Nile Virus (WNV) in North America has risen over the past decade, effectively modelling mosquito population density or, the abundance has proven to be a persistent challenge. It is critical to capture the fluctuations in mosquito abundance across seasons in order to forecast the varying risk of disease transmission from one year to the nex…
▽ More
Even as the incidence of mosquito-borne diseases like West Nile Virus (WNV) in North America has risen over the past decade, effectively modelling mosquito population density or, the abundance has proven to be a persistent challenge. It is critical to capture the fluctuations in mosquito abundance across seasons in order to forecast the varying risk of disease transmission from one year to the next. We develop a process-based mechanistic weather-driven Ordinary Differential Equation (ODE) model to study the population biology of both aqueous and terrestrial stages of mosquito population. The progression of mosquito lifecycle through these stages is influenced by different factors, including temperature, daylight hours, intra-species competition and the availability of aquatic habitats. Weather-driven parameters are utilised in our work, are a combination of laboratory research and literature data. In our model, we include precipitation data as a substitute for evaluating additional mortality in the mosquito population. We compute the \textit{Basic offspring number} of the associated model and perform sensitivity analysis. Finally, we employ our model to assess the effectiveness of various adulticides strategies to predict the reduction in mosquito population. This enhancement in modelling of mosquito abundance can be instrumental in guiding interventions aimed at reducing mosquito populations and mitigating mosquito-borne diseases such as the WNV.
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
HexaCoder: Secure Code Generation via Oracle-Guided Synthetic Training Data
Authors:
Hossein Hajipour,
Lea Schönherr,
Thorsten Holz,
Mario Fritz
Abstract:
Large language models (LLMs) have shown great potential for automatic code generation and form the basis for various tools such as GitHub Copilot. However, recent studies highlight that many LLM-generated code contains serious security vulnerabilities. While previous work tries to address this by training models that generate secure code, these attempts remain constrained by limited access to trai…
▽ More
Large language models (LLMs) have shown great potential for automatic code generation and form the basis for various tools such as GitHub Copilot. However, recent studies highlight that many LLM-generated code contains serious security vulnerabilities. While previous work tries to address this by training models that generate secure code, these attempts remain constrained by limited access to training data and labor-intensive data preparation.
In this paper, we introduce HexaCoder, a novel approach to enhance the ability of LLMs to generate secure codes by automatically synthesizing secure codes, which reduces the effort of finding suitable training data. HexaCoder comprises two key components: an oracle-guided data synthesis pipeline and a two-step process for secure code generation. The data synthesis pipeline generates pairs of vulnerable and fixed codes for specific Common Weakness Enumeration (CWE) types by utilizing a state-of-the-art LLM for repairing vulnerable code. A security oracle identifies vulnerabilities, and a state-of-the-art LLM repairs them by extending and/or editing the codes, creating data pairs for fine-tuning using the Low-Rank Adaptation (LoRA) method. Each example of our fine-tuning dataset includes the necessary security-related libraries and code that form the basis of our novel two-step generation approach. This allows the model to integrate security-relevant libraries before generating the main code, significantly reducing the number of generated vulnerable codes by up to 85% compared to the baseline methods. We perform extensive evaluations on three different benchmarks for four LLMs, demonstrating that HexaCoder not only improves the security of the generated code but also maintains a high level of functional correctness.
△ Less
Submitted 10 September, 2024;
originally announced September 2024.
-
Hypothesizing Missing Causal Variables with LLMs
Authors:
Ivaxi Sheth,
Sahar Abdelnabi,
Mario Fritz
Abstract:
Scientific discovery is a catalyst for human intellectual advances, driven by the cycle of hypothesis generation, experimental design, data evaluation, and iterative assumption refinement. This process, while crucial, is expensive and heavily dependent on the domain knowledge of scientists to generate hypotheses and navigate the scientific cycle. Central to this is causality, the ability to establ…
▽ More
Scientific discovery is a catalyst for human intellectual advances, driven by the cycle of hypothesis generation, experimental design, data evaluation, and iterative assumption refinement. This process, while crucial, is expensive and heavily dependent on the domain knowledge of scientists to generate hypotheses and navigate the scientific cycle. Central to this is causality, the ability to establish the relationship between the cause and the effect. Motivated by the scientific discovery process, in this work, we formulate a novel task where the input is a partial causal graph with missing variables, and the output is a hypothesis about the missing variables to complete the partial graph. We design a benchmark with varying difficulty levels and knowledge assumptions about the causal graph. With the growing interest in using Large Language Models (LLMs) to assist in scientific discovery, we benchmark open-source and closed models on our testbed. We show the strong ability of LLMs to hypothesize the mediation variables between a cause and its effect. In contrast, they underperform in hypothesizing the cause and effect variables themselves. We also observe surprising results where some of the open-source models outperform the closed GPT-4 model.
△ Less
Submitted 4 September, 2024;
originally announced September 2024.
-
Well-posedness, long-time behavior, and discretization of some models of nonlinear acoustics in velocity-enthalpy formulation
Authors:
Herbert Egger,
Marvin Fritz
Abstract:
We study a class of models for nonlinear acoustics, including the well-known Westervelt and Kuznetsov equations, as well as a model of Rasmussen that can be seen as a thermodynamically consistent modification of the latter. Using linearization, energy estimates, and fixed-point arguments, we establish the existence and uniqueness of solutions that, for sufficiently small data, are global in time a…
▽ More
We study a class of models for nonlinear acoustics, including the well-known Westervelt and Kuznetsov equations, as well as a model of Rasmussen that can be seen as a thermodynamically consistent modification of the latter. Using linearization, energy estimates, and fixed-point arguments, we establish the existence and uniqueness of solutions that, for sufficiently small data, are global in time and converge exponentially fast to equilibrium. In contrast to previous work, our analysis is based on a velocity-enthalpy formulation of the problem, whose weak form reveals the underlying port-Hamiltonian structure. Moreover, the weak form of the problem is particularly well-suited for a structure-preserving discretization. This is demonstrated in numerical tests, which also highlight typical characteristics of the models under consideration.
△ Less
Submitted 2 September, 2024;
originally announced September 2024.
-
Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation
Authors:
Yuxuan Zhou,
Margret Keuper,
Mario Fritz
Abstract:
Sampling-based decoding strategies have been widely adopted for Large Language Models (LLMs) in numerous applications, which target a balance between diversity and quality via temperature tuning and tail truncation (e.g., top-k and top-p sampling). Considering the high dynamic range of the candidate next-token given different prefixes, recent studies propose to adaptively truncate the tail of LLM'…
▽ More
Sampling-based decoding strategies have been widely adopted for Large Language Models (LLMs) in numerous applications, which target a balance between diversity and quality via temperature tuning and tail truncation (e.g., top-k and top-p sampling). Considering the high dynamic range of the candidate next-token given different prefixes, recent studies propose to adaptively truncate the tail of LLM's predicted distribution. Although improved results haven been reported with these methods on open-ended text generation tasks, the results are highly dependent on the curated truncation parameters and exemplar text. In this paper, we propose a systematic way to estimate the intrinsic capacity of a truncation sampling method by considering the trade-off between diversity and risk at each decoding step, based on our collected prefix tree which preserves the context of a full sentence. Our work provides a comprehensive comparison between existing truncation sampling methods, as well as their recommended parameters as a guideline for users.
△ Less
Submitted 24 August, 2024;
originally announced August 2024.
-
Inside the Black Box: Detecting Data Leakage in Pre-trained Language Encoders
Authors:
Yuan Xin,
Zheng Li,
Ning Yu,
Dingfan Chen,
Mario Fritz,
Michael Backes,
Yang Zhang
Abstract:
Despite being prevalent in the general field of Natural Language Processing (NLP), pre-trained language models inherently carry privacy and copyright concerns due to their nature of training on large-scale web-scraped data. In this paper, we pioneer a systematic exploration of such risks associated with pre-trained language encoders, specifically focusing on the membership leakage of pre-training…
▽ More
Despite being prevalent in the general field of Natural Language Processing (NLP), pre-trained language models inherently carry privacy and copyright concerns due to their nature of training on large-scale web-scraped data. In this paper, we pioneer a systematic exploration of such risks associated with pre-trained language encoders, specifically focusing on the membership leakage of pre-training data exposed through downstream models adapted from pre-trained language encoders-an aspect largely overlooked in existing literature. Our study encompasses comprehensive experiments across four types of pre-trained encoder architectures, three representative downstream tasks, and five benchmark datasets. Intriguingly, our evaluations reveal, for the first time, the existence of membership leakage even when only the black-box output of the downstream model is exposed, highlighting a privacy risk far greater than previously assumed. Alongside, we present in-depth analysis and insights toward guiding future researchers and practitioners in addressing the privacy considerations in developing pre-trained language models.
△ Less
Submitted 20 August, 2024;
originally announced August 2024.
-
Structure-preserving approximation of the Cahn-Hilliard-Biot system
Authors:
Aaron Brunk,
Marvin Fritz
Abstract:
In this work, we propose a structure-preserving discretisation for the recently studied Cahn-Hilliard-Biot system using conforming finite elements in space and problem-adapted explicit-implicit Euler time integration. We prove that the scheme preserves the thermodynamic structure, that is, the balance of mass and volumetric fluid content and the energy dissipation balance. The existence of discret…
▽ More
In this work, we propose a structure-preserving discretisation for the recently studied Cahn-Hilliard-Biot system using conforming finite elements in space and problem-adapted explicit-implicit Euler time integration. We prove that the scheme preserves the thermodynamic structure, that is, the balance of mass and volumetric fluid content and the energy dissipation balance. The existence of discrete solutions is established under suitable growth conditions. Furthermore, it is shown that the algorithm can be realised as a splitting method, that is, decoupling the Cahn-Hilliard subsystem from the poro-elasticity subsystem, while the first one is nonlinear and the second subsystem is linear. The schemes are illustrated by numerical examples and a convergence test.
△ Less
Submitted 17 July, 2024;
originally announced July 2024.
-
FullCert: Deterministic End-to-End Certification for Training and Inference of Neural Networks
Authors:
Tobias Lorenz,
Marta Kwiatkowska,
Mario Fritz
Abstract:
Modern machine learning models are sensitive to the manipulation of both the training data (poisoning attacks) and inference data (adversarial examples). Recognizing this issue, the community has developed many empirical defenses against both attacks and, more recently, certification methods with provable guarantees against inference-time attacks. However, such guarantees are still largely lacking…
▽ More
Modern machine learning models are sensitive to the manipulation of both the training data (poisoning attacks) and inference data (adversarial examples). Recognizing this issue, the community has developed many empirical defenses against both attacks and, more recently, certification methods with provable guarantees against inference-time attacks. However, such guarantees are still largely lacking for training-time attacks. In this work, we present FullCert, the first end-to-end certifier with sound, deterministic bounds, which proves robustness against both training-time and inference-time attacks. We first bound all possible perturbations an adversary can make to the training data under the considered threat model. Using these constraints, we bound the perturbations' influence on the model's parameters. Finally, we bound the impact of these parameter changes on the model's prediction, resulting in joint robustness guarantees against poisoning and adversarial examples. To facilitate this novel certification paradigm, we combine our theoretical work with a new open-source library BoundFlow, which enables model training on bounded datasets. We experimentally demonstrate FullCert's feasibility on two datasets.
△ Less
Submitted 11 September, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Authors:
Edoardo Debenedetti,
Javier Rando,
Daniel Paleka,
Silaghi Fineas Florin,
Dragos Albastroiu,
Niv Cohen,
Yuval Lemberg,
Reshmi Ghosh,
Rui Wen,
Ahmed Salem,
Giovanni Cherubin,
Santiago Zanella-Beguelin,
Robin Schmid,
Victor Klemm,
Takahiro Miki,
Chenhao Li,
Stefan Kraft,
Mario Fritz,
Florian Tramèr,
Sahar Abdelnabi,
Lea Schönherr
Abstract:
Large language model systems face important security risks from maliciously crafted messages that aim to overwrite the system's original instructions or leak private data. To study this problem, we organized a capture-the-flag competition at IEEE SaTML 2024, where the flag is a secret string in the LLM system prompt. The competition was organized in two phases. In the first phase, teams developed…
▽ More
Large language model systems face important security risks from maliciously crafted messages that aim to overwrite the system's original instructions or leak private data. To study this problem, we organized a capture-the-flag competition at IEEE SaTML 2024, where the flag is a secret string in the LLM system prompt. The competition was organized in two phases. In the first phase, teams developed defenses to prevent the model from leaking the secret. During the second phase, teams were challenged to extract the secrets hidden for defenses proposed by the other teams. This report summarizes the main insights from the competition. Notably, we found that all defenses were bypassed at least once, highlighting the difficulty of designing a successful defense and the necessity for additional research to protect LLM systems. To foster future research in this direction, we compiled a dataset with over 137k multi-turn attack chats and open-sourced the platform.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
MultiMax: Sparse and Multi-Modal Attention Learning
Authors:
Yuxuan Zhou,
Mario Fritz,
Margret Keuper
Abstract:
SoftMax is a ubiquitous ingredient of modern machine learning algorithms. It maps an input vector onto a probability simplex and reweights the input by concentrating the probability mass at large entries. Yet, as a smooth approximation to the Argmax function, a significant amount of probability mass is distributed to other, residual entries, leading to poor interpretability and noise. Although spa…
▽ More
SoftMax is a ubiquitous ingredient of modern machine learning algorithms. It maps an input vector onto a probability simplex and reweights the input by concentrating the probability mass at large entries. Yet, as a smooth approximation to the Argmax function, a significant amount of probability mass is distributed to other, residual entries, leading to poor interpretability and noise. Although sparsity can be achieved by a family of SoftMax variants, they often require an alternative loss function and do not preserve multi-modality. We show that this trade-off between multi-modality and sparsity limits the expressivity of SoftMax as well as its variants. We provide a solution to this tension between objectives by proposing a piece-wise differentiable function, termed MultiMax, which adaptively modulates the output distribution according to input entry range. Through comprehensive analysis and evaluation, we show that MultiMax successfully produces a distribution that supresses irrelevant entries while preserving multimodality, with benefits in image classification, language modeling and machine translation. The code is available at https://github.com/ZhouYuxuanYX/MultiMax.
△ Less
Submitted 4 June, 2024; v1 submitted 3 June, 2024;
originally announced June 2024.
-
Are you still on track!? Catching LLM Task Drift with Activations
Authors:
Sahar Abdelnabi,
Aideen Fay,
Giovanni Cherubin,
Ahmed Salem,
Mario Fritz,
Andrew Paverd
Abstract:
Large Language Models (LLMs) are routinely used in retrieval-augmented applications to orchestrate tasks and process inputs from users and other sources. These inputs, even in a single LLM interaction, can come from a variety of sources, of varying trustworthiness and provenance. This opens the door to prompt injection attacks, where the LLM receives and acts upon instructions from supposedly data…
▽ More
Large Language Models (LLMs) are routinely used in retrieval-augmented applications to orchestrate tasks and process inputs from users and other sources. These inputs, even in a single LLM interaction, can come from a variety of sources, of varying trustworthiness and provenance. This opens the door to prompt injection attacks, where the LLM receives and acts upon instructions from supposedly data-only sources, thus deviating from the user's original instructions. We define this as task drift, and we propose to catch it by scanning and analyzing the LLM's activations. We compare the LLM's activations before and after processing the external input in order to detect whether this input caused instruction drift. We develop two probing methods and find that simply using a linear classifier can detect drift with near perfect ROC AUC on an out-of-distribution test set. We show that this approach generalizes surprisingly well to unseen task domains, such as prompt injections, jailbreaks, and malicious instructions, without being trained on any of these attacks. Our setup does not require any modification of the LLM (e.g., fine-tuning) or any text generation, thus maximizing deployability and cost efficiency and avoiding reliance on unreliable model output. To foster future research on activation-based task inspection, decoding, and interpretability, we will release our large-scale TaskTracker toolkit, comprising a dataset of over 500K instances, representations from 5 SoTA language models, and inspection tools.
△ Less
Submitted 19 July, 2024; v1 submitted 2 June, 2024;
originally announced June 2024.
-
Stealthy Imitation: Reward-guided Environment-free Policy Stealing
Authors:
Zhixiong Zhuang,
Maria-Irina Nicolae,
Mario Fritz
Abstract:
Deep reinforcement learning policies, which are integral to modern control systems, represent valuable intellectual property. The development of these policies demands considerable resources, such as domain expertise, simulation fidelity, and real-world validation. These policies are potentially vulnerable to model stealing attacks, which aim to replicate their functionality using only black-box a…
▽ More
Deep reinforcement learning policies, which are integral to modern control systems, represent valuable intellectual property. The development of these policies demands considerable resources, such as domain expertise, simulation fidelity, and real-world validation. These policies are potentially vulnerable to model stealing attacks, which aim to replicate their functionality using only black-box access. In this paper, we propose Stealthy Imitation, the first attack designed to steal policies without access to the environment or knowledge of the input range. This setup has not been considered by previous model stealing methods. Lacking access to the victim's input states distribution, Stealthy Imitation fits a reward model that allows to approximate it. We show that the victim policy is harder to imitate when the distribution of the attack queries matches that of the victim. We evaluate our approach across diverse, high-dimensional control tasks and consistently outperform prior data-free approaches adapted for policy stealing. Lastly, we propose a countermeasure that significantly diminishes the effectiveness of the attack.
△ Less
Submitted 11 May, 2024;
originally announced May 2024.
-
PoLLMgraph: Unraveling Hallucinations in Large Language Models via State Transition Dynamics
Authors:
Derui Zhu,
Dingfan Chen,
Qing Li,
Zongxiong Chen,
Lei Ma,
Jens Grossklags,
Mario Fritz
Abstract:
Despite tremendous advancements in large language models (LLMs) over recent years, a notably urgent challenge for their practical deployment is the phenomenon of hallucination, where the model fabricates facts and produces non-factual statements. In response, we propose PoLLMgraph, a Polygraph for LLMs, as an effective model-based white-box detection and forecasting approach. PoLLMgraph distinctly…
▽ More
Despite tremendous advancements in large language models (LLMs) over recent years, a notably urgent challenge for their practical deployment is the phenomenon of hallucination, where the model fabricates facts and produces non-factual statements. In response, we propose PoLLMgraph, a Polygraph for LLMs, as an effective model-based white-box detection and forecasting approach. PoLLMgraph distinctly differs from the large body of existing research that concentrates on addressing such challenges through black-box evaluations. In particular, we demonstrate that hallucination can be effectively detected by analyzing the LLM's internal state transition dynamics during generation via tractable probabilistic models. Experimental results on various open-source LLMs confirm the efficacy of PoLLMgraph, outperforming state-of-the-art methods by a considerable margin, evidenced by over 20% improvement in AUC-ROC on common benchmarking datasets like TruthfulQA. Our work paves a new way for model-based white-box analysis of LLMs, motivating the research community to further explore, understand, and refine the intricate dynamics of LLM behaviors.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
-
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Authors:
Egor Zverev,
Sahar Abdelnabi,
Soroush Tabesh,
Mario Fritz,
Christoph H. Lampert
Abstract:
Instruction-tuned Large Language Models (LLMs) show impressive results in numerous practical applications, but they lack essential safety features that are common in other areas of computer science, particularly an explicit separation of instructions and data. This makes them vulnerable to manipulations such as indirect prompt injections and generally unsuitable for safety-critical tasks. Surprisi…
▽ More
Instruction-tuned Large Language Models (LLMs) show impressive results in numerous practical applications, but they lack essential safety features that are common in other areas of computer science, particularly an explicit separation of instructions and data. This makes them vulnerable to manipulations such as indirect prompt injections and generally unsuitable for safety-critical tasks. Surprisingly, there is currently no established definition or benchmark to quantify this phenomenon. In this work, we close this gap by introducing a formal measure for instruction-data separation and an empirical variant that is calculable from a model's outputs. We also present a new dataset, SEP, that allows estimating the measure for real-world models. Our results on various LLMs show that the problem of instruction-data separation is real: all models fail to achieve high separation, and canonical mitigation techniques, such as prompt engineering and fine-tuning, either fail to substantially improve separation or reduce model utility. The source code and SEP dataset are openly accessible at https://github.com/egozverev/Shold-It-Be-Executed-Or-Processed.
△ Less
Submitted 3 June, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History
Authors:
Akash Gupta,
Ivaxi Sheth,
Vyas Raina,
Mark Gales,
Mario Fritz
Abstract:
With the recent emergence of powerful instruction-tuned large language models (LLMs), various helpful conversational Artificial Intelligence (AI) systems have been deployed across many applications. When prompted by users, these AI systems successfully perform a wide range of tasks as part of a conversation. To provide some sort of memory and context, such approaches typically condition their outp…
▽ More
With the recent emergence of powerful instruction-tuned large language models (LLMs), various helpful conversational Artificial Intelligence (AI) systems have been deployed across many applications. When prompted by users, these AI systems successfully perform a wide range of tasks as part of a conversation. To provide some sort of memory and context, such approaches typically condition their output on the entire conversational history. Although this sensitivity to the conversational history can often lead to improved performance on subsequent tasks, we find that performance can in fact also be negatively impacted, if there is a task-switch. To the best of our knowledge, our work makes the first attempt to formalize the study of such vulnerabilities and interference of tasks in conversational LLMs caused by task-switches in the conversational history. Our experiments across 5 datasets with 15 task switches using popular LLMs reveal that many of the task-switches can lead to significant performance degradation.
△ Less
Submitted 11 October, 2024; v1 submitted 28 February, 2024;
originally announced February 2024.
-
Exploring Value Biases: How LLMs Deviate Towards the Ideal
Authors:
Sarath Sivaprasad,
Pramod Kaushik,
Sahar Abdelnabi,
Mario Fritz
Abstract:
Large-Language-Models (LLMs) are deployed in a wide range of applications, and their response has an increasing social impact. Understanding the non-deliberate(ive) mechanism of LLMs in giving responses is essential in explaining their performance and discerning their biases in real-world applications. This is analogous to human studies, where such inadvertent responses are referred to as sampling…
▽ More
Large-Language-Models (LLMs) are deployed in a wide range of applications, and their response has an increasing social impact. Understanding the non-deliberate(ive) mechanism of LLMs in giving responses is essential in explaining their performance and discerning their biases in real-world applications. This is analogous to human studies, where such inadvertent responses are referred to as sampling. We study this sampling of LLMs in light of value bias and show that the sampling of LLMs tends to favour high-value options. Value bias corresponds to this shift of response from the most likely towards an ideal value represented in the LLM. In fact, this effect can be reproduced even with new entities learnt via in-context prompting. We show that this bias manifests in unexpected places and has implications on relevant application scenarios, like choosing exemplars. The results show that value bias is strong in LLMs across different categories, similar to the results found in human studies.
△ Less
Submitted 21 February, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing
Authors:
Alaa Anani,
Tobias Lorenz,
Bernt Schiele,
Mario Fritz
Abstract:
Certification for machine learning is proving that no adversarial sample can evade a model within a range under certain conditions, a necessity for safety-critical domains. Common certification methods for segmentation use a flat set of fine-grained classes, leading to high abstain rates due to model uncertainty across many classes. We propose a novel, more practical setting, which certifies pixel…
▽ More
Certification for machine learning is proving that no adversarial sample can evade a model within a range under certain conditions, a necessity for safety-critical domains. Common certification methods for segmentation use a flat set of fine-grained classes, leading to high abstain rates due to model uncertainty across many classes. We propose a novel, more practical setting, which certifies pixels within a multi-level hierarchy, and adaptively relaxes the certification to a coarser level for unstable components classic methods would abstain from, effectively lowering the abstain rate whilst providing more certified semantically meaningful information. We mathematically formulate the problem setup, introduce an adaptive hierarchical certification algorithm and prove the correctness of its guarantees. Since certified accuracy does not take the loss of information into account for coarser classes, we introduce the Certified Information Gain ($\mathrm{CIG}$) metric, which is proportional to the class granularity level. Our extensive experiments on the datasets Cityscapes, PASCAL-Context, ACDC and COCO-Stuff demonstrate that our adaptive algorithm achieves a higher $\mathrm{CIG}$ and lower abstain rate compared to the current state-of-the-art certification method. Our code can be found here: https://github.com/AlaaAnani/adaptive-certify.
△ Less
Submitted 3 June, 2024; v1 submitted 13 February, 2024;
originally announced February 2024.
-
Towards Biologically Plausible and Private Gene Expression Data Generation
Authors:
Dingfan Chen,
Marie Oestreich,
Tejumade Afonja,
Raouf Kerkouche,
Matthias Becker,
Mario Fritz
Abstract:
Generative models trained with Differential Privacy (DP) are becoming increasingly prominent in the creation of synthetic data for downstream applications. Existing literature, however, primarily focuses on basic benchmarking datasets and tends to report promising results only for elementary metrics and relatively simple data distributions. In this paper, we initiate a systematic analysis of how D…
▽ More
Generative models trained with Differential Privacy (DP) are becoming increasingly prominent in the creation of synthetic data for downstream applications. Existing literature, however, primarily focuses on basic benchmarking datasets and tends to report promising results only for elementary metrics and relatively simple data distributions. In this paper, we initiate a systematic analysis of how DP generative models perform in their natural application scenarios, specifically focusing on real-world gene expression data. We conduct a comprehensive analysis of five representative DP generation methods, examining them from various angles, such as downstream utility, statistical properties, and biological plausibility. Our extensive evaluation illuminates the unique characteristics of each DP generation method, offering critical insights into the strengths and weaknesses of each approach, and uncovering intriguing possibilities for future developments. Perhaps surprisingly, our analysis reveals that most methods are capable of achieving seemingly reasonable downstream utility, according to the standard evaluation metrics considered in existing literature. Nevertheless, we find that none of the DP methods are able to accurately capture the biological characteristics of the real dataset. This observation suggests a potential over-optimistic assessment of current methodologies in this field and underscores a pressing need for future enhancements in model design.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
Privacy-Aware Document Visual Question Answering
Authors:
Rubèn Tito,
Khanh Nguyen,
Marlon Tobaben,
Raouf Kerkouche,
Mohamed Ali Souibgui,
Kangsoo Jung,
Joonas Jälkö,
Vincent Poulain D'Andecy,
Aurelie Joseph,
Lei Kang,
Ernest Valveny,
Antti Honkela,
Mario Fritz,
Dimosthenis Karatzas
Abstract:
Document Visual Question Answering (DocVQA) has quickly grown into a central task of document understanding. But despite the fact that documents contain sensitive or copyrighted information, none of the current DocVQA methods offers strong privacy guarantees. In this work, we explore privacy in the domain of DocVQA for the first time, highlighting privacy issues in state of the art multi-modal LLM…
▽ More
Document Visual Question Answering (DocVQA) has quickly grown into a central task of document understanding. But despite the fact that documents contain sensitive or copyrighted information, none of the current DocVQA methods offers strong privacy guarantees. In this work, we explore privacy in the domain of DocVQA for the first time, highlighting privacy issues in state of the art multi-modal LLM models used for DocVQA, and explore possible solutions. Specifically, we focus on invoice processing as a realistic document understanding scenario, and propose a large scale DocVQA dataset comprising invoice documents and associated questions and answers. We employ a federated learning scheme, that reflects the real-life distribution of documents in different businesses, and we explore the use case where the data of the invoice provider is the sensitive information to be protected. We demonstrate that non-private models tend to memorise, a behaviour that can lead to exposing private information. We then evaluate baseline training schemes employing federated learning and differential privacy in this multi-modal scenario, where the sensitive information might be exposed through either or both of the two input modalities: vision (document image) or language (OCR tokens). Finally, we design attacks exploiting the memorisation effect of the model, and demonstrate their effectiveness in probing a representative DocVQA models.
△ Less
Submitted 2 September, 2024; v1 submitted 15 December, 2023;
originally announced December 2023.
-
Analysis and computations of a stochastic Cahn-Hilliard model for tumor growth with chemotaxis and variable mobility
Authors:
Marvin Fritz,
Luca Scarpa
Abstract:
In this work, we present and analyze a system of PDEs, which models tumor growth by considering chemotaxis, active transport, and random effects. The stochasticity of the system is modelled by random initial data and Wiener noises that appear in the tumor and nutrient equations. The volume fraction of the tumor is governed by a stochastic phase-field equation of Cahn-Hilliard type, and the mass de…
▽ More
In this work, we present and analyze a system of PDEs, which models tumor growth by considering chemotaxis, active transport, and random effects. The stochasticity of the system is modelled by random initial data and Wiener noises that appear in the tumor and nutrient equations. The volume fraction of the tumor is governed by a stochastic phase-field equation of Cahn-Hilliard type, and the mass density of the nutrients is modelled by a stochastic reaction-diffusion equation. We allow a variable mobility function and non-increasing growth functions, such as logistic and Gompertzian growth. Via approximation and stochastic compactness arguments, we prove the existence of a probabilistic weak solution and, in the case of constant mobilities, the well-posedness of the model in the strong probabilistic sense. Lastly, we propose a numerical approximation based on the Galerkin finite element method in space and the semi-implicit Euler-Maruyama scheme in time. We illustrate the effects of the stochastic forcing in the tumor growth in several numerical simulations.
△ Less
Submitted 11 December, 2023;
originally announced December 2023.
-
SecurityNet: Assessing Machine Learning Vulnerabilities on Public Models
Authors:
Boyang Zhang,
Zheng Li,
Ziqing Yang,
Xinlei He,
Michael Backes,
Mario Fritz,
Yang Zhang
Abstract:
While advanced machine learning (ML) models are deployed in numerous real-world applications, previous works demonstrate these models have security and privacy vulnerabilities. Various empirical research has been done in this field. However, most of the experiments are performed on target ML models trained by the security researchers themselves. Due to the high computational resource requirement f…
▽ More
While advanced machine learning (ML) models are deployed in numerous real-world applications, previous works demonstrate these models have security and privacy vulnerabilities. Various empirical research has been done in this field. However, most of the experiments are performed on target ML models trained by the security researchers themselves. Due to the high computational resource requirement for training advanced models with complex architectures, researchers generally choose to train a few target models using relatively simple architectures on typical experiment datasets. We argue that to understand ML models' vulnerabilities comprehensively, experiments should be performed on a large set of models trained with various purposes (not just the purpose of evaluating ML attacks and defenses). To this end, we propose using publicly available models with weights from the Internet (public models) for evaluating attacks and defenses on ML models. We establish a database, namely SecurityNet, containing 910 annotated image classification models. We then analyze the effectiveness of several representative attacks/defenses, including model stealing attacks, membership inference attacks, and backdoor detection on these public models. Our evaluation empirically shows the performance of these attacks/defenses can vary significantly on public models compared to self-trained models. We share SecurityNet with the research community. and advocate researchers to perform experiments on public models to better demonstrate their proposed methods' effectiveness in the future.
△ Less
Submitted 19 October, 2023;
originally announced October 2023.
-
On the well-posedness of the Cahn-Hilliard-Biot model and its applications to tumor growth
Authors:
Marvin Fritz
Abstract:
We study the Cahn-Hilliard-Biot model with respect to its mathematical well-posedness. The system models flow through deformable porous media in which the solid material has two phases with distinct material properties. The two phases of the porous material evolve according to a generalized Ginzburg-Landau energy functional, with additional influence from both viscoelastic and fluid effects. The f…
▽ More
We study the Cahn-Hilliard-Biot model with respect to its mathematical well-posedness. The system models flow through deformable porous media in which the solid material has two phases with distinct material properties. The two phases of the porous material evolve according to a generalized Ginzburg-Landau energy functional, with additional influence from both viscoelastic and fluid effects. The flow-deformation coupling in the system is governed by Biot's theory. This results in a three-way coupled system that can be viewed as an extension of the Cahn-Larche equations by adding a fluid flowing through the medium. We distinguish the cases between a spatially dependent and a state-dependent Biot-Willis function. In the latter case, we consider a regularized system. In both cases, we use a Galerkin approximation to discretize the system and derive suitable energy estimates. Moreover, we apply compactness methods to pass to the limit in the discretized system. In the case of Vegard's law and homogeneous elasticity, we show that the weak solution depends continuously on the data and is unique. Lastly, we present some numerical simulations to highlight the features of the system as a tumor growth model.
△ Less
Submitted 3 October, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
Host-feeding preferences and temperature shape the dynamics of West Nile virus: a mathematical model of assessing the abatement planning
Authors:
Suman Bhowmick,
Megan Fritz,
Rebecca Lee Smith
Abstract:
West Nile virus (WNV) is prevalent in the United States but it shows considerable divergence in transmission patterns and spatio-temporal intensity.It is to be noted that the mechanism that drives the transmission potential of WNV is described by the abilities of host species to maintain and disseminate the pathogens pertinent with different eco-epidemiological factors that have an influence on th…
▽ More
West Nile virus (WNV) is prevalent in the United States but it shows considerable divergence in transmission patterns and spatio-temporal intensity.It is to be noted that the mechanism that drives the transmission potential of WNV is described by the abilities of host species to maintain and disseminate the pathogens pertinent with different eco-epidemiological factors that have an influence on the contact rates amongst the interacting species.There is growing evidence that several vectors exhibit strong feeding preferences towards different host communities.We construct a process based weather driven ordinary differential equation (ODE) model to understand the impact of one vector species Culex pipiens, preferred avian and non-preferred human hosts and compared it surveillance data for the Culex pipiens complex collected in Cook County, Illinois, USA.In our mechanistic model, we also demonstrate that adulticide treatments produced significant reductions in the Culex pipiens population.We take into account the feeding index that can be described as the ratio between observed frequency of mosquitoes feeding on one host compared to another host, divided by the expected frequency of mosquitoes feeding on these two hosts based on the presence of the particular hosts to develop this transmission model for WNV. Our findings demonstrate that the interplay between the feeding index and mosquito abatement strategy is rather a complex phenomenon and it induces a heterogeneous contact rates that should be included while modelling multi-host, multi-vector transmission model.
△ Less
Submitted 9 October, 2023;
originally announced October 2023.
-
Don't Miss Out on Novelty: Importance of Novel Features for Deep Anomaly Detection
Authors:
Sarath Sivaprasad,
Mario Fritz
Abstract:
Anomaly Detection (AD) is a critical task that involves identifying observations that do not conform to a learned model of normality. Prior work in deep AD is predominantly based on a familiarity hypothesis, where familiar features serve as the reference in a pre-trained embedding space. While this strategy has proven highly successful, it turns out that it causes consistent false negatives when a…
▽ More
Anomaly Detection (AD) is a critical task that involves identifying observations that do not conform to a learned model of normality. Prior work in deep AD is predominantly based on a familiarity hypothesis, where familiar features serve as the reference in a pre-trained embedding space. While this strategy has proven highly successful, it turns out that it causes consistent false negatives when anomalies consist of truly novel features that are not well captured by the pre-trained encoding. We propose a novel approach to AD using explainability to capture such novel features as unexplained observations in the input space. We achieve strong performance across a wide range of anomaly benchmarks by combining familiarity and novelty in a hybrid approach. Our approach establishes a new state-of-the-art across multiple benchmarks, handling diverse anomaly types while eliminating the need for expensive background models and dense matching. In particular, we show that by taking account of novel features, we reduce false negative anomalies by up to 40% on challenging benchmarks compared to the state-of-the-art. Our method gives visually inspectable explanations for pixel-level anomalies.
△ Less
Submitted 26 February, 2024; v1 submitted 1 October, 2023;
originally announced October 2023.
-
Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation
Authors:
Sahar Abdelnabi,
Amr Gomaa,
Sarath Sivaprasad,
Lea Schönherr,
Mario Fritz
Abstract:
There is an growing interest in using Large Language Models (LLMs) in multi-agent systems to tackle interactive real-world tasks that require effective collaboration and assessing complex situations. Yet, we still have a limited understanding of LLMs' communication and decision-making abilities in multi-agent setups. The fundamental task of negotiation spans many key features of communication, suc…
▽ More
There is an growing interest in using Large Language Models (LLMs) in multi-agent systems to tackle interactive real-world tasks that require effective collaboration and assessing complex situations. Yet, we still have a limited understanding of LLMs' communication and decision-making abilities in multi-agent setups. The fundamental task of negotiation spans many key features of communication, such as cooperation, competition, and manipulation potentials. Thus, we propose using scorable negotiation to evaluate LLMs. We create a testbed of complex multi-agent, multi-issue, and semantically rich negotiation games. To reach an agreement, agents must have strong arithmetic, inference, exploration, and planning capabilities while integrating them in a dynamic and multi-turn setup. We propose multiple metrics to rigorously quantify agents' performance and alignment with the assigned role. We provide procedures to create new games and increase games' difficulty to have an evolving benchmark. Importantly, we evaluate critical safety aspects such as the interaction dynamics between agents influenced by greedy and adversarial players. Our benchmark is highly challenging; GPT-3.5 and small models mostly fail, and GPT-4 and SoTA large models (e.g., Llama-3 70b) still underperform.
△ Less
Submitted 10 June, 2024; v1 submitted 29 September, 2023;
originally announced September 2023.
-
A Unified View of Differentially Private Deep Generative Modeling
Authors:
Dingfan Chen,
Raouf Kerkouche,
Mario Fritz
Abstract:
The availability of rich and vast data sources has greatly advanced machine learning applications in various domains. However, data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing. Overcoming these obstacles in compliance with privacy considerations is key for technological progress in many real-world application scenarios that involve…
▽ More
The availability of rich and vast data sources has greatly advanced machine learning applications in various domains. However, data with privacy concerns comes with stringent regulations that frequently prohibited data access and data sharing. Overcoming these obstacles in compliance with privacy considerations is key for technological progress in many real-world application scenarios that involve privacy sensitive data. Differentially private (DP) data publishing provides a compelling solution, where only a sanitized form of the data is publicly released, enabling privacy-preserving downstream analysis and reproducible research in sensitive domains. In recent years, various approaches have been proposed for achieving privacy-preserving high-dimensional data generation by private training on top of deep neural networks. In this paper, we present a novel unified view that systematizes these approaches. Our view provides a joint design space for systematically deriving methods that cater to different use cases. We then discuss the strengths, limitations, and inherent correlations between different approaches, aiming to shed light on crucial aspects and inspire future research. We conclude by presenting potential paths forward for the field of DP data generation, with the aim of steering the community toward making the next important steps in advancing privacy-preserving learning.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Certified Robust Models with Slack Control and Large Lipschitz Constants
Authors:
Max Losch,
David Stutz,
Bernt Schiele,
Mario Fritz
Abstract:
Despite recent success, state-of-the-art learning-based models remain highly vulnerable to input changes such as adversarial examples. In order to obtain certifiable robustness against such perturbations, recent work considers Lipschitz-based regularizers or constraints while at the same time increasing prediction margin. Unfortunately, this comes at the cost of significantly decreased accuracy. I…
▽ More
Despite recent success, state-of-the-art learning-based models remain highly vulnerable to input changes such as adversarial examples. In order to obtain certifiable robustness against such perturbations, recent work considers Lipschitz-based regularizers or constraints while at the same time increasing prediction margin. Unfortunately, this comes at the cost of significantly decreased accuracy. In this paper, we propose a Calibrated Lipschitz-Margin Loss (CLL) that addresses this issue and improves certified robustness by tackling two problems: Firstly, commonly used margin losses do not adjust the penalties to the shrinking output distribution; caused by minimizing the Lipschitz constant $K$. Secondly, and most importantly, we observe that minimization of $K$ can lead to overly smooth decision functions. This limits the model's complexity and thus reduces accuracy. Our CLL addresses these issues by explicitly calibrating the loss w.r.t. margin and Lipschitz constant, thereby establishing full control over slack and improving robustness certificates even with larger Lipschitz constants. On CIFAR-10, CIFAR-100 and Tiny-ImageNet, our models consistently outperform losses that leave the constant unattended. On CIFAR-100 and Tiny-ImageNet, CLL improves upon state-of-the-art deterministic $L_2$ robust accuracies. In contrast to current trends, we unlock potential of much smaller models without $K=1$ constraints.
△ Less
Submitted 12 September, 2023;
originally announced September 2023.
-
From Attachments to SEO: Click Here to Learn More about Clickbait PDFs!
Authors:
Giada Stivala,
Sahar Abdelnabi,
Andrea Mengascini,
Mariano Graziano,
Mario Fritz,
Giancarlo Pellegrino
Abstract:
Clickbait PDFs are PDF documents that do not embed malware but trick victims into visiting malicious web pages leading to attacks like password theft or drive-by download. While recent reports indicate a surge of clickbait PDFs, prior works have largely neglected this new threat, considering PDFs only as accessories of email phishing campaigns.
This paper investigates the landscape of clickbait…
▽ More
Clickbait PDFs are PDF documents that do not embed malware but trick victims into visiting malicious web pages leading to attacks like password theft or drive-by download. While recent reports indicate a surge of clickbait PDFs, prior works have largely neglected this new threat, considering PDFs only as accessories of email phishing campaigns.
This paper investigates the landscape of clickbait PDFs and presents the first systematic and comprehensive study of this phenomenon. Starting from a real-world dataset, we identify 44 clickbait PDF clusters via clustering and characterize them by looking at their volumetric, temporal, and visual features. Among these, we identify three large clusters covering 89% of the dataset, exhibiting significantly different volumetric and temporal properties compared to classical email phishing, and relying on web UI elements as visual baits. Finally, we look at the distribution vectors and show that clickbait PDFs are not only distributed via attachments but also via Search Engine Optimization attacks, placing clickbait PDFs outside the email distribution ecosystem.
Clickbait PDFs seem to be a lurking threat, not subjected to any form of content-based filtering or detection: AV scoring systems, like VirusTotal, rank them considerably low, creating a blind spot for organizations. While URL blocklists can help to prevent victims from visiting the attack web pages, we observe that they have a limited coverage.
△ Less
Submitted 22 December, 2023; v1 submitted 2 August, 2023;
originally announced August 2023.
-
Well-posedness and simulation of weak solutions to the time-fractional Fokker-Planck equation with general forcing
Authors:
Marvin Fritz
Abstract:
In this paper, we investigate the well-posedness of weak solutions to the time-fractional Fokker-Planck equation. Its dynamics is governed by anomalous diffusion, and we consider the most general case of space-time dependent forces. Consequently, the fractional derivatives appear on the right-hand side of the equation, and they cannot be brought to the left-hand side, which would have been prefera…
▽ More
In this paper, we investigate the well-posedness of weak solutions to the time-fractional Fokker-Planck equation. Its dynamics is governed by anomalous diffusion, and we consider the most general case of space-time dependent forces. Consequently, the fractional derivatives appear on the right-hand side of the equation, and they cannot be brought to the left-hand side, which would have been preferable from an analytical perspective. For showing the model's well-posedness, we derive an energy inequality by considering nonstandard and novel testing methods that involve a series of convolutions and integrations. We close the estimate by a Henry-Gronwall-type inequality. Lastly, we propose a numerical algorithm based on a nonuniform L1 scheme and present some simulation results for various forces.
△ Less
Submitted 31 July, 2023;
originally announced July 2023.
-
Analysis of a dilute polymer model with a time-fractional derivative
Authors:
Marvin Fritz,
Endre Süli,
Barbara Wohlmuth
Abstract:
We investigate the well-posedness of a coupled Navier-Stokes-Fokker-Planck system with a time-fractional derivative. Such systems arise in the kinetic theory of dilute solutions of polymeric liquids, where the motion of noninteracting polymer chains in a Newtonian solvent is modelled by a stochastic process exhibiting power-law waiting time, in order to capture subdiffusive processes associated wi…
▽ More
We investigate the well-posedness of a coupled Navier-Stokes-Fokker-Planck system with a time-fractional derivative. Such systems arise in the kinetic theory of dilute solutions of polymeric liquids, where the motion of noninteracting polymer chains in a Newtonian solvent is modelled by a stochastic process exhibiting power-law waiting time, in order to capture subdiffusive processes associated with non-Fickian diffusion. We outline the derivation of the model from a subordinated Langevin equation. The elastic properties of the polymer molecules immersed in the solvent are modelled by a finitely extensible nonlinear elastic (FENE) dumbbell model, and the drag term in the Fokker--Planck equation is assumed to be corotational. We prove the global-in-time existence of large-data weak solutions to this time-fractional model of order $α\in (\tfrac12,1)$, and derive an energy inequality satisfied by weak solutions.
△ Less
Submitted 31 July, 2023;
originally announced July 2023.
-
MargCTGAN: A "Marginally'' Better CTGAN for the Low Sample Regime
Authors:
Tejumade Afonja,
Dingfan Chen,
Mario Fritz
Abstract:
The potential of realistic and useful synthetic data is significant. However, current evaluation methods for synthetic tabular data generation predominantly focus on downstream task usefulness, often neglecting the importance of statistical properties. This oversight becomes particularly prominent in low sample scenarios, accompanied by a swift deterioration of these statistical measures. In this…
▽ More
The potential of realistic and useful synthetic data is significant. However, current evaluation methods for synthetic tabular data generation predominantly focus on downstream task usefulness, often neglecting the importance of statistical properties. This oversight becomes particularly prominent in low sample scenarios, accompanied by a swift deterioration of these statistical measures. In this paper, we address this issue by conducting an evaluation of three state-of-the-art synthetic tabular data generators based on their marginal distribution, column-pair correlation, joint distribution and downstream task utility performance across high to low sample regimes. The popular CTGAN model shows strong utility, but underperforms in low sample settings in terms of utility. To overcome this limitation, we propose MargCTGAN that adds feature matching of de-correlated marginals, which results in a consistent improvement in downstream utility as well as statistical properties of the synthetic data.
△ Less
Submitted 16 July, 2023;
originally announced July 2023.
-
B-cos Alignment for Inherently Interpretable CNNs and Vision Transformers
Authors:
Moritz Böhle,
Navdeeppal Singh,
Mario Fritz,
Bernt Schiele
Abstract:
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training. For this, we propose to replace the linear transformations in DNNs by our novel B-cos transformation. As we show, a sequence (network) of such transformations induces a single linear transformation that faithfully summarises the full model computations.…
▽ More
We present a new direction for increasing the interpretability of deep neural networks (DNNs) by promoting weight-input alignment during training. For this, we propose to replace the linear transformations in DNNs by our novel B-cos transformation. As we show, a sequence (network) of such transformations induces a single linear transformation that faithfully summarises the full model computations. Moreover, the B-cos transformation is designed such that the weights align with relevant signals during optimisation. As a result, those induced linear transformations become highly interpretable and highlight task-relevant features. Importantly, the B-cos transformation is designed to be compatible with existing architectures and we show that it can easily be integrated into virtually all of the latest state of the art models for computer vision - e.g. ResNets, DenseNets, ConvNext models, as well as Vision Transformers - by combining the B-cos-based explanations with normalisation and attention layers, all whilst maintaining similar accuracy on ImageNet. Finally, we show that the resulting explanations are of high visual quality and perform well under quantitative interpretability metrics.
△ Less
Submitted 15 January, 2024; v1 submitted 19 June, 2023;
originally announced June 2023.
-
From Bad to Worse: Using Private Data to Propagate Disinformation on Online Platforms with a Greater Efficiency
Authors:
Protik Bose Pranto,
Waqar Hassan Khan,
Sahar Abdelnabi,
Rebecca Weil,
Mario Fritz,
Rakibul Hasan
Abstract:
We outline a planned experiment to investigate if personal data (e.g., demographics and behavioral patterns) can be used to selectively expose individuals to disinformation such that an adversary can spread disinformation more efficiently compared to broadcasting the same information to everyone. This mechanism, if effective, will have devastating consequences as modern technologies collect and in…
▽ More
We outline a planned experiment to investigate if personal data (e.g., demographics and behavioral patterns) can be used to selectively expose individuals to disinformation such that an adversary can spread disinformation more efficiently compared to broadcasting the same information to everyone. This mechanism, if effective, will have devastating consequences as modern technologies collect and infer a plethora of private data that can be abused to target with disinformation. We believe this research will inform designing policies and regulations for online platforms.
△ Less
Submitted 7 June, 2023;
originally announced June 2023.
-
Private and Collaborative Kaplan-Meier Estimators
Authors:
Shadi Rahimian,
Raouf Kerkouche,
Ina Kurth,
Mario Fritz
Abstract:
Kaplan-Meier estimators are essential tools in survival analysis, capturing the survival behavior of a cohort. Their accuracy improves with large, diverse datasets, encouraging data holders to collaborate for more precise estimations. However, these datasets often contain sensitive individual information, necessitating stringent data protection measures that preclude naive data sharing.
In this…
▽ More
Kaplan-Meier estimators are essential tools in survival analysis, capturing the survival behavior of a cohort. Their accuracy improves with large, diverse datasets, encouraging data holders to collaborate for more precise estimations. However, these datasets often contain sensitive individual information, necessitating stringent data protection measures that preclude naive data sharing.
In this work, we introduce two novel differentially private methods that offer flexibility in applying differential privacy to various functions of the data. Additionally, we propose a synthetic dataset generation technique that enables easy and rapid conversion between different data representations. Utilizing these methods, we propose various paths that allow a joint estimation of the Kaplan-Meier curves with strict privacy guarantees. Our contribution includes a taxonomy of methods for this task and an extensive experimental exploration and evaluation based on this structure. We demonstrate that our approach can construct a joint, global Kaplan-Meier estimator that adheres to strict privacy standards ($\varepsilon = 1$) while exhibiting no statistically significant deviation from the nonprivate centralized estimator.
△ Less
Submitted 29 July, 2024; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Tumor evolution models of phase-field type with nonlocal effects and angiogenesis
Authors:
Marvin Fritz
Abstract:
In this survey article, a variety of systems modeling tumor growth are discussed. In accordance with the hallmarks of cancer, the described models incorporate the primary characteristics of cancer evolution. Specifically, we focus on diffusive interface models and follow the phase-field approach that describes the tumor as a collection of cells. Such systems are based on a multiphase approach that…
▽ More
In this survey article, a variety of systems modeling tumor growth are discussed. In accordance with the hallmarks of cancer, the described models incorporate the primary characteristics of cancer evolution. Specifically, we focus on diffusive interface models and follow the phase-field approach that describes the tumor as a collection of cells. Such systems are based on a multiphase approach that employs constitutive laws and balance laws for individual constituents. In mathematical oncology, numerous biological phenomena are involved, including temporal and spatial nonlocal effects, complex nonlinearities, stochasticity, and mixed-dimensional couplings. Using the models, for instance, we can express angiogenesis and cell-to-matrix adhesion effects. Finally, we offer some methods for numerically approximating the models and show simulations of the tumor's evolution in response to various biological effects.
△ Less
Submitted 20 March, 2023;
originally announced March 2023.
-
A phase-field model for non-small cell lung cancer under the effects of immunotherapy
Authors:
Andreas Wagner,
Pirmin Schlicke,
Marvin Fritz,
Christina Kuttler,
J. Tinsley Oden,
Christian Schumann,
Barbara Wohlmuth
Abstract:
Formulating tumor models that predict growth under therapy is vital for improving patient-specific treatment plans. In this context, we present our recent work on simulating non-small-scale cell lung cancer (NSCLC) in a simple, deterministic setting for two different patients receiving an immunotherapeutic treatment.
At its core, our model consists of a Cahn-Hilliard-based phase-field model desc…
▽ More
Formulating tumor models that predict growth under therapy is vital for improving patient-specific treatment plans. In this context, we present our recent work on simulating non-small-scale cell lung cancer (NSCLC) in a simple, deterministic setting for two different patients receiving an immunotherapeutic treatment.
At its core, our model consists of a Cahn-Hilliard-based phase-field model describing the evolution of proliferative and necrotic tumor cells. These are coupled to a simplified nutrient model that drives the growth of the proliferative cells and their decay into necrotic cells. The applied immunotherapy decreases the proliferative cell concentration. Here, we model the immunotherapeutic agent concentration in the entire lung over time by an ordinary differential equation (ODE). Finally, reaction terms provide a coupling between all these equations. By assuming spherical, symmetric tumor growth and constant nutrient inflow, we simplify this full 3D cancer simulation model to a reduced 1D model.
We can then resort to patient data gathered from computed tomography (CT) scans over several years to calibrate our model. For the reduced 1D model, we show that our model can qualitatively describe observations during immunotherapy by fitting our model parameters to existing patient data. Our model covers cases in which the immunotherapy is successful and limits the tumor size, as well as cases predicting a sudden relapse, leading to exponential tumor growth.
Finally, we move from the reduced model back to the full 3D cancer simulation in the lung tissue. Thereby, we show the predictive benefits a more detailed patient-specific simulation including spatial information could yield in the future.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Client-specific Property Inference against Secure Aggregation in Federated Learning
Authors:
Raouf Kerkouche,
Gergely Ács,
Mario Fritz
Abstract:
Federated learning has become a widely used paradigm for collaboratively training a common model among different participants with the help of a central server that coordinates the training. Although only the model parameters or other model updates are exchanged during the federated training instead of the participant's data, many attacks have shown that it is still possible to infer sensitive inf…
▽ More
Federated learning has become a widely used paradigm for collaboratively training a common model among different participants with the help of a central server that coordinates the training. Although only the model parameters or other model updates are exchanged during the federated training instead of the participant's data, many attacks have shown that it is still possible to infer sensitive information such as membership, property, or outright reconstruction of participant data. Although differential privacy is considered an effective solution to protect against privacy attacks, it is also criticized for its negative effect on utility. Another possible defense is to use secure aggregation which allows the server to only access the aggregated update instead of each individual one, and it is often more appealing because it does not degrade model quality. However, combining only the aggregated updates, which are generated by a different composition of clients in every round, may still allow the inference of some client-specific information. In this paper, we show that simple linear models can effectively capture client-specific properties only from the aggregated model updates due to the linearity of aggregation. We formulate an optimization problem across different rounds in order to infer a tested property of every client from the output of the linear models, for example, whether they have a specific sample in their training data (membership inference) or whether they misbehave and attempt to degrade the performance of the common model by poisoning attacks. Our reconstruction technique is completely passive and undetectable. We demonstrate the efficacy of our approach on several scenarios which shows that secure aggregation provides very limited privacy guarantees in practice. The source code will be released upon publication.
△ Less
Submitted 27 October, 2023; v1 submitted 7 March, 2023;
originally announced March 2023.
-
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
Authors:
Kai Greshake,
Sahar Abdelnabi,
Shailesh Mishra,
Christoph Endres,
Thorsten Holz,
Mario Fritz
Abstract:
Large Language Models (LLMs) are increasingly being integrated into various applications. The functionalities of recent LLMs can be flexibly modulated via natural language prompts. This renders them susceptible to targeted adversarial prompting, e.g., Prompt Injection (PI) attacks enable attackers to override original instructions and employed controls. So far, it was assumed that the user is dire…
▽ More
Large Language Models (LLMs) are increasingly being integrated into various applications. The functionalities of recent LLMs can be flexibly modulated via natural language prompts. This renders them susceptible to targeted adversarial prompting, e.g., Prompt Injection (PI) attacks enable attackers to override original instructions and employed controls. So far, it was assumed that the user is directly prompting the LLM. But, what if it is not the user prompting? We argue that LLM-Integrated Applications blur the line between data and instructions. We reveal new attack vectors, using Indirect Prompt Injection, that enable adversaries to remotely (without a direct interface) exploit LLM-integrated applications by strategically injecting prompts into data likely to be retrieved. We derive a comprehensive taxonomy from a computer security perspective to systematically investigate impacts and vulnerabilities, including data theft, worming, information ecosystem contamination, and other novel security risks. We demonstrate our attacks' practical viability against both real-world systems, such as Bing's GPT-4 powered Chat and code-completion engines, and synthetic applications built on GPT-4. We show how processing retrieved prompts can act as arbitrary code execution, manipulate the application's functionality, and control how and if other APIs are called. Despite the increasing integration and reliance on LLMs, effective mitigations of these emerging threats are currently lacking. By raising awareness of these vulnerabilities and providing key insights into their implications, we aim to promote the safe and responsible deployment of these powerful models and the development of robust defenses that protect users and systems from potential attacks.
△ Less
Submitted 5 May, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
Data Forensics in Diffusion Models: A Systematic Analysis of Membership Privacy
Authors:
Derui Zhu,
Dingfan Chen,
Jens Grossklags,
Mario Fritz
Abstract:
In recent years, diffusion models have achieved tremendous success in the field of image generation, becoming the stateof-the-art technology for AI-based image processing applications. Despite the numerous benefits brought by recent advances in diffusion models, there are also concerns about their potential misuse, specifically in terms of privacy breaches and intellectual property infringement. I…
▽ More
In recent years, diffusion models have achieved tremendous success in the field of image generation, becoming the stateof-the-art technology for AI-based image processing applications. Despite the numerous benefits brought by recent advances in diffusion models, there are also concerns about their potential misuse, specifically in terms of privacy breaches and intellectual property infringement. In particular, some of their unique characteristics open up new attack surfaces when considering the real-world deployment of such models. With a thorough investigation of the attack vectors, we develop a systematic analysis of membership inference attacks on diffusion models and propose novel attack methods tailored to each attack scenario specifically relevant to diffusion models. Our approach exploits easily obtainable quantities and is highly effective, achieving near-perfect attack performance (>0.9 AUCROC) in realistic scenarios. Our extensive experiments demonstrate the effectiveness of our method, highlighting the importance of considering privacy and intellectual property risks when using diffusion models in image generation tasks.
△ Less
Submitted 5 August, 2023; v1 submitted 15 February, 2023;
originally announced February 2023.
-
CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models
Authors:
Hossein Hajipour,
Keno Hassler,
Thorsten Holz,
Lea Schönherr,
Mario Fritz
Abstract:
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Their advances in competition-level programming problems have made them an essential pillar of AI-assisted pair programming, and tools such as GitHub Copilot have emerged as part of the daily programming workflow used by millions of developers. The training data for these models is…
▽ More
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Their advances in competition-level programming problems have made them an essential pillar of AI-assisted pair programming, and tools such as GitHub Copilot have emerged as part of the daily programming workflow used by millions of developers. The training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities. This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure. While these models have been extensively assessed for their ability to produce functionally correct programs, there remains a lack of comprehensive investigations and benchmarks addressing the security aspects of these models.
In this work, we propose a method to systematically study the security issues of code language models to assess their susceptibility to generating vulnerable code. To this end, we introduce the first approach to automatically find generated code that contains vulnerabilities in black-box code generation models. To achieve this, we present an approach to approximate inversion of the black-box code generation models based on few-shot prompting. We evaluate the effectiveness of our approach by examining code language models in generating high-risk security weaknesses. Furthermore, we establish a collection of diverse non-secure prompts for various vulnerability scenarios using our method. This dataset forms a benchmark for evaluating and comparing the security weaknesses in code language models.
△ Less
Submitted 23 October, 2023; v1 submitted 8 February, 2023;
originally announced February 2023.
-
FedLAP-DP: Federated Learning by Sharing Differentially Private Loss Approximations
Authors:
Hui-Po Wang,
Dingfan Chen,
Raouf Kerkouche,
Mario Fritz
Abstract:
Conventional gradient-sharing approaches for federated learning (FL), such as FedAvg, rely on aggregation of local models and often face performance degradation under differential privacy (DP) mechanisms or data heterogeneity, which can be attributed to the inconsistency between the local and global objectives. To address this issue, we propose FedLAP-DP, a novel privacy-preserving approach for FL…
▽ More
Conventional gradient-sharing approaches for federated learning (FL), such as FedAvg, rely on aggregation of local models and often face performance degradation under differential privacy (DP) mechanisms or data heterogeneity, which can be attributed to the inconsistency between the local and global objectives. To address this issue, we propose FedLAP-DP, a novel privacy-preserving approach for FL. Our formulation involves clients synthesizing a small set of samples that approximate local loss landscapes by simulating the gradients of real images within a local region. Acting as loss surrogates, these synthetic samples are aggregated on the server side to uncover the global loss landscape and enable global optimization. Building upon these insights, we offer a new perspective to enforce record-level differential privacy in FL. A formal privacy analysis demonstrates that FedLAP-DP incurs the same privacy costs as typical gradient-sharing schemes while achieving an improved trade-off between privacy and utility. Extensive experiments validate the superiority of our approach across various datasets with highly skewed distributions in both DP and non-DP settings. Beyond the promising performance, our approach presents a faster convergence speed compared to typical gradient-sharing methods and opens up the possibility of trading communication costs for better performance by sending a larger set of synthetic images. The source is available at https://github.com/a514514772/FedLAP-DP.
△ Less
Submitted 2 May, 2024; v1 submitted 2 February, 2023;
originally announced February 2023.
-
Holistically Explainable Vision Transformers
Authors:
Moritz Böhle,
Mario Fritz,
Bernt Schiele
Abstract:
Transformers increasingly dominate the machine learning landscape across many tasks and domains, which increases the importance for understanding their outputs. While their attention modules provide partial insight into their inner workings, the attention scores have been shown to be insufficient for explaining the models as a whole. To address this, we propose B-cos transformers, which inherently…
▽ More
Transformers increasingly dominate the machine learning landscape across many tasks and domains, which increases the importance for understanding their outputs. While their attention modules provide partial insight into their inner workings, the attention scores have been shown to be insufficient for explaining the models as a whole. To address this, we propose B-cos transformers, which inherently provide holistic explanations for their decisions. Specifically, we formulate each model component - such as the multi-layer perceptrons, attention layers, and the tokenisation module - to be dynamic linear, which allows us to faithfully summarise the entire transformer via a single linear transform. We apply our proposed design to Vision Transformers (ViTs) and show that the resulting models, dubbed Bcos-ViTs, are highly interpretable and perform competitively to baseline ViTs on ImageNet. Code will be made available soon.
△ Less
Submitted 20 January, 2023;
originally announced January 2023.
-
Private Set Generation with Discriminative Information
Authors:
Dingfan Chen,
Raouf Kerkouche,
Mario Fritz
Abstract:
Differentially private data generation techniques have become a promising solution to the data privacy challenge -- it enables sharing of data while complying with rigorous privacy guarantees, which is essential for scientific progress in sensitive domains. Unfortunately, restricted by the inherent complexity of modeling high-dimensional distributions, existing private generative models are strugg…
▽ More
Differentially private data generation techniques have become a promising solution to the data privacy challenge -- it enables sharing of data while complying with rigorous privacy guarantees, which is essential for scientific progress in sensitive domains. Unfortunately, restricted by the inherent complexity of modeling high-dimensional distributions, existing private generative models are struggling with the utility of synthetic samples.
In contrast to existing works that aim at fitting the complete data distribution, we directly optimize for a small set of samples that are representative of the distribution under the supervision of discriminative information from downstream tasks, which is generally an easier task and more suitable for private training. Our work provides an alternative view for differentially private generation of high-dimensional data and introduces a simple yet effective method that greatly improves the sample utility of state-of-the-art approaches.
△ Less
Submitted 7 November, 2022;
originally announced November 2022.
-
SimSCOOD: Systematic Analysis of Out-of-Distribution Generalization in Fine-tuned Source Code Models
Authors:
Hossein Hajipour,
Ning Yu,
Cristian-Alexandru Staicu,
Mario Fritz
Abstract:
Large code datasets have become increasingly accessible for pre-training source code models. However, for the fine-tuning phase, obtaining representative training data that fully covers the code distribution for specific downstream tasks remains challenging due to the task-specific nature and limited labeling resources. Moreover, fine-tuning pretrained models can result in forgetting previously ac…
▽ More
Large code datasets have become increasingly accessible for pre-training source code models. However, for the fine-tuning phase, obtaining representative training data that fully covers the code distribution for specific downstream tasks remains challenging due to the task-specific nature and limited labeling resources. Moreover, fine-tuning pretrained models can result in forgetting previously acquired pre-training knowledge. These lead to out-of-distribution (OOD) generalization issues with unexpected model inference behaviors that have not been systematically studied yet. In this paper, we contribute the first systematic approach that simulates various OOD scenarios along different dimensions of source code data properties and study the fine-tuned model behaviors in such scenarios. We investigate the behaviors of models under different fine-tuning methodologies, including full fine-tuning and Low-Rank Adaptation (LoRA) fine-tuning methods. Our comprehensive analysis, conducted on four state-of-the-art pretrained models and applied to two code generation tasks, exposes multiple failure modes attributed to OOD generalization issues. Additionally, our analysis uncovers that LoRA fine-tuning consistently exhibits significantly better OOD generalization performance than full fine-tuning across various scenarios.
△ Less
Submitted 30 October, 2023; v1 submitted 10 October, 2022;
originally announced October 2022.
-
UnGANable: Defending Against GAN-based Face Manipulation
Authors:
Zheng Li,
Ning Yu,
Ahmed Salem,
Michael Backes,
Mario Fritz,
Yang Zhang
Abstract:
Deepfakes pose severe threats of visual misinformation to our society. One representative deepfake application is face manipulation that modifies a victim's facial attributes in an image, e.g., changing her age or hair color. The state-of-the-art face manipulation techniques rely on Generative Adversarial Networks (GANs). In this paper, we propose the first defense system, namely UnGANable, agains…
▽ More
Deepfakes pose severe threats of visual misinformation to our society. One representative deepfake application is face manipulation that modifies a victim's facial attributes in an image, e.g., changing her age or hair color. The state-of-the-art face manipulation techniques rely on Generative Adversarial Networks (GANs). In this paper, we propose the first defense system, namely UnGANable, against GAN-inversion-based face manipulation. In specific, UnGANable focuses on defending GAN inversion, an essential step for face manipulation. Its core technique is to search for alternative images (called cloaked images) around the original images (called target images) in image space. When posted online, these cloaked images can jeopardize the GAN inversion process. We consider two state-of-the-art inversion techniques including optimization-based inversion and hybrid inversion, and design five different defenses under five scenarios depending on the defender's background knowledge. Extensive experiments on four popular GAN models trained on two benchmark face datasets show that UnGANable achieves remarkable effectiveness and utility performance, and outperforms multiple baseline methods. We further investigate four adaptive adversaries to bypass UnGANable and show that some of them are slightly effective.
△ Less
Submitted 3 October, 2022;
originally announced October 2022.
-
Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against Fact-Verification Systems
Authors:
Sahar Abdelnabi,
Mario Fritz
Abstract:
Mis- and disinformation are a substantial global threat to our security and safety. To cope with the scale of online misinformation, researchers have been working on automating fact-checking by retrieving and verifying against relevant evidence. However, despite many advances, a comprehensive evaluation of the possible attack vectors against such systems is still lacking. Particularly, the automat…
▽ More
Mis- and disinformation are a substantial global threat to our security and safety. To cope with the scale of online misinformation, researchers have been working on automating fact-checking by retrieving and verifying against relevant evidence. However, despite many advances, a comprehensive evaluation of the possible attack vectors against such systems is still lacking. Particularly, the automated fact-verification process might be vulnerable to the exact disinformation campaigns it is trying to combat. In this work, we assume an adversary that automatically tampers with the online evidence in order to disrupt the fact-checking model via camouflaging the relevant evidence or planting a misleading one. We first propose an exploratory taxonomy that spans these two targets and the different threat model dimensions. Guided by this, we design and propose several potential attack methods. We show that it is possible to subtly modify claim-salient snippets in the evidence and generate diverse and claim-aligned evidence. Thus, we highly degrade the fact-checking performance under many different permutations of the taxonomy's dimensions. The attacks are also robust against post-hoc modifications of the claim. Our analysis further hints at potential limitations in models' inference when faced with contradicting evidence. We emphasize that these attacks can have harmful implications on the inspectable and human-in-the-loop usage scenarios of such models, and we conclude by discussing challenges and directions for future defenses.
△ Less
Submitted 16 June, 2023; v1 submitted 7 September, 2022;
originally announced September 2022.