-
Controlling Counterfactual Harm in Decision Support Systems Based on Prediction Sets
Authors:
Eleni Straitouri,
Suhas Thejaswi,
Manuel Gomez Rodriguez
Abstract:
Decision support systems based on prediction sets help humans solve multiclass classification tasks by narrowing down the set of potential label values to a subset of them, namely a prediction set, and asking them to always predict label values from the prediction sets. While this type of systems have been proven to be effective at improving the average accuracy of the predictions made by humans,…
▽ More
Decision support systems based on prediction sets help humans solve multiclass classification tasks by narrowing down the set of potential label values to a subset of them, namely a prediction set, and asking them to always predict label values from the prediction sets. While this type of systems have been proven to be effective at improving the average accuracy of the predictions made by humans, by restricting human agency, they may cause harm$\unicode{x2014}$a human who has succeeded at predicting the ground-truth label of an instance on their own may have failed had they used these systems. In this paper, our goal is to control how frequently a decision support system based on prediction sets may cause harm, by design. To this end, we start by characterizing the above notion of harm using the theoretical framework of structural causal models. Then, we show that, under a natural, albeit unverifiable, monotonicity assumption, we can estimate how frequently a system may cause harm using only predictions made by humans on their own. Further, we also show that, under a weaker monotonicity assumption, which can be verified experimentally, we can bound how frequently a system may cause harm again using only predictions made by humans on their own. Building upon these assumptions, we introduce a computational framework to design decision support systems based on prediction sets that are guaranteed to cause harm less frequently than a user-specified value using conformal risk control. We validate our framework using real human predictions from two different human subject studies and show that, in decision support systems based on prediction sets, there is a trade-off between accuracy and counterfactual harm.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Prediction-Powered Ranking of Large Language Models
Authors:
Ivi Chatzi,
Eleni Straitouri,
Suhas Thejaswi,
Manuel Gomez Rodriguez
Abstract:
Large language models are often ranked according to their level of alignment with human preferences -- a model is better than other models if its outputs are more frequently preferred by humans. One of the popular ways to elicit human preferences utilizes pairwise comparisons between the outputs provided by different models to the same inputs. However, since gathering pairwise comparisons by human…
▽ More
Large language models are often ranked according to their level of alignment with human preferences -- a model is better than other models if its outputs are more frequently preferred by humans. One of the popular ways to elicit human preferences utilizes pairwise comparisons between the outputs provided by different models to the same inputs. However, since gathering pairwise comparisons by humans is costly and time-consuming, it has become a common practice to gather pairwise comparisons by a strong large language model -- a model strongly aligned with human preferences. Surprisingly, practitioners cannot currently measure the uncertainty that any mismatch between human and model preferences may introduce in the constructed rankings. In this work, we develop a statistical framework to bridge this gap. Given a (small) set of pairwise comparisons by humans and a large set of pairwise comparisons by a model, our framework provides a rank-set -- a set of possible ranking positions -- for each of the models under comparison. Moreover, it guarantees that, with a probability greater than or equal to a user-specified value, the rank-sets cover the true ranking consistent with the distribution of human pairwise preferences asymptotically. Using pairwise comparisons made by humans in the LMSYS Chatbot Arena platform and pairwise comparisons made by three strong large language models, we empirically demonstrate the effectivity of our framework and show that the rank-sets constructed using only pairwise comparisons by the strong large language models are often inconsistent with (the distribution of) human pairwise preferences.
△ Less
Submitted 23 May, 2024; v1 submitted 27 February, 2024;
originally announced February 2024.
-
Designing Decision Support Systems Using Counterfactual Prediction Sets
Authors:
Eleni Straitouri,
Manuel Gomez Rodriguez
Abstract:
Decision support systems for classification tasks are predominantly designed to predict the value of the ground truth labels. However, since their predictions are not perfect, these systems also need to make human experts understand when and how to use these predictions to update their own predictions. Unfortunately, this has been proven challenging. In this context, it has been recently argued th…
▽ More
Decision support systems for classification tasks are predominantly designed to predict the value of the ground truth labels. However, since their predictions are not perfect, these systems also need to make human experts understand when and how to use these predictions to update their own predictions. Unfortunately, this has been proven challenging. In this context, it has been recently argued that an alternative type of decision support systems may circumvent this challenge. Rather than providing a single label prediction, these systems provide a set of label prediction values constructed using a conformal predictor, namely a prediction set, and forcefully ask experts to predict a label value from the prediction set. However, the design and evaluation of these systems have so far relied on stylized expert models, questioning their promise. In this paper, we revisit the design of this type of systems from the perspective of online learning and develop a methodology that does not require, nor assumes, an expert model. Our methodology leverages the nested structure of the prediction sets provided by any conformal predictor and a natural counterfactual monotonicity assumption to achieve an exponential improvement in regret in comparison to vanilla bandit algorithms. We conduct a large-scale human subject study ($n = 2{,}751$) to compare our methodology to several competitive baselines. The results show that, for decision support systems based on prediction sets, limiting experts' level of agency leads to greater performance than allowing experts to always exercise their own agency. We have made available the data gathered in our human subject study as well as an open source implementation of our system at https://github.com/Networks-Learning/counterfactual-prediction-sets.
△ Less
Submitted 16 July, 2024; v1 submitted 6 June, 2023;
originally announced June 2023.
-
Human-Aligned Calibration for AI-Assisted Decision Making
Authors:
Nina L. Corvelo Benz,
Manuel Gomez Rodriguez
Abstract:
Whenever a binary classifier is used to provide decision support, it typically provides both a label prediction and a confidence value. Then, the decision maker is supposed to use the confidence value to calibrate how much to trust the prediction. In this context, it has been often argued that the confidence value should correspond to a well calibrated estimate of the probability that the predicte…
▽ More
Whenever a binary classifier is used to provide decision support, it typically provides both a label prediction and a confidence value. Then, the decision maker is supposed to use the confidence value to calibrate how much to trust the prediction. In this context, it has been often argued that the confidence value should correspond to a well calibrated estimate of the probability that the predicted label matches the ground truth label. However, multiple lines of empirical evidence suggest that decision makers have difficulties at developing a good sense on when to trust a prediction using these confidence values. In this paper, our goal is first to understand why and then investigate how to construct more useful confidence values. We first argue that, for a broad class of utility functions, there exist data distributions for which a rational decision maker is, in general, unlikely to discover the optimal decision policy using the above confidence values -- an optimal decision maker would need to sometimes place more (less) trust on predictions with lower (higher) confidence values. However, we then show that, if the confidence values satisfy a natural alignment property with respect to the decision maker's confidence on her own predictions, there always exists an optimal decision policy under which the level of trust the decision maker would need to place on predictions is monotone on the confidence values, facilitating its discoverability. Further, we show that multicalibration with respect to the decision maker's confidence on her own predictions is a sufficient condition for alignment. Experiments on four different AI-assisted decision making tasks where a classifier provides decision support to real human experts validate our theoretical results and suggest that alignment may lead to better decisions.
△ Less
Submitted 23 February, 2024; v1 submitted 31 May, 2023;
originally announced June 2023.
-
On the Within-Group Fairness of Screening Classifiers
Authors:
Nastaran Okati,
Stratis Tsirtsis,
Manuel Gomez Rodriguez
Abstract:
Screening classifiers are increasingly used to identify qualified candidates in a variety of selection processes. In this context, it has been recently shown that, if a classifier is calibrated, one can identify the smallest set of candidates which contains, in expectation, a desired number of qualified candidates using a threshold decision rule. This lends support to focusing on calibration as th…
▽ More
Screening classifiers are increasingly used to identify qualified candidates in a variety of selection processes. In this context, it has been recently shown that, if a classifier is calibrated, one can identify the smallest set of candidates which contains, in expectation, a desired number of qualified candidates using a threshold decision rule. This lends support to focusing on calibration as the only requirement for screening classifiers. In this paper, we argue that screening policies that use calibrated classifiers may suffer from an understudied type of within-group unfairness -- they may unfairly treat qualified members within demographic groups of interest. Further, we argue that this type of unfairness can be avoided if classifiers satisfy within-group monotonicity, a natural monotonicity property within each of the groups. Then, we introduce an efficient post-processing algorithm based on dynamic programming to minimally modify a given calibrated classifier so that its probability estimates satisfy within-group monotonicity. We validate our algorithm using US Census survey data and show that within-group monotonicity can be often achieved at a small cost in terms of prediction granularity and shortlist size.
△ Less
Submitted 7 August, 2023; v1 submitted 31 January, 2023;
originally announced February 2023.
-
Counterfactual Inference of Second Opinions
Authors:
Nina L. Corvelo Benz,
Manuel Gomez Rodriguez
Abstract:
Automated decision support systems that are able to infer second opinions from experts can potentially facilitate a more efficient allocation of resources; they can help decide when and from whom to seek a second opinion. In this paper, we look at the design of this type of support systems from the perspective of counterfactual inference. We focus on a multiclass classification setting and first s…
▽ More
Automated decision support systems that are able to infer second opinions from experts can potentially facilitate a more efficient allocation of resources; they can help decide when and from whom to seek a second opinion. In this paper, we look at the design of this type of support systems from the perspective of counterfactual inference. We focus on a multiclass classification setting and first show that, if experts make predictions on their own, the underlying causal mechanism generating their predictions needs to satisfy a desirable set invariant property. Further, we show that, for any causal mechanism satisfying this property, there exists an equivalent mechanism where the predictions by each expert are generated by independent sub-mechanisms governed by a common noise. This motivates the design of a set invariant Gumbel-Max structural causal model where the structure of the noise governing the sub-mechanisms underpinning the model depends on an intuitive notion of similarity between experts which can be estimated from data. Experiments on both synthetic and real data show that our model can be used to infer second opinions more accurately than its non-causal counterpart.
△ Less
Submitted 30 June, 2022; v1 submitted 16 March, 2022;
originally announced March 2022.
-
Improving Screening Processes via Calibrated Subset Selection
Authors:
Lequn Wang,
Thorsten Joachims,
Manuel Gomez Rodriguez
Abstract:
Many selection processes such as finding patients qualifying for a medical trial or retrieval pipelines in search engines consist of multiple stages, where an initial screening stage focuses the resources on shortlisting the most promising candidates. In this paper, we investigate what guarantees a screening classifier can provide, independently of whether it is constructed manually or trained. We…
▽ More
Many selection processes such as finding patients qualifying for a medical trial or retrieval pipelines in search engines consist of multiple stages, where an initial screening stage focuses the resources on shortlisting the most promising candidates. In this paper, we investigate what guarantees a screening classifier can provide, independently of whether it is constructed manually or trained. We find that current solutions do not enjoy distribution-free theoretical guarantees -- we show that, in general, even for a perfectly calibrated classifier, there always exist specific pools of candidates for which its shortlist is suboptimal. Then, we develop a distribution-free screening algorithm -- called Calibrated Subset Selection (CSS) -- that, given any classifier and some amount of calibration data, finds near-optimal shortlists of candidates that contain a desired number of qualified candidates in expectation. Moreover, we show that a variant of CSS that calibrates a given classifier multiple times across specific groups can create shortlists with provable diversity guarantees. Experiments on US Census survey data validate our theoretical results and show that the shortlists provided by our algorithm are superior to those provided by several competitive baselines.
△ Less
Submitted 12 June, 2022; v1 submitted 2 February, 2022;
originally announced February 2022.
-
Improving Expert Predictions with Conformal Prediction
Authors:
Eleni Straitouri,
Lequn Wang,
Nastaran Okati,
Manuel Gomez Rodriguez
Abstract:
Automated decision support systems promise to help human experts solve multiclass classification tasks more efficiently and accurately. However, existing systems typically require experts to understand when to cede agency to the system or when to exercise their own agency. Otherwise, the experts may be better off solving the classification tasks on their own. In this work, we develop an automated…
▽ More
Automated decision support systems promise to help human experts solve multiclass classification tasks more efficiently and accurately. However, existing systems typically require experts to understand when to cede agency to the system or when to exercise their own agency. Otherwise, the experts may be better off solving the classification tasks on their own. In this work, we develop an automated decision support system that, by design, does not require experts to understand when to trust the system to improve performance. Rather than providing (single) label predictions and letting experts decide when to trust these predictions, our system provides sets of label predictions constructed using conformal prediction$\unicode{x2014}$prediction sets$\unicode{x2014}$and forcefully asks experts to predict labels from these sets. By using conformal prediction, our system can precisely trade-off the probability that the true label is not in the prediction set, which determines how frequently our system will mislead the experts, and the size of the prediction set, which determines the difficulty of the classification task the experts need to solve using our system. In addition, we develop an efficient and near-optimal search method to find the conformal predictor under which the experts benefit the most from using our system. Simulation experiments using synthetic and real expert predictions demonstrate that our system may help experts make more accurate predictions and is robust to the accuracy of the classifier the conformal predictor relies on.
△ Less
Submitted 30 June, 2023; v1 submitted 28 January, 2022;
originally announced January 2022.
-
Counterfactual Temporal Point Processes
Authors:
Kimia Noorbakhsh,
Manuel Gomez Rodriguez
Abstract:
Machine learning models based on temporal point processes are the state of the art in a wide variety of applications involving discrete events in continuous time. However, these models lack the ability to answer counterfactual questions, which are increasingly relevant as these models are being used to inform targeted interventions. In this work, our goal is to fill this gap. To this end, we first…
▽ More
Machine learning models based on temporal point processes are the state of the art in a wide variety of applications involving discrete events in continuous time. However, these models lack the ability to answer counterfactual questions, which are increasingly relevant as these models are being used to inform targeted interventions. In this work, our goal is to fill this gap. To this end, we first develop a causal model of thinning for temporal point processes that builds upon the Gumbel-Max structural causal model. This model satisfies a desirable counterfactual monotonicity condition, which is sufficient to identify counterfactual dynamics in the process of thinning. Then, given an observed realization of a temporal point process with a given intensity function, we develop a sampling algorithm that uses the above causal model of thinning and the superposition theorem to simulate counterfactual realizations of the temporal point process under a given alternative intensity function. Simulation experiments using synthetic and real epidemiological data show that the counterfactual realizations provided by our algorithm may give valuable insights to enhance targeted interventions.
△ Less
Submitted 20 May, 2022; v1 submitted 15 November, 2021;
originally announced November 2021.
-
Consequential Ranking Algorithms and Long-term Welfare
Authors:
Behzad Tabibian,
Vicenç Gómez,
Abir De,
Bernhard Schölkopf,
Manuel Gomez Rodriguez
Abstract:
Ranking models are typically designed to provide rankings that optimize some measure of immediate utility to the users. As a result, they have been unable to anticipate an increasing number of undesirable long-term consequences of their proposed rankings, from fueling the spread of misinformation and increasing polarization to degrading social discourse. Can we design ranking models that understan…
▽ More
Ranking models are typically designed to provide rankings that optimize some measure of immediate utility to the users. As a result, they have been unable to anticipate an increasing number of undesirable long-term consequences of their proposed rankings, from fueling the spread of misinformation and increasing polarization to degrading social discourse. Can we design ranking models that understand the consequences of their proposed rankings and, more importantly, are able to avoid the undesirable ones? In this paper, we first introduce a joint representation of rankings and user dynamics using Markov decision processes. Then, we show that this representation greatly simplifies the construction of consequential ranking models that trade off the immediate utility and the long-term welfare. In particular, we can obtain optimal consequential rankings just by applying weighted sampling on the rankings provided by models that maximize measures of immediate utility. However, in practice, such a strategy may be inefficient and impractical, specially in high dimensional scenarios. To overcome this, we introduce an efficient gradient-based algorithm to learn parameterized consequential ranking models that effectively approximate optimal ones. We showcase our methodology using synthetic and real data gathered from Reddit and show that ranking models derived using our methodology provide ranks that may mitigate the spread of misinformation and improve the civility of online discussions.
△ Less
Submitted 13 May, 2019;
originally announced May 2019.
-
Enhancing the Accuracy and Fairness of Human Decision Making
Authors:
Isabel Valera,
Adish Singla,
Manuel Gomez Rodriguez
Abstract:
Societies often rely on human experts to take a wide variety of decisions affecting their members, from jail-or-release decisions taken by judges and stop-and-frisk decisions taken by police officers to accept-or-reject decisions taken by academics. In this context, each decision is taken by an expert who is typically chosen uniformly at random from a pool of experts. However, these decisions may…
▽ More
Societies often rely on human experts to take a wide variety of decisions affecting their members, from jail-or-release decisions taken by judges and stop-and-frisk decisions taken by police officers to accept-or-reject decisions taken by academics. In this context, each decision is taken by an expert who is typically chosen uniformly at random from a pool of experts. However, these decisions may be imperfect due to limited experience, implicit biases, or faulty probabilistic reasoning. Can we improve the accuracy and fairness of the overall decision making process by optimizing the assignment between experts and decisions?
In this paper, we address the above problem from the perspective of sequential decision making and show that, for different fairness notions from the literature, it reduces to a sequence of (constrained) weighted bipartite matchings, which can be solved efficiently using algorithms with approximation guarantees. Moreover, these algorithms also benefit from posterior sampling to actively trade off exploitation---selecting expert assignments which lead to accurate and fair decisions---and exploration---selecting expert assignments to learn about the experts' preferences and biases. We demonstrate the effectiveness of our algorithms on both synthetic and real-world data and show that they can significantly improve both the accuracy and fairness of the decisions taken by pools of experts.
△ Less
Submitted 25 May, 2018;
originally announced May 2018.
-
Teaching Multiple Concepts to a Forgetful Learner
Authors:
Anette Hunziker,
Yuxin Chen,
Oisin Mac Aodha,
Manuel Gomez Rodriguez,
Andreas Krause,
Pietro Perona,
Yisong Yue,
Adish Singla
Abstract:
How can we help a forgetful learner learn multiple concepts within a limited time frame? While there have been extensive studies in designing optimal schedules for teaching a single concept given a learner's memory model, existing approaches for teaching multiple concepts are typically based on heuristic scheduling techniques without theoretical guarantees. In this paper, we look at the problem fr…
▽ More
How can we help a forgetful learner learn multiple concepts within a limited time frame? While there have been extensive studies in designing optimal schedules for teaching a single concept given a learner's memory model, existing approaches for teaching multiple concepts are typically based on heuristic scheduling techniques without theoretical guarantees. In this paper, we look at the problem from the perspective of discrete optimization and introduce a novel algorithmic framework for teaching multiple concepts with strong performance guarantees. Our framework is both generic, allowing the design of teaching schedules for different memory models, and also interactive, allowing the teacher to adapt the schedule to the underlying forgetting mechanisms of the learner. Furthermore, for a well-known memory model, we are able to identify a regime of model parameters where our framework is guaranteed to achieve high performance. We perform extensive evaluations using simulations along with real user studies in two concrete applications: (i) an educational app for online vocabulary teaching; and (ii) an app for teaching novices how to recognize animal species from images. Our results demonstrate the effectiveness of our algorithm compared to popular heuristic approaches.
△ Less
Submitted 25 October, 2019; v1 submitted 21 May, 2018;
originally announced May 2018.
-
Fake News Detection in Social Networks via Crowd Signals
Authors:
Sebastian Tschiatschek,
Adish Singla,
Manuel Gomez Rodriguez,
Arpit Merchant,
Andreas Krause
Abstract:
Our work considers leveraging crowd signals for detecting fake news and is motivated by tools recently introduced by Facebook that enable users to flag fake news. By aggregating users' flags, our goal is to select a small subset of news every day, send them to an expert (e.g., via a third-party fact-checking organization), and stop the spread of news identified as fake by an expert. The main objec…
▽ More
Our work considers leveraging crowd signals for detecting fake news and is motivated by tools recently introduced by Facebook that enable users to flag fake news. By aggregating users' flags, our goal is to select a small subset of news every day, send them to an expert (e.g., via a third-party fact-checking organization), and stop the spread of news identified as fake by an expert. The main objective of our work is to minimize the spread of misinformation by stopping the propagation of fake news in the network. It is especially challenging to achieve this objective as it requires detecting fake news with high-confidence as quickly as possible. We show that in order to leverage users' flags efficiently, it is crucial to learn about users' flagging accuracy. We develop a novel algorithm, DETECTIVE, that performs Bayesian inference for detecting fake news and jointly learns about users' flagging accuracy over time. Our algorithm employs posterior sampling to actively trade off exploitation (selecting news that maximize the objective value at a given epoch) and exploration (selecting news that maximize the value of information towards learning about users' flagging accuracy). We demonstrate the effectiveness of our approach via extensive experiments and show the power of leveraging community signals for fake news detection.
△ Less
Submitted 2 March, 2018; v1 submitted 24 November, 2017;
originally announced November 2017.
-
From Parity to Preference-based Notions of Fairness in Classification
Authors:
Muhammad Bilal Zafar,
Isabel Valera,
Manuel Gomez Rodriguez,
Krishna P. Gummadi,
Adrian Weller
Abstract:
The adoption of automated, data-driven decision making in an ever expanding range of applications has raised concerns about its potential unfairness towards certain social groups. In this context, a number of recent studies have focused on defining, detecting, and removing unfairness from data-driven decision systems. However, the existing notions of fairness, based on parity (equality) in treatme…
▽ More
The adoption of automated, data-driven decision making in an ever expanding range of applications has raised concerns about its potential unfairness towards certain social groups. In this context, a number of recent studies have focused on defining, detecting, and removing unfairness from data-driven decision systems. However, the existing notions of fairness, based on parity (equality) in treatment or outcomes for different social groups, tend to be quite stringent, limiting the overall decision making accuracy. In this paper, we draw inspiration from the fair-division and envy-freeness literature in economics and game theory and propose preference-based notions of fairness -- given the choice between various sets of decision treatments or outcomes, any group of users would collectively prefer its treatment or outcomes, regardless of the (dis)parity as compared to the other groups. Then, we introduce tractable proxies to design margin-based classifiers that satisfy these preference-based notions of fairness. Finally, we experiment with a variety of synthetic and real-world datasets and show that preference-based fairness allows for greater decision accuracy than parity-based fairness.
△ Less
Submitted 28 November, 2017; v1 submitted 30 June, 2017;
originally announced July 2017.
-
Cheshire: An Online Algorithm for Activity Maximization in Social Networks
Authors:
Ali Zarezade,
Abir De,
Hamid Rabiee,
Manuel Gomez Rodriguez
Abstract:
User engagement in social networks depends critically on the number of online actions their users take in the network. Can we design an algorithm that finds when to incentivize users to take actions to maximize the overall activity in a social network? In this paper, we model the number of online actions over time using multidimensional Hawkes processes, derive an alternate representation of these…
▽ More
User engagement in social networks depends critically on the number of online actions their users take in the network. Can we design an algorithm that finds when to incentivize users to take actions to maximize the overall activity in a social network? In this paper, we model the number of online actions over time using multidimensional Hawkes processes, derive an alternate representation of these processes based on stochastic differential equations (SDEs) with jumps and, exploiting this alternate representation, address the above question from the perspective of stochastic optimal control of SDEs with jumps. We find that the optimal level of incentivized actions depends linearly on the current level of overall actions. Moreover, the coefficients of this linear relationship can be found by solving a matrix Riccati differential equation, which can be solved efficiently, and a first order differential equation, which has a closed form solution. As a result, we are able to design an efficient online algorithm, Cheshire, to sample the optimal times of the users' incentivized actions. Experiments on both synthetic and real data gathered from Twitter show that our algorithm is able to consistently maximize the number of online actions more effectively than the state of the art.
△ Less
Submitted 6 March, 2017;
originally announced March 2017.
-
Fairness Beyond Disparate Treatment & Disparate Impact: Learning Classification without Disparate Mistreatment
Authors:
Muhammad Bilal Zafar,
Isabel Valera,
Manuel Gomez Rodriguez,
Krishna P. Gummadi
Abstract:
Automated data-driven decision making systems are increasingly being used to assist, or even replace humans in many settings. These systems function by learning from historical decisions, often taken by humans. In order to maximize the utility of these systems (or, classifiers), their training involves minimizing the errors (or, misclassifications) over the given historical data. However, it is qu…
▽ More
Automated data-driven decision making systems are increasingly being used to assist, or even replace humans in many settings. These systems function by learning from historical decisions, often taken by humans. In order to maximize the utility of these systems (or, classifiers), their training involves minimizing the errors (or, misclassifications) over the given historical data. However, it is quite possible that the optimally trained classifier makes decisions for people belonging to different social groups with different misclassification rates (e.g., misclassification rates for females are higher than for males), thereby placing these groups at an unfair disadvantage. To account for and avoid such unfairness, in this paper, we introduce a new notion of unfairness, disparate mistreatment, which is defined in terms of misclassification rates. We then propose intuitive measures of disparate mistreatment for decision boundary-based classifiers, which can be easily incorporated into their formulation as convex-concave constraints. Experiments on synthetic as well as real world datasets show that our methodology is effective at avoiding disparate mistreatment, often at a small cost in terms of accuracy.
△ Less
Submitted 8 March, 2017; v1 submitted 26 October, 2016;
originally announced October 2016.
-
Modeling the Dynamics of Online Learning Activity
Authors:
Charalampos Mavroforakis,
Isabel Valera,
Manuel Gomez Rodriguez
Abstract:
People are increasingly relying on the Web and social media to find solutions to their problems in a wide range of domains. In this online setting, closely related problems often lead to the same characteristic learning pattern, in which people sharing these problems visit related pieces of information, perform almost identical queries or, more generally, take a series of similar actions. In this…
▽ More
People are increasingly relying on the Web and social media to find solutions to their problems in a wide range of domains. In this online setting, closely related problems often lead to the same characteristic learning pattern, in which people sharing these problems visit related pieces of information, perform almost identical queries or, more generally, take a series of similar actions. In this paper, we introduce a novel modeling framework for clustering continuous-time grouped streaming data, the hierarchical Dirichlet Hawkes process (HDHP), which allows us to automatically uncover a wide variety of learning patterns from detailed traces of learning activity. Our model allows for efficient inference, scaling to millions of actions taken by thousands of users. Experiments on real data gathered from Stack Overflow reveal that our framework can recover meaningful learning patterns in terms of both content and temporal dynamics, as well as accurately track users' interests and goals over time.
△ Less
Submitted 18 October, 2016;
originally announced October 2016.
-
RedQueen: An Online Algorithm for Smart Broadcasting in Social Networks
Authors:
Ali Zarezade,
Utkarsh Upadhyay,
Hamid Rabiee,
Manuel Gomez Rodriguez
Abstract:
Users in social networks whose posts stay at the top of their followers'{} feeds the longest time are more likely to be noticed. Can we design an online algorithm to help them decide when to post to stay at the top? In this paper, we address this question as a novel optimal control problem for jump stochastic differential equations. For a wide variety of feed dynamics, we show that the optimal bro…
▽ More
Users in social networks whose posts stay at the top of their followers'{} feeds the longest time are more likely to be noticed. Can we design an online algorithm to help them decide when to post to stay at the top? In this paper, we address this question as a novel optimal control problem for jump stochastic differential equations. For a wide variety of feed dynamics, we show that the optimal broadcasting intensity for any user is surprisingly simple -- it is given by the position of her most recent post on each of her follower's feeds. As a consequence, we are able to develop a simple and highly efficient online algorithm, RedQueen, to sample the optimal times for the user to post. Experiments on both synthetic and real data gathered from Twitter show that our algorithm is able to consistently make a user's posts more visible over time, is robust to volume changes on her followers' feeds, and significantly outperforms the state of the art.
△ Less
Submitted 18 October, 2016;
originally announced October 2016.
-
Influence of substitutional disorder on the electrical transport and the superconducting properties of Fe$_{1+z}$Te$_{1-x-y}$Se$_{x}$S$_{y}$
Authors:
M. G. Rodríguez,
G. Polla,
C. P. Ramos,
C. Acha
Abstract:
We have carried out an investigation of the structural, magnetic, transport and superconducting properties of Fe$_{1+z}$Te$_{1-x-y}$Se$_x$S$_y$ ceramic compounds, for $z=0$ and some specific Se (0$\leq$ x $\leq$ 0.5) and S (0 $\leq$ y $\leq$0.12) contents. The incorporation of Se and S to the FeTe structure produces a progressive reduction of the crystallographic parameters as well as different de…
▽ More
We have carried out an investigation of the structural, magnetic, transport and superconducting properties of Fe$_{1+z}$Te$_{1-x-y}$Se$_x$S$_y$ ceramic compounds, for $z=0$ and some specific Se (0$\leq$ x $\leq$ 0.5) and S (0 $\leq$ y $\leq$0.12) contents. The incorporation of Se and S to the FeTe structure produces a progressive reduction of the crystallographic parameters as well as different degrees of structural disorder associated with the differences of the ionic radius of the substituting cations. In the present study, we measure transport properties of this family of compounds and we show the direct influence of disorder in the normal and superconductor states. We notice that the structural disorder correlates with a variable range hopping conducting regime observed at temperatures $T >$ 200 K. At lower temperatures, all the samples except the one with the highest degree of disorder show a crossover to a metallic-like regime, probably related to the transport of resilient-quasi-particles associated with the proximity of a Fermi liquid state at temperatures below the superconducting transition. Moreover, the superconducting properties are depressed only for that particular sample, in accordance to the condition that superconductivity is affected by disorder when the electronic localization length $ξ_L$ becomes smaller than the coherence length $ξ_{SC}$.
△ Less
Submitted 28 July, 2015;
originally announced July 2015.
-
Fairness Constraints: Mechanisms for Fair Classification
Authors:
Muhammad Bilal Zafar,
Isabel Valera,
Manuel Gomez Rodriguez,
Krishna P. Gummadi
Abstract:
Algorithmic decision making systems are ubiquitous across a wide variety of online as well as offline services. These systems rely on complex learning methods and vast amounts of data to optimize the service functionality, satisfaction of the end user and profitability. However, there is a growing concern that these automated decisions can lead, even in the absence of intent, to a lack of fairness…
▽ More
Algorithmic decision making systems are ubiquitous across a wide variety of online as well as offline services. These systems rely on complex learning methods and vast amounts of data to optimize the service functionality, satisfaction of the end user and profitability. However, there is a growing concern that these automated decisions can lead, even in the absence of intent, to a lack of fairness, i.e., their outcomes can disproportionately hurt (or, benefit) particular groups of people sharing one or more sensitive attributes (e.g., race, sex). In this paper, we introduce a flexible mechanism to design fair classifiers by leveraging a novel intuitive measure of decision boundary (un)fairness. We instantiate this mechanism with two well-known classifiers, logistic regression and support vector machines, and show on real-world data that our mechanism allows for a fine-grained control on the degree of fairness, often at a small cost in terms of accuracy.
△ Less
Submitted 23 March, 2017; v1 submitted 19 July, 2015;
originally announced July 2015.
-
COEVOLVE: A Joint Point Process Model for Information Diffusion and Network Co-evolution
Authors:
Mehrdad Farajtabar,
Yichen Wang,
Manuel Gomez Rodriguez,
Shuang Li,
Hongyuan Zha,
Le Song
Abstract:
Information diffusion in online social networks is affected by the underlying network topology, but it also has the power to change it. Online users are constantly creating new links when exposed to new information sources, and in turn these links are alternating the way information spreads. However, these two highly intertwined stochastic processes, information diffusion and network evolution, ha…
▽ More
Information diffusion in online social networks is affected by the underlying network topology, but it also has the power to change it. Online users are constantly creating new links when exposed to new information sources, and in turn these links are alternating the way information spreads. However, these two highly intertwined stochastic processes, information diffusion and network evolution, have been predominantly studied separately, ignoring their co-evolutionary dynamics.
We propose a temporal point process model, COEVOLVE, for such joint dynamics, allowing the intensity of one process to be modulated by that of the other. This model allows us to efficiently simulate interleaved diffusion and network events, and generate traces obeying common diffusion and network patterns observed in real-world networks. Furthermore, we also develop a convex optimization framework to learn the parameters of the model from historical diffusion and network evolution traces. We experimented with both synthetic data and data gathered from Twitter, and show that our model provides a good fit to the data as well as more accurate predictions than alternatives.
△ Less
Submitted 1 April, 2016; v1 submitted 8 July, 2015;
originally announced July 2015.
-
Learning and Forecasting Opinion Dynamics in Social Networks
Authors:
Abir De,
Isabel Valera,
Niloy Ganguly,
Sourangshu Bhattacharya,
Manuel Gomez Rodriguez
Abstract:
Social media and social networking sites have become a global pinboard for exposition and discussion of news, topics, and ideas, where social media users often update their opinions about a particular topic by learning from the opinions shared by their friends. In this context, can we learn a data-driven model of opinion dynamics that is able to accurately forecast opinions from users? In this pap…
▽ More
Social media and social networking sites have become a global pinboard for exposition and discussion of news, topics, and ideas, where social media users often update their opinions about a particular topic by learning from the opinions shared by their friends. In this context, can we learn a data-driven model of opinion dynamics that is able to accurately forecast opinions from users? In this paper, we introduce SLANT, a probabilistic modeling framework of opinion dynamics, which represents users opinions over time by means of marked jump diffusion stochastic differential equations, and allows for efficient model simulation and parameter estimation from historical fine grained event data. We then leverage our framework to derive a set of efficient predictive formulas for opinion forecasting and identify conditions under which opinions converge to a steady state. Experiments on data gathered from Twitter show that our model provides a good fit to the data and our formulas achieve more accurate forecasting than alternatives.
△ Less
Submitted 24 May, 2016; v1 submitted 17 June, 2015;
originally announced June 2015.
-
First-Order Insulator-to-Metal Mott Transition in the Paramagnetic 3D System GaTa4Se8
Authors:
A. Camjayi,
C. Acha,
R. Weht,
M. G. Rodríguez,
B. Corraze,
E. Janod,
L. Cario,
M. J. Rozenberg
Abstract:
The nature of the Mott transition in the absence of any symmetry braking remains a matter of debate. We study the correlation-driven insulator-to-metal transition in the prototypical 3D Mott system GaTa4Se8, as a function of temperature and applied pressure. We report novel experiments on single crystals, which demonstrate that the transition is of first order and follows from the coexistence of t…
▽ More
The nature of the Mott transition in the absence of any symmetry braking remains a matter of debate. We study the correlation-driven insulator-to-metal transition in the prototypical 3D Mott system GaTa4Se8, as a function of temperature and applied pressure. We report novel experiments on single crystals, which demonstrate that the transition is of first order and follows from the coexistence of two states, one insulating and one metallic, that we toggle with a small bias current. We provide support for our findings by contrasting the experimental data with calculations that combine local density approximation with dynamical mean-field theory, which are in very good agreement.
△ Less
Submitted 15 September, 2014;
originally announced September 2014.
-
Shaping Social Activity by Incentivizing Users
Authors:
Mehrdad Farajtabar,
Nan Du,
Manuel Gomez Rodriguez,
Isabel Valera,
Hongyuan Zha,
Le Song
Abstract:
Events in an online social network can be categorized roughly into endogenous events, where users just respond to the actions of their neighbors within the network, or exogenous events, where users take actions due to drives external to the network. How much external drive should be provided to each user, such that the network activity can be steered towards a target state? In this paper, we model…
▽ More
Events in an online social network can be categorized roughly into endogenous events, where users just respond to the actions of their neighbors within the network, or exogenous events, where users take actions due to drives external to the network. How much external drive should be provided to each user, such that the network activity can be steered towards a target state? In this paper, we model social events using multivariate Hawkes processes, which can capture both endogenous and exogenous event intensities, and derive a time dependent linear relation between the intensity of exogenous events and the overall network activity. Exploiting this connection, we develop a convex optimization framework for determining the required level of external drive in order for the network to reach a desired activity level. We experimented with event data gathered from Twitter, and show that our method can steer the activity of the network more accurately than alternatives.
△ Less
Submitted 19 August, 2014; v1 submitted 2 August, 2014;
originally announced August 2014.
-
Quantifying Information Overload in Social Media and its Impact on Social Contagions
Authors:
Manuel Gomez Rodriguez,
Krishna Gummadi,
Bernhard Schoelkopf
Abstract:
Information overload has become an ubiquitous problem in modern society. Social media users and microbloggers receive an endless flow of information, often at a rate far higher than their cognitive abilities to process the information. In this paper, we conduct a large scale quantitative study of information overload and evaluate its impact on information dissemination in the Twitter social media…
▽ More
Information overload has become an ubiquitous problem in modern society. Social media users and microbloggers receive an endless flow of information, often at a rate far higher than their cognitive abilities to process the information. In this paper, we conduct a large scale quantitative study of information overload and evaluate its impact on information dissemination in the Twitter social media site. We model social media users as information processing systems that queue incoming information according to some policies, process information from the queue at some unknown rates and decide to forward some of the incoming information to other users. We show how timestamped data about tweets received and forwarded by users can be used to uncover key properties of their queueing policies and estimate their information processing rates and limits. Such an understanding of users' information processing behaviors allows us to infer whether and to what extent users suffer from information overload.
Our analysis provides empirical evidence of information processing limits for social media users and the prevalence of information overloading. The most active and popular social media users are often the ones that are overloaded. Moreover, we find that the rate at which users receive information impacts their processing behavior, including how they prioritize information from different sources, how much information they process, and how quickly they process information. Finally, the susceptibility of a social media user to social contagions depends crucially on the rate at which she receives information. An exposure to a piece of information, be it an idea, a convention or a product, is much less effective for users that receive information at higher rates, meaning they need more exposures to adopt a particular contagion.
△ Less
Submitted 26 March, 2014;
originally announced March 2014.
-
Scalable Influence Estimation in Continuous-Time Diffusion Networks
Authors:
Nan Du,
Le Song,
Manuel Gomez Rodriguez,
Hongyuan Zha
Abstract:
If a piece of information is released from a media site, can it spread, in 1 month, to a million web pages? This influence estimation problem is very challenging since both the time-sensitive nature of the problem and the issue of scalability need to be addressed simultaneously. In this paper, we propose a randomized algorithm for influence estimation in continuous-time diffusion networks. Our alg…
▽ More
If a piece of information is released from a media site, can it spread, in 1 month, to a million web pages? This influence estimation problem is very challenging since both the time-sensitive nature of the problem and the issue of scalability need to be addressed simultaneously. In this paper, we propose a randomized algorithm for influence estimation in continuous-time diffusion networks. Our algorithm can estimate the influence of every node in a network with |V| nodes and |E| edges to an accuracy of $\varepsilon$ using $n=O(1/\varepsilon^2)$ randomizations and up to logarithmic factors O(n|E|+n|V|) computations. When used as a subroutine in a greedy influence maximization algorithm, our proposed method is guaranteed to find a set of nodes with an influence of at least (1-1/e)OPT-2$\varepsilon$, where OPT is the optimal value. Experiments on both synthetic and real-world data show that the proposed method can easily scale up to networks of millions of nodes while significantly improves over previous state-of-the-arts in terms of the accuracy of the estimated influence and the quality of the selected nodes in maximizing the influence.
△ Less
Submitted 14 November, 2013;
originally announced November 2013.
-
Measurement of the neutron electric to magnetic form factor ratio at Q2 = 1.58 GeV2 using the reaction 3He(e,e'n)pp
Authors:
B. S. Schlimme,
P. Achenbach,
C. A. Ayerbe Gayoso,
J. C. Bernauer,
R. Böhm,
D. Bosnar,
Th. Challand,
M. O. Distler,
L. Doria,
F. Fellenberger,
H. Fonvieille,
M. Gómez Rodríguez,
P. Grabmayr,
T. Hehl,
W. Heil,
D. Kiselev,
J. Krimmer,
M. Makek,
H. Merkel,
D. G. Middleton,
U. Müller,
L. Nungesser,
B. A. Ott,
J. Pochodzalla,
M. Potokar
, et al. (7 additional authors not shown)
Abstract:
A measurement of beam helicity asymmetries in the reaction 3He(e,e'n)pp has been performed at the Mainz Microtron in quasielastic kinematics in order to determine the electric to magnetic form factor ratio of the neutron, GEn/GMn, at a four momentum transfer Q2 = 1.58 GeV2. Longitudinally polarized electrons were scattered on a highly polarized 3He gas target. The scattered electrons were detected…
▽ More
A measurement of beam helicity asymmetries in the reaction 3He(e,e'n)pp has been performed at the Mainz Microtron in quasielastic kinematics in order to determine the electric to magnetic form factor ratio of the neutron, GEn/GMn, at a four momentum transfer Q2 = 1.58 GeV2. Longitudinally polarized electrons were scattered on a highly polarized 3He gas target. The scattered electrons were detected with a high-resolution magnetic spectrometer, and the ejected neutrons with a dedicated neutron detector composed of scintillator bars. To reduce systematic errors data were taken for four different target polarization orientations allowing the determination of GEn/GMn from a double ratio. We find mu_n GEn/GMn = 0.250 +/- 0.058(stat.) +/- 0.017 (sys.).
△ Less
Submitted 29 August, 2013; v1 submitted 28 July, 2013;
originally announced July 2013.
-
Modeling Information Propagation with Survival Theory
Authors:
Manuel Gomez Rodriguez,
Jure Leskovec,
Bernhard Schoelkopf
Abstract:
Networks provide a skeleton for the spread of contagions, like, information, ideas, behaviors and diseases. Many times networks over which contagions diffuse are unobserved and need to be inferred. Here we apply survival theory to develop general additive and multiplicative risk models under which the network inference problems can be solved efficiently by exploiting their convexity. Our additive…
▽ More
Networks provide a skeleton for the spread of contagions, like, information, ideas, behaviors and diseases. Many times networks over which contagions diffuse are unobserved and need to be inferred. Here we apply survival theory to develop general additive and multiplicative risk models under which the network inference problems can be solved efficiently by exploiting their convexity. Our additive risk model generalizes several existing network inference models. We show all these models are particular cases of our more general model. Our multiplicative model allows for modeling scenarios in which a node can either increase or decrease the risk of activation of another node, in contrast with previous approaches, which consider only positive risk increments. We evaluate the performance of our network inference algorithms on large synthetic and real cascade datasets, and show that our models are able to predict the length and duration of cascades in real data.
△ Less
Submitted 15 May, 2013;
originally announced May 2013.
-
Structure and Dynamics of Information Pathways in Online Media
Authors:
Manuel Gomez Rodriguez,
Jure Leskovec,
Bernhard Schölkopf
Abstract:
Diffusion of information, spread of rumors and infectious diseases are all instances of stochastic processes that occur over the edges of an underlying network. Many times networks over which contagions spread are unobserved, and such networks are often dynamic and change over time. In this paper, we investigate the problem of inferring dynamic networks based on information diffusion data. We assu…
▽ More
Diffusion of information, spread of rumors and infectious diseases are all instances of stochastic processes that occur over the edges of an underlying network. Many times networks over which contagions spread are unobserved, and such networks are often dynamic and change over time. In this paper, we investigate the problem of inferring dynamic networks based on information diffusion data. We assume there is an unobserved dynamic network that changes over time, while we observe the results of a dynamic process spreading over the edges of the network. The task then is to infer the edges and the dynamics of the underlying network.
We develop an on-line algorithm that relies on stochastic convex optimization to efficiently solve the dynamic network inference problem. We apply our algorithm to information diffusion among 3.3 million mainstream media and blog sites and experiment with more than 179 million different pieces of information spreading over the network in a one year period. We study the evolution of information pathways in the online media space and find interesting insights. Information pathways for general recurrent topics are more stable across time than for on-going news events. Clusters of news media sites and blogs often emerge and vanish in matter of days for on-going news events. Major social movements and events involving civil population, such as the Libyan's civil war or Syria's uprise, lead to an increased amount of information pathways among blogs as well as in the overall increase in the network centrality of blogs and social media sites.
△ Less
Submitted 6 December, 2012;
originally announced December 2012.
-
Influence Maximization in Continuous Time Diffusion Networks
Authors:
Manuel Gomez Rodriguez,
Bernhard Schölkopf
Abstract:
The problem of finding the optimal set of source nodes in a diffusion network that maximizes the spread of information, influence, and diseases in a limited amount of time depends dramatically on the underlying temporal dynamics of the network. However, this still remains largely unexplored to date. To this end, given a network and its temporal dynamics, we first describe how continuous time Marko…
▽ More
The problem of finding the optimal set of source nodes in a diffusion network that maximizes the spread of information, influence, and diseases in a limited amount of time depends dramatically on the underlying temporal dynamics of the network. However, this still remains largely unexplored to date. To this end, given a network and its temporal dynamics, we first describe how continuous time Markov chains allow us to analytically compute the average total number of nodes reached by a diffusion process starting in a set of source nodes. We then show that selecting the set of most influential source nodes in the continuous time influence maximization problem is NP-hard and develop an efficient approximation algorithm with provable near-optimal performance. Experiments on synthetic and real diffusion networks show that our algorithm outperforms other state of the art algorithms by at least ~20% and is robust across different network topologies.
△ Less
Submitted 8 May, 2012;
originally announced May 2012.
-
Submodular Inference of Diffusion Networks from Multiple Trees
Authors:
Manuel Gomez Rodriguez,
Bernhard Schölkopf
Abstract:
Diffusion and propagation of information, influence and diseases take place over increasingly larger networks. We observe when a node copies information, makes a decision or becomes infected but networks are often hidden or unobserved. Since networks are highly dynamic, changing and growing rapidly, we only observe a relatively small set of cascades before a network changes significantly. Scalable…
▽ More
Diffusion and propagation of information, influence and diseases take place over increasingly larger networks. We observe when a node copies information, makes a decision or becomes infected but networks are often hidden or unobserved. Since networks are highly dynamic, changing and growing rapidly, we only observe a relatively small set of cascades before a network changes significantly. Scalable network inference based on a small cascade set is then necessary for understanding the rapidly evolving dynamics that govern diffusion. In this article, we develop a scalable approximation algorithm with provable near-optimal performance based on submodular maximization which achieves a high accuracy in such scenario, solving an open problem first introduced by Gomez-Rodriguez et al (2010). Experiments on synthetic and real diffusion data show that our algorithm in practice achieves an optimal trade-off between accuracy and running time.
△ Less
Submitted 8 May, 2012;
originally announced May 2012.
-
Electrical transport properties of manganite powders under pressure
Authors:
M. G. Rodríguez,
A. G. Leyva,
C. Acha
Abstract:
We have measured the electrical resistance of micrometric to nanometric powders of the La$_{5/8-y}$Pr$_y$Ca$_{3/8}$MnO$_3$ (LPCMO with y=0.3) manganite for hydrostatic pressures up to 4 kbar. By applying different final thermal treatments to samples synthesized by a microwave assisted denitration process, we obtained two particular grain characteristic dimensions (40 nm and 1000 nm) which allowed…
▽ More
We have measured the electrical resistance of micrometric to nanometric powders of the La$_{5/8-y}$Pr$_y$Ca$_{3/8}$MnO$_3$ (LPCMO with y=0.3) manganite for hydrostatic pressures up to 4 kbar. By applying different final thermal treatments to samples synthesized by a microwave assisted denitration process, we obtained two particular grain characteristic dimensions (40 nm and 1000 nm) which allowed us to analyze the grain size sensitivity of the electrical conduction properties of both the metal electrode interface with manganite (Pt / LPCMO) as well as the intrinsic intergranular interfaces formed by the LPCMO powder, conglomerate under the only effect of external pressure. We also analyzed the effects of pressure on the phase diagram of these powders. Our results indicate that different magnetic phases coexist at low temperatures and that the electrical transport properties are related to the intrinsic interfaces, as we observe evidences of a granular behavior and an electronic transport dominated by the Space Charge limited Current mechanism.
△ Less
Submitted 17 January, 2012;
originally announced January 2012.
-
Uncovering the Temporal Dynamics of Diffusion Networks
Authors:
Manuel Gomez Rodriguez,
David Balduzzi,
Bernhard Schölkopf
Abstract:
Time plays an essential role in the diffusion of information, influence and disease over networks. In many cases we only observe when a node copies information, makes a decision or becomes infected -- but the connectivity, transmission rates between nodes and transmission sources are unknown. Inferring the underlying dynamics is of outstanding interest since it enables forecasting, influencing and…
▽ More
Time plays an essential role in the diffusion of information, influence and disease over networks. In many cases we only observe when a node copies information, makes a decision or becomes infected -- but the connectivity, transmission rates between nodes and transmission sources are unknown. Inferring the underlying dynamics is of outstanding interest since it enables forecasting, influencing and retarding infections, broadly construed. To this end, we model diffusion processes as discrete networks of continuous temporal processes occurring at different rates. Given cascade data -- observed infection times of nodes -- we infer the edges of the global diffusion network and estimate the transmission rates of each edge that best explain the observed data. The optimization problem is convex. The model naturally (without heuristics) imposes sparse solutions and requires no parameter tuning. The problem decouples into a collection of independent smaller problems, thus scaling easily to networks on the order of hundreds of thousands of nodes. Experiments on real and synthetic data show that our algorithm both recovers the edges of diffusion networks and accurately estimates their transmission rates from cascade data.
△ Less
Submitted 3 May, 2011;
originally announced May 2011.