-
Large Language Models Overcome the Machine Penalty When Acting Fairly but Not When Acting Selfishly or Altruistically
Authors:
Zhen Wang,
Ruiqi Song,
Chen Shen,
Shiya Yin,
Zhao Song,
Balaraju Battu,
Lei Shi,
Danyang Jia,
Talal Rahwan,
Shuyue Hu
Abstract:
In social dilemmas where the collective and self-interests are at odds, people typically cooperate less with machines than with fellow humans, a phenomenon termed the machine penalty. Overcoming this penalty is critical for successful human-machine collectives, yet current solutions often involve ethically-questionable tactics, like concealing machines' non-human nature. In this study, with 1,152…
▽ More
In social dilemmas where the collective and self-interests are at odds, people typically cooperate less with machines than with fellow humans, a phenomenon termed the machine penalty. Overcoming this penalty is critical for successful human-machine collectives, yet current solutions often involve ethically-questionable tactics, like concealing machines' non-human nature. In this study, with 1,152 participants, we explore the possibility of closing this research question by using Large Language Models (LLMs), in scenarios where communication is possible between interacting parties. We design three types of LLMs: (i) Cooperative, aiming to assist its human associate; (ii) Selfish, focusing solely on maximizing its self-interest; and (iii) Fair, balancing its own and collective interest, while slightly prioritizing self-interest. Our findings reveal that, when interacting with humans, fair LLMs are able to induce cooperation levels comparable to those observed in human-human interactions, even when their non-human nature is fully disclosed. In contrast, selfish and cooperative LLMs fail to achieve this goal. Post-experiment analysis shows that all three types of LLMs succeed in forming mutual cooperation agreements with humans, yet only fair LLMs, which occasionally break their promises, are capable of instilling a perception among humans that cooperating with them is the social norm, and eliciting positive views on their trustworthiness, mindfulness, intelligence, and communication quality. Our findings suggest that for effective human-machine cooperation, bot manufacturers should avoid designing machines with mere rational decision-making or a sole focus on assisting humans. Instead, they should design machines capable of judiciously balancing their own interest and the interest of humans.
△ Less
Submitted 8 October, 2024; v1 submitted 29 September, 2024;
originally announced October 2024.
-
Hack Me If You Can: Aggregating AutoEncoders for Countering Persistent Access Threats Within Highly Imbalanced Data
Authors:
Sidahmed Benabderrahmane,
Ngoc Hoang,
Petko Valtchev,
James Cheney,
Talal Rahwan
Abstract:
Advanced Persistent Threats (APTs) are sophisticated, targeted cyberattacks designed to gain unauthorized access to systems and remain undetected for extended periods. To evade detection, APT cyberattacks deceive defense layers with breaches and exploits, thereby complicating exposure by traditional anomaly detection-based security methods. The challenge of detecting APTs with machine learning is…
▽ More
Advanced Persistent Threats (APTs) are sophisticated, targeted cyberattacks designed to gain unauthorized access to systems and remain undetected for extended periods. To evade detection, APT cyberattacks deceive defense layers with breaches and exploits, thereby complicating exposure by traditional anomaly detection-based security methods. The challenge of detecting APTs with machine learning is compounded by the rarity of relevant datasets and the significant imbalance in the data, which makes the detection process highly burdensome. We present AE-APT, a deep learning-based tool for APT detection that features a family of AutoEncoder methods ranging from a basic one to a Transformer-based one. We evaluated our tool on a suite of provenance trace databases produced by the DARPA Transparent Computing program, where APT-like attacks constitute as little as 0.004% of the data. The datasets span multiple operating systems, including Android, Linux, BSD, and Windows, and cover two attack scenarios. The outcomes showed that AE-APT has significantly higher detection rates compared to its competitors, indicating superior performance in detecting and ranking anomalies.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Self-Reflection Outcome is Sensitive to Prompt Construction
Authors:
Fengyuan Liu,
Nouar AlDahoul,
Gregory Eady,
Yasir Zaki,
Bedoor AlShebli,
Talal Rahwan
Abstract:
Large language models (LLMs) demonstrate impressive zero-shot and few-shot reasoning capabilities. Some propose that such capabilities can be improved through self-reflection, i.e., letting LLMs reflect on their own output to identify and correct mistakes in the initial responses. However, despite some evidence showing the benefits of self-reflection, recent studies offer mixed results. Here, we a…
▽ More
Large language models (LLMs) demonstrate impressive zero-shot and few-shot reasoning capabilities. Some propose that such capabilities can be improved through self-reflection, i.e., letting LLMs reflect on their own output to identify and correct mistakes in the initial responses. However, despite some evidence showing the benefits of self-reflection, recent studies offer mixed results. Here, we aim to reconcile these conflicting findings by first demonstrating that the outcome of self-reflection is sensitive to prompt wording; e.g., LLMs are more likely to conclude that it has made a mistake when explicitly prompted to find mistakes. Consequently, idiosyncrasies in reflection prompts may lead LLMs to change correct responses unnecessarily. We show that most prompts used in the self-reflection literature are prone to this bias. We then propose different ways of constructing prompts that are conservative in identifying mistakes and show that self-reflection using such prompts results in higher accuracy. Our findings highlight the importance of prompt engineering in self-reflection tasks. We release our code at https://github.com/Michael98Liu/mixture-of-prompts.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Inclusive content reduces racial and gender biases, yet non-inclusive content dominates popular media outlets
Authors:
Nouar AlDahoul,
Hazem Ibrahim,
Minsu Park,
Talal Rahwan,
Yasir Zaki
Abstract:
Images are often termed as representations of perceived reality. As such, racial and gender biases in popular media imagery could play a vital role in shaping people's perceptions of society. While inquiries into such biases have examined the frequency at which different racial and gender groups appear in different forms of media, the literature still lacks a large-scale longitudinal study that fu…
▽ More
Images are often termed as representations of perceived reality. As such, racial and gender biases in popular media imagery could play a vital role in shaping people's perceptions of society. While inquiries into such biases have examined the frequency at which different racial and gender groups appear in different forms of media, the literature still lacks a large-scale longitudinal study that further examines the manner in which these groups are portrayed. To fill this gap, we examine three media forms, namely fashion magazines, movie posters, and advertisements. To do so, we collect a large dataset comprising over 300,000 images spanning over five decades and utilize state-of-the-art machine learning models to not only classify race and gender but also identify the posture, emotional state, and body composition of the person featured in each image. We find that racial minorities appear far less frequently than their White counterparts, and when they do appear, they are portrayed less prominently and tend to convey more negative emotions. We also find that women are more likely to be portrayed with their full bodies in images, whereas men are more frequently presented with their faces. This disparity exemplifies face-ism, where emphasizing faces over bodies has been linked to perceptions of higher competence and intelligence. Finally, through a series of survey experiments, we show that exposure to inclusive content-rather than racially and gender-homogenized content -- significantly reduces perception biases towards minorities in areas such as household income, hiring merit, beauty standards, leadership positions, and the representation of women in the workplace. Taken together, our findings demonstrate that racial and gender biases in media continue to be an ongoing problem that may exacerbate existing stereotypes.
△ Less
Submitted 10 May, 2024;
originally announced May 2024.
-
Interpretable Machine Learning Models for Predicting the Next Targets of Activist Funds
Authors:
Minwu Kim,
Sidahmed Benabderrahmane,
Talal Rahwan
Abstract:
This work develops a predictive model to identify potential targets of activist investment funds, which strategically acquire significant corporate stakes to drive operational and strategic improvements and enhance shareholder value. Predicting these targets is crucial for companies to mitigate intervention risks, for activists to select optimal targets, and for investors to capitalize on associat…
▽ More
This work develops a predictive model to identify potential targets of activist investment funds, which strategically acquire significant corporate stakes to drive operational and strategic improvements and enhance shareholder value. Predicting these targets is crucial for companies to mitigate intervention risks, for activists to select optimal targets, and for investors to capitalize on associated stock price gains. Our analysis utilizes data from the Russell 3000 index from 2016 to 2022. We tested 123 variations of models using different data imputation, oversampling, and machine learning methods, achieving a top AUC-ROC of 0.782. This demonstrates the model's effectiveness in identifying likely targets of activist funds. We applied the Shapley value method to determine the most influential factors in a company's susceptibility to activist investment. This interpretative approach provides clear insights into the driving forces behind activist targeting. Our model offers stakeholders a strategic tool for proactive corporate governance and investment strategy, enhancing understanding of the dynamics of activist investing.
△ Less
Submitted 12 October, 2024; v1 submitted 24 April, 2024;
originally announced April 2024.
-
A Novel BERT-based Classifier to Detect Political Leaning of YouTube Videos based on their Titles
Authors:
Nouar AlDahoul,
Talal Rahwan,
Yasir Zaki
Abstract:
A quarter of US adults regularly get their news from YouTube. Yet, despite the massive political content available on the platform, to date no classifier has been proposed to identify the political leaning of YouTube videos. To fill this gap, we propose a novel classifier based on Bert -- a language model from Google -- to classify YouTube videos merely based on their titles into six categories, n…
▽ More
A quarter of US adults regularly get their news from YouTube. Yet, despite the massive political content available on the platform, to date no classifier has been proposed to identify the political leaning of YouTube videos. To fill this gap, we propose a novel classifier based on Bert -- a language model from Google -- to classify YouTube videos merely based on their titles into six categories, namely: Far Left, Left, Center, Anti-Woke, Right, and Far Right. We used a public dataset of 10 million YouTube video titles (under various categories) to train and validate the proposed classifier. We compare the classifier against several alternatives that we trained on the same dataset, revealing that our classifier achieves the highest accuracy (75%) and the highest F1 score (77%). To further validate the classification performance, we collect videos from YouTube channels of numerous prominent news agencies, such as Fox News and New York Times, which have widely known political leanings, and apply our classifier to their video titles. For the vast majority of cases, the predicted political leaning matches that of the news agency.
△ Less
Submitted 16 February, 2024;
originally announced April 2024.
-
Google Scholar is manipulatable
Authors:
Hazem Ibrahim,
Fengyuan Liu,
Yasir Zaki,
Talal Rahwan
Abstract:
Citations are widely considered in scientists' evaluation. As such, scientists may be incentivized to inflate their citation counts. While previous literature has examined self-citations and citation cartels, it remains unclear whether scientists can purchase citations. Here, we compile a dataset of ~1.6 million profiles on Google Scholar to examine instances of citation fraud on the platform. We…
▽ More
Citations are widely considered in scientists' evaluation. As such, scientists may be incentivized to inflate their citation counts. While previous literature has examined self-citations and citation cartels, it remains unclear whether scientists can purchase citations. Here, we compile a dataset of ~1.6 million profiles on Google Scholar to examine instances of citation fraud on the platform. We survey faculty at highly-ranked universities, and confirm that Google Scholar is widely used when evaluating scientists. Intrigued by a citation-boosting service that we unravelled during our investigation, we contacted the service while undercover as a fictional author, and managed to purchase 50 citations. These findings provide conclusive evidence that citations can be bought in bulk, and highlight the need to look beyond citation counts.
△ Less
Submitted 7 February, 2024;
originally announced February 2024.
-
AI-generated faces influence gender stereotypes and racial homogenization
Authors:
Nouar AlDahoul,
Talal Rahwan,
Yasir Zaki
Abstract:
Text-to-image generative AI models such as Stable Diffusion are used daily by millions worldwide. However, the extent to which these models exhibit racial and gender stereotypes is not yet fully understood. Here, we document significant biases in Stable Diffusion across six races, two genders, 32 professions, and eight attributes. Additionally, we examine the degree to which Stable Diffusion depic…
▽ More
Text-to-image generative AI models such as Stable Diffusion are used daily by millions worldwide. However, the extent to which these models exhibit racial and gender stereotypes is not yet fully understood. Here, we document significant biases in Stable Diffusion across six races, two genders, 32 professions, and eight attributes. Additionally, we examine the degree to which Stable Diffusion depicts individuals of the same race as being similar to one another. This analysis reveals significant racial homogenization, e.g., depicting nearly all middle eastern men as dark-skinned, bearded, and wearing a traditional headdress. We then propose novel debiasing solutions that address the above stereotypes. Finally, using a preregistered experiment, we show that being presented with inclusive AI-generated faces reduces people's racial and gender biases, while being presented with non-inclusive ones increases such biases. This persists regardless of whether the images are labeled as AI-generated. Taken together, our findings emphasize the need to address biases and stereotypes in AI-generated content.
△ Less
Submitted 10 May, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
Network Members Can Hide from Group Centrality Measures
Authors:
Marcin Waniek,
Talal Rahwan
Abstract:
Group centrality measures are a generalization of standard centrality, designed to quantify the importance of not just a single node (as is the case with standard measures) but rather that of a group of nodes. Some nodes may have an incentive to evade such measures, i.e., to hide their actual importance, in order to conceal their true role in the network. A number of studies have been proposed in…
▽ More
Group centrality measures are a generalization of standard centrality, designed to quantify the importance of not just a single node (as is the case with standard measures) but rather that of a group of nodes. Some nodes may have an incentive to evade such measures, i.e., to hide their actual importance, in order to conceal their true role in the network. A number of studies have been proposed in the literature to understand how nodes can rewire the network in order to evade standard centrality, but no study has focused on group centrality to date. We close this gap by analyzing four group centrality measures: degree, closeness, betweenness, and GED-walk. We show that an optimal way to rewire the network can be computed efficiently given the former measure, but the problem is NP-complete given closeness and betweenness. Moreover, we empirically evaluate a number of hiding strategies, and show that an optimal way to hide from degree group centrality is also effective in practice against the other measures. Altogether, our results suggest that it is possible to hide from group centrality measures based solely on the local information available to the group members about the network topology.
△ Less
Submitted 15 December, 2023;
originally announced December 2023.
-
Coupled-Space Attacks against Random-Walk-based Anomaly Detection
Authors:
Yuni Lai,
Marcin Waniek,
Liying Li,
Jingwen Wu,
Yulin Zhu,
Tomasz P. Michalak,
Talal Rahwan,
Kai Zhou
Abstract:
Random Walks-based Anomaly Detection (RWAD) is commonly used to identify anomalous patterns in various applications. An intriguing characteristic of RWAD is that the input graph can either be pre-existing or constructed from raw features. Consequently, there are two potential attack surfaces against RWAD: graph-space attacks and feature-space attacks. In this paper, we explore this vulnerability b…
▽ More
Random Walks-based Anomaly Detection (RWAD) is commonly used to identify anomalous patterns in various applications. An intriguing characteristic of RWAD is that the input graph can either be pre-existing or constructed from raw features. Consequently, there are two potential attack surfaces against RWAD: graph-space attacks and feature-space attacks. In this paper, we explore this vulnerability by designing practical coupled-space attacks, investigating the interplay between graph-space and feature-space attacks. To this end, we conduct a thorough complexity analysis, proving that attacking RWAD is NP-hard. Then, we proceed to formulate the graph-space attack as a bi-level optimization problem and propose two strategies to solve it: alternative iteration (alterI-attack) or utilizing the closed-form solution of the random walk model (cf-attack). Finally, we utilize the results from the graph-space attacks as guidance to design more powerful feature-space attacks (i.e., graph-guided attacks). Comprehensive experiments demonstrate that our proposed attacks are effective in enabling the target nodes from RWAD with a limited attack budget. In addition, we conduct transfer attack experiments in a black-box setting, which show that our feature attack significantly decreases the anomaly scores of target nodes. Our study opens the door to studying the coupled-space attack against graph anomaly detection in which the graph space relies on the feature space.
△ Less
Submitted 23 October, 2023; v1 submitted 26 July, 2023;
originally announced July 2023.
-
Editors handle their collaborators' submissions despite explicit policies
Authors:
Fengyuan Liu,
Bedoor AlShebli,
Talal Rahwan
Abstract:
Editors are crucial to the integrity of the scientific publishing process, yet they themselves could face conflicts of interest (COIs), whereby their personal interests interfere with their editorial duties. One such COI stems from the fact that, apart from a few exceptions, the vast majority of editors are research-active scientists with many collaborators. Each such editor could potentially hand…
▽ More
Editors are crucial to the integrity of the scientific publishing process, yet they themselves could face conflicts of interest (COIs), whereby their personal interests interfere with their editorial duties. One such COI stems from the fact that, apart from a few exceptions, the vast majority of editors are research-active scientists with many collaborators. Each such editor could potentially handle submissions from their recent collaborators, allowing the editor to use their power, consciously or otherwise, to treat such submissions favourably, thereby jeopardizing the integrity of the editorial decision. Naturally, a number of policies have been put in place to govern such COI, but their effectiveness remains unknown. We fill this gap by analyzing half a million papers handled by 60,000 different editors and published in 500 journals by six publishers, namely Frontiers, Hindawi, IEEE, MDPI, PLOS, and PNAS. We find numerous papers handled by editors who collaborated recently with the authors; this happens despite policies explicitly prohibiting such behavior. Overall, nearly 3% of journals have a COI rate $\geq$ 10%, and nearly half of them have a COI rate $\geq$ 2%. Moreover, leveraging three quasi-experiments, we find that COI policies have a limited, if any, effect on regulating this phenomenon. Finally, we find that editors are faster to accept submissions from their collaborators, raising the possibility of favoritism. These findings highlight the need for policy reform to assure the scientific community that all submissions are treated equally.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
HowkGPT: Investigating the Detection of ChatGPT-generated University Student Homework through Context-Aware Perplexity Analysis
Authors:
Christoforos Vasilatos,
Manaar Alam,
Talal Rahwan,
Yasir Zaki,
Michail Maniatakos
Abstract:
As the use of Large Language Models (LLMs) in text generation tasks proliferates, concerns arise over their potential to compromise academic integrity. The education sector currently tussles with distinguishing student-authored homework assignments from AI-generated ones. This paper addresses the challenge by introducing HowkGPT, designed to identify homework assignments generated by AI. HowkGPT i…
▽ More
As the use of Large Language Models (LLMs) in text generation tasks proliferates, concerns arise over their potential to compromise academic integrity. The education sector currently tussles with distinguishing student-authored homework assignments from AI-generated ones. This paper addresses the challenge by introducing HowkGPT, designed to identify homework assignments generated by AI. HowkGPT is built upon a dataset of academic assignments and accompanying metadata [17] and employs a pretrained LLM to compute perplexity scores for student-authored and ChatGPT-generated responses. These scores then assist in establishing a threshold for discerning the origin of a submitted assignment. Given the specificity and contextual nature of academic work, HowkGPT further refines its analysis by defining category-specific thresholds derived from the metadata, enhancing the precision of the detection. This study emphasizes the critical need for effective strategies to uphold academic integrity amidst the growing influence of LLMs and provides an approach to ensuring fair and accurate grading in educational institutions.
△ Less
Submitted 7 June, 2023; v1 submitted 26 May, 2023;
originally announced May 2023.
-
Perception, performance, and detectability of conversational artificial intelligence across 32 university courses
Authors:
Hazem Ibrahim,
Fengyuan Liu,
Rohail Asim,
Balaraju Battu,
Sidahmed Benabderrahmane,
Bashar Alhafni,
Wifag Adnan,
Tuka Alhanai,
Bedoor AlShebli,
Riyadh Baghdadi,
Jocelyn J. Bélanger,
Elena Beretta,
Kemal Celik,
Moumena Chaqfeh,
Mohammed F. Daqaq,
Zaynab El Bernoussi,
Daryl Fougnie,
Borja Garcia de Soto,
Alberto Gandolfi,
Andras Gyorgy,
Nizar Habash,
J. Andrew Harris,
Aaron Kaufman,
Lefteris Kirousis,
Korhan Kocak
, et al. (14 additional authors not shown)
Abstract:
The emergence of large language models has led to the development of powerful tools such as ChatGPT that can produce text indistinguishable from human-generated work. With the increasing accessibility of such technology, students across the globe may utilize it to help with their school work -- a possibility that has sparked discussions on the integrity of student evaluations in the age of artific…
▽ More
The emergence of large language models has led to the development of powerful tools such as ChatGPT that can produce text indistinguishable from human-generated work. With the increasing accessibility of such technology, students across the globe may utilize it to help with their school work -- a possibility that has sparked discussions on the integrity of student evaluations in the age of artificial intelligence (AI). To date, it is unclear how such tools perform compared to students on university-level courses. Further, students' perspectives regarding the use of such tools, and educators' perspectives on treating their use as plagiarism, remain unknown. Here, we compare the performance of ChatGPT against students on 32 university-level courses. We also assess the degree to which its use can be detected by two classifiers designed specifically for this purpose. Additionally, we conduct a survey across five countries, as well as a more in-depth survey at the authors' institution, to discern students' and educators' perceptions of ChatGPT's use. We find that ChatGPT's performance is comparable, if not superior, to that of students in many courses. Moreover, current AI-text classifiers cannot reliably detect ChatGPT's use in school work, due to their propensity to classify human-written answers as AI-generated, as well as the ease with which AI-generated text can be edited to evade detection. Finally, we find an emerging consensus among students to use the tool, and among educators to treat this as plagiarism. Our findings offer insights that could guide policy discussions addressing the integration of AI into educational frameworks.
△ Less
Submitted 7 May, 2023;
originally announced May 2023.
-
Human intuition as a defense against attribute inference
Authors:
Marcin Waniek,
Navya Suri,
Abdullah Zameek,
Bedoor AlShebli,
Talal Rahwan
Abstract:
Attribute inference - the process of analyzing publicly available data in order to uncover hidden information - has become a major threat to privacy, given the recent technological leap in machine learning. One way to tackle this threat is to strategically modify one's publicly available data in order to keep one's private information hidden from attribute inference. We evaluate people's ability t…
▽ More
Attribute inference - the process of analyzing publicly available data in order to uncover hidden information - has become a major threat to privacy, given the recent technological leap in machine learning. One way to tackle this threat is to strategically modify one's publicly available data in order to keep one's private information hidden from attribute inference. We evaluate people's ability to perform this task, and compare it against algorithms designed for this purpose. We focus on three attributes: the gender of the author of a piece of text, the country in which a set of photos was taken, and the link missing from a social network. For each of these attributes, we find that people's effectiveness is inferior to that of AI, especially when it comes to hiding the attribute in question. Moreover, when people are asked to modify the publicly available information in order to hide these attributes, they are less likely to make high-impact modifications compared to AI. This suggests that people are unable to recognize the aspects of the data that are critical to an inference algorithm. Taken together, our findings highlight the limitations of relying on human intuition to protect privacy in the age of AI, and emphasize the need for algorithmic support to protect private information from attribute inference.
△ Less
Submitted 24 April, 2023;
originally announced April 2023.
-
China and the U.S. produce more impactful AI research when collaborating together
Authors:
Bedoor AlShebli,
Shahan Ali Memon,
James A. Evans,
Talal Rahwan
Abstract:
Artificial Intelligence (AI) has become a disruptive technology, promising to grant a significant economic and strategic advantage to the nations that harness its power. China, with its recent push towards AI adoption, is challenging the U.S.'s position as the global leader in this field. Given AI's massive potential, as well as the fierce geopolitical tensions between the two nations, a number of…
▽ More
Artificial Intelligence (AI) has become a disruptive technology, promising to grant a significant economic and strategic advantage to the nations that harness its power. China, with its recent push towards AI adoption, is challenging the U.S.'s position as the global leader in this field. Given AI's massive potential, as well as the fierce geopolitical tensions between the two nations, a number of policies have been put in place that discourage AI scientists from migrating to, or collaborating with, the other country. However, the extents of such brain drain and cross-border collaboration are not fully understood. Here, we analyze a dataset of over 350,000 AI scientists and 5,000,000 AI papers. We find that, since the year 2000, China and the U.S. have been leading the field in terms of impact, novelty, productivity, and workforce. Most AI scientists who migrate to China come from the U.S., and most who migrate to the U.S. come from China, highlighting a notable brain drain in both directions. Upon migrating from one country to the other, scientists continue to collaborate frequently with the origin country. Although the number of collaborations between the two countries has been increasing since the dawn of the millennium, such collaborations continue to be relatively rare. A matching experiment reveals that the two countries have always been more impactful when collaborating than when each of them works without the other. These findings suggest that instead of suppressing cross-border migration and collaboration between the two nations, the field could benefit from promoting such activities.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
Gender inequality and self-publication patterns among scientific editors
Authors:
Fengyuan Liu,
Petter Holme,
Matteo Chiesa,
Bedoor AlShebli,
Talal Rahwan
Abstract:
Academic publishing is the principal medium of documenting and disseminating scientific discoveries. At the heart of its daily operations are the editorial boards. Despite their activities and recruitment often being opaque to outside observers, they play a crucial role in promoting fair evaluations and gender parity. Literature on gender inequality lacks the connection between women as editors an…
▽ More
Academic publishing is the principal medium of documenting and disseminating scientific discoveries. At the heart of its daily operations are the editorial boards. Despite their activities and recruitment often being opaque to outside observers, they play a crucial role in promoting fair evaluations and gender parity. Literature on gender inequality lacks the connection between women as editors and as research-active scientists, thereby missing the comparison between the gender balances in these two academic roles. Literature on editorial fairness similarly lacks longitudinal studies on the conflicts of interest arising from editors being research active, which motivates them to expedite the publication of their papers. We fill these gaps using a dataset of 103,000 editors, 240 million authors, and 220 million publications spanning five decades and 15 disciplines. This unique dataset allows us to compare the proportion of female editors to that of female scientists in any given year or discipline. Although women are already underrepresented in science (26%), they are even more so among editors (14%) and editors-in-chief (8%); the lack of women with long-enough publishing careers explains the gender gap among editors, but not editors-in-chief, suggesting that other factors may be at play. Our dataset also allows us to study the self-publication patterns of editors, revealing that 8% of them double the rate at which they publish in their own journal soon after the editorship starts, and this behavior is accentuated in journals where the editors-in-chief self-publish excessively. Finally, men are more likely to engage in this behaviour than women.
△ Less
Submitted 23 June, 2022;
originally announced July 2022.
-
Hiding in Temporal Networks
Authors:
Marcin Waniek,
Petter Holme,
Talal Rahwan
Abstract:
Social network analysis tools can infer various attributes just by scrutinizing one's connections. Several researchers have studied the problem faced by an evader whose goal is to strategically rewire their social connections in order to mislead such tools, thereby concealing their private attributes. However, to date, this literature has only considered static networks, while neglecting the more…
▽ More
Social network analysis tools can infer various attributes just by scrutinizing one's connections. Several researchers have studied the problem faced by an evader whose goal is to strategically rewire their social connections in order to mislead such tools, thereby concealing their private attributes. However, to date, this literature has only considered static networks, while neglecting the more general case of temporal networks, where the structure evolves over time. Driven by this observation, we study how the evader can conceal their importance from an adversary armed with temporal centrality measures. We consider computational and structural aspects of this problem: Is it computationally feasible to calculate optimal ways of hiding? If it is, what network characteristics facilitate hiding? This topic has been studied in static networks, but in this work, we add realism to the problem by considering temporal networks of edges changing in time. We find that it is usually computationally infeasible to find the optimal way of hiding. On the other hand, by manipulating one's contacts, one could add a surprising amount of privacy. Compared to static networks, temporal networks offer more strategies for this type of manipulation and are thus, to some extent, easier to hide in.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
Social Diffusion Sources Can Escape Detection
Authors:
Marcin Waniek,
Manuel Cebrian,
Petter Holme,
Talal Rahwan
Abstract:
Influencing (and being influenced by) others through social networks is fundamental to all human societies. Whether this happens through the diffusion of rumors, opinions, or viruses, identifying the diffusion source (i.e., the person that initiated it) is a problem that has attracted much research interest. Nevertheless, existing literature has ignored the possibility that the source might strate…
▽ More
Influencing (and being influenced by) others through social networks is fundamental to all human societies. Whether this happens through the diffusion of rumors, opinions, or viruses, identifying the diffusion source (i.e., the person that initiated it) is a problem that has attracted much research interest. Nevertheless, existing literature has ignored the possibility that the source might strategically modify the network structure (by rewiring links or introducing fake nodes) to escape detection. Here, without restricting our analysis to any particular diffusion scenario, we close this gap by evaluating two mechanisms that hide the source-one stemming from the source's actions, the other from the network structure itself. This reveals that sources can easily escape detection, and that removing links is far more effective than introducing fake nodes. Thus, efforts should focus on exposing concealed ties rather than planted entities; such exposure would drastically improve our chances of detecting the diffusion source.
△ Less
Submitted 11 November, 2021; v1 submitted 21 February, 2021;
originally announced February 2021.
-
Strategic Evasion of Centrality Measures
Authors:
Marcin Waniek,
Jan Woźnica,
Kai Zhou,
Yevgeniy Vorobeychik,
Talal Rahwan,
Tomasz Michalak
Abstract:
Among the most fundamental tools for social network analysis are centrality measures, which quantify the importance of every node in the network. This centrality analysis typically disregards the possibility that the network may have been deliberately manipulated to mislead the analysis. To solve this problem, a recent study attempted to understand how a member of a social network could rewire the…
▽ More
Among the most fundamental tools for social network analysis are centrality measures, which quantify the importance of every node in the network. This centrality analysis typically disregards the possibility that the network may have been deliberately manipulated to mislead the analysis. To solve this problem, a recent study attempted to understand how a member of a social network could rewire the connections therein to avoid being identified as a leader of that network. However, the study was based on the assumption that the network analyzer - the seeker - is oblivious to any evasion attempts by the evader. In this paper, we relax this assumption by modelling the seeker and evader as strategic players in a Bayesian Stackelberg game. In this context, we study the complexity of various optimization problems, and analyze the equilibria of the game under different assumptions, thereby drawing the first conclusions in the literature regarding which centralities the seeker should use to maximize the chances of detecting a strategic evader.
△ Less
Submitted 26 January, 2021;
originally announced January 2021.
-
Traffic networks are vulnerable to disinformation attacks
Authors:
Marcin Waniek,
Gururaghav Raman,
Bedoor AlShebli,
Jimmy Chih-Hsien Peng,
Talal Rahwan
Abstract:
Disinformation continues to attract attention due to its increasing threat to society. Nevertheless, a disinformation-based attack on critical infrastructure has never been studied to date. Here, we consider traffic networks and focus on fake information that manipulates drivers' decisions to create congestion. We study the optimization problem faced by the adversary when choosing which streets to…
▽ More
Disinformation continues to attract attention due to its increasing threat to society. Nevertheless, a disinformation-based attack on critical infrastructure has never been studied to date. Here, we consider traffic networks and focus on fake information that manipulates drivers' decisions to create congestion. We study the optimization problem faced by the adversary when choosing which streets to target to maximize disruption. We prove that finding an optimal solution is computationally intractable, implying that the adversary has no choice but to settle for suboptimal heuristics. We analyze one such heuristic, and compare the cases when targets are spread across the city of Chicago vs. concentrated in its business district. Surprisingly, the latter results in more far-reaching disruption, with its impact felt as far as 2 kilometers from the closest target. Our findings demonstrate that vulnerabilities in critical infrastructure may arise not only from hardware and software, but also from behavioral manipulation.
△ Less
Submitted 8 March, 2020;
originally announced March 2020.
-
Hiding in Multilayer Networks
Authors:
Marcin Waniek,
Tomasz P. Michalak,
Talal Rahwan
Abstract:
Multilayer networks allow for modeling complex relationships, where individuals are embedded in multiple social networks at the same time. Given the ubiquity of such relationships, these networks have been increasingly gaining attention in the literature. This paper presents the first analysis of the robustness of centrality measures against strategic manipulation in multilayer networks. More spec…
▽ More
Multilayer networks allow for modeling complex relationships, where individuals are embedded in multiple social networks at the same time. Given the ubiquity of such relationships, these networks have been increasingly gaining attention in the literature. This paper presents the first analysis of the robustness of centrality measures against strategic manipulation in multilayer networks. More specifically, we consider an "evader" who strategically chooses which connections to form in a multilayer network in order to obtain a low centrality-based ranking-thereby reducing the chance of being highlighted as a key figure in the network-while ensuring that she remains connected to a certain group of people. We prove that determining an optimal way to "hide" is NP-complete and hard to approximate for most centrality measures considered in our study. Moreover, we empirically evaluate a number of heuristics that the evader can use. Our results suggest that the centrality measures that are functions of the entire network topology are more robust to such a strategic evader than their counterparts which consider each layer separately.
△ Less
Submitted 14 November, 2019;
originally announced November 2019.
-
The Impact of Informal Mentorship in Academic Collaborations
Authors:
Bedoor AlShebli,
Kinga Makovi,
Talal Rahwan
Abstract:
Inspired by the numerous benefits of mentorship in academia, we study "informal mentorship" in scientific collaborations, whereby a junior scientist is supported by multiple senior collaborators, without them necessarily having any formal supervisory roles. To this end, we analyze 2.5 million unique pairs of mentor-protégés spanning 9 disciplines and over a century of research, and we show that me…
▽ More
Inspired by the numerous benefits of mentorship in academia, we study "informal mentorship" in scientific collaborations, whereby a junior scientist is supported by multiple senior collaborators, without them necessarily having any formal supervisory roles. To this end, we analyze 2.5 million unique pairs of mentor-protégés spanning 9 disciplines and over a century of research, and we show that mentorship quality has a causal effect on the scientific impact of the papers written by the protégé post mentorship. This effect increases with the number of mentors, and persists over time, across disciplines and university ranks. The effect also increases with the academic age of the mentors until they reach 30 years of experience, after which it starts to decrease. Furthermore, we study how the gender of both the mentors and their protégé affect not only the impact of the protégé post mentorship, but also the citation gain of the mentors during the mentorship experience with their protégé. We find that increasing the proportion of female mentors decreases the impact of the protégé, while also compromising the gain of female mentors. While current policies that have been encouraging junior females to be mentored by senior females have been instrumental in retaining women in science, our findings suggest that the impact of women who remain in academia may increase by encouraging opposite-gender mentorships instead.
△ Less
Submitted 10 August, 2019;
originally announced August 2019.
-
How weaponizing disinformation can bring down a city's power grid
Authors:
Gururaghav Raman,
Bedoor AlShebli,
Marcin Waniek,
Talal Rahwan,
Jimmy Chih-Hsien Peng
Abstract:
Social technologies have made it possible to propagate disinformation and manipulate the masses at an unprecedented scale. This is particularly alarming from a security perspective, as humans have proven to be the weakest link when protecting critical infrastructure in general, and the power grid in particular. Here, we consider an attack in which an adversary attempts to manipulate the behavior o…
▽ More
Social technologies have made it possible to propagate disinformation and manipulate the masses at an unprecedented scale. This is particularly alarming from a security perspective, as humans have proven to be the weakest link when protecting critical infrastructure in general, and the power grid in particular. Here, we consider an attack in which an adversary attempts to manipulate the behavior of energy consumers by sending fake discount notifications encouraging them to shift their consumption into the peak-demand period. We conduct surveys to assess the propensity of people to follow-through on such notifications and forward them to their friends. This allows us to model how the disinformation propagates through social networks. Finally, using Greater London as a case study, we show that disinformation can indeed be used to orchestrate an attack wherein unwitting consumers synchronize their energy-usage patterns, resulting in blackouts on a city-scale. These findings demonstrate that in an era when disinformation can be weaponized, system vulnerabilities arise not only from the hardware and software of critical infrastructure, but also from the behavior of the consumers.
△ Less
Submitted 31 July, 2019;
originally announced August 2019.
-
Price of Anarchy in Algorithmic Matching of Romantic Partners
Authors:
Andrés Abeliuk,
Khaled Elbassioni,
Talal Rahwan,
Manuel Cebrian,
Iyad Rahwan
Abstract:
Algorithmic-matching sites offer users access to an unprecedented number of potential mates. However, they also pose a principal-agent problem with a potential moral hazard. The agent's interest is to maximize usage of the Web site, while the principal's interest is to find the best possible romantic partners. This creates a conflict of interest: optimally matching users would lead to stable coupl…
▽ More
Algorithmic-matching sites offer users access to an unprecedented number of potential mates. However, they also pose a principal-agent problem with a potential moral hazard. The agent's interest is to maximize usage of the Web site, while the principal's interest is to find the best possible romantic partners. This creates a conflict of interest: optimally matching users would lead to stable couples and fewer singles using the site, which is detrimental for the online dating industry. Here, we borrow the notion of Price-of-Anarchy from game theory to quantify the decrease in social efficiency of online dating sites caused by the agent's self-interest. We derive theoretical bounds on the price-of-anarchy, showing it can be bounded by a constant that does not depend on the number of users of the dating site. This suggests that as online dating sites grow, their potential benefits scale up without sacrificing social efficiency. Further, we performed experiments involving human subjects in a matching market, and compared the social welfare achieved by an optimal matching service against a self-interest matching algorithm. We show that by introducing competition among dating sites, the selfish behavior of agents aligns with its users, and social efficiency increases.
△ Less
Submitted 15 February, 2019; v1 submitted 8 January, 2019;
originally announced January 2019.
-
Attacking Similarity-Based Link Prediction in Social Networks
Authors:
Kai Zhou,
Tomasz P. Michalak,
Talal Rahwan,
Marcin Waniek,
Yevgeniy Vorobeychik
Abstract:
Link prediction is one of the fundamental problems in computational social science. A particularly common means to predict existence of unobserved links is via structural similarity metrics, such as the number of common neighbors; node pairs with higher similarity are thus deemed more likely to be linked. However, a number of applications of link prediction, such as predicting links in gang or ter…
▽ More
Link prediction is one of the fundamental problems in computational social science. A particularly common means to predict existence of unobserved links is via structural similarity metrics, such as the number of common neighbors; node pairs with higher similarity are thus deemed more likely to be linked. However, a number of applications of link prediction, such as predicting links in gang or terrorist networks, are adversarial, with another party incentivized to minimize its effectiveness by manipulating observed information about the network. We offer a comprehensive algorithmic investigation of the problem of attacking similarity-based link prediction through link deletion, focusing on two broad classes of such approaches, one which uses only local information about target links, and another which uses global network information. While we show several variations of the general problem to be NP-Hard for both local and global metrics, we exhibit a number of well-motivated special cases which are tractable. Additionally, we provide principled and empirically effective algorithms for the intractable cases, in some cases proving worst-case approximation guarantees.
△ Less
Submitted 31 December, 2018; v1 submitted 21 September, 2018;
originally announced September 2018.
-
Attack Tolerance of Link Prediction Algorithms: How to Hide Your Relations in a Social Network
Authors:
Marcin Waniek,
Kai Zhou,
Yevgeniy Vorobeychik,
Esteban Moro,
Tomasz P. Michalak,
Talal Rahwan
Abstract:
Link prediction is one of the fundamental research problems in network analysis. Intuitively, it involves identifying the edges that are most likely to be added to a given network, or the edges that appear to be missing from the network when in fact they are present. Various algorithms have been proposed to solve this problem over the past decades. For all their benefits, such algorithms raise ser…
▽ More
Link prediction is one of the fundamental research problems in network analysis. Intuitively, it involves identifying the edges that are most likely to be added to a given network, or the edges that appear to be missing from the network when in fact they are present. Various algorithms have been proposed to solve this problem over the past decades. For all their benefits, such algorithms raise serious privacy concerns, as they could be used to expose a connection between two individuals who wish to keep their relationship private. With this in mind, we investigate the ability of such individuals to evade link prediction algorithms. More precisely, we study their ability to strategically alter their connections so as to increase the probability that some of their connections remain unidentified by link prediction algorithms. We formalize this question as an optimization problem, and prove that finding an optimal solution is NP-complete. Despite this hardness, we show that the situation is not bleak in practice. In particular, we propose two heuristics that can easily be applied by members of the general public on existing social media. We demonstrate the effectiveness of those heuristics on a wide variety of networks and against a plethora of link prediction algorithms.
△ Less
Submitted 1 September, 2018;
originally announced September 2018.
-
The Preeminence of Ethnic Diversity in Scientific Collaboration
Authors:
Bedoor K AlShebli,
Talal Rahwan,
Wei Lee Woon
Abstract:
Inspired by the social and economic benefits of diversity, we analyze over 9 million papers and 6 million scientists to study the relationship between research impact and five classes of diversity: ethnicity, discipline, gender, affiliation, and academic age. Using randomized baseline models, we establish the presence of homophily in ethnicity, gender and affiliation. We then study the effect of d…
▽ More
Inspired by the social and economic benefits of diversity, we analyze over 9 million papers and 6 million scientists to study the relationship between research impact and five classes of diversity: ethnicity, discipline, gender, affiliation, and academic age. Using randomized baseline models, we establish the presence of homophily in ethnicity, gender and affiliation. We then study the effect of diversity on scientific impact, as reflected in citations. Remarkably, of the classes considered, ethnic diversity had the strongest correlation with scientific impact. To further isolate the effects of ethnic diversity, we used randomized baseline models and again found a clear link between diversity and impact. To further support these findings, we use coarsened exact matching to compare the scientific impact of ethnically diverse papers and scientists with closely-matched control groups. Here, we find that ethnic diversity resulted in an impact gain of 10.63% for papers, and 47.67% for scientists.
△ Less
Submitted 20 November, 2020; v1 submitted 6 March, 2018;
originally announced March 2018.
-
Game-theoretic Network Centrality: A Review
Authors:
Mateusz K. Tarkowski,
Tomasz P. Michalak,
Talal Rahwan,
Michael Wooldridge
Abstract:
Game-theoretic centrality is a flexible and sophisticated approach to identify the most important nodes in a network. It builds upon the methods from cooperative game theory and network theory. The key idea is to treat nodes as players in a cooperative game, where the value of each coalition is determined by certain graph-theoretic properties. Using solution concepts from cooperative game theory,…
▽ More
Game-theoretic centrality is a flexible and sophisticated approach to identify the most important nodes in a network. It builds upon the methods from cooperative game theory and network theory. The key idea is to treat nodes as players in a cooperative game, where the value of each coalition is determined by certain graph-theoretic properties. Using solution concepts from cooperative game theory, it is then possible to measure how responsible each node is for the worth of the network.
The literature on the topic is already quite large, and is scattered among game-theoretic and computer science venues. We review the main game-theoretic network centrality measures from both bodies of literature and organize them into two categories: those that are more focused on the connectivity of nodes, and those that are more focused on the synergies achieved by nodes in groups. We present and explain each centrality, with a focus on algorithms and complexity.
△ Less
Submitted 30 December, 2017;
originally announced January 2018.
-
Automatic HVAC Control with Real-time Occupancy Recognition and Simulation-guided Model Predictive Control in Low-cost Embedded System
Authors:
Muhammad Aftab,
Chien Chen,
Chi-Kin Chau,
Talal Rahwan
Abstract:
Intelligent building automation systems can reduce the energy consumption of heating, ventilation and air-conditioning (HVAC) units by sensing the comfort requirements automatically and scheduling the HVAC operations dynamically. Traditional building automation systems rely on fairly inaccurate occupancy sensors and basic predictive control using oversimplified building thermal response models, al…
▽ More
Intelligent building automation systems can reduce the energy consumption of heating, ventilation and air-conditioning (HVAC) units by sensing the comfort requirements automatically and scheduling the HVAC operations dynamically. Traditional building automation systems rely on fairly inaccurate occupancy sensors and basic predictive control using oversimplified building thermal response models, all of which prevent such systems from reaching their full potential. Such limitations can now be avoided due to the recent developments in embedded system technologies, which provide viable low-cost computing platforms with powerful processors and sizeable memory storage in a small footprint. As a result, building automation systems can now efficiently execute highly-sophisticated computational tasks, such as real-time video processing and accurate thermal-response simulations. With this in mind, we designed and implemented an occupancy-predictive HVAC control system in a low-cost yet powerful embedded system (using Raspberry Pi 3) to demonstrate the following key features for building automation: (1) real-time occupancy recognition using video-processing and machine-learning techniques, (2) dynamic analysis and prediction of occupancy patterns, and (3) model predictive control for HVAC operations guided by real-time building thermal response simulations (using an on-board EnergyPlus simulator). We deployed and evaluated our system for providing automatic HVAC control in the large public indoor space of a mosque, thereby achieving significant energy savings.
△ Less
Submitted 17 August, 2017;
originally announced August 2017.
-
Hiding Individuals and Communities in a Social Network
Authors:
Marcin Waniek,
Tomasz Michalak,
Talal Rahwan,
Michael Wooldridge
Abstract:
The Internet and social media have fueled enormous interest in social network analysis. New tools continue to be developed and used to analyse our personal connections, with particular emphasis on detecting communities or identifying key individuals in a social network. This raises privacy concerns that are likely to exacerbate in the future. With this in mind, we ask the question: Can individuals…
▽ More
The Internet and social media have fueled enormous interest in social network analysis. New tools continue to be developed and used to analyse our personal connections, with particular emphasis on detecting communities or identifying key individuals in a social network. This raises privacy concerns that are likely to exacerbate in the future. With this in mind, we ask the question: Can individuals or groups actively manage their connections to evade social network analysis tools?
By addressing this question, the general public may better protect their privacy, oppressed activist groups may better conceal their existence, and security agencies may better understand how terrorists escape detection. We first study how an individual can evade "network centrality" analysis without compromising his or her influence within the network. We prove that an optimal solution to this problem is hard to compute. Despite this hardness, we demonstrate that even a simple heuristic, whereby attention is restricted to the individual's immediate neighbourhood, can be surprisingly effective in practice. For instance, it could disguise Mohamed Atta's leading position within the WTC terrorist network, and that is by rewiring a strikingly-small number of connections. Next, we study how a community can increase the likelihood of being overlooked by community-detection algorithms. We propose a measure of concealment, expressing how well a community is hidden, and use it to demonstrate the effectiveness of a simple heuristic, whereby members of the community either "unfriend" certain other members, or "befriend" some non-members, in a coordinated effort to camouflage their community.
△ Less
Submitted 1 August, 2016;
originally announced August 2016.
-
Coalition Structure Generation on Graphs
Authors:
Talal Rahwan,
Tomasz P. Michalak
Abstract:
Two fundamental algorithm-design paradigms are Tree Search and Dynamic Programming. The techniques used therein have been shown to complement one another when solving the complete set partitioning problem, also known as the coalition structure generation problem [5]. Inspired by this observation, we develop in this paper an algorithm to solve the coalition structure generation problem on graphs, w…
▽ More
Two fundamental algorithm-design paradigms are Tree Search and Dynamic Programming. The techniques used therein have been shown to complement one another when solving the complete set partitioning problem, also known as the coalition structure generation problem [5]. Inspired by this observation, we develop in this paper an algorithm to solve the coalition structure generation problem on graphs, where the goal is to identifying an optimal partition of a graph into connected subgraphs. More specifically, we develop a new depth-first search algorithm, and combine it with an existing dynamic programming algorithm due to Vinyals et al. [9]. The resulting hybrid algorithm is empirically shown to significantly outperform both its constituent parts when the subset-evaluation function happens to have certain intuitive properties.
△ Less
Submitted 23 August, 2018; v1 submitted 23 October, 2014;
originally announced October 2014.
-
A Measure of Synergy in Coalitions
Authors:
Talal Rahwan,
Tomasz Michalak,
Michael Wooldridge
Abstract:
When the performance of a team of agents exceeds our expectations or fall short of them, we often explain this by saying that there was some synergy in the team---either positive (the team exceeded our expectations) or negative (they fell short). Our aim in this article is to develop a formal and principled way of measuring synergies, both positive and negative. Using characteristic function coope…
▽ More
When the performance of a team of agents exceeds our expectations or fall short of them, we often explain this by saying that there was some synergy in the team---either positive (the team exceeded our expectations) or negative (they fell short). Our aim in this article is to develop a formal and principled way of measuring synergies, both positive and negative. Using characteristic function cooperative games as our underlying model, we present a formal measure of synergy, based on the idea that a synergy is exhibited when the performance of a team deviates from the norm. We then show that our synergy value is the only possible such measure that satisfies certain intuitive properties. We then investigate some alternative characterisations of this measure.
△ Less
Submitted 10 April, 2014;
originally announced April 2014.
-
Towards a Fair Allocation of Rewards in Multi-Level Marketing
Authors:
Talal Rahwan,
Victor Naroditskiy,
Tomasz Michalak,
Michael Wooldridge,
Nicholas R Jennings
Abstract:
An increasing number of businesses and organisations rely on existing users for finding new users or spreading a message. One of the widely used "refer-a-friend" mechanisms offers an equal reward to both the referrer and the invitee. This mechanism provides incentives for direct referrals and is fair to the invitee. On the other hand, multi-level marketing and recent social mobilisation experiment…
▽ More
An increasing number of businesses and organisations rely on existing users for finding new users or spreading a message. One of the widely used "refer-a-friend" mechanisms offers an equal reward to both the referrer and the invitee. This mechanism provides incentives for direct referrals and is fair to the invitee. On the other hand, multi-level marketing and recent social mobilisation experiments focus on mechanisms that incentivise both direct and indirect referrals. Such mechanisms share the reward for inviting a new member among the ancestors, usually in geometrically decreasing shares. A new member receives nothing at the time of joining. We study fairness in multi-level marketing mechanisms. We show how characteristic function games can be used to model referral marketing, show how the canonical fairness concept of the Shapley value can be applied to this setting, and establish the complexity of finding the Shapley value in each class, and provide a comparison of the Shapley value-based mechanism to existing referral mechanisms.
△ Less
Submitted 2 April, 2014;
originally announced April 2014.
-
An Anytime Algorithm for Optimal Coalition Structure Generation
Authors:
Talal Rahwan,
Sarvapali Dyanand Ramchurn,
Nicholas Robert Jennings,
Andrea Giovannucci
Abstract:
Coalition formation is a fundamental type of interaction that involves the creation of coherent groupings of distinct, autonomous, agents in order to efficiently achieve their individual or collective goals. Forming effective coalitions is a major research challenge in the field of multi-agent systems. Central to this endeavour is the problem of determining which of the many possible coalitions t…
▽ More
Coalition formation is a fundamental type of interaction that involves the creation of coherent groupings of distinct, autonomous, agents in order to efficiently achieve their individual or collective goals. Forming effective coalitions is a major research challenge in the field of multi-agent systems. Central to this endeavour is the problem of determining which of the many possible coalitions to form in order to achieve some goal. This usually requires calculating a value for every possible coalition, known as the coalition value, which indicates how beneficial that coalition would be if it was formed. Once these values are calculated, the agents usually need to find a combination of coalitions, in which every agent belongs to exactly one coalition, and by which the overall outcome of the system is maximized. However, this coalition structure generation problem is extremely challenging due to the number of possible solutions that need to be examined, which grows exponentially with the number of agents involved. To date, therefore, many algorithms have been proposed to solve this problem using different techniques ranging from dynamic programming, to integer programming, to stochastic search all of which suffer from major limitations relating to execution time, solution quality, and memory requirements.
With this in mind, we develop an anytime algorithm to solve the coalition structure generation problem. Specifically, the algorithm uses a novel representation of the search space, which partitions the space of possible solutions into sub-spaces such that it is possible to compute upper and lower bounds on the values of the best coalition structures in them. These bounds are then used to identify the sub-spaces that have no potential of containing the optimal solution so that they can be pruned. The algorithm, then, searches through the remaining sub-spaces very efficiently using a branch-and-bound technique to avoid examining all the solutions within the searched subspace(s). In this setting, we prove that our algorithm enumerates all coalition structures efficiently by avoiding redundant and invalid solutions automatically. Moreover, in order to effectively test our algorithm we develop a new type of input distribution which allows us to generate more reliable benchmarks compared to the input distributions previously used in the field. Given this new distribution, we show that for 27 agents our algorithm is able to find solutions that are optimal in 0.175% of the time required by the fastest available algorithm in the literature. The algorithm is anytime, and if interrupted before it would have normally terminated, it can still provide a solution that is guaranteed to be within a bound from the optimal one. Moreover, the guarantees we provide on the quality of the solution are significantly better than those provided by the previous state of the art algorithms designed for this purpose. For example, for the worst case distribution given 25 agents, our algorithm is able to find a 90% efficient solution in around 10% of time it takes to find the optimal solution.
△ Less
Submitted 15 January, 2014;
originally announced January 2014.
-
Bounding the Estimation Error of Sampling-based Shapley Value Approximation
Authors:
Sasan Maleki,
Long Tran-Thanh,
Greg Hines,
Talal Rahwan,
Alex Rogers
Abstract:
The Shapley value is arguably the most central normative solution concept in cooperative game theory. It specifies a unique way in which the reward from cooperation can be "fairly" divided among players. While it has a wide range of real world applications, its use is in many cases hampered by the hardness of its computation. A number of researchers have tackled this problem by (i) focusing on cla…
▽ More
The Shapley value is arguably the most central normative solution concept in cooperative game theory. It specifies a unique way in which the reward from cooperation can be "fairly" divided among players. While it has a wide range of real world applications, its use is in many cases hampered by the hardness of its computation. A number of researchers have tackled this problem by (i) focusing on classes of games where the Shapley value can be computed efficiently, or (ii) proposing representation formalisms that facilitate such efficient computation, or (iii) approximating the Shapley value in certain classes of games. For the classical \textit{characteristic function} representation, the only attempt to approximate the Shapley value for the general class of games is due to Castro \textit{et al.} \cite{castro}. While this algorithm provides a bound on the approximation error, this bound is \textit{asymptotic}, meaning that it only holds when the number of samples increases to infinity. On the other hand, when a finite number of samples is drawn, an unquantifiable error is introduced, meaning that the bound no longer holds. With this in mind, we provide non-asymptotic bounds on the estimation error for two cases: where (i) the \textit{variance}, and (ii) the \textit{range}, of the players' marginal contributions is known. Furthermore, for the second case, we show that when the range is significantly large relative to the Shapley value, the bound can be improved (from $O(\frac{r}{m})$ to $O(\sqrt{\frac{r}{m}})$). Finally, we propose, and demonstrate the effectiveness of using stratified sampling for improving the bounds further.
△ Less
Submitted 12 February, 2014; v1 submitted 18 June, 2013;
originally announced June 2013.
-
Matching Games with Additive Externalities
Authors:
Simina Brânzei,
Tomasz P. Michalak,
Talal Rahwan,
Kate Larson,
Nicholas R. Jennings
Abstract:
Two-sided matchings are an important theoretical tool used to model markets and social interactions. In many real life problems the utility of an agent is influenced not only by their own choices, but also by the choices that other agents make. Such an influence is called an externality. Whereas fully expressive representations of externalities in matchings require exponential space, in this paper…
▽ More
Two-sided matchings are an important theoretical tool used to model markets and social interactions. In many real life problems the utility of an agent is influenced not only by their own choices, but also by the choices that other agents make. Such an influence is called an externality. Whereas fully expressive representations of externalities in matchings require exponential space, in this paper we propose a compact model of externalities, in which the influence of a match on each agent is computed additively. In this framework, we analyze many-to-many and one-to-one matchings under neutral, optimistic, and pessimistic behaviour, and provide both computational hardness results and polynomial-time algorithms for computing stable outcomes.
△ Less
Submitted 16 July, 2012;
originally announced July 2012.