-
AI Does Not Alter Perceptions of Text Messages
Authors:
N'yoma Diamond
Abstract:
For many people, anxiety, depression, and other social and mental factors can make composing text messages an active challenge. To remedy this problem, large language models (LLMs) may yet prove to be the perfect tool to assist users that would otherwise find texting difficult or stressful. However, despite rapid uptake in LLM usage, considerations for their assistive usage in text message composi…
▽ More
For many people, anxiety, depression, and other social and mental factors can make composing text messages an active challenge. To remedy this problem, large language models (LLMs) may yet prove to be the perfect tool to assist users that would otherwise find texting difficult or stressful. However, despite rapid uptake in LLM usage, considerations for their assistive usage in text message composition have not been explored. A primary concern regarding LLM usage is that poor public sentiment regarding AI introduces the possibility that its usage may harm perceptions of AI-assisted text messages, making usage counter-productive. To (in)validate this possibility, we explore how the belief that a text message did or did not receive AI assistance in composition alters its perceived tone, clarity, and ability to convey intent. In this study, we survey the perceptions of 26 participants on 18 randomly labeled pre-composed text messages. In analyzing the participants' ratings of message tone, clarity, and ability to convey intent, we find that there is no statistically significant evidence that the belief that AI is utilized alters recipient perceptions. This provides hopeful evidence that LLM-based text message composition assistance can be implemented without the risk of counter-productive outcomes.
△ Less
Submitted 7 February, 2024; v1 submitted 27 January, 2024;
originally announced February 2024.
-
General Performance Evaluation for Competitive Resource Allocation Games via Unseen Payoff Estimation
Authors:
N'yoma Diamond,
Fabricio Murai
Abstract:
Many high-stakes decision-making problems, such as those found within cybersecurity and economics, can be modeled as competitive resource allocation games. In these games, multiple players must allocate limited resources to overcome their opponent(s), while minimizing any induced individual losses. However, existing means of assessing the performance of resource allocation algorithms are highly di…
▽ More
Many high-stakes decision-making problems, such as those found within cybersecurity and economics, can be modeled as competitive resource allocation games. In these games, multiple players must allocate limited resources to overcome their opponent(s), while minimizing any induced individual losses. However, existing means of assessing the performance of resource allocation algorithms are highly disparate and problem-dependent. As a result, evaluating such algorithms is unreliable or impossible in many contexts and applications, especially when considering differing levels of feedback. To resolve this problem, we propose a generalized definition of payoff which uses an arbitrary user-provided function. This unifies performance evaluation under all contexts and levels of feedback. Using this definition, we develop metrics for evaluating player performance, and estimators to approximate them under uncertainty (i.e., bandit or semi-bandit feedback). These metrics and their respective estimators provide a problem-agnostic means to contextualize and evaluate algorithm performance. To validate the accuracy of our estimator, we explore the Colonel Blotto ($\mathcal{CB}$) game as an example. To this end, we propose a graph-pruning approach to efficiently identify feasible opponent decisions, which are used in computing our estimation metrics. Using various resource allocation algorithms and game parameters, a suite of $\mathcal{CB}$ games are simulated and used to compute and evaluate the quality of our estimates. These simulations empirically show our approach to be highly accurate at estimating the metrics associated with the unseen outcomes of an opponent's latent behavior.
△ Less
Submitted 8 January, 2024;
originally announced January 2024.
-
Using Intermarket Data to Evaluate the Efficient Market Hypothesis with Machine Learning
Authors:
N'yoma Diamond,
Grant Perkins
Abstract:
In its semi-strong form, the Efficient Market Hypothesis (EMH) implies that technical analysis will not reveal any hidden statistical trends via intermarket data analysis. If technical analysis on intermarket data reveals trends which can be leveraged to significantly outperform the stock market, then the semi-strong EMH does not hold. In this work, we utilize a variety of machine learning techniq…
▽ More
In its semi-strong form, the Efficient Market Hypothesis (EMH) implies that technical analysis will not reveal any hidden statistical trends via intermarket data analysis. If technical analysis on intermarket data reveals trends which can be leveraged to significantly outperform the stock market, then the semi-strong EMH does not hold. In this work, we utilize a variety of machine learning techniques to empirically evaluate the EMH using stock market, foreign currency (Forex), international government bond, index future, and commodities future assets. We train five machine learning models on each dataset and analyze the average performance of these models for predicting the direction of future S&P 500 movement as approximated by the SPDR S&P 500 Trust ETF (SPY). From our analysis, the datasets containing bonds, index futures, and/or commodities futures data notably outperform baselines by substantial margins. Further, we find that the usage of intermarket data induce statistically significant positive impacts on the accuracy, macro F1 score, weighted F1 score, and area under receiver operating characteristic curve for a variety of models at the 95% confidence level. This provides strong empirical evidence contradicting the semi-strong EMH.
△ Less
Submitted 20 December, 2022; v1 submitted 16 December, 2022;
originally announced December 2022.