Google Scholar

The risk of racial bias while tracking influenza-related content on social media using machine learning

B Lwowski, A Rios�- Journal of the American Medical Informatics�…, 2021 - academic.oup.com

Journal of the American Medical Informatics Association, 2021•academic.oup.com

Objective Machine learning is used to understand and track influenza-related content on
social media. Because these systems are used at scale, they have the potential to adversely
impact the people they are built to help. In this study, we explore the biases of different
machine learning methods for the specific task of detecting influenza-related content. We
compare the performance of each model on tweets written in Standard American English
(SAE) vs African American English (AAE). Materials and Methods Two influenza-related�…

Objective

Machine learning is used to understand and track influenza-related content on social media. Because these systems are used at scale, they have the potential to adversely impact the people they are built to help. In this study, we explore the biases of different machine learning methods for the specific task of detecting influenza-related content. We compare the performance of each model on tweets written in Standard American English (SAE) vs African American English (AAE).

Materials and Methods

Two influenza-related datasets are used to train 3 text classification models (support vector machine, convolutional neural network, bidirectional long short-term memory) with different feature sets. The datasets match real-world scenarios in which there is a large imbalance between SAE and AAE examples. The number of AAE examples for each class ranges from 2% to 5% in both datasets. We also evaluate each model's performance using a balanced dataset via undersampling.

Results

We find that all of the tested machine learning methods are biased on both datasets. The difference in false positive rates between SAE and AAE examples ranges from 0.01 to 0.35. The difference in the false negative rates ranges from 0.01 to 0.23. We also find that the neural network methods generally has more unfair results than the linear support vector machine on the chosen datasets.

Conclusions

The models that result in the most unfair predictions may vary from dataset to dataset. Practitioners should be aware of the potential harms related to applying machine learning to health-related social media data. At a minimum, we recommend evaluating fairness along with traditional evaluation metrics.

Oxford University Press

Show moreShow less

Save Cite Cited by 30 Related articles All 7 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

The risk of racial bias while tracking influenza-related content on social media using machine learning