Document Zbl 1497.68454

A model of fake data in data-driven analysis. (English) Zbl 1497.68454

J. Mach. Learn. Res. 21, Paper No. 3, 26 p. (2020).

Summary: Data-driven analysis has been increasingly used in various decision making processes. With more sources, including reviews, news, and pictures, can now be used for data analysis, the authenticity of data sources is in doubt. While previous literature attempted to detect fake data piece by piece, in the current work, we try to capture the fake data sender’s strategic behavior to detect the fake data source. Specifically, we model the tension between a data receiver who makes data-driven decisions and a fake data sender who benefits from misleading the receiver. We propose a potentially infinite horizon continuous time game-theoretic model with asymmetric information to capture the fact that the receiver does not initially know the existence of fake data and learns about it during the course of the game. We use point processes to model the data traffic, where each piece of data can occur at any discrete moment in a continuous time flow. We fully solve the model and employ numerical examples to illustrate the players’ strategies and payoffs for insights. Specifically, our results show that maintaining some suspicion about the data sources and understanding that the sender can be strategic are very helpful to the data receiver. In addition, based on our model, we propose a methodology of detecting fake data that is complementary to the previous studies on this topic, which suggested various approaches on analyzing the data piece by piece. We show that after analyzing each piece of data, understanding a source by looking at the its whole history of pushing data can be helpful.

MSC:

68T20	Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.)
60G55	Point processes (e.g., Poisson, Cox, Hawkes processes)
91A26	Rationality and learning in game theory
91B06	Decision theory

Keywords:

data-driven analysis; fake data; game theory; point process

Software:

CVIPtools

Cite Review PDF

Full Text: Link

References:

[1]	Hunt Allcott and Matthew Gentzkow. Social media and fake news in the 2016 election. Technical report, National Bureau of Economic Research, 2017.
[2]	Axel Anderson and Lones Smith. Dynamic deception.The American Economic Review, 103(7):2811-2847, 2013.
[3]	Athos Antonelli, Raffaele Cappelli, Dario Maio, and Davide Maltoni. Fake finger detection by skin distortion analysis.IEEE Transactions on Information Forensics and Security, 1 (3):360-373, 2006.
[4]	Richard Bellman and Robert E Kalaba.Dynamic Programming and Modern Control Theory, volume 81. Citeseer, 1965. · Zbl 0079.35202
[5]	Daryl J Daley and David Vere-Jones.An Introduction to the Theory of Point Processes: Volume II: General Theory and Structure. Springer Science & Business Media, 2007. · Zbl 1159.60003
[6]	Aditya Ganjam, Faisal Siddiqui, Jibin Zhan, Xi Liu, Ion Stoica, Junchen Jiang, Vyas Sekar, and Hui Zhang.C3: Internet-scale control plane for video quality optimization.In USENIX Symposium on Networked Systems Design and Implementation, pages 131-144, 2015.
[7]	Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. InAdvances in neural information processing systems, pages 2672-2680, 2014.
[8]	Junchen Jiang, Shijie Sun, Vyas Sekar, and Hui Zhang. Pytheas: Enabling data-driven quality of experience optimization using group-based exploration-exploitation. InUSENIX Symposium on Networked Systems Design and Implementation, 2017.
[9]	Emir Kamenica and Matthew Gentzkow. Bayesian persuasion.The American Economic Review, 101(6):2590-2615, 2011. · Zbl 1393.91020
[10]	Jooyeon Kim, Behzad Tabibian, Alice Oh, Bernhard Sch¨olkopf, and Manuel GomezRodriguez. Leveraging the crowd to detect and reduce the spread of fake news and misinformation. InACM International Conference on Web Search and Data Mining, pages 324-332. ACM, 2018.
[11]	Michael Luca and Georgios Zervas. Fake it till you make it: Reputation, competition, and yelp review fraud.Management Science, 62(12):3412-3427, 2016.
[12]	Justin Malbon. Taking fake online consumer reviews seriously.Journal of Consumer Policy, 36(2):139-157, 2013.
[13]	Eric Maskin and Jean Tirole. Markov perfect equilibrium: I. observable actions.Journal of Economic Theory, 100(2):191-219, 2001. · Zbl 1011.91022
[14]	Dina Mayzlin, Yaniv Dover, and Judith Chevalier. Promotional reviews: An empirical investigation of online review manipulation.The American Economic Review, 104(8): 2421-2455, 2014.
[15]	Sendhil Mullainathan and Andrei Shleifer. Media bias. Technical report, National Bureau of Economic Research, 2002.
[16]	Shereen Oraby, Lena Reed, Ryan Compton, Ellen Riloff, Marilyn Walker, and Steve Whittaker. And that’s a fact: Distinguishing factual and emotional argumentation in online dialogue.arXiv preprint arXiv:1709.05295, 2017.
[17]	Niels Ott, Ramon Ziai, Michael Hahn, and Detmar Meurers. Comet: Integrating different levels of linguistic modeling for meaning assessment. InJoint Conference on Lexical and Computational Semantics, Volume 2: International Workshop on Semantic Evaluation, volume 2, pages 608-616, 2013.
[18]	Victoria Rubin, Niall Conroy, Yimin Chen, and Sarah Cornwell. Fake news or truth? using satirical cues to detect potentially misleading news.InWorkshop on Computational Approaches to Deception Detection, pages 7-17, 2016.
[19]	Natali Ruchansky, Sungyong Seo, and Yan Liu. Csi: A hybrid deep model for fake news detection. InACM on Conference on Information and Knowledge Management, pages 797-806. ACM, 2017.
[20]	Zhan Shi, Gene Moo Lee, and Andrew B Whinston. Toward a better measure of business proximity: Topic modeling for industry intelligence.MIS quarterly, 40(4), 2016.
[21]	Kai Shu, Deepak Mahudeswaran, Suhang Wang, Dongwon Lee, and Huan Liu. Fakenewsnet: A data repository with news content, social context and dynamic information for studying fake news on social media.arXiv preprint arXiv:1809.01286, 2018.
[22]	Jost Tobias Springenberg. Unsupervised and semi-supervised learning with categorical generative adversarial networks.arXiv preprint arXiv:1511.06390, 2015.
[23]	Scott E Umbaugh.Computer Vision and Image Processing: A Practical Approach Using CViptools with Cdrom. Prentice Hall PTR, 1997.
[24]	Lizhen Xu, Jason A Duan, and Andrew Whinston. Path to purchase: A mutually exciting point process model for online advertising and conversion.Management Science, 60(6): 1392-1412, 2014.
[25]	Hu Zhang, Zhuohua Fan, Jiaheng Zheng, and Quanming Liu. An improving deception detection method in computer-mediated communication.Journal of Networks, 7(11): 1811, 2012.

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.