×

A commonsense knowledge-enabled textual analysis approach for financial market surveillance. (English) Zbl 1415.91336

Summary: Market surveillance systems (MSSs) are increasingly used to monitor trading activities in financial markets to maintain market integrity. Existing MSSs primarily focus on statistical analysis of market activity data and largely ignore textual market information, including, but not limited to, news reports and various social media. As suggested by both theoretical explorations in finance and prevailing market surveillance practice, unstructured market information holds major yet underexplored opportunities for surveillance. In this paper, we propose a news analysis approach with the help of commonsense knowledge to assess the risk of suspicious transactions identified in market activity analysis. Our approach explicitly models semantic relations between transactions and news articles and provides semantic references to words in news articles. We conducted experiments using data collected from a real-world market and found that our proposed approach significantly outperforms the existing methods, which are based on transaction characteristics or traditional textual analysis methods. Experiments also show that the performance advantage of the proposed approach mainly comes from the modeling of news-transaction relationships. The research contributes to the market surveillance literature and has significant practical implications.

MSC:

91G99 Actuarial science and mathematical finance
91-04 Software, source code, etc. for problems pertaining to game theory, economics, and finance

Software:

DBpedia
Full Text: DOI

References:

[1] Admati AR, Pfleiderer P (1988) Selling and trading on information in financial-markets. Amer. Econom. Rev. 78(2):96-103.
[2] Aggarwal RK, Wu G (2004) Stock market manipulation–theory and evidence. American Finance Assoc. Annual Meeting, San Diego, CA.
[3] Antweiler W, Frank MZ (2004) Is all that talk just noise? The information content of internet stock message boards. J. Finance 59(3):1259-1294. CrossRef
[4] Auer S, Bizer C, Lehmann J, Kobilarov G, Cyganiak R, Ives Z (2007) Dbpedia: A nucleus for a web of open data. Semantic Web: 6th Internat. Semantic Web Conf., Lecture Notes in Computer Science, Vol. 4825 (Springer, Berlin), 722-735. CrossRef
[5] Barberis N, Shleifer A, Vishny R (1998) A model of investor sentiment. J. Financial Econom. 49(3):307-343. CrossRef
[6] Bhogal J, Macfarlane A, Smith P (2007) A review of ontology based query expansion. Inform. Processing Management 43(4):866-886. CrossRef
[7] Brice P, Jiang W, Wan G (2011) A cluster-based context-tree model for multivariate data streams with applications to anomaly detection. INFORMS J. Comput. 23(3):364-376. Link · Zbl 1244.94018
[8] Chen CL, Tseng FSC, Liang T (2010) An integration of word net and fuzzy association rule mining for multi-label document clustering. Data Knowledge Engrg. 69(11):1208-1226. CrossRef
[9] Comerton-Forde C, Rydge J (2006) Market integrity and surveillance effort. J. Financial Servies Res. 29(2):149-172. CrossRef
[10] Cornell B, Sirri ER (1992) The reaction of investors and stock-prices to insider trading. J. Finance 47(3):1031-1059. CrossRef
[11] Curtis J, Cabral J, Baxter D (2006) On the application of the Cyc ontology to word sense disambiguation. Sutcliffe G, Goebel R, eds. Proc. 19th Internat. Florida Artificial Intelligence Res. Soc. Conf. (AAAI Press, Menlo Park, CA), 652-657.
[12] Das SR, Chen MY (2007) Yahoo! for Amazon: Sentiment extraction from small talk on the web. Management Sci. 53(9):1375-1388. Link
[13] Davis E (1990) Representations of Commonsense Knowledge (Morgan Kaufmann, San Francisco, CA).
[14] Gabrilovich E, Markovitch S (2007) Harnessing the expertise of 70,000 human editors: Knowledge-based feature generation for text categorization. J. Machine Learn. Res. 8:2297-2345.
[15] Gidofalvi G (2001) Using news articles to predict stock price movements. Technical report, Department of Computer Science and Engineering, University of California, San Diego.
[16] Goldberg H, Kirkland J, Lee D, Shyr P, Thakker D (2003) The NASD securities observation, news analysis and regulation system. Conf. Innovative Appl. Artificial Intelligence, Acapulco, Mexico.
[17] Green T (2004) Economic news and the impact of trading on bond prices. J. Finance 59(3):1203-1234. CrossRef
[18] Gupta R, Ratinov L (2008) Text categorization with knowledge transfer from heterogeneous data sources. National Conf. Artificial Intelligence, Chicago, Illinois.
[19] Halliday M, Hassan R (1989) Language, Context, and Text: Aspects of Language in a Social-Semiotic Perspective (Oxford University Press, Oxford, UK).
[20] Hong H, Lim T, Stein JC (2000) Bad news travels slowly: Size, analyst coverage, and the profitability of momentum strategies. J. Finance 55(1):265-295. CrossRef
[21] Hung CL, Wermter S, Smith P (2004) Hybrid neural document clustering using guided self-organization and wordnet. IEEE Intelligent Systems 19(2):68-77. CrossRef
[22] Järvelin K, Kekäläinen J, Niemi T (2001) Expansiontool: Concept-based query expansion and construction. Inform. Retrieval 4(3-4):231-255. CrossRef
[23] Kirkland JD, Senator TE, Hayden JJ, Dybala T, Goldberg HG, Shyr P (1999) The NASD regulation advanced-detection system (ADS). AI Magazine 20(1):55-67.
[24] Lenat DB, Guha RV (1989) Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project (Addison-Wesley Longman, Boston).
[25] Lucas HC (1993) Market expert surveillance system. Comm. ACM 36(12):27-34. CrossRef
[26] Mangkorntong P, Rabhi F (2009) A domain-driven approach for detecting event patterns in e-markets. World Wide Web 12(1):69-86. CrossRef
[27] Milosavljevic M, Delort J-Y, Hachey B, Arunasalam B, Radford W, Curran JR (2010) Automating financial surveillance. User Centric Media: First International Conference, UCMedia (Springer, Berlin), 305-311. CrossRef
[28] Mittermayer MA, Knolmayer GF (2006) Newscats: A news categorization and trading system. Internat. Conf. Data Mining, Hong Kong.CrossRef
[29] Ogut H, Doganay MM, Aktas R (2009) Detecting stock-price manipulation in an emerging market: The case of Turkey. Expert Systems with Appl. 36(9):11944-11949. CrossRef
[30] Palshikar GK, Apte MM (2008) Collusion set detection using graph clustering. Data Mining Knowledge Discovery 16(2):135-164. CrossRef
[31] Perfetti CA (1999) Comprehending written language: A blueprint of the reader. Brown C, Hagoort P, eds. The Neurocognition of Language Processing (Oxford University Press, Oxford, UK), 167-208.
[32] Pirrong C (2004) Detecting manipulation in futures markets: The ferruzzi soybean episode. Amer. Law Econom. Rev. 6(1):28-71. CrossRef
[33] Qu J, Qin W, Feng Y, Sai Y (2009) An outlier detection method based on voronoi diagram for financial surveillance. Internat. Workshop Intelligent Systems Appl. (IEEE, Wuhan, China), 1-4. CrossRef
[34] Rodier M (2011) Insider fraud: Watch out, your bank could be the next one to lose billions. InformationWeek, WallStreet and Technology. Accessed November 2012, http://www.wallstreetandtech.com/articles/231700249.
[35] Schumaker RP, Chen HC (2009) Textual analysis of stock market prediction using breaking financial news: The AZFinText system. ACM Trans. Inform. Systems 27(2):Article no. 12. CrossRef
[36] Selvaretnam B, Belkhatir M (2012) Natural language technology and query expansion: Issues, state-of-the-art and perspectives. J. Intelligent Inform. Systems 38(3):709-740. CrossRef
[37] Seo HC, Chung HJ, Rim HC, Myaeng SH, Kim SH (2004a) Unsupervised word sense disambiguation using wordnet relatives. Comput. Speech Language 18(3):253-273. CrossRef
[38] Seo YW, Giampapa JA, Sycara KP (2004b) Financial news analysis for intelligent portfolio management. Technical report CMU-RI-TR-04-04, Robotics Institute, Carnegie Mellon University, Pittsburgh, PA.
[39] Steinert-Threlkeld T (2011) Spending on surveillance systems to grow at double-digit rates. Securities Technology Monitor. Accessed March 2012, http://www.securitiestechnologymonitor.com/news/tabb-market-surveillance-systems-27153-1.html.
[40] Tetlock PC (2007) Giving content to investor sentiment: The role of media in the stock market. J. Finance 62(3):1139-1168. CrossRef
[41] Tetlock PC, Saar-Tsechansky M, Macskassy S (2008) More than words: Quantifying language to measure firms’ fundamentals. J. Finance 63(3):1437-1467. CrossRef
[42] Verhoeven L, Perfetti C (2008) Advances in text comprehension: Model, process and development. Appl. Cognitive Psych. 22(3):293-301. CrossRef
[43] Watts DJ (2002) A simple model of global cascades on random networks. PNAS 99(9):5766-5771. CrossRef · Zbl 1022.90001
[44] Xu SX, Zhang X(M) (2013) Impact of Wikipedia on market information environment: Evidence on management disclosure and investor reaction. MIS Quart. 37(4):1043-1068.
[45] Zheng HT, Kang BY, Kim HG (2009) Exploiting noun phrases and semantic relationships for text document clustering. Inform. Sci. 179(13):2249-2262. CrossRef
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.