skip to main content
research-article

ELEVATE: A Framework for Entity-level Event Diffusion Prediction into Foreign Language Communities

Published: 25 June 2017 Publication History

Abstract

The accessibility to news via the Web or other "traditional" media allows a rapid diffusion of information into almost every part of the world. These reports cover the full spectrum of events, ranging from locally relevant ones up to those that gain global attention. The societal impact of an event can be relatively easily "measured" by the attention it attracts (e.g. in the number of responses it receives and/or provokes) in the news or social media. However, this does not necessarily reflect its inter-cultural impact and its diffusion into other communities. In order to address the issue of predicting the spread of information into foreign language communities we introduce the ELEVATE framework. ELEVATE exploits entity information from Web contents and harnesses location related data for language-related event diffusion prediction. Our experiments on event spreading across Wikipedia communities of different language demonstrate the viability of our approach and improvement over state-of-the-art approaches.

References

[1]
Lada A. Adamic and Eytan Adar. 2003. Friends and neighbors on the Web. Social Networks 25, 3 (2003), 211--230.
[2]
Byung Gyu Ahn, Benjamin Van Durme, and Chris Callison-Burch. 2011. WikiTopics: What is Popular on Wikipedia and Why. In Proceedings of the Workshop on Automatic Summarization for Different Genres, Media, and Languages (WASDGML '11). Association for Computational Linguistics, Stroudsburg, PA, USA, 33--40. http://dl.acm.org/citation.cfm?id=2018987.2018992
[3]
Farzindar Atefeh and Wael Khreich. 2013. A survey of techniques for event detection in Twitter. Computational Intelligence 31, 1 (2013), 132--164.
[4]
S. Auer, C. Bizer, G. Kobilarov, J. Lehmann, R. Cyganiak, and Z. G. Ives. 2007. DBpedia: A Nucleus for a Web of Open Data. In ISWC/ASWC. 722--735.
[5]
Nicola Barbieri, Francesco Bonchi, and Giuseppe Manco. 2014. Who to Follow and Why: Link Prediction with Explanations. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '14). ACM, New York, NY, USA, 1266--1275.
[6]
Kurt Bollacker, Colin Evans, Praveen Paritosh, Tim Sturge, and Jamie Taylor. 2008. Freebase: A Collaboratively Created Graph Database for Structuring Human Knowledge. In Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data (SIGMOD '08). ACM, New York, NY, USA, 1247--1250.
[7]
Amparo E. Cano, Andrea Varga, Matthew Rowe, Fabio Ciravegna, and Yulan He. 2013. Harnessing Linked Knowledge Sources for Topic Classification in Social Media. In Proceedings of the 24th ACM Conference on Hypertext and Social Media (HT '13). ACM, New York, NY, USA, 41--50.
[8]
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2 (2011), 27:1-- 27:27. Issue 3. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
[9]
Clauset, Aaron, Moore, Cristopher, and Newman, M. E. J. 2008. Hierarchical structure and the prediction of missing links in networks. Nature 453, 7191 (may 2008), 98--101.
[10]
Corinna Cortes and Vladimir Vapnik. 1995. Support-Vector Networks. Machine Learning 20, 3 (1995), 273--297.
[11]
Jeffrey Dalton, Laura Dietz, and James Allan. 2014. Entity Query Feature Expansion Using Knowledge Base Links. In Proceedings of the 37th International ACM SIGIR Conference on Research & Development in Information Retrieval (SIGIR '14). ACM, New York, NY, USA, 365--374.
[12]
Yuxiao Dong, Jing Zhang, Jie Tang, Nitesh V. Chawla, and Bai Wang. 2015. CoupledLP: Link Prediction in Coupled Networks. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '15). ACM, New York, NY, USA, 199--208.
[13]
Besnik Fetahu, Abhijit Anand, and Avishek Anand. 2015. How much is Wikipedia Lagging Behind News? In Proceedings of the ACM Web Science Conference. ACM, 28.
[14]
Ana Freire, Matteo Manca, Diego Saez-Trumper, David Laniado, Ilaria Bordino, Francesco Gullo, and Andreas Kaltenbrunner. 2016. Graph-Based Breaking News Detection on Wikipedia. Wiki Workshop, ICWSM 2016 6 (2016), 1.
[15]
J. Hoffart, F. M. Suchanek, K. Berberich, and G. Weikum. 2013. YAGO2: A spatially and temporally enhanced knowledge base from Wikipedia. Artificial Intelligence 194 (2013), 28--61.
[16]
J. Hoffart, M. Amir Yosef, I. Bordino, H. Fürstenau, M. Pinkal, M. Spaniol, S. Thater, and G. Weikum. 2011. Robust Disambiguation of Named Entities in Text. In Conference on Empirical Methods in Natural Language Processing (EMNLP). 782--792.
[17]
Paul Jaccard. 1901. Étude comparative de la distribution florale dans une portion des Alpes et des Jura. Bulletin del la Société Vaudoise des Sciences Naturelles 37 (1901), 547--579.
[18]
Nathan Kallus. 2014. Predicting Crowd Behavior with Big Public Data. In Proceedings of the 23rd International Conference on World Wide Web (WWW '14 Companion). ACM, New York, NY, USA, 625--630.
[19]
Myunghwan Kim and Jure Leskovec. The Network Completion Problem: Inferring Missing Nodes and Edges in Networks. 47--58.
[20]
David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks. Journal of the American Society for Information Science and Technology 58, 7 (2007), 1019--1031.
[21]
David Liben-Nowell and Jon Kleinberg. 2007. The link-prediction problem for social networks. journal of the Association for Information Science and Technology 58, 7 (2007), 1019--1031.
[22]
Linyuan Lü and Tao Zhou. 2011. Link prediction in complex networks: A survey. Physica A: Statistical Mechanics and its Applications 390, 6 (2011), 1150--1170.
[23]
David J. McIver and John S. Brownstein. 2014. Wikipedia usage estimates prevalence of influenza-like illness in the United States in near real-time. PLoS Comput Biol 10, 4 (2014), e1003581.
[24]
Pablo N. Mendes, Max Jakob, Andrés García-Silva, and Christian Bizer. 2011. DBpedia Spotlight: Shedding Light on the Web of Documents. In Proceedings of the 7th International Conference on Semantic Systems (I-Semantics '11). ACM, New York, NY, USA, 1--8.
[25]
M. E. J. Newman. 2001. Clustering and preferential attachment in growing networks. Phys. Rev. E 64 (Jul 2001), 025102. Issue 2.
[26]
Miles Osborne, Saša Petrovic, Richard McCreadie, Craig Macdonald, and Iadh Ounis. 2012. Bieber no more: First story detection using Twitter and Wikipedia. In SIGIR 2012 Workshop on Time-aware Information Access. ACM.
[27]
Saša Petrović, Miles Osborne, and Victor Lavrenko. 2010. Streaming First Story Detection with Application to Twitter. In Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT '10). Association for Computational Linguistics, Stroudsburg, PA, USA, 181--189. http://dl.acm.org/citation.cfm?id=1857999.1858020
[28]
Kira Radinsky and Eric Horvitz. 2013. Mining the Web to Predict Future Events. In Proceedings of the Sixth ACM International Conference on Web Search and Data Mining (WSDM '13). ACM, New York, NY, USA, 255--264.
[29]
Ryan Rifkin and Aldebaro Klautau. 2004. In Defense of One-Vs-All Classification. J. Mach. Learn. Res. 5 (Dec. 2004), 101--141. http://dl.acm.org/citation.cfm?id=1005332.1005336
[30]
Takeshi Sakaki, Makoto Okazaki, and Yutaka Matsuo. 2010. Earthquake Shakes Twitter Users: Real-time Event Detection by Social Sensors. In Proceedings of the 19th International Conference on World Wide Web (WWW '10). ACM, New York, NY, USA, 851--860.
[31]
Axel Schulz, Benedikt Schmidt, and Thorsten Strufe. 2015. Small-Scale Incident Detection Based on Microposts. In Proceedings of the 26th ACM Conference on Hypertext & Social Media (HT '15). ACM, New York, NY, USA, 3--12.
[32]
Michael Strube and Simone Paolo Ponzetto. 2006. WikiRelate! Computing Semantic Relatedness Using Wikipedia. In Proceedings of the 21st National Conference on Artificial Intelligence - Volume 2 (AAAI'06). AAAI Press, 1419--1424. http://dl.acm.org/citation.cfm?id=1597348.1597414
[33]
F. M. Suchanek, G. Kasneci, and G. Weikum. 2007. YAGO: A Core of Semantic Knowledge - Unifying WordNet and Wikipedia. In 16th International World Wide Web Conference (WWW 2007). ACM, 697--706.
[34]
Lei Tang, Suju Rajan, and Vijay K. Narayanan. 2009. Large Scale Multi-label Classification via Metalabeler. In Proceedings of the 18th International Conference on World Wide Web (WWW '09). ACM, New York, NY, USA, 211--220.
[35]
Grigorios Tsoumakas and Ioannis Katakis. 2006. Multi-label classification: An overview. International Journal of Data Warehousing and Mining 3, 3 (2006).
[36]
Ricardo Usbeck, Axel-Cyrille Ngonga Ngomo, Michael Röder, Daniel Gerber, Sandro Athaide Coelho, Sören Auer, and Andreas Both. 2014. AGDISTIS - GraphBased Disambiguation of Named Entities Using Linked Data. Springer International Publishing, Cham, 457--471.
[37]
Peng Wang, BaoWen Xu, YuRong Wu, and XiaoYu Zhou. 2015. Link prediction in social networks: the state-of-the-art. Science China Information Sciences 58, 1 (2015), 1--38.
[38]
Stewart Whiting, Joemon Jose, and Omar Alonso. 2014. Wikipedia As a Time Machine. In Proceedings of the 23rd International Conference on World Wide Web (WWW '14 Companion). ACM, New York, NY, USA, 857--862.
[39]
Y. Yan, Y. Yang, D. Meng, G. Liu, W. Tong, A. G. Hauptmann, and N. Sebe. 2015. Event Oriented Dictionary Learning for Complex Event Detection. IEEE Transactions on Image Processing 24, 6 (June 2015), 1867--1878.
[40]
M. A. Yosef, J. Hoffart, I. Bordino, M. Spaniol, and G. Weikum. 2011. AIDA: An Online Tool for Accurate Disambiguation of Named Entities in Text and Tables. In Proc. of the 37th Intl. Conference on Very Large Databases (VLDB 2011), August 29 - September 3, Seattle, WA, USA. 1450--1453.

Cited By

View all
  • (2020)Towards a Better Contextualization of Web Contents via Entity-Level AnalyticsAdvances in Information Retrieval10.1007/978-3-030-45442-5_80(613-618)Online publication date: 8-Apr-2020
  • (2018)ELEVATE-Live: Assessment and Visualization of Online News Virality via Entity-Level AnalyticsWeb Engineering10.1007/978-3-319-91662-0_40(482-486)Online publication date: 20-May-2018
  • (2018)Towards Better Understanding Researcher Strategies in Cross-Lingual Event AnalyticsDigital Libraries for Open Knowledge10.1007/978-3-030-00066-0_12(139-151)Online publication date: 5-Sep-2018

Index Terms

  1. ELEVATE: A Framework for Entity-level Event Diffusion Prediction into Foreign Language Communities

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WebSci '17: Proceedings of the 2017 ACM on Web Science Conference
      June 2017
      438 pages
      ISBN:9781450348966
      DOI:10.1145/3091478
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 June 2017

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. multilingual web data
      2. societal events analysis

      Qualifiers

      • Research-article

      Conference

      WebSci '17
      Sponsor:
      WebSci '17: ACM Web Science Conference
      June 25 - 28, 2017
      New York, Troy, USA

      Acceptance Rates

      WebSci '17 Paper Acceptance Rate 30 of 85 submissions, 35%;
      Overall Acceptance Rate 245 of 933 submissions, 26%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 22 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2020)Towards a Better Contextualization of Web Contents via Entity-Level AnalyticsAdvances in Information Retrieval10.1007/978-3-030-45442-5_80(613-618)Online publication date: 8-Apr-2020
      • (2018)ELEVATE-Live: Assessment and Visualization of Online News Virality via Entity-Level AnalyticsWeb Engineering10.1007/978-3-319-91662-0_40(482-486)Online publication date: 20-May-2018
      • (2018)Towards Better Understanding Researcher Strategies in Cross-Lingual Event AnalyticsDigital Libraries for Open Knowledge10.1007/978-3-030-00066-0_12(139-151)Online publication date: 5-Sep-2018

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media