×

Inferring range of information diffusion based on historical frequent items. (English) Zbl 1494.62029

Summary: To estimate the range of information diffusion is critical for social network and user behavior analysis. Selecting nodes to constitute the range of information diffusion is challenging by the classic independent cascade and linear threshold models, due to the unknown topology of large-scale online social networks (OSNs). In this paper, we start from the mining of frequent itemsets in historical records of information diffusion, and adopt Bayesian network (BN) as the framework to represent and infer the implied dependence relations among frequent items. To make probabilistic inferences to infer the range, we first propose a greedy algorithm to select the observed nodes as the evidence of BN inference, for which we propose the metric of proximity degree and prove its submodularity. Then, we give the algorithm to construct the item-association BN (IABN) to represent the dependencies among frequent items. Following, we present an approximate algorithm to infer the range of information diffusion w.r.t. the observed nodes. Experimental results show that the observed nodes could be selected and the range of information diffusion could be inferred effectively. Empirical studies also demonstrate that our proposed IABN outperforms some state-of-the-art methods to obtain relatively complete nodes in the range of information diffusion.

MSC:

62R40 Topological data analysis
62H22 Probabilistic graphical models
91D30 Social networks; opinion dynamics
Full Text: DOI

References:

[1] Agarwal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings of the 20th international conference on very large databases (VLDB), pp 487-499
[2] Arnaboldi, V.; Passarella, A.; Conti, M., Online social networks: human cognitive constraints in Facebook and Twitter personal graphs (2015), Amsterdam: Elsevier, Amsterdam
[3] Chen W, Wang Y, Yang S (2009) Efficient influence maximization in social networks. In: Proceedings of the 15th ACMSIGKDD conference on knowledge discovery and data mining (SIGKDD), pp 199-208
[4] Christakis, Nicholas; Fowler, James, Social network sensors for early detection of contagious outbreaks, PLoS One, 5, e12948, 09 (2010) · doi:10.1371/journal.pone.0012948
[5] Cui P, Jin S, Yu L et al (2013) Cascading outbreak prediction in networks: a data-driven approach. In: Proceedings of the 19th ACM SIGKDD conference on knowledge discovery and data mining (SIGKDD), pp 901-909
[6] Devore, J., Probability and statistics for engineering and the sciences (2004), Davidson: Wadsworth Group, Davidson
[7] Filmus, Yuval, Inequalities on submodular functions via term rewriting, Inf Process Lett, 113, 13, 457-464 (2013) · Zbl 1371.90141 · doi:10.1016/j.ipl.2013.03.018
[8] George D, Hawkins J (2005) A hierarchical Bayesian model of in variant pattern recognition in the visual cortex. In: Proceedings of 2005 IEEE international joint conference on neural networks (IJCNN), pp 1812-1817
[9] Ha, C.; Wu, X.; Hu, X., Computing and pruning method for frequent pattern interestingness based on Bayesian networks, J Softw, 22, 12, 2934-2950 (2011) · doi:10.3724/SP.J.1001.2011.03978
[10] Han, J.; Kamber, M., Data mining: concepts and techniques (2001), Burlington: Morgan Kaufmann Publishers, Burlington · Zbl 1230.68018
[11] Han, J.; Cheng, H.; Xin, D., Frequent pattern mining: current status and future directions, Data Min Knowl Discov, 15, 1, 55-85 (2007) · doi:10.1007/s10618-006-0059-1
[12] Hasan M (2016) Methods and applications of network sampling
[13] Hernando, A.; Bobadilla, J.; Ortega, F., A non negative matrix factorization for collaborative filtering recommender systems based on a Bayesian probabilistic model, Knowl-Based Syst, 97, 188-202 (2016) · doi:10.1016/j.knosys.2015.12.018
[14] Hu, S.; Cautis, B.; Chen, Z., Model-free inference of diffusion networks using RKHS embeddings, Data Min Knowl Discov, 33, 499-525 (2019) · Zbl 1464.62250 · doi:10.1007/s10618-018-00611-1
[15] Kurant M, Gjoka M, Wang Y et al (2012) Coarse-grained topology estimation via graph sampling. In: Proceedings of the ACM SIGCOMM 2012 conference on data communication, pp 25-30
[16] Lee, G.; Yun, U.; Ruang, H., An uncertainty-based approach: frequent itemset mining from uncertain data with different item importance, Knowl-Based Syst, 90, 239-256 (2014) · doi:10.1016/j.knosys.2015.08.018
[17] Leskovec J, Faloutsos C (2006) Sampling from large graphs. In: Proceedings of the 12th ACMSIGKDD conference on knowledge discovery and data mining (SIGKDD), pp 631-636
[18] Leskovec J, Backstrom L, Kleinberg J (2009) Meme-tracking and the dynamics of the news cycle. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining (KDD), pp 497-506
[19] Liu, W.; Yue, K.; Liu, H., Associative categorization of frequent patterns based on the probabilistic graphical model, Front Comput Sci, 8, 2, 265-278 (2014) · doi:10.1007/s11704-014-3173-z
[20] Liu, W.; Yue, K.; Wu, H., Markov-network based latent link analysis for community detection in social behavioral interactions, Appl Intell, 48, 8, 2081-2096 (2018) · doi:10.1007/s10489-017-1040-y
[21] Maiya A, Berger-Wolf T (2010) Online sampling of high centrality individuals in social networks. In: Proceedings of the 14th Pacific-Asia knowledge discovery and data mining (PAKDD), pp 91-98
[22] Menon A, Chitrapura K, Garg S et al (2011) Response prediction using collaborative filtering with hierarchies and side-information. In: Proceedings of the 17th ACM SIGKDD conference on knowledge discovery and data mining (SIGKDD), pp 141-149
[23] Myers S, Zhu C, Leskovec J (2012) Information diffusion and external influence in networks. In: Proceedings of the 18th ACM SIGKDD conference on knowledge discovery and data mining (SIGKDD), pp 33-41
[24] Nemhauser, G.; Wolsey, L.; Fisher, M., An analysis of the approximations for maximizing submodular set functions, Math Program, 14, 265-294 (1978) · Zbl 0374.90045 · doi:10.1007/BF01588971
[25] Pearl, J., Probabilistic reasoning in intelligent system: networks of plausible inference (1988), Burlington: Morgan Kaufmann Publishers, Burlington · Zbl 0746.68089
[26] Rodrigues T, Benevenuto F, Cha M et al (2011) On word-of-mouth based discovery of the web. In: Proceedings of the ACM SIGCOMM on Internet measurement conference, pp 381-396
[27] Russell, J.; Norvig, P., Artificial intelligence: a modern approach (2011), Hoboken: Pearson, Hoboken · Zbl 0835.68093
[28] Smith S, Kao E, Shah D et al (2018) Influence estimation on social media networks using causal inference. In: Proceedings of IEEE statistical signal processing (SSP) workshop
[29] Vlasselaer, J.; Meert, W.; Broeck, G., Exploiting local and repeated structure in dynamic Bayesian networks, Artif Intell, 232, 43-53 (2016) · Zbl 1351.68288 · doi:10.1016/j.artint.2015.12.001
[30] Yang C, Tang J, Sun M et al (2019) Multi-scale information diffusion prediction with reinforced recurrent networks. In: Proceedings of the twenty-eighth international joint conference on artificial intelligence (IJCAI), pp 4033-4039
[31] Yang J, Leskovec J (2011) Patterns of temporal variation in online media. In: Proceedings of the 11th conference on web search and data mining (WSDM), pp 177-186
[32] Yin Z, Yue K, Wu H, Su Y (2018) Adaptive and parallel data acquisition from online big graphs. In: Proceedings of the 23rd international conference on database systems for advanced applications (DASFAA) (1), pp 323-331
[33] Yu K, Wu X, Ding W et al (2011) Causal associative classification. In: Proceedings of the 11th IEEE international conference on data mining (ICDM), pp 914-923
[34] Yu, L.; Cui, P.; Wang, F., Uncovering and predicting the dynamic process of information cascades with survival model, Knowl Inf Syst, 50, 2, 633-659 (2017) · doi:10.1007/s10115-016-0955-7
[35] Zhang Q, Gong Y, Wu J, et al. (2016) Retweet prediction with attention-based deep neural network. In: Proceedings of the 25th ACM international on conference on information and knowledge management (CIKM), pp 75-84
[36] Zhong E, Fan W, Wang J et al (2012) Comsoc: adaptive transfer of user behaviors over composite social network. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), pp 696-704
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.