×

Mining periodic patterns and cascading bursts phenomenon in individual e-mail communication. (English) Zbl 1516.62412

Summary: Quantitative understanding of human activity is very important as many social and economic trends are driven by human actions. We propose a novel stochastic process, the Multi-state Markov Cascading Non-homogeneous Poisson Process (M2CNPP), to analyze human e-mail communication involving both periodic patterns and bursts phenomenon. The model parameters are estimated using the Generalized Expectation Maximization (GEM) algorithm while the hidden states are treated as missing values. The empirical results demonstrate that the proposed model adequately captures the major temporal cascading features as well as the periodic patterns in e-mail communication.

MSC:

62-XX Statistics

Software:

plfit
Full Text: DOI

References:

[1] Barabási, A. L., The origins of bursts and heavy tails in human dynamics, Nature, 435, 207-211 (2005) · doi:10.1038/nature03459
[2] Bartlett, M. S., The spectral analysis of point processes, J. Roy. Stat. Soc., 25, 264-296 (1963) · Zbl 0124.08504
[3] Belik, V.; Geisel, T.; Brockmann, D., Natural human mobility patterns and spatial spread of infectious diseases, Phys. Rev. X., 1, 3103-3106 (2011)
[4] Berchtold, A., The double chain Markov model, Commun. Stat-Theor. M., 28, 2569-2589 (1999) · Zbl 0973.62067 · doi:10.1080/03610929908832439
[5] Bhattacharya, K. and Kaski, K., Social Physics: Uncovering Human Behaviour from Communication, arXiv preprint (2018) Available at arXiv:1804.04907.
[6] Brockmann, D.; Hufnagel, L.; Geisel, T., The scaling laws of human travel, Nature, 439, 462-465 (2006) · doi:10.1038/nature04292
[7] Clauset, A.; Shalizi, C. R.; Newman, M. E.J., Power-law distributions in empirical data, Siam Review, 51, 661-703 (2009) · Zbl 1176.62001 · doi:10.1137/070710111
[8] Cox, D. R., Some statistical methods connected with series of events, J. Roy. Stat. Soc. B., 17, 129-164 (1955) · Zbl 0067.37403
[9] Eckmann, J. P.; Moses, E.; Sergi, D.; Kadanoff, L. P., Entropy of dialogues creates coherent structures in e-mail traffic, P. Natl. A. Sci., 101, 14333-14337 (2004) · Zbl 1072.37516 · doi:10.1073/pnas.0405728101
[10] Edwards, A. M.; Phillips, R. A.; Watkins, N. W., Revisiting Lévy flight search patterns of wandering albatrosses, bumblebees and deer, Nature, 449, 1044-1048 (2007) · doi:10.1038/nature06199
[11] Fischer, W.; Meier-Hellstern, K., The Markov-modulated Poisson process (MMPP) cookbook, Performance Evaluation, 18, 149-171 (1993) · Zbl 0781.60098 · doi:10.1016/0166-5316(93)90035-S
[12] Fox, E. W.; Short, M. B.; Schoenberg, F. P.; Coronges, K. D.; Bertozzi, A. L., Modeling e-mail networks and inferring leadership using self-exciting point processes, J. Am. Stat. Assoc., 111, 564-584 (2016) · doi:10.1080/01621459.2015.1135802
[13] Gonzalez, M. C.; Hidalgo, C. A.; Barabási, A. L., Understanding individual human mobility patterns, Nature, 453, 779-782 (2008) · doi:10.1038/nature06958
[14] Han, X. P.; Wang, X. W.; Yan, X. Y.; Wang, B. H., Cascading walks model for human mobility patterns, PLoS. ONE., 10, 1-19 (2013)
[15] Hawkes, A. G., Spectra of some self-exciting and mutually exciting point processes, Biometrika, 58, 83-90 (1971) · Zbl 0219.60029 · doi:10.1093/biomet/58.1.83
[16] Karsai, M.; Kaski, K.; Barabási, A. L.; Kertész, J., Universal features of correlated bursty behaviour, Sci. Rep., 2, 539-539 (2012) · doi:10.1038/srep00397
[17] Kivela, M.; Porter, M. A., Estimating interevent time distributions from finite observation periods in communication networks, Phys Rev E, 92, 1-10 (2015) · doi:10.1103/PhysRevE.92.052813
[18] Langrock, R.; Borchers, D. L.; Skaug, H. J., Markov-modulated nonhomogeneous poisson processes for modeling detections in surveys of marine mammal abundance, J. Am. Stat. Assoc., 108, 840-851 (2013) · Zbl 06224970 · doi:10.1080/01621459.2013.797356
[19] Lewis, P. A.; Shedler, G. S., Simulation of nonhomogeneous Poisson processes by thinning, Nav. Res. Log. (NRL), 26, 403-413 (1979) · Zbl 0497.60003 · doi:10.1002/nav.3800260304
[20] Malmgren, R.D., Hofman, J.M., Amaral, L.A.N., and Watts, D.J., Characterizing individual communication patterns, in Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, Association for Computing Machinery, 2009, pp. 607-615
[21] Malmgren, R. D.; Stouffer, D. B.; Campanharo, A. S.; Amaral, L. A.N., On universality in human correspondence activity, Science, 325, 1696-1700 (2009) · doi:10.1126/science.1174562
[22] Malmgren, R. D.; Stouffer, D. B.; Motter, A. E.; Amaral, L. A., A Poissonian explanation for heavy tails in e-mail communication, P. Nat. A. Sci., 105, 18153-18158 (2008) · doi:10.1073/pnas.0800332105
[23] McLachlan, G. J.; Krishnan, T., The EM algorithm and extensions, 382 (2007), John Wiley and Sons: John Wiley and Sons, New Jersey
[24] Meng, X. L.; Rubin, D. B., Recent extensions of the EM algorithm (with discussion), Bayesian stat., 4, 307-320 (1992)
[25] Mohler, G., Modeling and estimation of multi-source clustering in crime and security data, Ann. Appl. Stat., 7, 1525-1539 (2013) · Zbl 1454.62504 · doi:10.1214/13-AOAS647
[26] Mohler, G. O.; Short, M. B.; Brantingham, P. J.; Schoenberg, F. P.; Tita, G. E., Self-exciting point process modeling of crime, J. Am. Stat. Assoc., 106, 100-108 (2011) · Zbl 1396.62224 · doi:10.1198/jasa.2011.ap09546
[27] Newman, M. E.J., Power Laws, Pareto Distributions and Zipf’s Law, Contemp. Phys., 46, 323-351 (2006) · doi:10.1080/00107510500052444
[28] Neyman, J.; Scott, E. L., Statistical Approach to Problems of Cosmology, J. Roy. Stat. Soc. B., 20, 1-43 (1958) · Zbl 0085.42906
[29] Neyman, J., Scott, E.L., and Shane, C.D., Statistics of Images of Galaxies with Particular Reference to Clustering, in Berkeley Symposium on Mathematical Statistics and Probability. 1958, pp. 75-106 · Zbl 0072.24001
[30] Perline, R., Strong, weak and false inverse power laws, Stat. Sci., 20, 68-88 (2005) · Zbl 1100.62013 · doi:10.1214/088342304000000215
[31] Reynaud-Bouret, P.; Schbath, S., Adaptive estimation for hawkes processes; application to genome analysis, Ann. Stat., 38, 2781-2822 (2010) · Zbl 1200.62135 · doi:10.1214/10-AOS806
[32] Rodrigues, E. R.; Achcar, J. A., Applications of discrete-time Markov chains and Poisson processes to air pollution modeling and studies (2013), Springer: Springer, New York · Zbl 1275.62078
[33] Rydén, T., An EM algorithm for estimation in Markov-modulated Poisson processes, Comput. Stat. Data Anal., 21, 431-447 (1996) · Zbl 0875.62405 · doi:10.1016/0167-9473(95)00025-9
[34] Salvador, P.; Valadas, R.; Pacheco, A., Multiscale fitting procedure using Markov modulated Poisson processes, Telecommun. Syst., 23, 123-148 (2003) · doi:10.1023/A:1023672904130
[35] Scott, S. L., Bayesian Analysis of a Two-State Markov Modulated Poisson Process, J. Comput. Graph. Stat., 8, 662-670 (1999)
[36] Scott, S. L.; Smyth, P., The Markov modulated Poisson process and Markov Poisson cascade with applications to web traffic data, Bayesian Stat., 7, 671-680 (2003)
[37] Stouffer, D.B., Malmgren, R.D., and Amaral, L.A.N., Log-normal statistics in e-mail communication patterns, (2006) Available at arXiv:physics/060527
[38] Wang, S. F.; Feng, X.; Wu, Y.; Xiao, J. H., Double dynamic scaling in human communication dynamics, Phys. A Stat. Mech. Appl., 473, 313-318 (2017) · doi:10.1016/j.physa.2017.01.010
[39] Wang, W. J.; Pan, L.; Yuan, N.; Zhang, S.; Liu, D., A comparative analysis of intra-city human mobility by taxi, Phys. A Stat. Mech. Appl., 420, 134-147 (2015) · doi:10.1016/j.physa.2014.10.085
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.