×

When are there too many collisions? Variants of the birthday problem. (English) Zbl 07880526

Summary: Due to restrictions on the use of unique identifiers of individuals in data sets, there may be instances in which two or more data sets have some of the individuals in common, with no direct way to detect such occurrences. More generally, a collision occurs when two or more observations are in agreement with respect to variables associated with the observations. This article discusses several possible statistical/probabilistic approaches to determining when the number of collisions (or near-collisions) exceeds what would be expected by chance if in fact the observations are all distinct. The methods and results are related to the Birthday Problem and to Occupancy Problems.

MSC:

62-XX Statistics
Full Text: DOI

References:

[1] Abramson, M., and Moser, W. O. J.. 1970. More birthday surprises. The American Mathematical Monthly77 (8):856-8. doi: . · Zbl 0311.60009
[2] Allenby, R. B. J. T., and Slomson, A.. 2010. How to count: An introduction to combinationatorics. 2nd ed. London: Chapman and Hall/CRC.
[3] Anthonisen, N. R.1994. Effects of smoking intervention and use of an inhaled anticholinergic bronchodilator on the rate of decline of FEV1. The Lung Health Study. J. Amer. The Journal of the American Medical Association272 (19):1497-505. 16 doi: .
[4] Borja, M. C., and Haigh, J.. 2007. The birthday problem. Significance. Royal Society4(3): 124-7.
[5] Canny, J.0000. Lecture 5: Occupancy problems. doi:https://people.eecs.berkely.edu/ ∼jfc/cs174/lecs/lec5.pdf
[6] Eaton, J., Godbole, A. P., and Sinclair, B.. 2010. Competition between discrete random variables, with applications to occupancy problems. Journal of Statistical Planning and Inference140 (8):2204-12. doi: . · Zbl 1191.62016
[7] Fang, K.-T.1982. A restricted occupancy problem. Journal of Applied Probability19 (3):707-11. doi: . · Zbl 0494.60011
[8] Johnson, N. L., and Kotz, S.. 1977. Urn models and their applications. New York: Wiley. · Zbl 0352.60001
[9] Mathis, F. H.1991. A generalized birthday problem. SIAM Review33 (2):265-70. doi: . · Zbl 0725.60012
[10] McKinney, E. H.1966. Generalized birthday problem. The American Mathematical Monthly73 (4):385-7. doi: .
[11] Wendl, M. C.2003. Collision probability between sets of random variable. Statistics & Probability Letters64 (3):249-54. doi: . · Zbl 1113.60302
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.