×

Statistical analysis of the Hirsch index. (English) Zbl 1253.62103

Summary: The J.E. Hirsch [Proc. Natl. Acad. Sci. USA 102, 16569–16572 (2005)] index (commonly referred to as h-index) is a bibliometric indicator which is widely recognized as effective for measuring the scientific production of a scholar since it summarizes size and impact of the research output. In a formal setting, the h-index is actually an empirical functional of the distribution of the citation counts received by the scholar. Under this approach, the asymptotic theory for the empirical h-index has been recently exploited when the citation counts follow a continuous distribution and, in particular, variance estimation has been considered for the Pareto-type and the Weibull-type distribution families. However, in bibliometric applications, citation counts display a distribution supported by the integers. Thus, we provide general properties for the empirical h-index under the small- and large-sample settings. In addition, we also introduce consistent nonparametric variance estimation, which allows for the implementation of large-sample set estimation for the theoretical h-index.

MSC:

62P99 Applications of statistics
62G05 Nonparametric estimation

References:

[1] Adler, Citation statistics, Statist. Sci. 24 pp 1– (2009) · Zbl 1327.01054 · doi:10.1214/09-STS285
[2] Ball, Achievement index climbs the ranks, Nature 448 pp 727– (2007) · doi:10.1038/448737a
[3] Barcza, Paretian publication patterns imply Paretian Hirsch index, Scientometrics 81 pp 513– (2009) · doi:10.1007/s11192-008-2175-8
[4] Beirlant, Asymptotics for the Hirsch index, Scand. J. Stat. 37 pp 355– (2010) · Zbl 1349.62594 · doi:10.1111/j.1467-9469.2010.00694.x
[5] Beirlant, Statistics of extremes: theory and applications (2004) · Zbl 1070.62036 · doi:10.1002/0470012382
[6] Braun, A Hirsch-type index for journals, Scientometrics 69 pp 169– (2006) · doi:10.1007/s11192-006-0147-4
[7] Christoph, Discrete stable random variables, Statist. Prob. Lett. 37 pp 243– (1998) · Zbl 1246.60026 · doi:10.1016/S0167-7152(97)00123-5
[8] Costas, The h-index: advantages, limitations and its relation with other bibliometric indicators at the micro level, J. Informetr. 1 pp 193– (2007) · doi:10.1016/j.joi.2007.02.001
[9] DasGupta, Asymptotic theory of statistics and probability (2008) · Zbl 1154.62001
[10] Egghe, Power laws in the information production process (2005)
[11] Glänzel, On the h-index - a mathematical approach to a new measure of publication activity and citation impact, Scientometrics 67 pp 315– (2006) · doi:10.1007/s11192-006-0102-4
[12] Hall, Comment: citation statistics, Statist. Sci. 24 pp 25– (2009) · Zbl 1327.01059 · doi:10.1214/09-STS285D
[13] Hirsch, An index to quantify an individual’s scientific research output, Proc. Natl. Acad. Sci. USA 102 pp 16569– (2005) · Zbl 1355.01034 · doi:10.1073/pnas.0507655102
[14] Marcheselli, Asymptotic results in jackknifing non-smooth functions of the sample mean vector, Ann. Statist. 31 pp 1885– (2003) · Zbl 1042.62047 · doi:10.1214/aos/1074290330
[15] Marcheselli, Parameter estimation for the discrete stable family, Comm. Statist. Theory Methods 37 pp 815– (2008) · Zbl 1135.62025 · doi:10.1080/03610920701570298
[16] Molinari, A new methodology for ranking scientific institutions, Scientometrics 75 pp 163– (2008) · doi:10.1007/s11192-007-1853-2
[17] Nejati, A two-dimensional approach to evaluate the scientific production of countries (case study: the basic sciences), Scientometrics 84 pp 357– (2010) · doi:10.1007/s11192-009-0103-1
[18] Steutel, Infinite divisibility of probability distributions on the real line (2004) · Zbl 1063.60001
[19] Van Noorden, Metrics: a profusion of measures, Nature 465 pp 864– (2010) · doi:10.1038/465864a
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.