×

Penalty-based aggregation of multidimensional data. (English) Zbl 1380.68359

Summary: Research in aggregation theory is nowadays still mostly focused on algorithms to summarize tuples consisting of observations in some real interval or of diverse general ordered structures. Of course, in practice of information processing many other data types between these two extreme cases are worth inspecting. This contribution deals with the aggregation of lists of data points in \(\mathbb{R}^d\) for arbitrary \(d \geq 1\). Even though particular functions aiming to summarize multidimensional data have been discussed by researchers in data analysis, computational statistics and geometry, there is clearly a need to provide a comprehensive and unified model in which their properties like equivariances to geometric transformations, internality, and monotonicity may be studied at an appropriate level of generality. The proposed penalty-based approach serves as a common framework for all idempotent information aggregation methods, including componentwise functions, pairwise distance minimizers, and data depth-based medians. It also allows for deriving many new practically useful tools.

MSC:

68T37 Reasoning under uncertainty in the context of artificial intelligence
Full Text: DOI

References:

[1] Abellanas, M.; Claverol, M.; Hurtado, F., Point set stratification and Delaunay depth, Comput. Stat. Data Anal., 51, 2513-2530 (2007) · Zbl 1161.62362
[2] Aloupis, G., Geometric measures of data depth, (DIMACS Series in Discrete Mathematics and Theoretical Computer Science (2006)), 147-158
[3] Aloupis, G.; Cortés, C.; Gómez, F.; Soss, M.; Toussaint, G., Lower bounds for computing statistical depth, Comput. Stat. Data Anal., 40, 223-229 (2002) · Zbl 1098.62502
[4] Aloupis, G.; Langerman, S.; Soss, M.; Toussaint, G., Algorithms for bivariate medians and a Fermat-Torricelli problem for lines, Comput. Geom., 26, 69-79 (2003) · Zbl 1039.62045
[5] Beliakov, G.; Bustince, H.; Calvo, T., A Practical Guide to Averaging Functions (2016), Springer
[6] Beliakov, G.; Calvo, T.; James, S., On penalty-based aggregation functions and consensus, (Herrera-Viedma, E.; etal., Consensual Processes. Consensual Processes, STUDFUZZ, vol. 267 (2011)), 23-40
[7] Beliakov, G.; James, S., Stability of weighted penalty-based aggregation functions, Fuzzy Sets Syst., 226, 1-18 (2013) · Zbl 1284.68545
[8] Beliakov, G.; James, S., A penalty-based aggregation operator for non-convex intervals, Knowl.-Based Syst., 70, 335-344 (2014)
[9] Beliakov, G.; Pradera, A.; Calvo, T., Aggregation Functions: A Guide for Practitioners (2007), Springer-Verlag · Zbl 1123.68124
[10] Beliakov, G.; Wilkin, T., On some properties of weighted averaging with variable weights, Inf. Sci., 281, 1-7 (2014) · Zbl 1360.62154
[11] Bullen, P., Handbook of Means and Their Inequalities (2003), Springer Science+Business: Springer Science+Business Media, Dordrecht · Zbl 1035.26024
[12] Bustince, H.; Barrenechea, E.; Pagola, M., Relationship between restricted dissimilarity functions, restricted equivalence functions and normal \(e_N\)-functions: image thresholding invariant, Pattern Recognit. Lett., 29, 525-536 (2008)
[13] Bustince, H.; Beliakov, G.; Dimuro, G. P.; Bedregal, B.; Mesiar, R., On the definition of penalty functions in data aggregation, Fuzzy Sets Syst., 323, 1-18 (2017) · Zbl 1409.68288
[14] Bustince, H.; Fernandez, J.; Mesiar, R.; Pradera, A.; Beliakov, G., Restricted dissimilarity functions and penalty functions, (Galichet, S.; etal., Proc. Eusflat/LFA’11 (2011)), 79-85 · Zbl 1254.68251
[15] Calvo, T.; Beliakov, G., Aggregation functions based on penalties, Fuzzy Sets Syst., 161, 1420-1436 (2010) · Zbl 1207.68384
[16] Calvo, T.; Mesiar, R.; Yager, R. R., Quantitative weights and aggregation, IEEE Trans. Fuzzy Syst., 12, 62-69 (2004)
[17] Cena, A.; Gagolewski, M., Fuzzy \(k\)-minpen clustering and \(k\)-nearest-minpen classification procedures incorporating generic distance-based penalty minimizers, (Carvalho, J.; etal., Information Processing and Management of Uncertainty in Knowledge-Based Systems, Part II. Information Processing and Management of Uncertainty in Knowledge-Based Systems, Part II, Commun. Comput. Inf. Sci., vol. 611 (2016), Springer), 445-456 · Zbl 1460.62103
[18] Chakraborty, B.; Chaudhuri, P., On a transformation and re-transformation technique for constructing an affine equivariant multivariate median, Proc. Am. Math. Soc., 124, 2539-2547 (1996) · Zbl 0856.62046
[19] Chan, T. M., An optimal randomized algorithm for maximum Tukey depth, (Proc. 15th ACM-SIAM Symp. Discrete Algorithms. Proc. 15th ACM-SIAM Symp. Discrete Algorithms, SODA (2004)), 430-436 · Zbl 1317.68246
[20] Chenouri, S.; Small, C. G., A nonparametric multivariate multisample test based on data depth, Electron. J. Stat., 6, 760-782 (2012) · Zbl 1336.62138
[21] Demirci, M., Aggregation operators on partially ordered sets and their categorical foundations, Kybernetika, 42, 261-277 (2006) · Zbl 1249.03091
[22] Deza, M. M.; Deza, E., Encyclopedia of Distances (2014), Springer · Zbl 1301.51001
[23] Donoho, D. L., Breakdown Properties of Multivariate Location Estimates (1982), Department of Statistics, Harvard University, Ph.D. thesis
[24] Donoho, D. L.; Gasko, M., Breakdown properties of location estimates based on halfspace depth and projected outlyingness, Ann. Stat., 20, 1803-1827 (1992) · Zbl 0776.62031
[25] Dujmović, J. J., Weighted conjunctive and disjunctive means and their application in system evaluation, Publ. Elektroteh. Fak. Univ. Beogr., Mat., 461, 497, 147-158 (1974) · Zbl 0297.65012
[26] Durocher, S.; Fraser, R.; Leblanc, A.; Morrison, J.; Skala, M., On combinatorial depth measures, (Proc. 26th Canadian Conf. Computational Geometry (2014)), 206-211
[27] Dyckerhoff, R.; Koshevoy, G.; Mosler, K., Zonoid data depth: theory and computation, (Prat, A.; etal., Proc. COMPSTAT 1996 (1996), Physica-Verlag: Physica-Verlag Heidelberg), 235-240 · Zbl 0894.62081
[28] Eddy, W., Convex hull peeling, (Proc. COMPSTAT’82 (1982), Physica-Verlag: Physica-Verlag Vienna), 42-47 · Zbl 0493.62020
[29] Edelsbrunner, H., Algorithms in Combinatorial Geometry (1987), Springer-Verlag: Springer-Verlag Heidelberg · Zbl 0634.52001
[30] Elbassioni, K.; Tiwary, H. R., Complexity of approximating the vertex centroid of a polyhedron, Theor. Comput. Sci., 421, 56-61 (2012) · Zbl 1232.68072
[31] Fernández Salido, J.; Murakami, S., Extending Yager’s orness concept for the OWA aggregators to other mean operators, Fuzzy Sets Syst., 139, 515-542 (2003) · Zbl 1032.03524
[32] Gagolewski, M., Data Fusion: Theory, Methods, and Applications (2015), Institute of Computer Science, Polish Academy of Sciences: Institute of Computer Science, Polish Academy of Sciences Warsaw, Poland
[33] Gagolewski, M., Some issues in aggregation of multidimensional data, (Baczynski, M.; De Baets, B.; Mesiar, R., Proc. 8th International Summer School on Aggregation Operators. Proc. 8th International Summer School on Aggregation Operators, AGOP 2015 (2015), University of Silesia: University of Silesia Katowice, Poland), 127-132
[34] Gagolewski, M., Spread measures and their relation to aggregation functions, Eur. J. Oper. Res., 241, 469-477 (2015) · Zbl 1339.91036
[35] Gagolewski, M.; Cena, A.; Bartoszuk, M., Hierarchical clustering via penalty-based aggregation and the Genie approach, (Torra, V.; etal., Modeling Decisions for Artificial Intelligence. Modeling Decisions for Artificial Intelligence, Lecture Notes in Artificial Intelligence, vol. 9880 (2016), Springer), 191-202
[36] Gärtner, B., Fast and robust smallest enclosing balls, Lecture Notes in Computer Science, 1643, 325-338 (1999)
[37] Gärtner, B.; Schönherr, S., An efficient, exact, and generic quadratic programming solver for geometric optimization, (Proc. 16th ACM Symposium on Computational Geometry (2000)), 110-118 · Zbl 1377.68277
[38] Grabisch, M.; Marichal, J. L.; Mesiar, R.; Pap, E., Aggregation Functions (2009), Cambridge University Press · Zbl 1196.00002
[39] Green, P., Peeling bivariate data, (Barnett, V., Interpreting Multivariate Data (1981), Wiley: Wiley New York)
[40] Grübel, R., Orthogonalization of multivariate location estimators: the orthomedian, Ann. Stat., 24, 1457-1473 (1996) · Zbl 0867.62048
[41] Guerrini, L., An extension of Witzgall’s result on convex metrics, Divulg. Mat., 13, 83-89 (2005) · Zbl 1101.46301
[42] Huber, P. J., The 1972 Wald lecture robust statistics: a review, Ann. Math. Stat., 42, 1041-1067 (1972) · Zbl 0254.62023
[43] Kołacz, A.; Grzegorzewski, P., Measures of dispersion for multidimensional data, Eur. J. Oper. Res., 251, 930-937 (2016) · Zbl 1347.62100
[44] Komorníková, M.; Mesiar, R., Aggregation functions on bounded partially ordered sets and their classification, Fuzzy Sets Syst., 175, 48-56 (2011) · Zbl 1253.06004
[45] Leisch, F., A toolbox for K-centroids cluster analysis, Comput. Stat. Data Anal., 51, 526-544 (2006) · Zbl 1157.62439
[46] Li, J.; Liu, R. Y., New nonparametric tests of multivariate locations and scales using data depth, Stat. Sci., 19, 686-696 (2004) · Zbl 1100.62564
[47] Liu, R. Y., On a notion of data depth based on random simplices, Ann. Stat., 18, 405-414 (1990) · Zbl 0701.62063
[48] Liu, R. Y.; Parelius, J. M.; Singh, K., Multivariate analysis by data depth: descriptive statistics, graphics and inference, Ann. Stat., 27, 783-858 (1999) · Zbl 0984.62037
[49] Lucchetti, R., Convexity and Well-Posed Problems, CMS Books in Mathematics (2006) · Zbl 1106.49001
[50] Martín, J.; Mayor, G., Dispersion measures and multidistances on \(R^k\), (Ferraro, M. B.; etal., Soft Methods for Data Science (2017), Springer), 347-354 · Zbl 1422.62023
[51] Mesiar, R., Fuzzy set approach to the utility, preference relations, and aggregation operators, Eur. J. Oper. Res., 176, 414-422 (2007) · Zbl 1137.91372
[52] Milasevic, P.; Ducharme, G., Uniqueness of the spatial median, Ann. Stat., 15, 1332-1333 (1987) · Zbl 0631.62058
[53] Möttönen, J.; Nordhausen, K.; Oja, H., Asymptotic theory of the spatial median, (Nonparametrics and Robustness in Modern Statistical Inference and Time Series, vol. 7 (2010)), 182-193
[54] Nagumo, M., Über eine Klasse der Mittelwerte, Jpn. J. Math., 7, 71-79 (1930) · JFM 56.0198.03
[55] Oja, H., Descriptive statistics for multivariate distributions, Stat. Probab. Lett., 1, 327-332 (1983) · Zbl 0517.62051
[56] Ovchinnikov, S., Means on ordered sets, Math. Soc. Sci., 32, 39-56 (1996) · Zbl 0921.90010
[57] Ovchinnikov, S., Invariant functions on simple orders, Order, 14, 365-371 (1998) · Zbl 0910.06001
[58] Pérez-Fernández, R.; Rademaker, M.; De Baets, B., Monometrics and their role in the rationalisation of ranking rules, Inf. Fusion, 34, 16-27 (2017)
[59] Rademacher, L., Approximating the centroid is hard, (Symposium on Computational Geometry (2007)), 302-305 · Zbl 1221.68095
[60] Rademaker, M.; De Baets, B., A ranking procedure based on a natural monotonicity constraint, Inf. Fusion, 17, 74-82 (2014)
[61] Rockafellar, R. T., Convex Analysis (1970), Princeton University Press: Princeton University Press New Jersey · Zbl 0202.14303
[62] Rousseeuw, P. J.; Ruts, I., Algorithm AS 307: bivariate location depth, Appl. Stat., 45, 516-526 (1996) · Zbl 0905.62002
[63] Rousseeuw, P. J.; Ruts, I., Constructing the bivariate Tukey median, Stat. Sin., 8, 827-839 (1998) · Zbl 0905.62029
[64] Rousseeuw, P. J.; Struyf, A., Computing location depth and regression depth in higher dimensions, Stat. Comput., 8, 193-203 (1998)
[65] Rousseuw, P. J.; Ruts, I., The depth function of a population distribution, Metrika, 49, 213-244 (1999) · Zbl 1093.62540
[66] Small, C. G., A survey on multidimensional medians, Int. Stat. Rev., 58, 263-277 (1990)
[67] Sylvester, J. J., A question in the geometry of situation, Q. J. Pure Appl. Math., 1, 79 (1857)
[68] Tukey, J. W., Mathematics and the picturing of data, (Proc. Intl. Congress of Mathematicians (1974)), 523-531 · Zbl 0347.62002
[69] Vardi, Y.; Zhang, C. H., The multivariate \(L_1\)-median and associated data depth, Proc. Natl. Acad. Sci. USA, 97, 1423-1426 (2000) · Zbl 1054.62067
[70] Weiszfeld, E., Sur le point par lequel la somme des distances de \(n\) points donnés est minimum, Tohoku Math. J., 43, 355-386 (1937) · Zbl 0017.18007
[71] Wilkin, T.; Beliakov, G., Weakly monotonic averaging functions, Int. J. Intell. Syst., 30, 144-169 (2015)
[72] Witzgall, C., On convex metrics, J. Res. Natl. Bur. Stand. B, Math. Math. Phys., 69B, 175-177 (1965) · Zbl 0135.34403
[73] Yager, R. R., Toward a general theory of information aggregation, Inf. Sci., 68, 191-206 (1993) · Zbl 0765.90005
[74] Yager, R. R., On ordered weighted averaging aggregation operators in multicriteria decision making, IEEE Trans. Syst. Man Cybern., 18, 183-190 (1988) · Zbl 0637.90057
[75] Yager, R. R.; Rybalov, A., Understanding the median as a fusion operator, Int. J. Gen. Syst., 26, 239-263 (1997) · Zbl 0980.68502
[76] Zuo, Y., Projection-based depth functions and associated medians, Ann. Math. Stat., 31, 1460-1490 (2003) · Zbl 1046.62056
[77] Zuo, Y.; Serfling, R., General notions of statistical depth function, Ann. Math. Stat., 28, 461-482 (2000) · Zbl 1106.62334
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.