×

Statistics of robust optimization: a generalized empirical likelihood approach. (English) Zbl 1473.62292

Summary: We study statistical inference and distributionally robust solution methods for stochastic optimization problems, focusing on confidence intervals for optimal values and solutions that achieve exact coverage asymptotically. We develop a generalized empirical likelihood framework – based on distributional uncertainty sets constructed from nonparametric \(f\)-divergence balls – for Hadamard differentiable functionals, and in particular, stochastic optimization problems. As consequences of this theory, we provide a principled method for choosing the size of distributional uncertainty regions to provide one- and two-sided confidence intervals that achieve exact coverage. We also give an asymptotic expansion for our distributionally robust formulation, showing how robustification regularizes problems by their variance. Finally, we show that optimizers of the distributionally robust formulations we study enjoy (essentially) the same consistency properties as those in classical sample average approximations. Our general approach applies to quickly mixing stationary sequences, including geometrically ergodic Harris recurrent Markov chains.

MSC:

62M05 Markov processes: estimation; hidden Markov models
62G05 Nonparametric estimation
62G35 Nonparametric robustness
62L20 Stochastic approximation

Software:

Saga; AdaGrad; Convex.jl

References:

[1] [1] Ali SM , Silvey SD (1966) A general class of coefficients of divergence of one distribution from another. J. Roy. Statist. Soc. B . 28:131-142.Google Scholar · Zbl 0203.19902
[2] [2] Arcones MA , Yu B (1994) Central limit theorems for empirical and U-processes of stationary mixing sequences. J. Theoret. Probab. 7(1):47-71.Crossref, Google Scholar · Zbl 0786.60028 · doi:10.1007/BF02213360
[3] [3] Artzner P , Delbaen F , Eber J-M , Heath D (1999) Coherent measures of risk. Math. Finance 9(3):203-228.Crossref, Google Scholar · Zbl 0980.91042 · doi:10.1111/1467-9965.00068
[4] [4] Baggerly KA (1998) Empirical likelihood as a goodness-of-fit measure. Biometrika 85(3):535-547.Crossref, Google Scholar · Zbl 0918.62043 · doi:10.1093/biomet/85.3.535
[5] [5] Bartlett PL , Bousquet O , Mendelson S (2005) Local Rademacher complexities. Ann. Statist. 33(4):1497-1537.Crossref, Google Scholar · Zbl 1083.62034 · doi:10.1214/009053605000000282
[6] [6] Bean D , Bickel P , El Karoui N , Yu B (2013) Optimal M-estimation in high-dimensional regression. Proc. Natl. Acad. Sci. USA 110(36):14563-14568.Crossref, Google Scholar · doi:10.1073/pnas.1307845110
[7] [7] Ben-Tal A , Ghaoui LE , Nemirovski A (2009) Robust Optimization (Princeton University Press, Princeton, NJ).Crossref, Google Scholar · Zbl 1221.90001 · doi:10.1515/9781400831050
[8] [8] Ben-Tal A , Hazan E , Koren T , Mannor S (2015) Oracle-based robust optimization via online learning. Oper. Res. 63(3):628-638.Link, Google Scholar · Zbl 1327.90379
[9] [9] Ben-Tal A , den Hertog D , Waegenaere AD , Melenberg B , Rennen G (2013) Robust solutions of optimization problems affected by uncertain probabilities. Management Sci. 59(2):341-357.Link, Google Scholar
[10] [10] Bertail P (2006) Empirical likelihood in some semiparametric models. Bernoulli 12(2):299-331.Crossref, Google Scholar · Zbl 1099.62046 · doi:10.3150/bj/1145993976
[11] [11] Bertail P , Gautherat E , Harari-Kermadec H (2014) Empirical φ∗ p-divergence minimizers for hadamard differentiable functionals. Akritas MG , Lahiri SN , Politis DN , eds. Topics in Nonparametric Statistics (Springer, New York), 21-32.Crossref, Google Scholar · Zbl 1331.62186 · doi:10.1007/978-1-4939-0569-0_3
[12] [12] Bertsekas DP (1973) Stochastic optimization problems with nondifferentiable cost functionals. J. Optim. Theory Appl. 12(2):218-231.Crossref, Google Scholar · Zbl 0248.90043 · doi:10.1007/BF00934819
[13] [13] Bertsimas D , Gupta V , Kallus N (2014) Robust sample average approximation. Preprint, submitted August 19, https://arxiv.org/abs/1408.4445.Google Scholar
[14] [14] Bertsimas D , Gupta V , Kallus N (2018) Data-driven robust optimization. Math. Programming 167(2):235-292.Crossref, Google Scholar · Zbl 1397.90298 · doi:10.1007/s10107-017-1125-8
[15] [15] Billingsley P (1986) Probability and Measure , 2nd ed. (Wiley, New York).Google Scholar · Zbl 0649.60001
[16] [16] Black F (1976) Studies of stock price volatility changes. Proc. 1976 Meetings Amer. Statist. Assoc. (American Statistical Association, Washington, DC), 177-181.Google Scholar
[17] [17] Blanchet J , Murthy K (2019) Quantifying distributional model risk via optimal transport. Math. Oper. Res. 44(2):565-600.Link, Google Scholar · Zbl 1434.60113
[18] [18] Blanchet J , Kang Y , Murthy K (2019) Robust Wasserstein profile inference and applications to machine learning. J. Appl. Probab. 56(3):830-857.Crossref, Google Scholar · Zbl 1436.62336 · doi:10.1017/jpr.2019.49
[19] [19] Bradley RC (2005) Basic properties of strong mixing conditions. a survey and some open questions. Probab. Surveys 2:107-144.Crossref, Google Scholar · Zbl 1189.60077 · doi:10.1214/154957805100000104
[20] [20] Bravo F (2003) Second-order power comparisons for a class of nonparametric likelihood-based tests. Biometrika 90(4):881-890.Crossref, Google Scholar · Zbl 1436.62160 · doi:10.1093/biomet/90.4.881
[21] [21] Bravo F (2006) Bartlett-type adjustments for empirical discrepancy test statistics. J. Statist. Planning Inference 136(3):537-554.Crossref, Google Scholar · Zbl 1080.62022 · doi:10.1016/j.jspi.2004.08.010
[22] [22] Bubeck S , Eldan R , Lehec J (2015) Finite-time analysis of projected Langevin Monte Carlo. Cortes C , Lawrence ND , Lee DD , Sugiyama M , Garnett R , eds. Advances in Neural Information Processing Systems , vol. 28 (Neural Information Processing Systems Foundation, San Diego), 1243-1251.Google Scholar
[23] [23] Candès E , Sur P (2020) The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression. Ann. Statist. 48(1):27-42.Crossref, Google Scholar · Zbl 1439.62171 · doi:10.1214/18-AOS1789
[24] [24] Chen SX , Peng L , Qin YL (2009) Effects of data dimension on empirical likelihood. Biometrika 96(3):711-722.Crossref, Google Scholar · Zbl 1170.62023 · doi:10.1093/biomet/asp037
[25] [25] Chen X , Lee JD , Tong XT , Zhang Y (2020) Statistical inference for model parameters in stochastic gradient descent. Ann. Statist. 48(1):251-273.Crossref, Google Scholar · Zbl 1440.62287 · doi:10.1214/18-AOS1801
[26] [26] Christie AA (1982) The stochastic behavior of common stock variances: Value, leverage and interest rate effects. J. Financial Econom. 10(4):407-432.Crossref, Google Scholar · doi:10.1016/0304-405X(82)90018-6
[27] [27] Clarkson K , Hazan E , Woodruff D (2012) Sublinear optimization for machine learning. J. ACM 59(5):23.Crossref, Google Scholar · Zbl 1281.68177 · doi:10.1145/2371656.2371658
[28] [28] Corcoran SA (1998) Bartlett adjustment of empirical discrepancy statistics. Biometrika 85(4):967-972.Crossref, Google Scholar · Zbl 1101.62330 · doi:10.1093/biomet/85.4.967
[29] [29] Cressie N , Read TR (1984) Multinomial goodness-of-fit tests. J. Roy. Statist. Soc. B 46(3):440-464.Google Scholar · Zbl 0571.62017
[30] [30] Csiszár I (1967) Information-type measures of difference of probability distributions and indirect observation. Studia Scientifica Mathematica Hungary 2:299-318.Google Scholar · Zbl 0157.25802
[31] [31] Danskin JM (1967) The Theory of Max-Min and Its Application to Weapons Allocation Problems (Springer, Berlin).Crossref, Google Scholar · Zbl 0154.20009 · doi:10.1007/978-3-642-46092-0
[32] [32] Defazio A , Bach F , Lacoste-Julien S (2014) SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. Ghahramani Z , Welling M , Cortes C , Lawrence ND , Weinberger KQ , eds. Advances in Neural Information Processing Systems , vol. 27 (Neural Information Processing Systems Foundation, San Diego), 1646-1654.Google Scholar
[33] [33] Delage E , Ye Y (2010) Distributionally robust optimization under moment uncertainty with application to data-driven problems. Oper. Res. 58(3):595-612.Link, Google Scholar · Zbl 1228.90064
[34] [34] DiCiccio T , Hall P , Romano J (1988) Bartlett adjustment for empirical likelihood. Technical Report 298. Department of Statistics, Stanford University, Stanford, CA.Google Scholar
[35] [35] DiCiccio T , Hall P , Romano J (1991) Empirical likelihood is Bartlett-correctable. Ann. Statist. 19(2):1053-1061.Crossref, Google Scholar · Zbl 0725.62042 · doi:10.1214/aos/1176348137
[36] [36] Donoho D , Montanari A (2016) High dimensional robust M-estimation: asymptotic variance via approximate message passing. Probab. Theory Related Fields 166(3-4):935-969.Crossref, Google Scholar · Zbl 1357.62220 · doi:10.1007/s00440-015-0675-z
[37] [37] Doukhan P (1994) Mixing, Properties and Examples (Springer, New York).Google Scholar · Zbl 0801.60027
[38] [38] Doukhan P , Massart P , Rio E (1995) Invariance principles for absolutely regular empirical processes. Annales de l’IHP probabilités et statistiques 31(2):393-427.Google Scholar · Zbl 0817.60028
[39] [39] Duchi JC , Namkoong H (2016) Variance-based regularization with convex objectives. Preprint, submitted October 8, https://arxiv.org/abs/1610.02581.Google Scholar
[40] [40] Duchi JC , Namkoong H (2019) Variance-based regularization with convex objectives. J. Machine Learn. Res. 20(68):1-55.Google Scholar · Zbl 1489.62193
[41] [41] Duchi JC , Hazan E , Singer Y (2011) Adaptive subgradient methods for online learning and stochastic optimization. J. Machine Learn. Res. 12(61):2121-2159.Google Scholar · Zbl 1280.68164
[42] [42] Dupacová J , Wets R (1988) Asymptotic behavior of statistical estimators and of optimal solutions of stochastic optimization problems. Ann. Statist. 16(4):1517-1549.Crossref, Google Scholar · Zbl 0667.62018 · doi:10.1214/aos/1176351052
[43] [43] Esfahani PM , Kuhn D (2018) Data-driven distributionally robust optimization using the wasserstein metric: Performance guarantees and tractable reformulations. Math. Programming 171(1-2):115-166.Crossref, Google Scholar · Zbl 1433.90095 · doi:10.1007/s10107-017-1172-1
[44] [44] Ethier SN , Kurtz TG (2009) Markov Processes: Characterization and Convergence (Wiley, New York).Google Scholar
[45] [45] Fournier N , Guillin A (2015) On the rate of convergence in Wasserstein distance of the empirical measure. Probab. Theory Related Fields 162(3-4):707-738.Crossref, Google Scholar · Zbl 1325.60042 · doi:10.1007/s00440-014-0583-7
[46] [46] Glynn PW , Zeevi A (2008) Bounding stationary expectations of markov processes. Ethier SN , Feng J , Stockbridge RH , eds. Markov Processes and Related Topics: A Festschrift for Thomas G. Kurtz (Institute of Mathematical Statistics, Beachwood, OH), 195-214.Crossref, Google Scholar · Zbl 1170.68389 · doi:10.1214/074921708000000381
[47] [47] Gupta V (2019) Near-optimal Bayesian ambiguity sets for distributionally robust optimization. Management Sci. 65(9):4242-4260.Link, Google Scholar
[48] [48] Hazan E (2016) Introduction to online convex optimization. Foundations Trends Optim. 2(3-4):157-325.Crossref, Google Scholar · doi:10.1561/2400000013
[49] [49] Hiriart-Urruty J , Lemaréchal C (1993) Convex Analysis and Minimization Algorithms I (Springer, New York).Crossref, Google Scholar · Zbl 0795.49001 · doi:10.1007/978-3-662-02796-7
[50] [50] Hiriart-Urruty J , Lemaréchal C (1993) Convex Analysis and Minimization Algorithms II (Springer, New York).Crossref, Google Scholar · Zbl 0795.49002 · doi:10.1007/978-3-662-06409-2
[51] [51] Hjort NL , McKeague IW , Van Keilegom I (2009) Extending the scope of empirical likelihood. Ann. Statist. 37(3):1079-1111.Crossref, Google Scholar · Zbl 1160.62029 · doi:10.1214/07-AOS555
[52] [52] Ibragimov IA (1962) Some limit theorems for stationary processes. Theory Probab. Appl. 7(4):349-382.Crossref, Google Scholar · Zbl 0119.14204 · doi:10.1137/1107036
[53] [53] Imbens G (2002) Generalized method of moments and empirical likelihood. J. Bus. Econom. Statist. 20(4):493-506.Crossref, Google Scholar · doi:10.1198/073500102288618630
[54] [54] Jiang R , Guan Y (2016) Data-driven chance constrained stochastic program. Math. Programming 158(1-2):291-327.Crossref, Google Scholar · Zbl 1346.90640 · doi:10.1007/s10107-015-0929-7
[55] [55] Johnson R , Zhang T (2013) Accelerating stochastic gradient descent using predictive variance reduction. Burges CJC , Bottou L , Welling M , Ghahramani Z , Weinberger KQ , eds. Advances in Neural Information Processing Systems , vol. 26 (Neural Information Processing Systems Foundation, San Diego), 315-323.Google Scholar
[56] [56] King AJ (1989) Generalized delta theorems for multivalued mappings and measurable selections. Math. Oper. Res. 14(4):720-736.Link, Google Scholar · Zbl 0685.60008
[57] [57] King AJ , Rockafellar RT (1993) Asymptotic theory for solutions in statistical estimation and stochastic programming. Math. Oper. Res. 18(1):148-162.Link, Google Scholar · Zbl 0798.90115
[58] [58] King AJ , Wets RJ (1991) Epi-consistency of convex stochastic programs. Stochastics Stochastic Rep. 34(1-2):83-92.Crossref, Google Scholar · Zbl 0733.90049 · doi:10.1080/17442509108833676
[59] [59] Kosorok MR (2008) Introduction to empirical processes. Introduction to Empirical Processes and Semiparametric Inference (Springer, New York), 77-79.Crossref, Google Scholar · Zbl 1180.62137 · doi:10.1007/978-0-387-74978-5_5
[60] [60] Krokhmal PA (2007) Higher moment coherent risk measures. Quant. Finance 7(4):373-387.Crossref, Google Scholar · Zbl 1190.91074 · doi:10.1080/14697680701458307
[61] [61] Lam H (2016) Robust sensitivity analysis for stochastic systems. Math. Oper. Res. 41(4):1248-1275.Link, Google Scholar · Zbl 1361.65008
[62] [62] Lam H (2018) Sensitivity to serial dependency of input processes: A robust approach. Management Sci. 64(3):1311-1327.Link, Google Scholar
[63] [63] Lam H , Zhou E (2017) The empirical likelihood approach to quantifying uncertainty in sample average approximation. Oper. Res. Lett. 45(4):301-307.Crossref, Google Scholar · Zbl 1409.62073 · doi:10.1016/j.orl.2017.04.003
[64] [64] Lan G , Nemirovski A , Shapiro A (2012) Validation analysis of robust stochastic approximation method. Math. Programming 134(2):425-458.Crossref, Google Scholar · Zbl 1273.90154 · doi:10.1007/s10107-011-0442-6
[65] [65] Lehmann EL , Romano JP (2005) Testing Statistical Hypotheses , 3rd ed. (Springer, New York).Google Scholar · Zbl 1076.62018
[66] [66] Li T , Liu L , Kyrillidis A , Caramanis C (2018) Statistical inference using SGD. Thirty-Second AAAI Conf. Artificial Intelligence (Association for the Advancement of Artificial Intelligence, Menlo Park, CA), 3571-3578.Google Scholar
[67] [67] Mak W-K , Morton DP , Wood RK (1999) Monte Carlo bounding techniques for determining solution quality in stochastic programs. Oper. Res. Lett. 24(1):47-56.Crossref, Google Scholar · Zbl 0956.90022 · doi:10.1016/S0167-6377(98)00054-6
[68] [68] Mandt S , Hoffman M , Blei D (2017) Stochastic gradient descent as approximate Bayesian inference. J. Machine Learn. Res. 18(134):1-35.Google Scholar · Zbl 1442.62055
[69] [69] Markowitz H (1952) Portfolio selection. J. Finance 7(1):77-91.Google Scholar
[70] [70] Meyn S , Tweedie RL (2009) Markov Chains and Stochastic Stability , 2nd ed. (Cambridge University Press, New York).Crossref, Google Scholar · Zbl 1165.60001 · doi:10.1017/CBO9780511626630
[71] [71] Mokkadem A (1990) Propriétés de mélange des processus autorégressifs polynomiaux. Ann. Inst. Henri Poincaré Probab. Statist. 26(2):219-260.Google Scholar · Zbl 0706.60040
[72] [72] Namkoong H , Duchi JC (2016) Stochastic gradient methods for distributionally robust optimization with f-divergences. Lee DD , Sugiyama M , Luxburg UV , Guyon I , Garnett R , eds. Advances in Neural Information Processing Systems , vol. 29 (Neural Information Processing Systems Foundation, San Diego), 2208-2216.Google Scholar
[73] [73] Namkoong H , Duchi JC (2017) Variance-based regularization with convex objectives. Guyon I , Luxburg UV , Bengio S , Wallach H , Fergus R , Vishwanathan S , Garnett R , eds. Advances in Neural Information Processing Systems , vol. 30 (Neural Information Processing Systems Foundation, San Diego), 2971-2980.Google Scholar
[74] [74] Nemirovski A , Juditsky A , Lan G , Shapiro A (2009) Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 19(4):1574-1609.Crossref, Google Scholar · Zbl 1189.90109 · doi:10.1137/070704277
[75] [75] Newey W , Smith R (2004) Higher order properties of gmm and generalized empirical likelihood estimators. Econometrica 72(1):219-255.Crossref, Google Scholar · Zbl 1151.62313 · doi:10.1111/j.1468-0262.2004.00482.x
[76] [76] Nobel A , Dembo A (1993) A note on uniform laws of averages for dependent processes. Statist. Probab. Lett. 17(3):169-172.Crossref, Google Scholar · Zbl 0776.60042 · doi:10.1016/0167-7152(93)90163-D
[77] [77] Nummelin E , Tweedie RL (1978) Geometric ergodicity and r-positivity for general markov chains. Ann. Probab. 6(3):404-420.Crossref, Google Scholar · Zbl 0378.60051 · doi:10.1214/aop/1176995527
[78] [78] Owen A (1990) Empirical likelihood ratio confidence regions. Ann. Statist. 18(1):90-120.Crossref, Google Scholar · Zbl 0712.62040 · doi:10.1214/aos/1176347494
[79] [79] Owen AB (1988) Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75(2):237-249.Crossref, Google Scholar · Zbl 0641.62032 · doi:10.1093/biomet/75.2.237
[80] [80] Owen AB (2001) Empirical Likelihood (CRC Press, Boca Raton, FL).Crossref, Google Scholar · Zbl 0989.62019 · doi:10.1201/9781420036152
[81] [81] Pflug G , Wozabal D (2007) Ambiguity in portfolio selection. Quant. Finance 7(4):435-442.Crossref, Google Scholar · Zbl 1190.91138 · doi:10.1080/14697680701455410
[82] [82] Polyak BT , Juditsky AB (1992) Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4):838-855.Crossref, Google Scholar · Zbl 0762.62022 · doi:10.1137/0330046
[83] [83] Rio E (2017) Asymptotic Theory of Weakly Dependent Random Processes (Springer, New York).Crossref, Google Scholar · Zbl 1378.60003 · doi:10.1007/978-3-662-54323-8
[84] [84] Rockafellar RT , Uryasev S (2000) Optimization of conditional value-at-risk. J. Risk 2(3):21-42.Crossref, Google Scholar · doi:10.21314/JOR.2000.038
[85] [85] Rockafellar RT , Wets RJB (1998) Variational Analysis (Springer, New York).Crossref, Google Scholar · Zbl 0888.49001 · doi:10.1007/978-3-642-02431-3
[86] [86] Römisch W (2005) Delta method, infinite dimensional. Kotz S, Read CB, Balakrishnan N, Vidakovic B, eds. Encyclopedia of Statistical Sciences (Wiley, Hoboken, NJ).Google Scholar
[87] [87] Scarsini M (1999) Multivariate convex orderings, dependence, and stochastic equality. J. Appl. Probab. 35(1):93-103.Crossref, Google Scholar · Zbl 0906.60020 · doi:10.1017/S0021900200014704
[88] [88] Shafieezadeh-Abadeh S , Esfahani PM , Kuhn D (2015) Distributionally robust logistic regression. Cortes C , Lawrence ND , Lee DD , Sugiyama M , Garnett R , eds. Advances in Neural Information Processing Systems , vol. 28 (Neural Information Processing Systems Foundation, San Diego), 1576-1584.Google Scholar
[89] [89] Shalev-Shwartz S , Wexler Y (2016) Minimizing the maximal loss: How and why? Balcan MF , Weinberger KQ , eds. Proc. 33rd Internat. Conf. Machine Learn. (Association for Computing Machinery, New York), 793-801.Google Scholar
[90] [90] Shapiro A (1989) Asymptotic properties of statistical estimators in stochastic programming. Ann. Statist. 17(2):841-858.Crossref, Google Scholar · Zbl 0688.62025 · doi:10.1214/aos/1176347146
[91] [91] Shapiro A (1990) On differential stability in stochastic programming. Math. Programming 47(1-3):107-116.Crossref, Google Scholar · Zbl 0705.90063 · doi:10.1007/BF01580855
[92] [92] Shapiro A (1991) Asymptotic analysis of stochastic programs. Ann. Oper. Res. 30(1):169-186.Crossref, Google Scholar · Zbl 0745.90057 · doi:10.1007/BF02204815
[93] [93] Shapiro A (1993) Asymptotic behavior of optimal solutions in stochastic programming. Math. Oper. Res. 18(4):829-845.Link, Google Scholar · Zbl 0804.90101
[94] [94] Shapiro A , Dentcheva D , Ruszczyński A (2009) Lectures on Stochastic Programming: Modeling and Theory (SIAM and Mathematical Programming Society, Philadelphia).Crossref, Google Scholar · Zbl 1183.90005 · doi:10.1137/1.9780898718751
[95] [95] Sinha A , Namkoong H , Volpi R , Duchi JC (2017) Certifiable distributional robustness with principled adversarial training. Preprint, submitted October 29, https://arxiv.org/abs/1710.10571.Google Scholar
[96] [96] Udell M , Mohan K , Zeng D , Hong J , Diamond S , Boyd S (2014) Convex optimization in Julia. First Workshop High Performance Tech. Comput. Dynam. Languages (IEEE, New York), 18-28.Google Scholar
[97] [97] van der Vaart AW (1998) Asymptotic Statistics (Cambridge University Press, New York).Crossref, Google Scholar · Zbl 0943.62002 · doi:10.1017/CBO9780511802256
[98] [98] van der Vaart AW , Wellner JA (1996) Weak Convergence and Empirical Processes with Applications to Statistics (Springer, New York).Crossref, Google Scholar · Zbl 0862.60002 · doi:10.1007/978-1-4757-2545-2
[99] [99] Wang Z , Glynn P , Ye Y (2016) Likelihood robust optimization for data-driven problems. Comput. Management Sci. 13:241-261.Crossref, Google Scholar · Zbl 1397.90225 · doi:10.1007/s10287-015-0240-3
[100] [100] Wozabal D (2012) A framework for optimization under ambiguity. Ann. Oper. Res. 193(1):21-47.Crossref, Google Scholar · Zbl 1255.91454 · doi:10.1007/s10479-010-0812-0
[101] [101] Xu H , Caramanis C , Mannor S (2009) Robustness and regularization of support vector machines. J. Machine Learn. Res. 10:1485-1510.Google Scholar · Zbl 1235.68209
[102] [102] Yu B (1994) Rates of convergence for empirical processes of stationary mixing sequences. Ann. Probab. 22(1):94-116.Crossref, Google Scholar · Zbl 0802.60024 · doi:10.1214/aop/1176988849
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.