Asymptotic and finite-sample properties of estimators based on stochastic gradients. (English) Zbl 1378.62046
This paper deals with implicit stochastic gradient descent procedures, defined as \(\Theta^{im}_n=\Theta^{im}_{n-1}+\nu_n\nabla\log f(Y_{ni}X_n,\Theta^{im}_n)\), where \(\nu_n>0\) is the learning rate sequence, typically \(\nu_n:=\nu_1 n^{-\nu}\), \(\nu_1>0\) is the learning rate parameter, \(\nu_n\in(0. 5,1]\), and \(C_n\) are \(p\times p\) positive definite matrices, also known as condition matrices. The authors’ “theoretical analysis provides the first full characterization of the behavior of both standard and implicit stochastic gradient descent-based estimators, including finite-sample error bounds”.
Reviewer: N. G. Gamkrelidze (Moskva)
MSC:
62L20 | Stochastic approximation |
62F10 | Point estimation |
62L12 | Sequential estimation |
62F12 | Asymptotic properties of parametric estimators |