Sangam: A Confluence of Knowledge Streams

An Optimal Multistage Stochastic Gradient Method for Minimax Problems

Show simple item record

dc.contributor Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.creator Fallah, Alireza
dc.creator Ozdaglar, Asuman
dc.creator Pattathil, Sarath
dc.date 2022-07-18T17:00:45Z
dc.date 2022-07-18T17:00:45Z
dc.date 2020
dc.date 2022-07-18T16:56:10Z
dc.date.accessioned 2023-03-01T18:07:42Z
dc.date.available 2023-03-01T18:07:42Z
dc.identifier https://hdl.handle.net/1721.1/143824
dc.identifier Fallah, Alireza, Ozdaglar, Asuman and Pattathil, Sarath. 2020. "An Optimal Multistage Stochastic Gradient Method for Minimax Problems." Proceedings of the IEEE Conference on Decision and Control, 2020-December.
dc.identifier.uri http://localhost:8080/xmlui/handle/CUHPOERS/278854
dc.description © 2020 IEEE. In this paper, we study the minimax optimization problem in the smooth and strongly convex-strongly concave setting when we have access to noisy estimates of gradients. In particular, we first analyze the stochastic Gradient Descent Ascent (GDA) method with constant stepsize, and show that it converges to a neighborhood of the solution of the minimax problem. We further provide tight bounds on the convergence rate and the size of this neighborhood. Next, we propose a multistage variant of stochastic GDA (M-GDA) that runs in multiple stages with a particular learning rate decay schedule and converges to the exact solution of the minimax problem. We show M-GDA achieves the lower bounds in terms of noise dependence without any assumptions on the knowledge of noise characteristics. We also show that M-GDA obtains a linear decay rate with respect to the error's dependence on the initial error, although the dependence on condition number is suboptimal. In order to improve this dependence, we apply the multistage machinery to the stochastic Optimistic Gradient Descent Ascent (OGDA) algorithm and propose the M-OGDA algorithm which also achieves the optimal linear decay rate with respect to the initial error. To the best of our knowledge, this method is the first to simultaneously achieve the best dependence on noise characteristic as well as the initial error and condition number.
dc.format application/pdf
dc.language en
dc.publisher Institute of Electrical and Electronics Engineers (IEEE)
dc.relation 10.1109/CDC42340.2020.9304033
dc.relation Proceedings of the IEEE Conference on Decision and Control
dc.rights Creative Commons Attribution-Noncommercial-Share Alike
dc.rights http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.source arXiv
dc.title An Optimal Multistage Stochastic Gradient Method for Minimax Problems
dc.type Article
dc.type http://purl.org/eprint/type/ConferencePaper


Files in this item

Files Size Format View
2002.05683.pdf 300.1Kb application/pdf View/Open

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse