An Optimal Multistage Stochastic Gradient Method for Minimax Problems

Fallah, Alireza; Ozdaglar, Asuman; Pattathil, Sarath

dc.contributor	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.creator	Fallah, Alireza
dc.creator	Ozdaglar, Asuman
dc.creator	Pattathil, Sarath
dc.date	2022-07-18T17:00:45Z
dc.date	2022-07-18T17:00:45Z
dc.date	2020
dc.date	2022-07-18T16:56:10Z
dc.date.accessioned	2023-03-01T18:07:42Z
dc.date.available	2023-03-01T18:07:42Z
dc.identifier	https://hdl.handle.net/1721.1/143824
dc.identifier	Fallah, Alireza, Ozdaglar, Asuman and Pattathil, Sarath. 2020. "An Optimal Multistage Stochastic Gradient Method for Minimax Problems." Proceedings of the IEEE Conference on Decision and Control, 2020-December.
dc.identifier.uri	http://localhost:8080/xmlui/handle/CUHPOERS/278854
dc.description	© 2020 IEEE. In this paper, we study the minimax optimization problem in the smooth and strongly convex-strongly concave setting when we have access to noisy estimates of gradients. In particular, we first analyze the stochastic Gradient Descent Ascent (GDA) method with constant stepsize, and show that it converges to a neighborhood of the solution of the minimax problem. We further provide tight bounds on the convergence rate and the size of this neighborhood. Next, we propose a multistage variant of stochastic GDA (M-GDA) that runs in multiple stages with a particular learning rate decay schedule and converges to the exact solution of the minimax problem. We show M-GDA achieves the lower bounds in terms of noise dependence without any assumptions on the knowledge of noise characteristics. We also show that M-GDA obtains a linear decay rate with respect to the error's dependence on the initial error, although the dependence on condition number is suboptimal. In order to improve this dependence, we apply the multistage machinery to the stochastic Optimistic Gradient Descent Ascent (OGDA) algorithm and propose the M-OGDA algorithm which also achieves the optimal linear decay rate with respect to the initial error. To the best of our knowledge, this method is the first to simultaneously achieve the best dependence on noise characteristic as well as the initial error and condition number.
dc.format	application/pdf
dc.language	en
dc.publisher	Institute of Electrical and Electronics Engineers (IEEE)
dc.relation	10.1109/CDC42340.2020.9304033
dc.relation	Proceedings of the IEEE Conference on Decision and Control
dc.rights	Creative Commons Attribution-Noncommercial-Share Alike
dc.rights	http://creativecommons.org/licenses/by-nc-sa/4.0/
dc.source	arXiv
dc.title	An Optimal Multistage Stochastic Gradient Method for Minimax Problems
dc.type	Article
dc.type	http://purl.org/eprint/type/ConferencePaper

Files in this item

Files	Size	Format	View
2002.05683.pdf	300.1Kb	application/pdf	View/Open

This item appears in the following Collection(s)

MIT Open Access Articles [2860]

Show simple item record

Search DSpace

Advanced Search

Browse

All of DSpace
This Collection
- By Issue Date
- Authors
- Titles
- Subjects

An Optimal Multistage Stochastic Gradient Method for Minimax Problems

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection