dc.contributor |
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
|
dc.contributor |
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science |
|
dc.contributor |
Jaakkola, Tommi S. |
|
dc.contributor |
Sontag, David Alexander |
|
dc.contributor |
Jaakkola, Tommi S. |
|
dc.creator |
Meshi, Ofer |
|
dc.creator |
Sontag, David Alexander |
|
dc.creator |
Jaakkola, Tommi S. |
|
dc.creator |
Globerson, Amir |
|
dc.date |
2011-05-19T21:39:04Z |
|
dc.date |
2011-05-19T21:39:04Z |
|
dc.date |
2010-01 |
|
dc.date.accessioned |
2023-03-01T18:06:16Z |
|
dc.date.available |
2023-03-01T18:06:16Z |
|
dc.identifier |
9781605589077 |
|
dc.identifier |
1605589071 |
|
dc.identifier |
http://hdl.handle.net/1721.1/62851 |
|
dc.identifier |
Meshi, Ofer et al. "Learning Efficiently with Approximate Inference via Dual Losses" Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, 2010. |
|
dc.identifier |
https://orcid.org/0000-0002-2199-0379 |
|
dc.identifier.uri |
http://localhost:8080/xmlui/handle/CUHPOERS/278761 |
|
dc.description |
Many structured prediction tasks involve
complex models where inference is computationally intractable, but where it can be well
approximated using a linear programming
relaxation. Previous approaches for learning for structured prediction (e.g., cutting-
plane, subgradient methods, perceptron) repeatedly make predictions for some of the
data points. These approaches are computationally demanding because each prediction
involves solving a linear program to optimality. We present a scalable algorithm for learning for structured prediction. The main idea
is to instead solve the dual of the structured
prediction loss. We formulate the learning
task as a convex minimization over both the
weights and the dual variables corresponding
to each data point. As a result, we can begin to optimize the weights even before completely solving any of the individual prediction problems. We show how the dual variables can be efficiently optimized using coordinate descent. Our algorithm is competitive with state-of-the-art methods such as
stochastic subgradient and cutting-plane. |
|
dc.format |
application/pdf |
|
dc.language |
en_US |
|
dc.publisher |
International Machine Learning Society |
|
dc.relation |
http://www.icml2010.org/papers/587.pdf |
|
dc.relation |
International Conference on Machine Learning (27th, 2010) proceedings |
|
dc.rights |
Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. |
|
dc.source |
MIT web domain |
|
dc.title |
Learning efficiently with approximate inference via dual losses |
|
dc.type |
Article |
|
dc.type |
http://purl.org/eprint/type/ConferencePaper |
|