File(s) under permanent embargo
Incorporating the Barzilai-Borwein adaptive step size into sugradient methods for deep network training
conference contribution
posted on 2019-01-01, 00:00 authored by Antonio Robles-KellyAntonio Robles-Kelly, Asef NazariAsef NazariIn this paper, we incorporate the Barzilai-Borwein [2] step size into gradient descent methods used to train deep networks. This allows us to adapt the learning rate using a two-point approximation to the secant equation which quasi-Newton methods are based upon. Moreover, the adaptive learning rate method presented here is quite general in nature and can be applied to widely used gradient descent approaches such as Adagrad [7] and RMSprop. We evaluate our method using standard example network architectures on widely available datasets and compare against alternatives elsewhere in the literature. In our experiments, our adaptive learning rate shows a smoother and faster convergence than that exhibited by the alternatives, with better or comparable performance.
History
Event
Digital Image Computing Techniques and Applications. International Conference (2019 : Perth, W.A.)Pagination
1 - 6Publisher
IEEELocation
Perth, W.A.Place of publication
Piscataway, N.J.Publisher DOI
Start date
2019-12-02End date
2019-12-04ISBN-13
9781728138572Language
engPublication classification
E1 Full written paper - refereedEditor/Contributor(s)
UnknownTitle of proceedings
DICTA 2019 : Proceedings of the 2019 Digital Image Computing Techniques and Applications International ConferenceUsage metrics
Categories
No categories selectedKeywords
Licence
Exports
RefWorks
BibTeX
Ref. manager
Endnote
DataCite
NLM
DC