Deakin University
Browse

File(s) under permanent embargo

Incorporating the Barzilai-Borwein adaptive step size into sugradient methods for deep network training

conference contribution
posted on 2019-01-01, 00:00 authored by Antonio Robles-KellyAntonio Robles-Kelly, Asef NazariAsef Nazari
In this paper, we incorporate the Barzilai-Borwein [2] step size into gradient descent methods used to train deep networks. This allows us to adapt the learning rate using a two-point approximation to the secant equation which quasi-Newton methods are based upon. Moreover, the adaptive learning rate method presented here is quite general in nature and can be applied to widely used gradient descent approaches such as Adagrad [7] and RMSprop. We evaluate our method using standard example network architectures on widely available datasets and compare against alternatives elsewhere in the literature. In our experiments, our adaptive learning rate shows a smoother and faster convergence than that exhibited by the alternatives, with better or comparable performance.

History

Event

Digital Image Computing Techniques and Applications. International Conference (2019 : Perth, W.A.)

Pagination

1 - 6

Publisher

IEEE

Location

Perth, W.A.

Place of publication

Piscataway, N.J.

Start date

2019-12-02

End date

2019-12-04

ISBN-13

9781728138572

Language

eng

Publication classification

E1 Full written paper - refereed

Editor/Contributor(s)

Unknown

Title of proceedings

DICTA 2019 : Proceedings of the 2019 Digital Image Computing Techniques and Applications International Conference

Usage metrics

    Research Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC