Barzilai–Borwein-based adaptive learning rate for deep learning

Jinxiu Liang South China University of Technology, Guangzhou 510006, China Yong Xu South China University of Technology, Guangzhou 510006, China; Peng Cheng Laboratory, Shenzhen 510852, China Chenglong Bao Tsinghua University, Beijing 100084, China Yuhui Quan South China University of Technology, Guangzhou 510006, China Hui Ji National University of Singapore, Singapore 117543, Singapore

Machine Learning mathscidoc:2206.41003

Pattern Recognition Letters, 128, (1), 197-203, 2019.12
Learning rate is arguably the most important hyper-parameter to tune when training a neural network. As manually setting right learning rate remains a cumbersome process, adaptive learning rate algorithms aim at automating such a process. Motivated by the success of the Barzilai–Borwein (BB) step-size method in many gradient descent methods for solving convex problems, this paper aims at investigating the potential of the BB method for training neural networks. With strong motivation from related convergence analysis, the BB method is generalized to adaptive learning rate of mini-batch gradient descent. The experiments showed that, in contrast to many existing methods, the proposed BB method is highly insensitive to initial learning rate, especially in terms of generalization performance. Also, the BB method showed its advantages on both learning speed and generalization performance over other available methods.
No keywords uploaded!
[ Download ] [ 2022-06-13 22:18:43 uploaded by Baocl ] [ 416 downloads ] [ 0 comments ]
@inproceedings{jinxiu2019barzilai–borwein-based,
  title={Barzilai–Borwein-based adaptive learning rate for deep learning},
  author={Jinxiu Liang, Yong Xu, Chenglong Bao, Yuhui Quan, and Hui Ji},
  url={http://archive.ymsc.tsinghua.edu.cn/pacm_paperurl/20220613221843254656356},
  booktitle={Pattern Recognition Letters},
  volume={128},
  number={1},
  pages={197-203},
  year={2019},
}
Jinxiu Liang, Yong Xu, Chenglong Bao, Yuhui Quan, and Hui Ji. Barzilai–Borwein-based adaptive learning rate for deep learning. 2019. Vol. 128. In Pattern Recognition Letters. pp.197-203. http://archive.ymsc.tsinghua.edu.cn/pacm_paperurl/20220613221843254656356.
Please log in for comment!
 
 
Contact us: office-iccm@tsinghua.edu.cn | Copyright Reserved