梯度下降算法概述.pdf


立即下载 楊♡
2025-03-17
descent algorithms popular optimizational common gorithms optimization overview gradient stat
643.6 KB

An overview of gradient descent optimization
algorithms∗
Sebastian Ruder
Insight Centre for Data Analytics, NUI Galway
Aylien Ltd., Dublin
ruder.sebastian@gmail.com
Abstract
Gradient descent optimization algorithms, while increasingly popular, are often
used as black-box optimizers, as practical explanations of their strengths and
weaknesses are hard to come by. This article aims to provide the reader with
intuitions with regard to the behaviour of different algorithms that will allow her
to put them to use. In the course of this overview, we look at different variants of
gradient descent, summarize challenges, introduce the most common optimization
algorithms, review architectures in a parallel and distributed setting, and investigate
additional strategies for optimizing gradient descent.
1 Introduction
Gradient descent is one of the most popular algorithms to perform optimization and by far the
most common way to optimize neural networks. At the same time, every stat


descent/algorithms/popular/optimizational/common/gorithms/optimization/overview/gradient/stat/ descent/algorithms/popular/optimizational/common/gorithms/optimization/overview/gradient/stat/
-1 条回复
登录 后才能参与评论
-->