Difference between revisions of "Optimization Methods"
m (BPeat moved page Optimization Methods for LSTMs to Optimization Methods without leaving a redirect) |
|||
| Line 1: | Line 1: | ||
[http://www.youtube.com/results?search_query=Optimization+LSTM+SGD+Adagrad+Adadelta+RMSprop+Adam+BGFS Youtube search...] | [http://www.youtube.com/results?search_query=Optimization+LSTM+SGD+Adagrad+Adadelta+RMSprop+Adam+BGFS Youtube search...] | ||
| − | * [[Natural Language Processing (NLP | + | * [[Natural Language Processing (NLP)]] |
* [[Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)]] | * [[Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)]] | ||
* [[Average-SGD Weight-Dropped LSTM (AWD-LSTM)]] | * [[Average-SGD Weight-Dropped LSTM (AWD-LSTM)]] | ||
Revision as of 07:09, 5 January 2019
- Natural Language Processing (NLP)
- Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)
- Average-SGD Weight-Dropped LSTM (AWD-LSTM)
- Gradient Boosting Algorithms
Methods:
- Stochastic gradient descent (SGD) (with and without momentum)
- L-BGFS
- Adagrad
- Adadelta
- Root Mean Squared (RMSprop)
- Adam
- Hessian-free (HF)