Difference between revisions of "Optimization Methods"
| Line 11: | Line 11: | ||
* [[Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)]] | * [[Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)]] | ||
* [[Average-SGD Weight-Dropped LSTM (AWD-LSTM)]] | * [[Average-SGD Weight-Dropped LSTM (AWD-LSTM)]] | ||
| − | * [[ | + | * Gradient [[Boosting]] Algorithms |
Methods: | Methods: | ||
Revision as of 06:37, 4 February 2019
Youtube search... ...Google search
- Natural Language Processing (NLP)
- Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)
- Average-SGD Weight-Dropped LSTM (AWD-LSTM)
- Gradient Boosting Algorithms
Methods:
- Stochastic gradient descent (SGD) (with and without momentum)
- L-BGFS
- Adagrad
- Adadelta
- Root Mean Squared (RMSprop)
- Adam
- Hessian-free (HF)