Difference between revisions of "Optimization Methods"
(Created page with "[http://www.youtube.com/results?search_query=Optimization+LSTM+SGD+Adagrad+Adadelta+RMSprop+Adam+BGFS Youtube search...] * Natural Language Processing (NLP), Natural Langua...") |
|||
| Line 4: | Line 4: | ||
* [[Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)]] | * [[Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)]] | ||
* [[Average-SGD Weight-Dropped LSTM (AWD-LSTM)]] | * [[Average-SGD Weight-Dropped LSTM (AWD-LSTM)]] | ||
| + | * [[Gradient Boosting Algorithms]] | ||
Methods: | Methods: | ||
| − | * SGD (with and without momentum) | + | * Stochastic gradient descent (SGD) (with and without momentum) |
* L-BGFS | * L-BGFS | ||
* Adagrad | * Adagrad | ||
* Adadelta | * Adadelta | ||
| − | * RMSprop | + | * Root Mean Squared (RMSprop) |
* Adam | * Adam | ||
* Hessian-free (HF) | * Hessian-free (HF) | ||
| − | + | <youtube>JXQT_vxqwIs</youtube> | |
| − | <youtube>- | + | <youtube>k8fTYJPd3_I</youtube> |
| + | <youtube>_e-LFe_igno</youtube> | ||
| + | <youtube>kK8-jCCR4is</youtube> | ||
| + | <youtube>VINCQghQRuM</youtube> | ||
Revision as of 06:24, 24 October 2018
- Natural Language Processing (NLP), Natural Language Inference (NLI) and Recognizing Textual Entailment (RTE)
- Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)
- Average-SGD Weight-Dropped LSTM (AWD-LSTM)
- Gradient Boosting Algorithms
Methods:
- Stochastic gradient descent (SGD) (with and without momentum)
- L-BGFS
- Adagrad
- Adadelta
- Root Mean Squared (RMSprop)
- Adam
- Hessian-free (HF)