Difference between revisions of "Optimization Methods"

From
Jump to: navigation, search
m (BPeat moved page Optimization Methods for LSTMs to Optimization Methods without leaving a redirect)
Line 1: Line 1:
 
[http://www.youtube.com/results?search_query=Optimization+LSTM+SGD+Adagrad+Adadelta+RMSprop+Adam+BGFS Youtube search...]
 
[http://www.youtube.com/results?search_query=Optimization+LSTM+SGD+Adagrad+Adadelta+RMSprop+Adam+BGFS Youtube search...]
  
* [[Natural Language Processing (NLP), Natural Language Inference (NLI) and Recognizing Textual Entailment (RTE)]]
+
* [[Natural Language Processing (NLP)]]
 
* [[Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)]]
 
* [[Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and Recurrent Neural Network (RNN)]]
 
* [[Average-SGD Weight-Dropped LSTM (AWD-LSTM)]]
 
* [[Average-SGD Weight-Dropped LSTM (AWD-LSTM)]]

Revision as of 07:09, 5 January 2019

Youtube search...

Methods:

  • Stochastic gradient descent (SGD) (with and without momentum)
  • L-BGFS
  • Adagrad
  • Adadelta
  • Root Mean Squared (RMSprop)
  • Adam
  • Hessian-free (HF)