Optimization Methods - Revision history

BPeat at 01:27, 6 March 2024

2024-03-06T01:27:52Z

BPeat at 01:25, 6 March 2024

2024-03-06T01:25:08Z

BPeat at 01:22, 6 March 2024

2024-03-06T01:22:04Z

BPeat at 11:36, 2 March 2024

2024-03-02T11:36:24Z

BPeat at 11:28, 2 March 2024

2024-03-02T11:28:03Z

BPeat at 12:25, 31 December 2023

2023-12-31T12:25:30Z

BPeat at 00:32, 17 August 2023

2023-08-17T00:32:16Z

BPeat at 14:37, 6 August 2023

2023-08-06T14:37:05Z

BPeat at 14:33, 6 August 2023

2023-08-06T14:33:51Z

BPeat at 14:27, 6 August 2023

2023-08-06T14:27:54Z

@@ Line 20: / Line 20: @@
 [https://www.bing.com/news/search?q=ai+Optimization+method&qft=interval%3d%228%22 ...Bing News]
-* [[Agents#AI Agent Optimization|AI Agent Optimization]] ... [[Optimization Methods]] ... [[Optimizer]] ... [[Exploration]]
+* [[Agents#AI Agent Optimization|AI Agent Optimization]] ... [[Optimization Methods]] ... [[Optimizer]] ... [[Objective vs. Cost vs. Loss vs. Error Function]] ... [[Exploration]]
 * [[Backpropagation]] ... [[Feed Forward Neural Network (FF or FFNN)|FFNN]] ... [[Forward-Forward]] ... [[Activation Functions]] ...[[Softmax]] ... [[Loss]] ... [[Boosting]] ... [[Gradient Descent Optimization & Challenges|Gradient Descent]] ... [[Algorithm Administration#Hyperparameter|Hyperparameter]] ... [[Manifold Hypothesis]] ... [[Principal Component Analysis (PCA)|PCA]]
 * [[Large Language Model (LLM)]] ... [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]] ... [[Natural Language Classification (NLC)|Classification]] ...  [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding]] ... [[Language Translation|Translation]] ... [[Natural Language Tools & Services|Tools & Services]]

@@ Line 20: / Line 20: @@
 [https://www.bing.com/news/search?q=ai+Optimization+method&qft=interval%3d%228%22 ...Bing News]
-* [[Agents#AI Agent Optimization|AI Agent Optimization]]
+* [[Agents#AI Agent Optimization|AI Agent Optimization]] ... [[Optimization Methods]] ... [[Optimizer]] ... [[Exploration]]
 * [[Backpropagation]] ... [[Feed Forward Neural Network (FF or FFNN)|FFNN]] ... [[Forward-Forward]] ... [[Activation Functions]] ...[[Softmax]] ... [[Loss]] ... [[Boosting]] ... [[Gradient Descent Optimization & Challenges|Gradient Descent]] ... [[Algorithm Administration#Hyperparameter|Hyperparameter]] ... [[Manifold Hypothesis]] ... [[Principal Component Analysis (PCA)|PCA]]
 * [[Large Language Model (LLM)]] ... [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]] ... [[Natural Language Classification (NLC)|Classification]] ...  [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding]] ... [[Language Translation|Translation]] ... [[Natural Language Tools & Services|Tools & Services]]

@@ Line 21: / Line 21: @@
 * [[Agents#AI Agent Optimization|AI Agent Optimization]]
 * [[Backpropagation]] ... [[Feed Forward Neural Network (FF or FFNN)|FFNN]] ... [[Forward-Forward]] ... [[Activation Functions]] ...[[Softmax]] ... [[Loss]] ... [[Boosting]] ... [[Gradient Descent Optimization & Challenges|Gradient Descent]] ... [[Algorithm Administration#Hyperparameter|Hyperparameter]] ... [[Manifold Hypothesis]] ... [[Principal Component Analysis (PCA)|PCA]]
 * [[Large Language Model (LLM)]] ... [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]] ... [[Natural Language Classification (NLC)|Classification]] ...  [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding]] ... [[Language Translation|Translation]] ... [[Natural Language Tools & Services|Tools & Services]]
-* [[Recurrent Neural Network (RNN)]]
+* [[Recurrent Neural Network (RNN)]]]]
 ** [[Average-Stochastic Gradient Descent (SGD) Weight-Dropped LSTM (AWD-LSTM)]]
 * Gradient [[Boosting]] Algorithms

@@ Line 20: / Line 20: @@
 [https://www.bing.com/news/search?q=ai+Optimization+method&qft=interval%3d%228%22 ...Bing News]
 * [[Backpropagation]] ... [[Feed Forward Neural Network (FF or FFNN)|FFNN]] ... [[Forward-Forward]] ... [[Activation Functions]] ...[[Softmax]] ... [[Loss]] ... [[Boosting]] ... [[Gradient Descent Optimization & Challenges|Gradient Descent]] ... [[Algorithm Administration#Hyperparameter|Hyperparameter]] ... [[Manifold Hypothesis]] ... [[Principal Component Analysis (PCA)|PCA]]
 * [[Large Language Model (LLM)]] ... [[Natural Language Processing (NLP)]]  ...[[Natural Language Generation (NLG)|Generation]] ... [[Natural Language Classification (NLC)|Classification]] ...  [[Natural Language Processing (NLP)#Natural Language Understanding (NLU)|Understanding]] ... [[Language Translation|Translation]] ... [[Natural Language Tools & Services|Tools & Services]]

@@ Line 32: / Line 32: @@
 SGD is a fundamental optimization algorithm used in training machine learning models. It updates the model's parameters based on the gradients of the loss function with respect to the training data. In each iteration, a random subset (mini-batch) of training data is used to compute the gradients, making it computationally efficient. The model parameters are then adjusted in the opposite direction of the gradient to minimize the loss function. SGD can be enhanced with momentum, which adds a fraction of the previous parameter update to the current update, helping to accelerate convergence in certain cases.
-<b>Limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)</b>:
+<b>Limited-[[memory]] Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)</b>:
-L-BFGS is a popular optimization method for unconstrained optimization problems. It is based on the quasi-Newton method and uses a limited-memory approach to approximate the inverse Hessian matrix. This approximation allows efficient updates of the model's parameters without explicitly computing the full Hessian matrix, making it suitable for large-scale machine learning problems.
+L-BFGS is a popular optimization method for unconstrained optimization problems. It is based on the quasi-Newton method and uses a limited-[[memory]] approach to approximate the inverse Hessian matrix. This approximation allows efficient updates of the model's parameters without explicitly computing the full Hessian matrix, making it suitable for large-scale machine learning problems.
 <b>Adagrad</b>:

@@ Line 26: / Line 26: @@
 * Gradient [[Boosting]] Algorithms
-These optimization methods play a crucial role in training AI models, and their selection depends on the nature of the problem, the architecture of the model, and the size of the dataset, among other factors. Experimentation and fine-tuning of the optimization algorithm often lead to improved training performance and model convergence. Methods:
+These optimization methods play a crucial role in training AI models, and their selection depends on the nature of the problem, the architecture of the model, and the size of the dataset, among other factors. Experimentation and [[fine-tuning]] of the optimization algorithm often lead to improved training performance and model convergence. Methods:
 <b>Stochastic Gradient Descent (SGD)</b>:

← Older revision		Revision as of 14:37, 6 August 2023
Line 79:		Line 79:
	<youtube>kK8-jCCR4is</youtube>		<youtube>kK8-jCCR4is</youtube>
	<youtube>VINCQghQRuM</youtube>		<youtube>VINCQghQRuM</youtube>
		+	<youtube>4_jiFQXPAsw</youtube>