Difference between revisions of "L1 and L2 Regularization"

From
Jump to: navigation, search
m (Text replacement - "http:" to "https:")
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
[http://www.youtube.com/results?search_query=L1+L2+Regularization+Dropout+Overfitting Youtube search...]
+
{{#seo:
[http://www.google.com/search?q=L1+L2+Regularization+Dropout+deep+machine+learning+ML ...Google search]
+
|title=PRIMO.ai
 +
|titlemode=append
 +
|keywords=artificial, intelligence, machine, learning, models, algorithms, data, singularity, moonshot, Tensorflow, Google, Nvidia, Microsoft, Azure, Amazon, AWS
 +
|description=Helpful resources for your journey with artificial intelligence; videos, articles, techniques, courses, profiles, and tools
 +
}}
 +
[https://www.youtube.com/results?search_query=L1+L2+Regularization+Dropout+Overfitting Youtube search...]
 +
[https://www.google.com/search?q=L1+L2+Regularization+Dropout+deep+machine+learning+ML ...Google search]
  
 
Mathematically speaking, L1 is just the sum of the weights as a regularization term in order to prevent the coefficients to fit so perfectly to overfit. There is also L2 regularization. where  L2 is the sum of the square of the weights.
 
Mathematically speaking, L1 is just the sum of the weights as a regularization term in order to prevent the coefficients to fit so perfectly to overfit. There is also L2 regularization. where  L2 is the sum of the square of the weights.
  
  
Good practices for addressing overfitting:
+
Good practices for addressing the [[Overfitting Challenge]]:
  
 
* add more data
 
* add more data
* use [[Data Augmentation]]
+
* use [[Data Quality#Batch Norm(alization) & Standardization|Batch Norm(alization) & Standardization]]
* use batch normalization
 
 
* use architectures that generalize well
 
* use architectures that generalize well
 
* reduce architecture complexity
 
* reduce architecture complexity
 
* add [[Regularization]]
 
* add [[Regularization]]
** L1 and L2 Regularization -  update the general cost function by adding another term known as the regularization term.  
+
** [[L1 and L2 Regularization]] -  update the general cost function by adding another term known as the regularization term.  
** [[Dropout]] - at every iteration, it randomly selects some nodes and temporarily removes the nodes (along with all of their incoming and outgoing connections)
+
** Dropout - at every iteration, it randomly selects some nodes and temporarily removes the nodes (along with all of their incoming and outgoing connections)
** [[Data Augmentation]]
+
** [[Data Augmentation, Data Labeling, and Auto-Tagging|Data Augmentation]]
** [[Early Stopping]]
+
** [[Early Stopping]]  
  
  
 
<youtube>xyymDGReKdY</youtube>
 
<youtube>xyymDGReKdY</youtube>
 
<youtube>CEFcwpBneFo</youtube>
 
<youtube>CEFcwpBneFo</youtube>

Latest revision as of 20:46, 28 March 2023

Youtube search... ...Google search

Mathematically speaking, L1 is just the sum of the weights as a regularization term in order to prevent the coefficients to fit so perfectly to overfit. There is also L2 regularization. where L2 is the sum of the square of the weights.


Good practices for addressing the Overfitting Challenge: