Difference between revisions of "L1 and L2 Regularization"

From
Jump to: navigation, search
Line 9: Line 9:
 
* add more data
 
* add more data
 
* use [[Data Augmentation]]
 
* use [[Data Augmentation]]
* use [[Batch Normalization]]
+
* use [[Batch Norm(alization) & Standardization]]
 
* use architectures that generalize well
 
* use architectures that generalize well
 
* reduce architecture complexity
 
* reduce architecture complexity

Revision as of 18:32, 2 January 2019

Youtube search... ...Google search

Mathematically speaking, L1 is just the sum of the weights as a regularization term in order to prevent the coefficients to fit so perfectly to overfit. There is also L2 regularization. where L2 is the sum of the square of the weights.


Good practices for addressing the Overfitting Challenge: