Difference between revisions of "L1 and L2 Regularization"

Revision as of 13:37, 30 December 2018

Mathematically speaking, L1 is just the sum of the weights as a regularization term in order to prevent the coefficients to fit so perfectly to overfit. There is also L2 regularization. where L2 is the sum of the square of the weights.

Good practices for addressing overfitting:

add more data
use Data Augmentation
use batch normalization
use architectures that generalize well
reduce architecture complexity
add Regularization
- L1 and L2 Regularization - update the general cost function by adding another term known as the regularization term.
- Dropout - at every iteration, it randomly selects some nodes and temporarily removes the nodes (along with all of their incoming and outgoing connections)
- Data Augmentation
- Early Stopping

Difference between revisions of "L1 and L2 Regularization"

Revision as of 13:37, 30 December 2018

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

Revision as of 13:35, 30 December 2018 (view source) BPeat (talk \| contribs) ← Older edit		Revision as of 13:37, 30 December 2018 (view source) BPeat (talk \| contribs) m Newer edit →
Line 3:		Line 3:

	Mathematically speaking, L1 is just the sum of the weights as a regularization term in order to prevent the coefficients to fit so perfectly to overfit. There is also L2 regularization. where L2 is the sum of the square of the weights.		Mathematically speaking, L1 is just the sum of the weights as a regularization term in order to prevent the coefficients to fit so perfectly to overfit. There is also L2 regularization. where L2 is the sum of the square of the weights.
		+

	Good practices for addressing overfitting:		Good practices for addressing overfitting: