Difference between revisions of "Dropout"

From
Jump to: navigation, search
Line 14: Line 14:
 
* add more data
 
* add more data
 
* use [[Data Augmentation]]
 
* use [[Data Augmentation]]
* use [[Batch Normalization]]
+
* use [[Batch Norm(alization) & Standardization]]
 
* use architectures that generalize well
 
* use architectures that generalize well
 
* reduce architecture complexity
 
* reduce architecture complexity

Revision as of 18:31, 2 January 2019

Youtube search... ...Google search

Deep neural nets with a large number of parameters are very powerful machine learning systems. However, overfitting is a serious problem in such networks. Large networks are also slow to use, making it difficult to deal with overfitting by combining the predictions of many different large neural nets at test time. Dropout is a technique for addressing this problem. The key idea is to randomly drop units (along with their connections) from the neural network during training. This prevents units from co-adapting too much. During training, dropout samples from an exponential number of different “thinned” networks. At test time, it is easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. This significantly reduces overfitting and gives major improvements over other regularization methods.Dropout: A Simple Way to Prevent Neural Networks from Overfitting | Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan Salakhutdinov

1IrdJ5PghD9YoOyVAQ73MJw.gif


Good practices for addressing the Overfitting Challenge: