Overfitting... Techniques to prevent

Created on May 7|Last edited on May 9
Comment
﻿
References:﻿8 Simple Techniques to Prevent Overfitting﻿
Overfitting .... perform very well on training data but very poorly in validation or test data.
bias-variance tradeoff .... Bias is the difference between the average prediction of our model and the correct value which we are trying to predict. A model with high bias pays very little attention to the training data and oversimplifies the model. It always leads to high error on training and test data. Alternatively, Variance refers to the variability of model prediction for a given data point or a value which tells us spread of our data. Model with high variance pays a lot of attention to training data and does not generalize on the data which it hasn’t seen before. As a result, such models perform very well on training data but has high error rates on test data.
Techniques: 
Train on large clean dataset (well distributed images)
Cross-Validation: To evaluate dataset, we can use various methods of CV. Some of the most commonly ones are listed below: 
Hold-out
k-folds
Leave-one-out
Leave-p-out
Stratified K-folds
Repeated K-folds
Nested K-folds
Time series CV
Reference: Cross Validation blog 
Data Augmentation
Learning Rate Scheduling
Feature Selection
RegularizationL1 Regularization
L2 Regularization
Reference: Apply Regularization to keras pretrained models﻿
Remove layers/Number of units per layer
DropoutIgnoring a subset of units of the neural network
﻿
Early Stopping
Ensembling***Bagging: It uses complex base models and tries to "smooth out" their predictions.
Boosting: It uses simple base models and tries to "boost" their aggregate complexity.
Stacking
﻿
﻿
Add a comment