In this blog, I would try to make an intuitive understanding about overfitting and the techniques used generally to decrease them.
Let's assume you are a part of a gameshow, where you have to predict a number based on some given data (just like an ML Model!).
Input | Output |
---|---|
1 | 3 |
2 | 5 |
3 | 7 |
4 | 9 |
: | : |
There are various ways one might tackle this problem.
Guess and retry
You can try to make your own algorithm in mind and use that to guess the number. Then based on how far your prediction lies from the actual value you can make changes to your algorithm.
For example, Letβs say based on your knowledge from previous input, you make an algorithm/function which predicts that the output for 5 is 10. You are told that correct value (true value) is higher than that. Then you make changes in your algorithm which predicts 12, which comes out to be higher than correct value. You repeat such steps of update and predict until you get correct values for all your input data.
This method is analogous to how gradient descent works, where the model makes predictions and updates its weights based on loss gradient.
Memorization
You can also memorize the current input and output values hoping that the game master asks you from that itself!
However, let's say you cannot remember all the data from your own. Then you can use your handy scratchpad and jot down all the data.
This is analogous to overfitting, where the model starts remembering the input data rather than learning the features. This leads to poor performance in unseen data.
So how do we avoid our model from overfitting?? Here are some common ways.
Dropout Layers
What if, it is announced that the game master can take your scratch pad whenever he wants! Now you can't jot down and rely only on that scratch pad. This forces you to rather develop an algorithm.
This is how dropout layers work, by randomly dropping weights to 0, the model has to increase its dependence on other layers forcing it to learn features.
Increasing the Data
If the data given to you is huge, no matter how much you write and memorize it, you cannot capture it all. Hence, again, forcing you to develop algorithm to predict the output.
I have tried to give a basic understanding of overfitting in this post. This is my first blog and I would really appreciate any feedback.π
~See ya'
By Ashed
Top comments (0)