For any machine learning project, we always want to find the best suitable model. For this, we train data on different machine learning models and select the one with the best performance.
The performance of the model selected can be further improved by hyperparameter optimization.
What is Hyperparameter Optimization in Machine Learning?
Hyperparameter optimization or tuning is the process of finding a combination of parameters that returns an optimal model which performs the best for a given problem.
Hyperparameters can be found using methods like Manual Search, Grid Search, Random Search, Bayesian Optimization, etc. But manually performing hyperparameter tuning could take a lot of time and effort. Thanks to GridSearchCV which allows us to automate this process.
What is GridSearchCV?
GridSearchCV is a process of performing hyperparameter optimization to get the optimal value for a given machine-learning model.
How GridSearchCV Works?
Step 1: A dictionary is passed having a list of values for different hyperparameters. For example, the parameter values for the decision tree can be:
criterion: Gini or entropy,
max_depth: between 5 and 50,
min_samples_split: between 2 and 5
Step 2: GridSearchCV tries a different combination of the values from different hyperparameters.
Step 3: Model performance is evaluated for each combination using the Cross-Validation method.
Step 4: The combination with the highest model evaluation score is selected.
What is Cross Validation?
Cross Validation is a model evaluation technique that uses different portions of data as training and testing in different iterations.
Advantages of GridSearchCV
- The process is easy to understand.
- It can find the best hyperparameters for a given model.
Drawbacks of GridSearchCV
- It takes a lot of time as it evaluates every possible combination of hyperparameter values.
- The amount of time taken grows exponentially as the size of the parameters grows. RandomSearchCV can be a better option in such cases.