Machine Learning - Grid Search
Table of Contents
One of the efficient methods to find the optimal parameters of a model in machine learning is Grid Search. This article explains what grid search is, how it works, and when it should be used.
What is Grid Search? #
Grid search is one of the methods to optimize hyperparameters of a machine learning model. This method tests all combinations of specified hyperparameters to find the combination that produces the best performance. The performance of each parameter combination is evaluated through cross-validation, and through this process, the optimal model can be selected.
How It Works #
Grid search first receives a range or list of hyperparameters specified by the user. For example, let’s assume we are conducting a grid search for a decision tree classifier. The user sets the range of hyperparameters such as the depth of the tree (depth), the minimum number of samples for splitting (min_samples_split), etc. Grid search trains the model on all possible combinations within this range and evaluates the performance of each combination using cross-validation. Common methods for performance evaluation include accuracy, precision, recall, F1 score, etc. After the evaluation, the parameter combination with the best performance is selected.
Advantages and Disadvantages #
Advantages:
- Easy to use and understand
- Because it explores all possible combinations, there is a high possibility of finding the optimal combination
Disadvantages:
- Very high computational cost. As the number of parameters and their range increases, the required amount of computation increases exponentially
- It takes a long time to find the optimal combination
When to Use #
Grid search is suitable when the range of parameters is relatively small and the model’s learning time is short. Also, it is good to use when finding the optimal combination of hyperparameters is important and there are sufficient computational resources. However, for models with a very large parameter space or very long learning time, it is advisable to consider other hyperparameter optimization techniques such as Random Search or Bayesian optimization.
GridSearchCV #
GridSearchCV is a class included in the model selection module of the scikit-learn (sklearn) library, used to search the hyperparameter space of a given model through cross-validation and find the optimal parameters. The main parameters that can be passed to this class’s constructor are as follows:
Key Parameters #
- estimator: The model to optimize. For example, it could be scikit-learn’s estimator objects like RandomForestClassifier(), SVC(), etc.
- param_grid: A dictionary of parameters to search. For example, you could set {’n_estimators’: [100, 200], ‘max_features’: [‘auto’, ‘sqrt’]}, which means to search the values [100, 200] for n_estimators and [‘auto’, ‘sqrt’] for max_features parameters.
- scoring: The criterion for evaluating the model’s performance. It is specified as a string, such as ‘accuracy’, ‘f1’, etc. Other predefined scoring options in scikit-learn can also be used.
- cv: The strategy for cross-validation splitting. For example, 5 means 5-fold cross-validation. You can also directly pass objects of scikit-learn’s splitters like KFold, StratifiedKFold, etc.
- refit: Determines whether to retrain the model on the entire dataset after finding the optimal parameters. The default value is True, meaning the model is trained on the entire dataset with the optimal parameters.
Implementing GridSearchCV #
The following is a simple example of using Scikit-learn’s GridSearchCV to find the optimal hyperparameters for a classifier. Here, we use a decision tree classifier (DecisionTreeClassifier).
>>> from sklearn.model_selection import GridSearchCV
>>> from sklearn.tree import DecisionTreeClassifier
>>> from sklearn.datasets import load_iris
>>> from sklearn.model_selection import train_test_split
# Load data
>>> iris = load_iris()
>>> X = iris.data
>>> y = iris.target
# Split training and testing set
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Set up the model
>>> estimator = DecisionTreeClassifier()
# Set the grid of parameters to search
>>> param_grid = {
'max_depth': [None, 2, 4, 6, 8],
'min_samples_split': [2, 5, 10],
'min_samples_leaf': [1, 2, 4]
}
# Set up GridSearchCV
>>> grid_search = GridSearchCV(estimator=estimator, param_grid=param_grid, scoring='accuracy', cv=5, refit=True)
# Perform grid search
>>> grid_search.fit(X_train, y_train)
# Print the best parameters and highest score
>>> print("Best parameters:", grid_search.best_params_)
>>> print("Best score:", grid_search.best_score_)
# Evaluate performance on test data
>>> test_accuracy = grid_search.score(X_test, y_test)
>>> print("Test accuracy:", test_accuracy)