Stacking

A Simple Explanation - By Varsha Saini

What is Ensemble learning in Machine Learning?

Ensemble learning is a method in which various weak learners (machine learning models) are combined together to form a strong learner. Different machine learning algorithms are trained and output from all of them is combined to get the results.

Types of Ensemble Learning Methods in Machine Learning

Below are the three types of ensemble learning models:

Bagging
Boosting
Stacking

Stacking Ensemble

Stacking is an ensemble learning technique that combines the output from different machine learning algorithms to generate the final predictions.

Steps to Implement Stacking Models

Step 1: The dataset is divided into training and validation set using k-fold validation. k-1 folds are used as training data and 1 fold is used as validation data.

Step 2: Training data is passed to different machine learning models called base models.

Step 3: Output from all the models is generated for the validation dataset.

Step 4: The outputs generated from different machine learning models are combined into a new dataset.

Step 5: The new dataset is used as an input to meta-models.

Step 6: Meta models are used to generate the final output.

Meta Model

Meta model (model of models)

How Stacking Models are Different from Bagging and Boosting Models

1. The individual models in the stacking algorithm may be heterogenous i.e. different machine learning models can be used in the stacking ensemble, whereas individual models in bagging and boosting are homogenous (usually decision trees).

2. The output from weak learners is used by meta-models to get the final output, whereas there is no meta-model in bagging and boosting.

Stacking Ensemble Family

Below are a few ensemble methods that belong to the stacking family:

1. Voting Ensembles

A voting ensemble is a meta-model i.e it is applied to the results from various different models. It can be used for both regression and classification problems.

Below is how the voting ensemble makes predictions in case of regression and classification problems:

Regression: Takes the average of predictions from individual models.
Classification: The class which is predicted by most of the models is considered the output.

There are two types of voting classifiers:

Hard Voting: Predicts the class with the highest sum of votes from the base models.
Soft Voting: Predicts the class with the largest sum of probabilities from the base models.

2. Weighted Average Ensemble

In the voting ensemble, equal weightage is given to all the base learners. The weighted average ensemble is similar to the voting ensemble except that it gives weightage to the base models based on their individual performance.

3. Blending Ensemble

The stacking ensemble uses k-fold cross-validation to evaluate the models. Blending is the same as stacking generalization except that instead of k-fold cross-validation, it uses the holdout method for model evaluation. You can learn about these techniques here.

4. Super Learner Ensemble

In the SuperLearner ensemble, data is split into k-folds and the same data is passed to all the base learners. Each base learner model is evaluated on out-of-fold data and predictions are stored from each model. The meta-model is fit the out-of-fold prediction from each model. Below is the diagram that explains the process very well. It is taken directly from the paper.

Varsha Saini

Stacking

A Simple Explanation - By Varsha Saini

What is Ensemble learning in Machine Learning?

Types of Ensemble Learning Methods in Machine Learning

Stacking Ensemble

Steps to Implement Stacking Models

Meta Model

How Stacking Models are Different from Bagging and Boosting Models

Stacking Ensemble Family

1. Voting Ensembles

2. Weighted Average Ensemble

3. Blending Ensemble

4. Super Learner Ensemble

Other Popular Terms

Adjusted R-Squared

Autocorrelation

Bagging Algorithm

Bessel’s Correction

Boosting Algorithm

CatBoost

Citizen Data Scientist

Cohen Kappa

Confusion Matrix

Correlation

Cross Validation

Data Drift

Data Imputation

Differential Privacy

Elastic Net Regression

Evaluation Metrics

Feature Selection

Genetic Programming