Boosting Technique in Ensemble Learning

# Boosting Technique in Ensemble Learning

Updated on Oct 3, 2023 12:17 IST

The below article covers the Boosting technique in Ensemble Learning.

Theoretically, in Machine Learning, we focus on building one ML model for our application – such as a Naïve Bayes Classifier, Decision Tree Classifier, Linear Regression model, etc. We feed data into the model and train it to predict an outcome or make a decision.

But in the real world, different models perform well for different applications, and it’s mostly not advisable to rely on one specific model to give us the most accurate outcome. So, it is more practical to choose from various models by combining their outputs and obtaining a final model that best fits our application.

This final model is called an Ensemble Model. And the process of combining multiple models is known as Ensemble Learning. In this article, we will discuss a common Ensemble Learning technique in detail – Boosting

We are going to cover the following sections:

## Quick Intro to Ensemble Models in Machine Learning

An ensemble model is created busing ensemble learning technique. Ensemble learning combining multiple classification and prediction model strategically to enhance their overall performance to solve a particular problem. The base models are generally weak learners, which eventually produce a strong learner or an ensemble model when combined together.

The two most popular ensemble methods are:

## What is Boosting in Ensemble Learning?

Boosting is an ensemble learning method that involves training homogenous weak learners sequentially such that a base model depends on the previously fitted base models. All these base learners are then combined in a very adaptive way to obtain an ensemble model.

In boosting, the ensemble model is the weighted sum of all constituent base learners. There are two meta-algorithms in boosting that differentiate how the base models are aggregated:

## How Boosting Works?

Boosting consists of multiple weak learners that are fitted iteratively in a manner that each new learner gives more weight or is only trained with observations that have been poorly classified by the previous learners.

At the end of this process, we obtain a strong learner (ensemble model) with lesser bias than the individual base models composing it. Hence, boosting techniques help avoid the underfitting of the model. So, when a base model usually has low variance but high bias, we will implement boosting techniques. Another reason is that such models are generally less computationally expensive to fit.

Once we have decided upon the type of our base model, we need to ask a few questions:

• What information from the previous learners will be considered when fitting the current learner?
• How would the learners (base models) be aggregated?

### How is a boosting model trained to make predictions?

1. Samples generated from the training set are assigned the same weight to start with. These samples are used to train a homogeneous weak learner or base model.
1. The prediction error for a sample is calculated – the greater the error, the weight of the sample increases. Hence, the sample becomes more important for training the next base model.
1. The individual learner is weighted too – does well on its predictions, gets a higher weight assigned to it. So, a model that outputs good predictions will have a higher say in the final decision.
1. The weighted data is then passed on to the following base model, and steps 2) and 3) are repeated until the data is fitted well enough to reduce the error below a certain threshold.
1. When new data is fed into the boosting model, it is passed through all individual base models, and each model makes its own weighted prediction.
1. Weight of these models is used to generate the final prediction. The predictions are scaled and aggregated to produce a final prediction.
K-means Clustering in Machine Learning
When you are dealing with Machine Learning problems that work with unlabeled training datasets, the most common learning algorithms you will come across are clustering algorithms. Amongst them, the simplest...read more
Bagging Technique in Ensemble Learning
In this article, we will discuss the concept how to solve machine learning problems using the ensemble learning bagging.

This algorithm updates the weights attached to each of the misclassified training data samples and of the corresponding weak learners.

• As discussed above, once an individual base model is trained, the sample next in sequence is assigned a weight that signifies the prediction accuracy (or lack thereof).
• The weighted sample is then used to train the next base learner which would intuitively focus more on the samples with greater weight assigned to them and try to make better predictions.
• The results would be re-weighted for the misclassified samples and fed into the next individual learner.

Click the below colab icon to run the demo.

### Problem Statement:

Let’s build a boosting ensemble model using AdaBoost. For this, we will implement a Decision Tree as the base learner using the scikit-learn library in Python.

### Dataset Description:

We are going to make use of the breast_cancer dataset already present in the scikit-learn library.

Here is a preview of the dataset (the full description is too long to include):

The target column is used to predict whether a tumor is cancerous or not.

2. Split the data into training and testing sets
3. Build a Decision Tree Classifier and get its Accuracy Score
4. Build an AdaBoost Model and get its Accuracy Score
5. Compare the Accuracy Scores

#### Step 1 – Load the data

` `
`from sklearn.datasets import load_breast_cancer #Load the breast cancer datasetx, y = load_breast_cancer(return_X_y=True)Copy code`

#### Step 2 – Split the data into training and testing sets

` `
`from sklearn.model_selection import train_test_split #Split the dataset into 70% training set and 30% testing setx_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=23)Copy code`

#### Step 3 – Build a Decision Tree Classifier and get its Accuracy Score

` `
`from sklearn.tree import DecisionTreeClassifierfrom sklearn.metrics import accuracy_score #Train a Decision tree classifierdtree = DecisionTreeClassifier(max_depth=1, random_state=23)dtree.fit(x_train,y_train)dt_pred = dtree.predict(x_test) dt_acc = round(accuracy_score(y_test,dt_pred),3)print(f"Decision Tree Classifier Accuracy Score: ", dt_acc)Copy code`

#### Step 4 – Build an AdaBoost Model and get its Accuracy Score

We will initialize this AdaBoost ensemble model with the following parameters:

• base_estimator = Decision Tree (default)
• n_estimators = 50 – Create 50 samples to train 50 decision tree base models
• learning_rate = 0.6 – Shrinks the contribution of each learner model by the value given
` `
`from sklearn.ensemble import AdaBoostClassifier #AdaBoost Model using Decision Tree Classifierada = AdaBoostClassifier(n_estimators=50,learning_rate=0.6)ada.fit(x_train,y_train)ada_pred = ada.predict(x_test) ada_acc = round(accuracy_score(y_test,ada_pred),3)print(f"Decision Tree AdaBoost Model Accuracy Score: ", ada_acc)Copy code`

#### Step 5 – Compare the Accuracy Scores

` `
`import numpy as npimport matplotlib.pyplot as plt #Compare the Accuracy Scores through Visualizationplt.figure(figsize=(10,2))plt.barh(np.arange(2),[dt_acc,ada_acc], tick_label=['Decision Tree','AdaBoost'])Copy code`

Thus, we can see how AdaBoost improves the performance and accuracy of the above Decision Tree Classifier.

This algorithm updates the values of the training data samples. Here, the weak learner models are combined sequentially using Gradient Descent. Let’s understand how this is done –

For every data sample, we compute the pseudo-residual. This value is basically the difference between the target and the predicted value.

pseudo_residuals = Yᵗᵃʳᵍᵉᵗ – Yᵖʳᵉᵈ

These residuals indicate the direction in which the successive learner should be updated to get the right value of the data sample.

• At first, the pseudo-residuals are set to the average of the known targets.
• With each weak learner model, we predict the pseudo-residuals obtained in the previous learner model.
• Pseudo-residuals thus obtained are the targets for the following weak learner model.

Click the below colab icon to run the demo.

### Problem Statement:

Let’s build a boosting ensemble model using Gradient Boosting. For this, we will make use of the gender classification dataset available on Kaggle.

### Dataset Description:

The dataset has 7 features and a target variable:

• longhair – This column contains 0’s and 1’s where 1 is “long hair” and 0 is “not long hair”
• nosewide – This column contains 0’s and 1’s where 1 is “wide nose” and 0 is “not wide nose”
• noselong – This column contains 0’s and 1’s where 1 is “Long nose” and 0 is “not long nose”
• lipsthin – This column contains 0’s and 1’s where 1 represents the “thin lips” while 0 is “Not thin lips”
• distancenosetoliplong – This column contains 0’s and 1’s where 1 represents the “long distance between nose and lips” while 0 is “short distance between nose and lips”
• gender – “Male” or “Female”

The gender column is the target column used to predict the gender of an individual.

2. Split the data into training and testing sets
3. Build a Decision Tree Classifier and get its Accuracy Score
4. Build a Gradient Boost Model and get its Accuracy Score
5. Compare the Accuracy Scores

#### Step 1 – Read the data

` `
`import pandas as pd #Read the datasetdata = pd.read_csv('gender_classification_v7.csv')Copy code`

#### Step 2 – Split the data into training and testing sets

` `
`from sklearn.model_selection import train_test_split #Divide the dependent and independent features from the dataframex=data.iloc[:,:-1]y=data.iloc[:,-1] #Divide the datatset into train and test sets keeping 10% for testingx_train,x_test,y_train,y_test = train_test_split(x,y,test_size=0.3, random_state=7)Copy code`

#### Step 3 – Build a Decision Tree Classifier and get its Accuracy Score

We are going to use a Decision Tree with fixed parameters as the base learner.

` `
`from sklearn.tree import DecisionTreeClassifierfrom sklearn.metrics import accuracy_score #Train a Decision tree classifierdtree = DecisionTreeClassifier(max_depth=1, random_state=23)dtree.fit(x_train,y_train)dt_pred = dtree.predict(x_test) dt_acc = round(accuracy_score(y_test,dt_pred),3)print(f"Decision Tree Classifier Accuracy Score: ", dt_acc)Copy code`

#### Step 4 – Build a Gradient Boost Model and get its Accuracy Score

We will initialize this Gradient Boost ensemble model with the following parameters:

• n_estimators = 50 – Create 50 samples to train 50 decision tree base models
• learning_rate = 0.6 – Shrinks the contribution of each learner model by the value given
` `
`from sklearn.ensemble import GradientBoostingClassifier #Gradient Boost Model using Decision Tree Classifiergb = GradientBoostingClassifier(n_estimators=1000, learning_rate=0.5)gb.fit(x_train,y_train)gb_pred = gb.predict(x_test) gb_acc = round(accuracy_score(y_test,gb_pred),3)print(f"Decision Tree Gradient Boost Model Accuracy Score: ", gb_acc)Copy code`

#### Step 5 – Compare the Accuracy Scores

` `
`import numpy as npimport matplotlib.pyplot as plt #Visualize the Accuracy Scoresplt.figure(figsize=(10,2))plt.barh(np.arange(2),[dt_acc,gb_acc], tick_label=['Decision Tree','Gradient Boost'])Copy code`

Thus, we can see how Gradient Boosting improves the performance and accuracy of the above Decision Tree Classifier.

## Endnotes

We have discussed how ensemble learning techniques aim at optimizing an ML model by alleviating the overfitting/underfitting problems. Artificial Intelligence & Machine Learning is an increasingly growing domain that has hugely impacted big businesses worldwide. Interested in being a part of this frenzy? Explore related articles here.

Top Trending Tech Articles:
Career Opportunities after BTech | Online Python Compiler | What is Coding | Queue Data Structure | Top Programming Language | Trending DevOps Tools | Highest Paid IT Jobs | Most In Demand IT Skills | Networking Interview Questions | Features of Java | Basic Linux Commands | Amazon Interview Questions

Recently completed any professional course/certification from the market? Tell us what liked or disliked in the course for more curated content.