Difference between Correlation and Regression

# Difference between Correlation and Regression

clickHere
Vikram Singh
Assistant Manager - Content
Updated on Feb 9, 2024 16:51 IST

Correlation measures the degree of relationship between two variables, while regression is about how one variable affects the other. In this article, we will briefly discuss the difference between correlation and regression.

In statistics, Correlation and Regression are used to quantify the direction and strength of the relationship between two or more numeric values. Correlation measures the degree of relationship between two variables, while Regression is about how one variable affects another. In this article, we will discuss the difference between Correlation and Regression.

Must Check: Top Math Courses for Data Science

## Correlation vs. Regression: Difference between Correlation and Regression

Must Check: Free Maths for Data Science Online Courses

## What is Correlation?

### Definition

The word correlation comprises two words, co (together) and relation (connection), that determine the relationship between two variables, x, and y

• It is a statistical technique that is used to represent the strength between pairs of variables.
• Correlation can be Positive, Negative, or Zero.
• Positive Correlation: When both the variables move in the same direction.
• Example: Height and Weight (taller people tend to be heavier and vice-versa)
• Negative Correlation: When both the variables move in opposite directions.
• Example: Price and Demand (demand increases, price decreases, and vice-versa).
• Zero Correlation: Zero correlation suggests that the correlation statistic doesn’t indicate a relationship between both the variable
•  It doesn’t mean that there is no relationship between both variables; it simply means that there is no linear relationship between them.
• Example: Drinking Coffee and the Height of the student in a class.

Your Career Awaits: Discover the Best Government Job-Oriented Courses After 10 & Online Government Certification Opportunities

## Formula of Correlation

The correlation of two random variables is given by:

correlation(r) = COV (X, Y) / S.D. (X) S.D. (Y),

Where,

• COV (X, Y) – covariance of X and Y
• S.D. (X): standard deviation of X
• S.D. (Y): standard deviation of Y

Correlation takes the value from -1 to 1.

### Types of Correlation

We mainly use three types of Correlations.

• Pearson: used with nominal or continuous variables, and measure the linear relationship between both the variables only, i.e., for Pearson correlation, it is hard to measure non-linear relationship.
• Spearman Rank: Used for ordinal and continuous variables and captures linear and non-linear relationships.
• Kendall Tau: A non-parametric measure for calculating the rank correlation of ordinal variables. Similar to the spearman rank, it captures both linear and non-linear relationships.

### Application of Correlation

• E-commerce
• Time spent vs. Product purchase by a customer
• Number of unique customers vs. Sales in a day

The correlation between the described variables will help the company to decide on target customers and how to increase new customers.

• Education
• Years of study vs. Salary Intake

The correlation result will help the government decide what changes the current education policy will have to make so that unemployment decreases.

• Real Estate
• Income vs. Location of Flats
• Location of Flats vs. Rate of Flat

The above correlation will help the contractor and real estate companies to decide the market price of the flats and to choose the location for the site and the target customer.

Must Check: Mathematics for Machine Learning

## What is Regression?

### Definition

A statistical technique to estimate the change in the value of the dependent variable due to the change in the independent variable.

• It implies that the outcome depends on one or more variables (independent variables).
• Regression provides a detailed look at the data and includes an equation that is used to predict and optimize the data in the future.
• The main use of regression analysis is: determining the strength of predictors, forecasting an effect, and trend forecasting.

Also Read: Regression Analysis in Machine Learning

### Formula of Regression

As the Regression represents the relation between the dependent and independent variable, it can be represented by:

Y = a + bX +c, where

Y: Dependent Variable
X: independent variable:
a: intercept
B: slope
c: error (residual)

### Example of Regression

• Predicting rainfall depends on humidity, directions, speed of the wind, etc.
• The price of the House depends on the location, number of rooms, facilities available, pollution, etc.

### Application of Regression

• Epidemiology: Linear regression model relates smoking and mortality, where smoking is an independent variable, and the life span is treated as a dependent variable.
• Environmental Studies: Environmentalist uses polynomial Regression to predict the occurrence of tsunami, thunderstorm, and sandstorm in advance
• Geology: Regression is used to forecast total natural gas at different sites in the world.

Other than these, Regression is very useful in archaeology, medicine, finance, and economics.

Also Read: Linear Regression in Machine Learning

## Key Difference between Correlation and Regression

• In correlation, the variables X and Y are interchangeable. In contrast, Regression attempts to establish how the value of X causes the value of Y to change, and the result will change if X and Y are swapped.
• In correlation, both variables are random, while in Regression, one is a random variable, and another is a fixed variable.
• Correlation is a single statistic, while Regression produces an entire equation.
• Correlation does not capture causality, while Regression is founded upon it.
• The graphical representation of correlation is a single point, while a line represents the linear Regression.
• The correlation between X and Y is the same as the correlation between Y and X, whereas the Regression of X and Y is completely different from the Regression of Y and X.
• When the correlation is negative (or positive), then the slope of Regression will also be negative (or positive).
• Correlation and Regression quantify the strength and the direction of the relationship between two numeric values.

Must Check: Mathematics for Machine Learning

## Conclusion

In this article, we have discussed the correlation and regression differences with example and their applications.