Chi-Square Test: Definition and Example

# Chi-Square Test: Definition and Example

clickHere
Vikram Singh
Assistant Manager - Content
Updated on Dec 9, 2022 08:46 IST

## Introduction

Chi-square test is a statistically significant test for Hypothesis Testing.

There are 3 steps in Hypothesis Testing:

• State Null and Alternate Hypothesis
• Perform Statistical Test
• Accept and reject the Null Hypothesis

## What isChi-Square test?

Statistical method which is used to find the difference or correlation between the observed and expected categorical variables in the dataset.

Example: Food delivery company wants to find the relationship between gender, location and food choices of peoples India.

It is used to determine that the difference between 2 categorical variables are:

• Due to chance or
• Due to relationship

## Types of Chi-square Test:

• goodness of fit test
• test for independence

## Goodness of fit test:

• Number of variable = 1
• Used to determine, whether the variable(sample) belongs to population or not
• Degree of freedom:

To know more about sample and population and degree of freedom, read the article Basics of Statistics for Data Science and z-test

Example:

Problem Statement:

The observed and expected frequency of numbers appearing on dice.

Using chi-square test at 5% significance level determine whether,

Observed frequencies are different from expected frequency or not.

Solution:

Step-1: State Null and Alternate Hypothesis:

Null Hypothesis:

There is no difference between observed and expected frequency of outcome of rolling dice

Alternate Hypothesis:

There is a difference between observed and expected frequency of outcome of rolling dice

Step-2: Significance level and Degree of Freedom:

Significance level = 5%

Degree of Freedom = 6-1 = 5

Corresponding chi-square value = 11.07

Step-3: Find the chi-square value:

Step-4: Comparing with the significance level:

From, step-2 and step – 3, we have:

0.1186 < 11.07

So, we have to accept the Null Hypothesis

There is no difference between observed and expected frequency of outcome of rolling dice.

## Test for independence

• Number of variables = 2
• Used to determine, whether the variables are different or same
• Degree of Freedom:

Example:

Problem Statement: Election commission decides to find the relationship between Gender and casting vote.

A sample of 10,000 people voters were taken, the result are summarized as:

Solution:

Step-1: State Null and Alternate Hypothesis

Null Hypothesis: Gender is independent of voting.

Alternate Hypothesis: Gender and Voting are independent.

Step-2: Significance level and Degree of Freedom

Significance level = 5%

Degree of Freedom = (2-1) x (2-1) = 1

Corresponding chi-square value = 3.84

Step-3: Find the chi-square value

Step-4: Comparing with the significance level

From step-2 and step-3, we have,

6.6 > 3.84

Hence, rejecting the null hypothesis.

i.e. Gender and Voting are independent of each other.

## Conclusion:

Chi-square is a statistically significant test for the hypothesis testing (null and alternative hypotheses) when the variables are categorical.

Top Trending Articles:
Data Analyst Interview Questions | Data Science Interview Questions | Machine Learning Applications | Big Data vs Machine Learning | Data Scientist vs Data Analyst | How to Become a Data Analyst | Data Science vs. Big Data vs. Data Analytics | What is Data Science | What is a Data Scientist | What is Data Analyst