68-95-99.7 Rule: Definition and Implementation

# 68-95-99.7 Rule: Definition and Implementation

Updated on Jan 2, 2023 15:14 IST

68-95-99.7 Rule or the empirical rule is based on mean and standard deviation. It is a shorthand for remembering percentage of values lying within interval estimate in the normal distribution.

68-95-99.7 rule is an Empirical Rule followed by all the data following a normal distribution. Through this article, you will learn about the rule in detail.

In order to understand the 68-95-99.7 rule let’s first understand the normal distribution with the below scenario:

• This is Jimmy
• Jimmy is an average 40-year Software Engineer
• He works 40hrs per week
• Jimmy earns 1.5lakh rupee per month

Now here’s the question:

Q: How often do you think that you are going to meet someone who earns 5x that of Jimmy?

Q: How often do you think you are going to meet a person who works 5x more per week?

Well, the answer will definitely differ from the kind of data you are looking at. It might be possible that 5x greater than average is common.

## What is Normal Distribution?

Normal Distribution or Gaussian Distribution is defined as a symmetrical bell-shaped curve. The peak of the curve denotes the average of the data points. When compiled and graphed, many continuous large datasets in nature exhibit this bell-shaped curve.

Let’s check out some of the common examples of simple data following Normal Distribution:

Height of Person

The population’s height is an example of normal distribution. The majority of persons in a given population are of average height. The number of persons who are taller and shorter than the average height is almost equal, and only a few people are either extraordinarily tall or extremely short. As a result, when we plot the height of all the people in the world, the graph will follow a bell-shaped curve denoting the normal distribution.

Explore normal distribution certifications

Rolling a Dice

A fair dice (more than 1 dice) roll also creates a bell-shaped curve denoting normal distribution. The normal distribution graph will become increasingly sophisticated as the number of dice increases.

Similarly, there are many more things in the world that follow this distribution:

• IQ of a population sample
• Income Distribution In Econom
• Performance of employees in the company
• Shoe size of a person
• Birth weight of a population, and many more

## Understanding 68-95-99.7 Rule

After the observation of multiple normal distribution curves, statisticians found a common pattern in the data. They observed that all the data showed some pattern at 68%, 95%, and 99.5%. Since this was a common observation in all the normal distribution graphs, it became a rule and we call it the Empirical Rule of Normal Distribution (also known as the 68-95-99 rule).

In other words, the Empirical Rule is a rule based on experience and observation. It gives how much of the data lies within one, two, or three standard deviations of the mean.

The 68–95–99.7 was first discovered and named by Abraham de Moivre in 1733 during his experimentation of flipping 100 fair coins. This term was coined 75 years before the normal distribution model even was introduced.

Let’s understand this 68-95-99.7 rule more clearly,

As per Empirical Rule:

• 68% of observed data points lie within one standard deviation of the mean
• 95% fall within two standard deviations, and
• 99.7% occurs within three standard deviations.

The empirical rule is also known as three-sigma because practically all data fall within the three standard deviations. This rule establishes both upper and lower bounds of a statistical chart at +/- three standard deviations. Since 99.7 percent of all numbers fall inside it, this limit is also used to identify the outliers. All the data points which lie beyond +/- 3 standard deviations are considered an outlier.

Learn What is an outlier?

## Implementing the 68-95-99.7 Rule using Python

Let us now understand the implementation of this empirical rule using Python:

` `
`import matplotlib.pyplot as pltimport numpy as npfrom scipy.stats import norm # setting the values of# mean and Standard Deviationmean = 0SD = 1 # value of Cumulative distribution function (cdf) # between one, two and three S.D. around the mean one_sd = norm.cdf(SD, mean, SD) - norm.cdf(-SD, mean, SD)two_sd = norm.cdf(2 * SD, mean, SD) - norm.cdf(-2 * SD, mean, SD)three_sd = norm.cdf(3 * SD, mean, SD) - norm.cdf(-3 * SD, mean, SD) # printing the value of fractions print("Values within 1 SD =", one_sd)print("Values within 2 SD =", two_sd)print("Values within 3 SD =", three_sd)Copy code`

Output:

` `
`Values within 1 SD = 0.6826894921370859Values within 2 SD = 0.9544997361036416Values within 3 SD = 0.9973002039367398Copy code`

The above output clearly verifies the empirical rule by denoting the fraction of values within 1, 2, and 3 Standard Deviation.

Explore free online certifications

## Conclusion

This observation of 68-95-99.7 is a rule because it is universally accepted for all the data following a normal distribution curve. The important part is to figure out whether or not your data follow a normal distribution or not.

_______________

Recently completed any professional course/certification from the market? Tell us what you liked or disliked in the course for more curated content.