clickHere
Vikram Singh
Assistant Manager - Content
Updated on Dec 5, 2022 11:36 IST

In Machine Learning Algorithms, we use distance metrics such as Euclidean, Manhattan, Minkowski, and Hamming.
In this article, we will briefly discuss one such metric, i.e., Manhattan Distance.

Different distance metrics are used in the machine-learning model; These metrics are the foundation of different machine-learning algorithms, whether it is a supervised (k-nearest neighbor) or unsupervised learning (k-mean clustering) algorithm. This article will discuss one such distance metric, i.e., Manhattan Distance Metric.

Table of Content

In k-mean clustering and k-Nearest Neighbor Algorithm, while creating clusters, you have to find the value of k and the data points that are closed enough to be considered as nearest neighbors; we use different distance metrics like Euclidean, Manhattan, Minkowski, or Hamming.

Types of Distance Matrices in Machine Learning

Four distance metrics are mainly used in Machine Learning.

Euclidean

It is one of the most common distance metrics which is very often used in machine learning algorithms that calculates the distance between two real-valued vectors. It is the shortest distance between two points.

Mathematical Formula

• The formula for Euclidean Distance (2-D):

d = [(x1 – y1 )2+ (x2 – y2)2]1/2

• Generalize Formula (n-D):

d = [(x1 – y1)2 + (x2 – y2)2 + (x3 – y3)2 + …….. + (xn – yn)2]½

Manhattan Distance

It is the sum of the absolute differences between points across all the dimensions.

• It calculates the distance between real vectors.
• It is also called Taxicab distance or City Block Distance.

Mathematical Formula

• 2-D

d = |x1 – y1| + |x2 – y2|

• General Formula (n-D)

d = |x1 – y1| + |x2 – y2| + |x3 – y3| + |x4 – y4| + …… + |xn – yn|

Minkowski Distance

It is the generalization of Euclidean and Manhattan Distance.

Mathematical Formula

d = [|x1 – y1|p + |x2 – y2|p + |x3 – y3|p + ….. + |xn – yn|p]1/p

Where p is the Order of Norm.

Hamming Distance

Hamming distance between two strings (of equal length) is the number of positions at which the corresponding alphabet or symbols differ.

• In simple terms, the number of substitutes required to change one string to another.

Example:

Let there be two strings, “Naukri” and “Pujari”.

Since both the strings are of the same length, so we can calculate the Hamming Distance.

The first four places in both the strings differ, and the last two places have the same characters.

Naukri and Pujari

Hence, the hamming distance here will be 4.

Note: The larger Hamming distance value implies maximum dissimilarities between the two strings and vice versa.

Now, we will briefly discuss Manhattan Distance.

Manhattan Distance

Manhattan distance between two points X (x1, x2, x3, ….., xn) and Y (y1, y2, y3, ….., yn) in n-dimensional is the sum of the distance in each dimension.

It is called the Manhattan distance because it is the distance a car would drive in a city (e.g., Manhattan), where the buildings are laid out in square blocks, and the straight streets intersect at right angles.
Now, you also know why it is called a taxicab and city block distance.

Manhattan Distance using Python:

Calculating the Manhattan distance by defining a function

` `
```from math import sqrt #define a manhattan function using sqrt function def manhattan(a, b): return sum(abs(v1 - v2) for v1, v2 in zip (a, b)) #define the pointsX = [1, 2, 3, 4, 5]Y = [6, 7, 8, 9, 10] #calculate the distancemanhattan (X, Y)Copy code```

Output

25

Note: You can also calculate the Manhattan distance using the scikit-learn library of Python.

` `
`from sklearn.metrics.pairwise import manhattan_distancesCopy code`

Properties of Manhattan Distance

• There are finite paths between two points whose length is equal to the Manhattan distance.
• For a given point, the other point at a given Manhattan distance lies in the square.
• A straight path with a length equal to Manhattan distance has only two permitted moves:
• Horizontal
• Vertical
• Manhattan distance is a particular case of Minkowski Distance
• For p = 1, Manhattan Distance = Minkowski Distance
• Manhattan Distance metric is preferred over Euclidean Distance when there is a high dimensionality in the data.

Conclusion

In this article, we have discussed the different types of distance metrics that are used in Machine Learning. We also covered Manhattan Distance in complete detail, with its properties and example in Python.
Hope you will like the article.
Keep Learning!!
Keep Sharing!!