Dot products are an important concept in data science and are used in a variety of applications, including machine learning, natural language processing, and recommendation systems.
A dot product, also known as a scalar product or inner product, is a mathematical operation that takes two vectors and returns a scalar. The dot product is calculated by multiplying the corresponding elements of the two vectors and then summing the results. For example, the dot product of two vectors a and b can be written as:
a · b = (a1 * b1) + (a2 * b2) + … + (an * bn)
where a1, a2, …, an are the elements of vector a and b1, b2, …, bn are the elements of vector b.
As discussed earlier that a dot product is named as-
- Scalar Product which is because the result produced is a single scalar number
- Inner Product as there is the use of it with the coordinates in Euclidean geometry
- Projection Product
The dot product of two vectors is a scalar, which means it is a single number rather than a vector. The value of the dot product depends on the angle between the two vectors. If the vectors are perpendicular, the dot product will be zero. If the vectors are parallel and pointing in the same direction, the dot product will be positive. And, if the vectors are parallel and pointing in opposite directions, the dot product will be negative.
Properties of dot products
There are several important properties of dot products that are useful to know. Firstly, the dot product is commutative, which means the order of the vectors doesn’t matter. For example, a · b = b · a.
Secondly, the dot product is distributive, which means you can distribute the dot product over addition. For example, (a + c) · b = a · b + c · b.
Finally, the dot product is associative, which means you can group the vectors in any way when calculating the dot product. For example, a · (b + c) = (a · b) + (a · c).
Below are the properties:
- Property 1:
The dot product of two vectors such as a or b is termed as commutative i.e. a.b=b.a=ab cos θ.
- Property 2:
The dot product of two vectors such as a.b=0 then it is mentioned that either b or a is zero or cos θ, where θ=π2. Either of the vectors should be zero or perpendicular to each other.
- Property 3:
The scalar product of two vectors i.e. a or b can be written as (pa).(qb)=(pb).(qa)=pq a.b
- Property 4:
When a dot product of a vector is the product of itself, it is basically the magnitude squared of the vector i.e. a.a=a.a cos 0=a2
- Property 5:
The distributive property can also be followed in the dot product of the vector i.e. a.(b+c)=a.b+a.c
- Property 6:
When there are orthogonal coordinates for mutually perpendicular vectors it can be seen that i^.i^=j^.j^=k^.k^=1
- Property 7:
When there are unit vectors if a=a1i^+a2j^+a3k^ and b=b1i^+b2j^+b3k^ then the dot product would be a.b=(a1i^+a2j^+a3k^).(b1i^+b2j^+b3k^)
An example would be:
Dot Product of Two Vectors Example Questions
Example 1: Let there be two vectors [7, 2, -1] and [6, -5, 2]. Find the dot product of the vectors.
Given vectors: [7, 2, -1] and [6, -5, 2] be a and b respectively.
a.b = (7)(6) + (2)(-5) + (-1)(2)
= 42 – 10 – 2
Example 2: Let there be two vectors |a|=6 and |b|=2 and θ = 60°. Find their dot product.
a.b = |a||b|cos θ = 6.2 cos 60° = 6.2 × (1/2) = 6
Applications of dot products in data science
One common application of dot products in data science is in calculating the similarity between vectors. In machine learning, it is often useful to compare the similarity of two vectors in order to classify or cluster data. The dot product can be used to measure the similarity between vectors by calculating the angle between them.
Dot products are also used to calculate the projection of one vector onto another. A projection is a vector that represents the component of one vector that is parallel to another vector. The dot product can be used to calculate the projection by dividing the dot product of the two vectors by the dot product of the vector being projected onto itself.
In addition to these applications, dot products are also used to optimize linear regression models. Linear regression is a statistical method used to model the relationship between a dependent variable and one or more independent variables. By minimizing the sum of the squared errors between the predicted values and the actual values, linear regression models can be optimized using the dot product.
There are many other applications of dot products in data science, including natural language processing and recommendation systems. In natural language processing, dot products are often used to calculate the similarity between words or phrases in order to classify or cluster text data. In recommendation systems, dot products can be used to measure the similarity between users or items in order to make recommendations.
There are basic applications of dot products:
- Cosine Similarity which is one of the most important similarity metrics which particularly relies on the dot product.
- The weighted sum is computed efficiently as Neural Networks use dot products.
- The orthogonal coordination computations are calculated.
The one python example that illustrates the perfect use of dot product-
There is one library of python which is Numpy, it has the dot() function which is used to calculate the dot product of vectors, especially in the Numpy arrays.
import NumPy as np
There are two ways in which these Numpy arrays can have the dot product:
Again mentioning dot() is the inbuilt function of Numpy Library in Python.
The output would be:
There is a similarity between this dot() function which is in Numpy the
Detailed explanation of the output as how the dot product implementation is done-
- There is the proper cross product of every number in one array which is u to every number in another array which is v:
There is also another technique to get the answer using the shorthand which is “@” operator to calculate dot product which can be used only in the updated python version which is 3.5+
Then also the output would be 70.
There is a proper similarity between the dot() function and the matrix multiplication (NumPy matmul() function) for a 2-D array.
Let’s have a deeper look into both the operators:
- Using .dot():
The array would be: array([[19, 22],[43, 50]])
- Using matmul() function now:
The output would be the same only which is: array([[19, 22],[43, 50]])
Generally, when there are matrices, it is better that we use the matmul() operation.
In conclusion, dot products are an important concept in data science and have a wide range of applications, including machine learning, natural language processing, and recommendation systems.
They are used to calculate the similarity between vectors, project one vector onto another, and optimize linear regression models. By understanding dot products and how they work, data scientists can effectively use them. If you’re interested in learning more about dot products and their applications, there are many resources available online that can help you further your understanding of this important concept.
Download this article as PDF to read offlineDownload as PDF