Ways to Measure Distance
Ways to Measure Distance
There are several ways to measure distance between two points in a multidimensional space. The choice of distance metric depends on the specific problem and the characteristics of the data.
Euclidean Distance
The Euclidean distance is the straight-line distance between two points in a multidimensional space. It is calculated as the square root of the sum of the squared differences between the corresponding coordinates of the two points.
1 | d(A, B) = sqrt((x1 - y1)^2 + (x2 - y2)^2 + ... + (xn - yn)^2) |
For example, let’s say we have two points in a 2D space:
- Point A: (1, 2)
- Point B: (4, 6)
The Euclidean distance between these two points is calculated as:
sqrt((1 - 4)^2 + (2 - 6)^2) = sqrt(9 + 16) = sqrt(25)
Manhattan Distance
The Manhattan distance is the sum of the absolute differences between the corresponding coordinates of the two points. It is calculated as the sum of the absolute differences between the coordinates of the two points.
1 | d(A, B) = |x1 - y1| + |x2 - y2| + ... + |xn - yn| |
For example, let’s say we have two points in a 2D space:
- Point A: (1, 2)
- Point B: (4, 6)
The Manhattan distance between these two points is calculated as:
|1 - 4| + |2 - 6| = 3 + 4 = 7
Minkowski Distance
The Minkowski distance is a generalization of the Euclidean and Manhattan distances. It is calculated as the sum of the powers of the differences between the corresponding coordinates of the two points.
1 | d(A, B) = (|x1 - y1|^p + |x2 - y2|^p + ... + |xn - yn|^p)^(1/p) |
For example, let’s say we have two points in a 2D space:
- Point A: (1, 2)
- Point B: (4, 6)
The Minkowski distance between these two points with a power of 2 is calculated as:
|(1 - 4)|^2 + (|2 - 6|)^2 = 9 + 16 = 25
The Minkowski distance between these two points with a power of 3 is calculated as:
|(1 - 4)|^3 + (|2 - 6|)^3 = 27 + 64 = 91
Hamming Distance
The Hamming distance is a measure of the difference between two strings of equal length. It is calculated as the number of positions at which the corresponding symbols are different.
For example, let’s say we have two strings:
- String A: AABB
- String B: ABBC
The Hamming distance between these two strings is calculated as:
- Position 1: A vs A (same)
- Position 2: A vs B (different)
- Position 3: B vs B (same)
- Position 4: B vs C (different)
d(A, B) = 2
Jaccard Distance
The Jaccard distance is a measure of the similarity between two sets. It is calculated as the size of the intersection divided by the size of the union of the two sets.
1 | Jaccard Similarity = |A ∩ B| / |A ∪ B| |
For example, let’s say we have two sets:
- Set A: {1, 2, 3}
- Set B: {2, 3, 4}
The Jaccard distance between these two sets is calculated as:
size(A ∩ B) / size(A ∪ B) = 2 / 4 = 0.5
Therefore, the Jaccard distance is: 1 - 0.5 = 0.5
Cosine Distance
The Cosine distance is a measure of the similarity between two vectors. It is calculated as the dot product of the two vectors divided by the product of their magnitudes.
1 | Cosine Similarity = (A · B) / (||A|| × ||B||) |
For example, let’s say we have two vectors:
- Vector A: [1, 2, 3]
- Vector B: [2, 3, 4]
The Cosine distance between these two vectors is calculated as:
A · B = 1 × 2 + 2 × 3 + 3 × 4 = 2 + 6 + 12 = 20
||A|| = sqrt(1^2 + 2^2 + 3^2) = sqrt(14)
||B|| = sqrt(2^2 + 3^2 + 4^2) = sqrt(29)
20 / (sqrt(14) × sqrt(29)) ≈ 20 / 20.149
Therefore, the cosine distance is: 1 - 0.9926 = 0.0074





