Maths encyclopedia and lessons  
Search

Mathematics Encyclopedia and Lessons

 
     
 

Lessons

Popular
Subjects

algebra
arithmetic
calculus
equations
geometry
differential equations
trigonometry
number theory
probability theory
more
 

References

applied mathematics
mathematical games
mathematicians
more
 
 

Mahalanobis distance

In statistics, Mahalanobis distance is a distance measure invented by P. C. Mahalanobis in 1936. It is based on correlations between variables by which different patterns can be identified and analysed. It is a useful way of determining similarity of an unknown sample set to a known one. It differs from Euclidean distance in that it takes into account the correlations of the data set.

Formally, the Mahalanobis distance from a group of values with mean \mu = ( \mu_1, \mu_2, \mu_3, \dots , \mu_p ) and covariance matrix Σ for a multivariate vector x = ( x_1, x_2, x_3, \dots, x_p ) is defined as:

D_M(x) = \sqrt{(x - \mu)' \Sigma^{-1} (x-\mu)}.\,

Mahalanobis distance can also be defined as dissimilarity measure between two random vectors \vec{x} and \vec{y} of the same distribution with the covariance matrix Σ :

d(\vec{x},\vec{y})=\sqrt{(\vec{x}-\vec{y})'\Sigma^{-1} (\vec{x}-\vec{y})}.\,

If the covariance matrix is the identity matrix then it is the same as Euclidean distance. If covariance matrix is diagonal, then it is called normalized Euclidean distance:

d(\vec{x},\vec{y})= \sqrt{\sum_{i=1}^p  {(x_i - y_i)^2 \over \sigma_i^2}},

where σi is the standard deviation of the xi over the sample set.


01-04-2007 01:18:14
The contents of this article are licensed from Wikipedia.org
under the GNU Free Documentation License. How to see transparent copy