In mathematics, the hypergeometric distribution is a discrete probability distribution that describes the number of successes in a sequence of n draws from a finite population without replacement.
A typical example is the following: There is a shipment of N objects in which D are defective. The hypergeometric distribution describes the probability that in a sample of n distinctive objects drawn from the shipment exactly k objects are defective.
In general, if a random variable X follows the hypergeometric distribution with parameters N, D and n, then the probability of getting exactly k successes is given by
The probability is positive when k is between max{ 0, D + n − N } and min{ n, D }.
The formula can be understood as follows: There are
possible samples (without replacement). There are
ways to obtain k defective objects and there are
ways to fill out the rest of the sample with non-defective objects.
When the population size is large (i.e. N is large) the hypergeometric distribution can be approximated reasonably well with a binomial distribution with parameters n (number of trials) and p = D / N (probability of success in a single trial).
The fact that the sum of the probabilities, as k runs through the range of possible values, is equal to 1, is essentially Vandermonde's identity from combinatorics.