Skip to content
Sahithyan's S3
Sahithyan's S3 — Applied Statistics

Hypergeometric Distribution

Hypergeometric Experiment

From a population of size NN containing kk successes, a sample of size nn is retrieved without replacement. Number of successes in the sample is observed and denoted by XX.

The hypergeometric distribution describes the probability of obtaining xx successes in a hypergeometric experiment. Denoted by h(x;N,n,k)h(x;N,n,k).

P(x)=kCx×  NkCnxNCnP(x) = \frac{^kC_x \times\; ^{N-k}C_{n-x}}{^NC_n}

Here:

  • kCx^kC_x - Number of ways to choose xx successes from kk successes.
  • NkCnx^{N-k}C_{n-x} - Number of ways to choose nxn-x failures from NkN-k failures.
  • NCn^NC_n - Total number of ways to choose nn items from NN items.

Mean

μx=nkN\mu_x = \frac{nk}{N}

Variance

Vx=nk(Nk)(Nn)N2(N1)V_x = \frac{nk(N-k)(N-n)}{N^2 (N - 1)}

Cumulative Hypergeometric Distribution

Refers to the probability that the hypergeometric random variable is greater than a lower limit or lesser than an upper limit.

Multivariate Hypergeometric Distribution

Suppose a population of size NN, having kk different types of items. Each type has N1,N2,,NkN_1, N_2, \ldots, N_k items.

i=1kNi=N\sum_{i=1}^{k} N_i = N

Multivariate Hypergeometric Distribution describes the probability of obtaining x1,x2,,xkx_1, x_2, \ldots, x_k items of each type from the above population. Denoted by h(x;N,n,N)h(\mathbf{x};N,n,\mathbf{N}).

P(x1,x2,,xk)=(N1x1)(N2x2)(Nkxk)(Nn)P(x_1, x_2, \dots, x_k) = \frac{\binom{N_1}{x_1} \cdot \binom{N_2}{x_2} \cdot \ldots \cdot \binom{N_k}{x_k}}{\binom{N}{n}}