Sahithyan's S3 — Applied Statistics

Sampling Distribution

A probability distribution of a statistic obtained from a large number of samples drawn from a specific population. A distribution that results when the following process is repeated:

A random sample of size n from a population
A statistic (i.e. mean or some portion or variance) is calculated for that sample
THe frequency distribution of the statistic is plotted

Variability

Measured by its variance or standard deviation. Depends on:

Total number of observations
Number of observations in a sample
Selection of the samples

Central Limit Theorem

Aka. CLT. States that a sampling distribution will be normal or nearly normal given the sample size is large enough. As a rule of thumb, 30 is considered large enough.

There are other cases where the CLT can be applied.

The population is normally distributed
The sampling distribution is symmetric, unimodal, without outliers and the sample size is 15 or less.
The sampling distribution is moderately skewed, unimodal, without outliers and the sample size is between 16 and 40.
The sample is greater than 40, without outliers.

Sampling Distribution of Mean

Suppose $\mu$ is the population mean and $\sigma$ is the population standard deviation.

If the below conditions are met:

The population is normally distributed OR the sample size is large enough.
The population standard deviation is known.

Then:

\overline{x} \sim N \left(\mu, \frac{\sigma} {\sqrt{n}} \right)

Sampling Distribution of Proportion

Suppose in a population of size $N$ all possible samples of size $n$ are drawn and the sampling distribution of the success proportion is calculated. Let $p$ be the population proportion of successes.

If the below conditions are met:

The sample size is large enough.
The population proportion is known.

Then:

\overline{p} \sim N \left(p,\frac{p(1-q)}{n} \right)