Central Limit Theorem

Learn about the central limit theorem and it's implications in this interactive tutorial.


In this 10-15 minute tutorial we'll go over the the Central Limit Theorem. The first few slides are some foundational knowledge you need to know, followed by interactive demos of the major conclusions of the central limit theorem.

You can also use the arrow keys to move forward or backwards. It's highly recommended you do not skip slides.


FOUNDATIONAL KNOWLEDGE

img

This is the graph of a normal distribution (sometimes called a bell curve). The Central Limit Theorem (CLT) tells us that with large enough samples (n > 30), the distribution of the mean (referred to as mu) will conform to this distribution.

Let's say you took a sample of a population. The mean of the sample (referred to as Xbar) would fall somewhere in the graph above. Notice however, the further you get from the center of the graph (the population mean) the less frequent it is.


FOUNDATIONAL KNOWLEDGE

img

The mean (top formula - referred to as Xbar) of any sample is the sum of the data points, divided by the number of points.
The variance (bottom formula - referred to as S2) is the average number of squared differences from the mean.
Enter numbers below, separated by spaces, to have these statistics generated.


Mean: 0 | Variance: 0

FOUNDATIONAL KNOWLEDGE

Consider a dice roll of a six-sided die. Roll the dice a couple times to create a sample.
Click here to roll the die.

Roll the dice to get a sample mean.

The sample mean is YOUR specific result. If you had 100 of your friends do the same thing, their sample means will be different. If you graphed the sample means of all the results, the distribution would appear normal, as the bell-shaped graph we saw earlier.


Taking Samples

Before getting into further details, lets collect some data (a sample) that we'll use for the basis of this tutorial.

In this sample, you will be guessing the number of months it takes for a cat to get adopted from a shelter, based on the appearance of the cat.

there should be a cat here

How many months before this cat was adopted?

Sample Results
Adoption Time

Notice your guesses don't seem to follow any pattern (it doesn't follow a particular distribution). If several other people did the same experiment, their guesses would vary from yours, but would likely still not follow a particular distribution.

Above is the graph of your adoption time guesses for each cat. Notice how this graph doesn't seem to follow any kind of distribution? The results look completely random!
Imagine you just played the game but rated 50 cats instead of just 5. If more cats were rated, the graph would look just as, if not more, random. We'll see this in the next slide.


These are the estimated adoption times for 50 different cats. We treat the graph above as one sample. Notice, there still doesn't seem to be a particular probability distribution. The way the cats are rated is essentially random. Most experiments will use multiple samples to improve accuracy. Now let's see...

What happens
?
?
?
?

Below is graph of the means of very large samples. Notice as you increase the size of the sample, the means of the experiments conform to a normal distribution.

Showing plots of the average adoption time for
samples.
Number of Times An Average Occurs
0Average Time Until Adoption (months)5

Thank You

Thanks for going through this interactive tutorial. If you still need understanding, check out some of the external links below. This tutorial was created by Amir and Nick.


Want to learn more?

Normal Distribution - MathIsFun
Normal Distribution - Khan Academy
Explained Simply - YouTube