Standard Deviation and Variance
Deviation
Deviation is the distance from the mean of a particular piece of data
Variance
The variance of a dataset is the average of the squared distances from the mean
where:
is the population size is the mean of the dataset
Re-writing this with expected value, given a random variable
- If
is constant, then - If
and are independent then
Standard Deviation
Standard deviation is the measure of how spread out numbers are. It is defined as the square root of the vairance.
Suppose we have a dataset containing
If we just sum the differences from the mean, we would have
If we try absolute value, we have
But consider the dataset
Instead, if we square the numbers instead, we find:
and
so the standard deviation will be larger if the data is more spread out.