# Statistics Formulas and Notes

Mean, Median, Variance, Standard Deviation, Range

Mean

The mean is the sum of all data values divided by the number of data values.
\bar{x} (x bar) is used to denote the mean.
\bar{x}=\frac{\sum x_{i}}{n}
Example
Data set: 3, 4, 5, 6, 7.
\bar{x}=\frac{\sum x_{i}}{n}
\bar{x}=\frac{ 3+4+5+6+7}{5}
\bar{x}=\frac{ 25}{5}
\bar{x}=5

Median

The median is(are) the value(s) in the middle of a dataset when the dataset is arranged in ascending order.
Example 1
Data set: 9, 3, 1, 2, 8, 6, 7.
In ascending order: 1 , 2 , 3 , 6 , 7 , 8 , 9.
6 is the value in the middle. 6 is the median.

Example 2
Data set: 7, 6, 4, 9, 5, 1, 2, 8.
In ascending order: 1 , 2 , 4 , 5 , 6 , 7 , 8 , 9.
In this case we have two numbers that are in the middle: 5 and 6. We add these numbers together and divide by 2 to get the median.
median = \frac{5+6}{2}
median = 5.5

Sample Variance

The Variance is a measure of how spread out numbers are. s^{2} is used to denote the sample variance. s^{2}=\frac{\sum \left( x_{i}- \bar{x} \right)^{2}}{n-1}
\bar{x}: mean
n: number of values
Example 1
Data set: 1 , 2 , 3 , 8 , 9.
\bar{x} = 4.6
n = 5
s^{2}=\frac{\sum \left( x_{i}- \bar{x} \right)^{2}}{n-1}
s^{2}=\frac{ \left( 1 - 4.6 \right)^{2} + \left( 2 - 4.6 \right)^{2} + \left( 3 - 4.6 \right)^{2} + \left( 8 - 4.6 \right)^{2} + \left( 9 - 4.6 \right)^{2} }{5-1}
s^{2}=\frac{ 53.2 }{4}
s^{2}= 13.3

Sample Standard Deviation

The standard deviation is the square root of the variance.
s= \sqrt{ \frac{\sum \left( x_{i}- \bar{x} \right)^{2}}{n-1}}
For the example above, the sample variance is 13.3.
s= \sqrt{ 13.3 }
s= 3.647

Population Variance

The Variance is a measure of how spread out numbers are. \sigma^{2} is used to denote the sample variance. \sigma^{2}=\frac{\sum \left( x_{i}- \mu \right)^{2}}{n}
\mu : mean
n: number of values

Population Standard Deviation

The standard deviation is the square root of the variance.
\sigma= \sqrt{ \frac{\sum \left( x_{i}- \mu \right)^{2}}{n}}

Range

Range = highest value - lowest value

Regression, Correlation

Correlation Coefficient

The correlation coefficient is used to measure how strong a relationship is between two variables. r is used to denote the correlation coefficient.
r= \frac{n\left(\sum{xy} \right)-\left(\sum{x} \right) \left(\sum{y} \right) }{ \sqrt{ \left[ n \left( \sum{x^{2}} \right)-\left( \sum{x} \right)^{2} \right] \left[ n \left( \sum{y^{2}} \right)-\left( \sum{y} \right)^{2} \right] }}

The value of r is between -1 and 1. r=1 indicates a strong relationship. r=-1 indicates a weak relationship. r=0 indicates no relationship.

Example

 x y 2 3 5 6 7 9 9 4 4 4

n = 5
\sum{x} = 27
\sum{y} = 26
\sum{xy} = 151
\sum{x^{2}} = 175
\sum{y^{2}} = 158
r= \frac{n\left(\sum{xy} \right)-\left(\sum{x} \right) \left(\sum{y} \right) }{ \sqrt{ \left[ n \left( \sum{x^{2}} \right)-\left( \sum{x} \right)^{2} \right] \left[ n \left( \sum{y^{2}} \right)-\left( \sum{y} \right)^{2} \right] }}

r= \frac{ 5 \left( 151 \right)-\left( 27 \right) \left( 26 \right) }{ \sqrt{ \left[ 5 \left( 175 \right)-\left( 27 \right)^{2} \right] \left[ 5 \left( 158 \right)-\left( 26 \right)^{2} \right] }}

r= \frac{ 53 }{ \sqrt{ 16644 } }

r= 0.4108156844

Permutations and Combinations

Permutations

In Permutations the order matters.
nPr = P(n,r) = \frac{n!}{(n-r)!}

Combinations

In Combinations order does not matter.
nCr = C(n,r) = \frac{n!}{r!(n-r)!}