Statistics Formulas and Notes
Mean, Median, Variance, Standard Deviation, Range Regression, Correlation Permutations and Combinations
Mean, Median, Variance, Standard Deviation, Range
Mean
The mean is the sum of all data values divided by the number of data values.
\bar{x} (x bar) is used to denote the mean.
\bar{x}=\frac{\sum x_{i}}{n}
Example
Data set: 3, 4, 5, 6, 7.
\bar{x}=\frac{\sum x_{i}}{n}
\bar{x}=\frac{ 3+4+5+6+7}{5}
\bar{x}=\frac{ 25}{5}
\bar{x}=5
The median is(are) the value(s) in the middle of a
dataset when the dataset is arranged in ascending order.
Example 1
Data set: 9, 3, 1, 2, 8, 6, 7.
In ascending order: 1 , 2 , 3 ,
6
, 7 , 8 , 9.
6 is the value in the middle. 6 is the median.
Example 2
Data set: 7, 6, 4, 9, 5, 1, 2, 8.
In ascending order: 1 , 2 , 4 ,
5 ,
6 ,
7 , 8 , 9.
In this case we have two numbers that are in the middle: 5 and 6. We add these numbers
together and divide by 2 to get the median.
median = \frac{5+6}{2}
median = 5.5
The Variance is a measure of how spread out numbers are. s^{2}
is used to denote the sample variance.
s^{2}=\frac{\sum \left( x_{i}- \bar{x} \right)^{2}}{n-1}
\bar{x}: mean
n: number of values
Example 1
Data set: 1 , 2 , 3 , 8 , 9.
\bar{x} = 4.6
n = 5
s^{2}=\frac{\sum \left( x_{i}- \bar{x} \right)^{2}}{n-1}
s^{2}=\frac{ \left( 1 - 4.6 \right)^{2} + \left( 2 - 4.6 \right)^{2} + \left( 3 - 4.6 \right)^{2} + \left( 8 - 4.6 \right)^{2} + \left( 9 - 4.6 \right)^{2} }{5-1}
s^{2}=\frac{ 53.2 }{4}
s^{2}= 13.3
The standard deviation is the square root of the variance.
s= \sqrt{ \frac{\sum \left( x_{i}- \bar{x} \right)^{2}}{n-1}}
For the example above, the sample variance is 13.3.
s= \sqrt{ 13.3 }
s= 3.647
The Variance is a measure of how spread out numbers are. \sigma^{2}
is used to denote the sample variance.
\sigma^{2}=\frac{\sum \left( x_{i}- \mu \right)^{2}}{n}
\mu : mean
n: number of values
The standard deviation is the square root of the variance.
\sigma= \sqrt{ \frac{\sum \left( x_{i}- \mu \right)^{2}}{n}}
Range = highest value - lowest value
Regression, Correlation
Correlation Coefficient
The correlation coefficient is used to measure how strong a relationship is between two variables. r is used to denote the correlation coefficient.
r=
\frac{n\left(\sum{xy} \right)-\left(\sum{x} \right) \left(\sum{y} \right)
}{ \sqrt{ \left[ n \left( \sum{x^{2}} \right)-\left( \sum{x} \right)^{2} \right]
\left[ n \left( \sum{y^{2}} \right)-\left( \sum{y} \right)^{2} \right]
}}
The value of r is between -1 and 1. r=1 indicates a strong relationship.
r=-1 indicates a weak relationship. r=0 indicates no relationship.
Example
x | y |
2 | 3 |
5 | 6 |
7 | 9 |
9 | 4 |
4 | 4 |
n = 5
\sum{x} = 27
\sum{y} = 26
\sum{xy} = 151
\sum{x^{2}} = 175
\sum{y^{2}} = 158
r=
\frac{n\left(\sum{xy} \right)-\left(\sum{x} \right) \left(\sum{y} \right)
}{ \sqrt{ \left[ n \left( \sum{x^{2}} \right)-\left( \sum{x} \right)^{2} \right]
\left[ n \left( \sum{y^{2}} \right)-\left( \sum{y} \right)^{2} \right]
}}
r=
\frac{ 5 \left( 151 \right)-\left( 27 \right) \left( 26 \right)
}{ \sqrt{ \left[ 5 \left( 175 \right)-\left( 27 \right)^{2} \right]
\left[ 5 \left( 158 \right)-\left( 26 \right)^{2} \right]
}}
r= \frac{ 53 }{ \sqrt{ 16644 } }
r= 0.4108156844
Permutations and Combinations
Permutations
In Permutations the order matters.
nPr = P(n,r) = \frac{n!}{(n-r)!}
In Combinations order does not matter.
nCr = C(n,r) = \frac{n!}{r!(n-r)!}