## Memo

Take and organize notes like text messages.

ANalysis Of VAriance, also known as ANOVA Analysis, is a method to help analyze data with more than two groups. The goal is to decide wether or not the population means for those groups are equal. If you are only comparing two groups, you can simply use the t-test to decide whether there is a difference in population mean.

Some example applications of the ANOVA analysis:

• Compare the gasoline mileage for a sample size of 20 Fords, Toyotas, and Mercedes-Benz

• Compare the effects of a blood pressure lowering medication in patients from three different risk groups (healthy, pre-hypertension, hypertension)

• Allow groups of 20 students to take the same test within a different period of time (60,90, 120 minutes)

As mentioned before, the goal is to figure out if there is a difference in population mean. To that note, the null hypothesis of a one-way ANOVA test is as follow:

```
let m1, m2, ... mi = population mean of group i
Ho : m1 == m2 == m3 == ... == mi
Ha : at least two of the population means are different
```

The main idea is to find the ANOVA test statistic which will be denoted as F and compare that to the critical value.

**If the test statistic is greater than the critical value, we reject Ho. Otherwise, we fail to reject Ho.**

`p-value`

is less than the `level of significance`

, we reject Ho. Otherwise, we fail to reject Ho.

```
let SST = Total sum of squares
let SSTr = Treatment sum of squares
let SSE = error sum of squares
let MSTr = Mean Square for Treatment
let MSE = Mean Square for Error
let F = MSTr / MSE
let cm = Sum of all data
```

```
let data = [[5.2,4.5,6.0,6.1,6.7,5.8],
[6.5, 8.0, 6.1, 7.5, 5.9, 5.6],
[5.8, 4.7, 6.4, 4.9, 6.0, 5.2],
[8.3, 6.1, 7.8, 7.0, 5.5, 7.2]]
cm = sum(data)
for group in data:
SSTr += sum(group)**2 / group.count
for item in group:
SST += item**2
let I = number of groups
let J = number of elements in group.
cm = (cm^2) / I * J
SSTr -= cm
SST -= cm
SSE = SST - SSTr
MSTr = (SSTr / I - 1)
MSE = (SSE / (I * (J - 1))
F = MSTr / MSE
if F > critical_value:
Reject Ho
else:
Fail to reject Ho
```

```
let df1 = I - 1 #column
let df2 = I * (J - 1) #row
let significance_level = given
func critical_value(df1, df2, significance_level):
return refer to the F-table here
```

As stated earlier, when we reject the null hypothesis, further investigation may be in order. We do this by performing the Tukey's procedure or t procedure.

The Tukey's procedure helps us find any significant difference between the sample means. If we fail to reject the null hypothesis, that means the sample means are the same. Otherwise, the at least one of the means are not the same as the rest.

```
func getQ(number_of_means, degrees_of_freedom, significance_level):
return refer to the Q table here
let t = sqrt(mse / n) * q
# get the absolute difference of all sample means and compare it with the t value.
let abs_mean_i = abs(mean[i] - mean[j]) for mean i and mean j
If abs_mean_i > t:
There is a difference between groups i and j
else:
There are no significant differences
```