## Memo

Take and organize notes like text messages.

A statistical hypothesis is a claim about a population parameter such as the mean or a proportion. There are two contradicting statements; a null hypothesis denoted as Ho and the alternative hypothesis denoted as Ha (not the same as alternative facts). The null hypothesis is said to be the claim that is initially assumed to be true while the alternative contradicts the null hypothesis.

The goal is to reject or fail to reject the null hypothesis. We never test the alternative but rather we test the null hypothesis and we want to see if we reject the null hypothesis or we fail to reject the null hypothesis. Rejecting the null hypothesis would mean we favor the alternative and to fail to reject the null hypothesis would mean we keep the null hypothesis as the true claim.

• The null hypothesis should always be phrased as an equality
while the alternative hypothesis can be phrased as an equality or
an inequality.

• A test statistic is calculated based on the sample data. There
are various forms that depend on what you are calculating and how
you are calculating it. If you are performing a one-way or two-way
ANOVA, your test statistic is known as the f test and if you are
calculating the probability of an event to occur using the Central
Limit Theorem, you would use the z-score.

• A rejection region is based on the test statistic in which is
when you decide whether to reject the null hypothesis.

The p-value is the area under a standard normal bell curve. If the p-value is smaller than the significance level (usually 0.05 if not specified), then we reject the null hypothesis. Otherwise, we fail to reject.

The test statistic for a population mean is as followed:

```
let p_0 = average of sample size
let p_1 = given mean
let s = point estimate
def test_statistic_population_mean(data, p_0)
s = 0
p_1 = sum(data) / len(data)
for x in data:
s += (x - p_1)**0.5
s = (s / (len(data) - 1)**0.5
return (p_1 - p_0) / (s / len(data)**0.5)
```

If your sample size is larger than or equal to 40, we will reference the z-table. Otherwise, we will refer to our t-table; with the caveat that you are assuming the population is normal. Of course, you can always find the p-value using computer.

```
let n = sample size
def test_statistic_population_proportion(p_0, p_1, n):
return (p_0 - p_1) / (((p_1 * (1 - p_1)) / n)**0.5)
```