**Data Science Day 21: ****F -test and t-test**

From last time we know t-test is used for comparing the **mean of 2-level** categorical variable and ANOVA is used for comparing the **mean value of a 3-level** categorical variable or more.

**Question:**

However, there is a question bugs me, why both T-test and ANOVA are comparing the mean value, but** one P-value comes from the t-test and the other P-value is derived from the F-test**?

I did a bit research into this and discussed with little Rain, then we found out the key relation to answer is the equivalence of F and t-test.

**Answer:**

F= t^{2}

The hidden reason is when pair of the sample are **normally distributed** then the **ratios of variance** of sample in each pair will always** follow the same distribution**. Therefore, the t-test and F-test generate the same p-values.

**Example : F-test vs t-test in ****Blood pressure decrease dataset**

We want to know if the blood pressure medication has changed the blood pressure for 15 patients after 6 months.

test=pd.DataFrame({"score_decrease": [ -5, -8, 0, 0, 0 ,2,4,6,8, 10,10, 10,18,26,32] }) center=pd.DataFrame({"score_remained": [ 0, 0, 0, 0, 0 ,0,0,0,0, 0,0, 0,0,0,0] })

#### F-test results:

scipy.stats.f_oneway(score_decrease,score_remained) F_onewayResult(statistic=array([ 7.08657734]), pvalue=array([ 0.01272079]))

#### t-test results:

scipy.stats.ttest_ind(score_decrease, score_remained) Ttest_indResult(statistic=array([ 2.66206261]), pvalue=array([ 0.01272079]))

As we can see the F-test and t-test have the **same P-value**= 0.0127.

I used **SAS** to generate a graph:

ods graphics on; proc ttest h0=0 plots(showh0) sides=u alpha=0.05; var decrease; run; ods graphics off;

**Summary:**

Except F=t^2, I summarized a table for F-test and t-test.

basic comparison | t test | F test |
---|---|---|

Assumption | 1. Observations are Independent and Random 2. Population are Normally distributed 3. No outliers | 1. Observations are Independent and Random 2. Population are Normally distributed 3. No outliers |

Null-hypothesis | The mean value of two groups are the same. The mean value = n0. | The mean value of three or more groups are the same. N1=N2=N3... |

Feature | standard deviation is not known. Sample size is small | the variance of the normal populations is not known |

Application | 1.Compare mean value of two groups. 2.Compare mean value of a group with a particular number. | 1. comparing the variances of two or more populations. 2. ANOVA comparing the mean value of 3 or more groups. |

#### Happy Studying! 😉