### Data Science Day 4:

Chi-Square test application 1:

**Test Goodness of a fit.**

*We use the goodness of a fit to test if the observed categorical data follows the hypothesized or expected distribution.*

**Example 1: P-value Interpretation**

Suppose f_exp are the expected number of boys in grade 1 different classes. f_obs are the observed number of boys in grade 1. We want to see if f_obs is the same as the f_exp distribution.

**H0(Null Hypotheses):** the observation boy students distribution is consistent with the expected distribution.

## Boy Students Distribution

18 | 10 |

15 | 5 |

5 | 7 |

8 | 18 |

4 | 10 |

3 | 11 |

We use the following **python** code to acquire the p-value:

*Chisquare(f_obs=[18,15,5,8,4,3], f_exp=[10,5,7,18,10,11])*

For this particular example, the **p-value=6.02e-08**, which is significantly **smaller than 0.05**. So we **reject H0,** and conclude the observed boy students distribution is** Different** from the Expected boy distributions*.*

**Example 2: Data visualization Interpretation**

We will graph a Chi-square distribution with degree 5 and size 1000, and use Kernel Density Estimation to fit the graph. We can see this is a pretty good fit.

To be continue…..