#### Data Science Day 6:

#### Chi-square application 3:

**Test for Homogeneity of One Categorical Variable across serveral sample spaces.**

*We use the Chi-square test for Homogeneity to evaluate if one single categorical variable has a similar distribution (or frequency proportion)across two or more sample spaces (or populations).*

**Example: **

Couple make-up companies wish to determine if there are differences in the sales market for China, USA, and Spain.

Customers | China | USA | Spain |
---|---|---|---|

Buy | 10000 | 2000 | 1000 |

Not Buy | 2013 | 391 | 212 |

**H0(Null Hypotheses):** The sales market has the **same distribution** (frequency proportion) in China, USA, and Spain.

**Solution:**

we will use **SciPy** package and *chi2_contingency* function in Python.

**Python Code:**

*country_buyer=np.array([[10000,2013],[2000,391],[1000,212]])*

*stats.chi2_contingency(country_buyer)*

**Result:**

We have **p-value= 0.69**, so we **accept** the Null Hypotheses and conclude the Make up customer **distribution is the same** in China, USA and Spain.

**Data visualization:**

We can see from the graph, the makeup customer has a similar distribution for China, USA, and Spain. It consolidated our statistical results.

Code:

I think it is human nature to pursue beauty and a decent amount of makeup does help men/women to level up our confidence, but I always remind myself don’t be obsessed with external beauty. **Everyone shines in our own way**, and we are already pretty enough! After all, Venus has a broken arm.

To be continue…