Chi Square 4

Data Science Day 6:

Chi-square application 3:

Test for Homogeneity of One Categorical Variable across serveral sample spaces.

We use the Chi-square test for Homogeneity to evaluate if one single categorical variable has a similar distribution (or frequency proportion)across two or more sample spaces (or populations).

Example: 
Couple make-up companies wish to determine if there are differences in the sales market for China, USA, and Spain.

CustomersChinaUSASpain
Buy1000020001000
Not Buy2013391212

 

H0(Null Hypotheses): The sales market has the same distribution (frequency proportion) in China, USA, and Spain.

Solution:

we will use SciPy package and chi2_contingency function in Python.

Python Code:
country_buyer=np.array([[10000,2013],[2000,391],[1000,212]])
stats.chi2_contingency(country_buyer)

Result:

We have p-value= 0.69, so we accept the Null Hypotheses and conclude the Make up customer distribution is the same in China, USA and Spain.

Data visualization:

We can see from the graph, the makeup customer has a similar distribution for China, USA, and Spain. It consolidated our statistical results.

Code:

 

I think it is human nature to pursue beauty and a decent amount of makeup does help men/women to level up our confidence, but I always remind myself don’t be obsessed with external beauty. Everyone shines in our own way, and we are already pretty enough! After all, Venus has a broken arm.

To be continue…

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Social media & sharing icons powered by UltimatelySocial
%d bloggers like this: