Category: Data Science

你的梦想是什么?
-当data science 届的网红

In order to fulfill my dream, I will organize the lecture notes, homework,  projects from Harrisburg University . Meanwhile, I will publish some data analysis related topics.

Online publication link:

Rpubs ,Github  , Kaggle

Risk Ratio

Data Science Day 15 Risk Ratio Last time, we give a SAS example of Risk Difference to test if two groups are experiencing the same proportion of a certain event. In order to understand the topic better, we will go over Risk Ratio. Definition: Risk Ratio or Relative Risk (RR) is the probability that an event occurs in a group 1…

Python Network Graph

Python Day 1: Neuron Network Graph Suppose we would like to build a basic network graph implies a student’s grade is affected by IQ and Study. In addition, Interest and method affect the result of the study. # libraries import pandas as pd import numpy as np import networkx as nx import matplotlib.pyplot as plt #build dataframe with connections: df…

Odds Ratio

Data Science Day 12: Odds Ratio Learning Objective:   Probability vs Odds Vs Odds Ratio 1. Probability = Event/Sample Space 2. Odds= Prob(Event)/Prob(Non-Event) 3. Odds Ratio = Odds(Group 1)/ Odds(Group 2)   Interpretation The Odds Ratio is a measure of association between exposure and outcome. OR=Odds(Group 1)/Odds(Group2)>1 indicates the increased occurrence of an event in Group 1 compared to Group…

CMH Test

Data Science Day 11: CMH Test We know Chi-square can test the independence between two categorical variables in one sample population. What if we need to check the independence relation among three categorical variables or more? Cochran Mantel Haenszel (CMH): There are 3 categorical variables, we want to test if the third categorical variable is independent of the other two variables. Usually, the third nominal…

Feature Selection 2

Data science Day 10: Sequential Backward Selection Backward Selection is the selection method starting from the whole set and achieves the attribute set by removing the element that results in the maximum decrease of the Objective Function in each step. Sequential Backward Selection Algorithm Let Y= X. x in Y where F(x) is maximized. Y- {xi}, and repeat step 2. If we…

SAS Bonferroni Method

SAS Day 10: Bonferroni Method   Background: We know ANOVA is good for testing if there is any difference between the mean value among different groups. Null Hypothesis: If the p-value for ANOVA <0.05, we know there is at least one group have different mean values compared to others.However, we do not know which groups have significant mean values.  If we…

Social media & sharing icons powered by UltimatelySocial