In order to fulfill my dream, I will organize the lecture notes, homework,  projects from Harrisburg University . Meanwhile, I will publish some data analysis related topics.

Rpubs ,Github  , Kaggle

Sample Size Calculation with R

Background Story: One day, my boss asked me to check if the data has a certain number of events to perform an efficacy analysis. I was curious how did he come up with the number, later I know he must have done the Sample Size Calculation. Today we will go over the basics and R applications for sample size calculation.

Randomization Method

Background Stroy: Last time we emphasized the importance of Randomization because it will provide a balanced measurement for treated and placebo groups, so the treatment is exchangeable. Today we will introduce 3 common randomization methods for different clinical trial purposes and the R code for implementing them: Simple Randomization, Block Randomization, and Stratified Randomization.

Unix Command

Today I want to share some basic Unix commands I use recently in Putty.     Directory: Direct the path to a certain location cd /… cd – {home directory} pwd {show current working directory} Files: List current files  ls {path} ls -l {date,size, permission} Check/Change Access: Read is 4. Write is 2. Execute is 1. ls -lah xyz(u,group,everyone else)…

T.Test: One vs Two Sample

data science day 25 T.Test is one of the most commonly used statistical tools to compare the difference in the mean value for Continuous variable as outcomes using Binary Explotary variables. Today we will go over three basic pieces of knowledge for T.Test The 2 basic Assumptions for T.Test (i.i.d, sample is normal ) One Sample vs Two Sample T.Test…

Python Read/Write Text File

python day 34 Plain text files are broadly used in Data Science nowadays. For example, in NLP, Natural Language Process, we usually import plain text files for sentimental analysis. Such as movie review, “good”, “bad”… So today we will learn how to read text files in Python.

