SAS Boxplot

SAS Day 33: Box Plot

Definition:
Box Plot or Whisker plot displays the distribution of 5-number summary of a dataset: minimum, maximum, q1, q3, and Median. 

Interpreting quartiles:

The 5-number summary approximately divides the data into 4 sections that each containing 25% of the data.

Explore a little more

If we want to look at the Outliers, we define the points below q1- 1.5(q3-q1) and q3+ 1.5(q3-q1) as outliers.

Note: if we transfer the Q1-Q3 range of a boxplot into a normal distribution, then it maps to the peak of a normal curve (± 0.6745σ).

akshayapatra / Pixabay

Example:

we will use sashelp.class as an example for box-plot using SGPLOT and TEMPLATE, they both produce the same result!

Basic Box-Plot 

Interpretation:
the median weight of female student is a little lower than 90, 25% of female students’ weight are within 75- 82, 25% are within 105-115 and 50% are between 85-102. 

Code:

SPGLOT

proc sgplot data=sashelp.class;
title “Distribution of Weight by Sex”;
vbox weight / category= sex;
run;

TEMPLATE

proc template;
define statgraph ClassBox;
begingraph;
entrytitle “Distribution of Weight by Sex”;
layout overlay;
boxplot y=weight x=sex ;
endlayout;
endgraph;
end;
run;

proc sort data=sashelp.class out=class;
by sex;
run;
proc sgrender data=class template=ClassBox;
run;

 

Advance Box Plot:

Code:

proc univariate data=sashelp.class;

var weight ;
class sex;
ods output quantiles =q;
run;

data q2(rename=(estimate=weight) where=(Quantile ne ” “));
set q;
quantile= scan(quantile, 2,””);
run;

proc template;
define statgraph bpp;
begingraph;
entrytitle “Distribution of Weight by Sex” ;
layout overlay;
boxplotparm y=weight x=sex stat=quantile;
endlayout;
endgraph;
end;
run;

proc sgrender data=q2 template=bpp;
run;

with the extra univariate step, we have a summary dataset to look for cross-validate the graph.
we can see indeed the min of female students weight is 50.

 

 

Reference:

https://www.khanacademy.org/math/statistics-probability/summarizing-quantitative-data/box-whisker-plots/a/box-plot-review

https://towardsdatascience.com/understanding-boxplots-5e2df7bcbd51

Creating Statistical Graphics in SAS,
Warren F.Kuhfeld  

Happy Practicing!  🍹

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Social media & sharing icons powered by UltimatelySocial
%d bloggers like this: