SAS Iris dataset Plot

SAS Day 34:

Background Story:

Once, in my machine learning class, the professor asked what software do we use for data science? One student answered: “SAS“.
Then the professor laughed and said: “Oh dear, you must be in the wrong class, nobody uses SAS in data science industry“.

SAS stands for Statistical Analytic Software, it is most widely used in health-related fields. Although it is known Python and R are the most popular Data Science Languages, I think SAS has its strength as well, (better than excel!!). 

At least it came with the Iris Dataset!

Fotomanie / Pixabay

Today we will use Iris Dataset for Scatter Plots:

Scatter Plot Matrix


ods graphics on / height=500px width=500px;
proc sgscatter data=sashelp.iris(where=(species ="Virginica" ));
title "Fisher Iris Data";
matrix petallength 
petalwidth SepalLength/ ellipse=(type=predicted)
diagonal=(histogram normal kernel);
   ods graphics on/reset= all;


Panel of scatter plots 



ods graphics on / height=500px width=500px;
proc sgscatter data=sashelp.iris;
title "Fisher Iris Data";
plot petallength*petalwidth
   ods graphics on/reset= all;

As we can observe from the previous graphs, Sestosa has more differences compared with Versicolor and Virginica, which is consistent with our Iris Dataset Cluster Analysis with Python.


Personal Thought:

I used to feel a bit ashamed that i use SAS more often than Python or R, because those programs sound a lot cooler. Now, I think SAS deserve my appreciation as well, like the lyrics “Wild Lily also has Spring(野百合也有春天)”! SAS is a wonderful software with Iris Dataset!  

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Social media & sharing icons powered by UltimatelySocial
%d bloggers like this: