SAS Day 34:
Once, in my machine learning class, the professor asked what software do we use for data science? One student answered: “SAS“.
Then the professor laughed and said: “Oh dear, you must be in the wrong class, nobody uses SAS in data science industry“.
SAS stands for Statistical Analytic Software, it is most widely used in health-related fields. Although it is known Python and R are the most popular Data Science Languages, I think SAS has its strength as well, (better than excel!!).
At least it came with the Iris Dataset!
Today we will use Iris Dataset for Scatter Plots:
Scatter Plot Matrix
ods graphics on / height=500px width=500px; proc sgscatter data=sashelp.iris(where=(species ="Virginica" )); title "Fisher Iris Data"; matrix petallength petalwidth SepalLength/ ellipse=(type=predicted) diagonal=(histogram normal kernel); run; ods graphics on/reset= all;
Panel of scatter plots
ods graphics on / height=500px width=500px; proc sgscatter data=sashelp.iris; title "Fisher Iris Data"; plot petallength*petalwidth sepallength*sepalwidth petallength*sepallength petalwidth*sepalwidth /group=species; run; ods graphics on/reset= all;
As we can observe from the previous graphs, Sestosa has more differences compared with Versicolor and Virginica, which is consistent with our Iris Dataset Cluster Analysis with Python.
I used to feel a bit ashamed that i use SAS more often than Python or R, because those programs sound a lot cooler. Now, I think SAS deserve my appreciation as well, like the lyrics “Wild Lily also has Spring（野百合也有春天）”! SAS is a wonderful software with Iris Dataset!