SAS Day 22: Merge
Sometimes we need to obtain information from different datasets, how do we combine two or more datasets in SAS?
Most cases, we use the “Merge” statements, however, depends on the data structures, we need to use SQL if it is many to many mapping.
P.S. regardless of Merge or SQL, we need to have at least a specific common variable.
We want to combine the information from ADSL(Patients General Infomation) and ADTTE (Time to Event Dataset).
First we sort the ADSL and ADTTE dataset by the same common variable “USUBJID”, then we merge these datasets by selecting the records in both of the datasets.
data pop; set adsl ; by usubjid; /*Always remember to Sort before Merge;*/ run; data adtte; set adam.adtte; by usubjid adt; run; /*Merge the records in both ADTTE and POP*/ data survival; merge adtte(in=b) pop(in=a); by usubjid; if a and b; run;
There are 3 steps for Merge Statement to work properly.
- Sort the datasets need to be MERGE.
- Make sure the common “BY” variable have the same name and length
- Merge the datasets with desired selections using the BY variable.
I have summarized the following code for most used Two Data Set Merge Cases:
data survival; merge adtte(in=b) pop(in=a); by usubjid; if a and b; * if a; /* Select the records in a*/ * if a and not b; /* Select the records in a but not in b*/ run;
Merge more than two datasets:
data a b c1; merge adtte(in=b) pop(in=a) adresp(in=c); by subjid; if a and b then output a; if a and c then output b; if a and not c then output c1; run;
It is important to know that Merge works for one-to-one or one to many mapping. Next time we will go over SQL for many to many mapping.
Happy Studying! 😇