I have table in SAS Enterprise Guide like below:
ID | val1| val2
----|-----|-----
123 | M | M
123 | M | P
123 | P | P
444 | PR | PR
444 | PR | PR
567 | PR | M
567 | M | M
99 | PR | P
And I need to creeate new column "col1" with values 0/1:
- If some ID never changed the value PR from column "val1" to the value of M or P in column "val2" then this ID has 1 else 0
So, as a result I need something like below:
ID | val1| val2| col1
----|-----|-----|----
123 | M | M | 1
123 | M | P | 1
123 | P | P | 1
444 | PR | PR | 1
444 | PR | PR | 1
567 | PR | M | 0
567 | M | M | 0
99 | PR | P | 0
Because:
- 123 - has 1 in "col1" because has never changed PR to M or P
- 444 - has 1 in "col1" because has never changed PR to M or P
- 567 - has 0 because changed PR to M
- 99 - has 0 because changed PR to P
How can I do that in PROC SQL in SAS Enterprise Guide ?
CodePudding user response:
So you want a single value at the ID level that is replicated onto all observations for that level of ID? PROC SQL makes that easy because it will automatically remerge aggregate values back to the detailed observations.
It sounds like the test you want is
val1='PR' and val2 in ('M' 'P')
You then want the overall result to be 1 (TRUE) when that expression is never true.
data have ;
input ID val1 $ val2 $ EXPECT ;
cards;
123 M M 1
123 M P 1
123 P P 1
444 PR PR 1
444 PR PR 1
567 PR M 0
567 M M 0
99 PR P 0
;
proc sql;
create table want as
select *
, min(not (val1='PR' and val2 in ('M' 'P'))) as COL1
from have
group by id
;
quit;
Result:
Obs ID val1 val2 EXPECT COL1
1 99 PR P 0 0
2 123 P P 1 1
3 123 M M 1 1
4 123 M P 1 1
5 444 PR PR 1 1
6 444 PR PR 1 1
7 567 M M 0 0
8 567 PR M 0 0
CodePudding user response:
Solution if the can be changed:
proc sort data=have;
by ID;
run;
data want;
* for every ID, read the data you have twice *;
set have (in=first_pass) have (in=last_pass);
by ID;
* From the first pass, remember if any observation "changed" from PR to P or M *;
retain col1;
if first.ID then col1 = 0;
if val1 eq 'PR' and val2 in ('P', 'M') then col1 = 1;
* only output the results from the second pass *;
if last_pass;
run;