Simplified table below:
ID | LPA_indx |
---|---|
13 | "LPA" |
13 | " NO LPA" |
13 | " NO LPA" |
14 | " NO LPA" |
14 | " NO LPA" |
14 | " NO LPA" |
14 | " NO LPA" |
16 | "LPA" |
16 | "LPA" |
16 | "LPA" |
I want to either change the LPA_indx column or create a new column that shows if any of the ID's has LPA.
Meaning, if even one of the LPA_indx for each ID number says LPA then I want it to show that for all of the rows for that ID. If none show it then I still want it to read "NO LPA". Any ideas? If it was a numeric value then I would think I can do a group by and a max() function, but I am unsure how to do it with strings like this.
As one more clarifcation, below is the table I wish to output.
ID | LPA_indx |
---|---|
13 | "LPA" |
13 | "LPA" |
13 | "LPA" |
14 | " NO LPA" |
14 | " NO LPA" |
14 | " NO LPA" |
14 | " NO LPA" |
16 | "LPA" |
16 | "LPA" |
16 | "LPA" |
Thanks!
CodePudding user response:
You can use the FINDW() function inside a DoW Loop.
data want;
do _n_=1 by 1 until (last.id);
set have;
by id;
cond = 0;
if findw(lpa_indx, '"LPA"') then cond = 1;
sum=sum(sum, cond);
end;
do until (last.id);
set have;
by id;
if sum > 0 then lpa_indx = '"LPA"';
else lpa_indx = '" NO LPA"';
output;
end;
drop cond sum;
run;
id lpa_indx
13 "LPA"
13 "LPA"
13 "LPA"
14 " NO LPA"
14 " NO LPA"
14 " NO LPA"
14 " NO LPA"
16 "LPA"
16 "LPA"
16 "LPA"
CodePudding user response:
SQL allows you to apply the max/min function on character variables. Since LPA is before "NO LPA" it means you want the minimum in this case.
proc sql;
create table want as
select *, min(LPA_indx) as lpa_indx_min
from have
group by ID
order by ID;
quit;
From an efficiency view, I suspect the data step/DoW loop will run faster. From a maintenance view, the SQL is easier to understand. However, if you ever add new different codes then the data step is also better as it will still work and the SQL may not depending on the alphabetical order of the new code.