In the data below I would like proc sql to select the minimum date for subject 123 as the missing date.
data visit;
input subject $1-3 dtc $4-24 ;
cards;
123 2014-01-15T00:00
123
123 2014-01-17T00:00:00
124 2014-01-15T00:00:00
124 2014-01-15T00:00:00
124 2014-01-17T00:00:00
;
run;
proc sql;
create table want. as
select distinct subject, min(dtc) as mindt format = date9.
from have
where subject ne ''
group by subject;
quit;
CodePudding user response:
MIN()
will discard missing values from the aggregate computation. Thus, you need to test separately if there are any missing values.
Example:
Use a CASE
expression to calculate the MIN
you want.
data have;
input subject $1-3 dtc $5-27 ;
cards;
123 2014-01-15T00:00
123 .
123 2014-01-17T00:00:00
124 2014-01-15T00:00:00
124 2014-01-15T00:00:00
124 2014-01-17T00:00:00
;
proc sql ;
create table want as
select
subject
, case when nmiss(dtc) then '' else min(dtc) end as mindtc
, input (calculated mindtc, ? yymmdd10.) as mindt format=date9.
from have
where subject ne ''
group by subject
;
quit;
CodePudding user response:
Here is an alternative solution in SAS:
First, create an index or sort your data by subject and dtc.
proc sort data=have out=have_sorted;
by subject dtc;
quit;
Then you can apply a data step with by grouping and use the first.[column] to get the minimum for each subject including missing values:
data minima;
set have_sorted;
by subject dtc;
if first.subject;
run;