Home > Blockchain >  What do the following lines of the code do?
What do the following lines of the code do?

Time:11-02

I have the following code with me which was shared by another user on this forum:

ods exclude all;
ods output nlevels=nlevels;
proc freq data=sashelp.cars nlevels;
  tables _all_ / noprint ;
run;
ods select all;
proc contents data=sashelp.cars noprint out=contents;
run;

proc sql;
create table want(drop=table:) as 
  select c.varnum,c.name,c.type,n.*
  from contents c inner join nlevels n
  on c.name=n.TableVar
  order by varnum
;
quit;

filename code temp;
data _null_;
  set contents end=eof;
  length nliteral $65 dsname $80;
  nliteral=nliteral(name);
  dsname = catx('.',libname,nliteral(memname));
  file code;
  if _n_=1 then put 'create table counts as select ' / ' ' @ ;
  else put ',' @;
  put 'nmiss(' nliteral ') as missing_' varnum 
    /',count(distinct ' nliteral ') as distinct_' varnum
  ;
  if eof then put 'from ' dsname ';';
run;
proc sql;
%include code /source2;
quit;
proc transpose data=counts out=count2 name=name ;
run;

proc sql ;
create table want as 
  select c.varnum, c.name, c.type 
       , m.col1 as nmissing
       , d.col1 as ndistinct
  from contents c 
  left join count2 m on m.name like 'missing%' and c.varnum=input(scan(m.name,-1,'_'),32.)
  left join count2 d on d.name like 'distinct%' and c.varnum=input(scan(d.name,-1,'_'),32.)
  order by varnum
;
quit;

The above code gives the following output:

|Obs|    VARNUM |   NAME |          TYPE|    nmissing|    ndistinct|
  1       1      Make             2            0           38
  2       2      Model            2            0          425
  3       3      Type             2            0            6
...

My question is what does the following part of the above code do:

filename code temp;
data _null_;
  set contents end=eof;
  length nliteral $65 dsname $80;
  nliteral=nliteral(name);
  dsname = catx('.',libname,nliteral(memname));
  file code;
  if _n_=1 then put 'create table counts as select ' / ' ' @ ;
  else put ',' @;
  put 'nmiss(' nliteral ') as missing_' varnum 
    /',count(distinct ' nliteral ') as distinct_' varnum
  ;
  if eof then put 'from ' dsname ';';
run;
proc sql;
%include code /source2;
quit;

Could someone please explain what each line of the above portion of the code is doing? Thank you.

CodePudding user response:

The SAS data step is a good tool for writing report files. In this case the report being written is SAS code.

filename code temp;

Makes a temporary file and points the fileref CODE to that location.

data _null_;

Starts a data step that writes out no dataset.

  set contents end=eof;

Reads from the dataset CONTENTS and sets the variable EOF to true when the End Of the File is reached.

  length nliteral $65 dsname $80;

Defines two character variables.

  nliteral=nliteral(name);

Sets the variable NLITERAL to a name literal string version of the current variable's name that is stored in the NAME variable.

  dsname = catx('.',libname,nliteral(memname));

Sets the variable DSNAME to the name of the dataset specified in the LIBNAME and MEMNAME variables.

  file code;

Directs PUT messages to the fileref CODE instead of the default of sending them to the SAS log.

  if _n_=1 then put 'create table counts as select ' / ' ' @ ;
  else put ',' @;

Writes the beginning of an SQL CREATE TABLE statement as the first line of the file. Otherwise writes a comma before the next set of columns that will be written by the next statement.

  put 'nmiss(' nliteral ') as missing_' varnum 
    /',count(distinct ' nliteral ') as distinct_' varnum
  ;

Writes code to generate two variables. The variables are named MISSING_xx and DISTINCT_xx where the XX part is the variable's position number in the original dataset that was stored in the VARNUM dataset. The NLITERAL variable is used to print the name of the variable being summarized.

  if eof then put 'from ' dsname ';';

When the End Of the File is reached write the end part of the CREATE TABLE statement.

run;

Ends the data step definition.

proc sql;

Starts PROC SQL.

%include code /source2;

Runs the CREATE TABLE statement that the previous data step wrote to the fileref CODE.

quit;

Ends the PROC SQL step.

CodePudding user response:

This appears to be dynamically creating SQL code using the dataset contents and saving it to a temporary file called code. This is used in the following SQL statement:

proc sql;
%include code /source2;
quit;

The filename code contains the SQL code. %include grabs that code and puts it in the program. They could have done this using a macro variable, but I'm assuming the code must be so long that it cannot fit into a single macro variable. Instead, they chose to use a temporary file.

  • Related