I have the following dataset in SAS;
City grade1 grade2 grade3
NY A. A. A
CA. B. A. C
CO. A. B. B
I would "combine" the three variables grades and get a proc freq that tells me the number of grades for each City; the expected output should therefore be:
A. B. C
NY 3. 0. 0
CA. 1. 1. 1
CO. 1. 2. 0
How could I do that in SAS?
CodePudding user response:
Quite a few steps but it gives the expected result.
*-- Creating sample data --*;
data have;
infile datalines delimiter="|";
input City $ grade1 $ grade2 $ grade3 $;
datalines;
NY|A|A|A
CA|B|A|C
CO|A|B|B
;
*-- Sorting in order to use the transpose procedure --*;
proc sort data=have; by city; quit;
*-- Transposing from wide to tall format --*;
proc transpose data=have out=stage1(rename=(col1=grade) drop= _name_);
by city;
var grade:;
run;
*-- Assigning a value to 1 for each record for later sum --*;
data stage2;
set stage1;
val = 1;
run;
*-- Tabulate to create val_sum --*;
ods exclude all; *turn off default tabulate print;
proc tabulate data=stage2 out=stage3;
class city grade;
var val;
table city,grade*sum=''*val='';
run;
ods select all; *turn on;
*-- Transpose back using val_sum --*;
proc transpose data=stage3 out=stage4(drop=_name_);
by city;
id grade;
var val_sum;
run;
*-- Replace missing values by 0 to achieve desired output --*;
proc stdize data=stage4 out=want reponly missing=0;run;
City A B C
CA 1 1 1
CO 1 2 0
NY 3 0 0
CodePudding user response:
In general:
- Transpose data to a long format
- Use PROC FREQ with the SPARSE option to generate the counts
- Save the output from PROC FREQ to a data set
- Transpose the output from PROC FREQ to the desired output format
*create sample data;
data have;
input City $ grade1 $ grade2 $ grade3 $;
cards;
NY A A A
CA B A C
CO A B B
;;;;
*sort;
proc sort data=have; by City;run;
*transpose to long format;
proc transpose data=have out=want1 prefix=Grade;
by City;
var grade1-grade3;
run;
*displayed output and counts;
proc freq data=want1;
table City*Grade1 / sparse out=freq norow nopercent nocol;
run;
*output table in desird format;
proc transpose data=freq out=want2;
by city;
id Grade1;
var count;
run;