Home > database >  SAS New Variable Array from Single Variable
SAS New Variable Array from Single Variable

Time:04-27

Should be an easy question that I cannot answer. I was given some survey data that had several questions that allowed "select as many that apply." For these questions, the response options are stored as a single variable separated by commas.

For example, say that question 1 (Q1) had 9 different response options. For that question, Person 1 might have three responses stored as: 1,3,10 whereas person 2 might have four responses stored as: 2,3,8,9.

From this single variable (Q1), I want to create 9 separate variables, one for each response option (Q1_1 to Q1_9). I believe I can do this in one DATA step with an ARRAY and a DO loop, but neither the INDEX or IN functions are working. Below is the code I've been using.

DATA Final; SET Final; 
     ARRAY Q1b{9} Q1_1 - Q1_9; *new variables I want to create;
     DO i = 1 TO 9;
         IF NOT MISSING(Q1) THEN Q1b{i} = 0; *works;
         IF INDEX(Q1,"i") THEN Q1b{i} = 1; *Doesn't work;
         IF Q1 IN: ("i") THEN Q1b{i} = 1; *Doesn't work either;
     END;
RUN;

Using responses from the example people above, the value of Q1_1 for person 1 would be 1, whereas for person 2 it would be 0. Likewise, the value of Q1_2 for person 1 would be 0, whereas for person 2 it would be 1. I'm guessing it has something with how SAS stores the looping i (I'm thinking of it similar to a macro variable without the &). Thought are appreciated.

Thanks, Ryan

CodePudding user response:

  • Generally shouldn't use the same input and output data set name. Makes it hard to debug and you sometimes get unexpected output.
  • Check if the variable is empty before the loop to save you from loop unneccesarily
  • Remove the quotes from the i to use the number i. In quotes it's looking for the letter i, not the iterator variable i.

Something like this maybe:

DATA Final_expanded; /*1*/
SET Final; 
     ARRAY Q1b{9} Q1_1 - Q1_9; *new variables I want to create;
     if not missing(q1) then DO i = 1 TO 9; /*2*/
         IF INDEX(Q1, i) THEN Q1b{i} = 1; /*3*/
     END;
RUN;

EDIT: couldn't get that to work for some reason but this method does work.

DATA Final_expanded; /*1*/
SET have; 
     ARRAY Q1b{9} Q1_1 - Q1_9; *new variables I want to create;
     nwords = countw(q1);
     if not missing(q1) then DO i = 1 TO nwords; /*2*/
         index = scan(q1, i);
         q1b(index) = 1;
     END;
RUN;

CodePudding user response:

INDEX() needs character strings.

IF INDEX(Q1,cats(i)) THEN Q1b{i} = 1;

Note that will not scale if there are more than 9 possible responses since the digit '1' will appear in both '1' and '10'. So you might want to use INDEXW() instead.

IF INDEXW(Q1,cats(i),',') THEN Q1b{i} = 1;

Boolean expressions evaluate to 0 or 1. So your program could just be:

data WANT;
  set HAVE;
  array q1_ [9] ;
  do i = 1 to dim(q1_);
    q1_[i] = 0<indexw(q1,cats(i),',');
  end;
run;
  • Related