Pages

SyntaxHighlighter

Sunday, May 15, 2022

If only I had an Array

Modeling data sets often have binary flag variable to indicate if a condition is true or false. Sometimes those indicators need to be collapsed or recoded into a single value based on conditions. See the below code for one technique to handle this.
       if s_del30postscratch_ind  = 1 then curr_delinquency = "D30";
  else if s_del60postscratch_ind  = 1 then curr_delinquency = "D60";
  else if s_del90postscratch_ind  = 1 then curr_delinquency = "D90";
  else if s_del120postscratch_ind = 1 then curr_delinquency = "D120";
  else if s_del150postscratch_ind = 1 then curr_delinquency = "D150";
  else if s_del180postscratch_ind = 1 then curr_delinquency = "D180+";
  else curr_delinquency = "Current";
An alternative way to do the same thing in SAS using two arrays is as follows:
  array avars s_del30postscratch_ind s_del60postscratch_ind s_del90postscratch_ind 
              s_del120postscratch_ind s_del180postscratch_ind;
  array anames[5] $8 _temporary_ ('D30', 'D60', 'D90', 'D120', 'D180+');
  
  do _n_ = 1 to dim(avars);
    if avars[_n_] = 1 then curr_delinquency = anames[_n_];
  end;
  if sum(of avars[*]) = 0 then curr_delinquency = 'Current';
Here is the entire array technique complete with some sample data. The temporary array has the same number of elements so the relative offset matches and makes this a better, more eloquent technique that can be expanded to easily support more if statements if that is what you encounter. Notice the use of the double dash (--) to specify the start and stop columns to process that can be used as a shortcut. Just a different way to do the same thing.
data x;
  length curr_delinquency $8;
  input s_del30postscratch_ind s_del60postscratch_ind s_del90postscratch_ind 
        s_del120postscratch_ind s_del180postscratch_ind;
  array avars s_del30postscratch_ind -- s_del180postscratch_ind;
  array anames[5] $8 _temporary_ ('D30', 'D60', 'D90', 'D120', 'D180+');
  
  do _n_ = 1 to dim(avars);
    if avars[_n_] = 1 then curr_delinquency = anames[_n_];
  end;
  if sum(of avars[*]) = 0 then curr_delinquency = 'Current';
  datalines;
  1 0 0 0 0
  0 1 0 0 0
  0 0 1 0 0
  0 0 0 1 0
  0 0 0 0 1
  1 1 1 1 1 
  0 0 0 0 0
  ;
run;