Pages

SyntaxHighlighter

Saturday, April 27, 2019

X out the X command - use SYSTASK instead

This is an easy one when it comes to best practices. There are times when you may need to shell out to the operating system in order to perform a task that SAS does not otherwise handle.

The easy way to do this is via the X command, but is it the best technique to use? The same can be said for the use of %SYSEXEC and CALL SYSTEM commands. Look at the comparison table below.

Feature X Command SYSTASK
Asynchronous Processing? No Yes
Return Code? No Yes

Be sure to review the SAS documentation on SYSTASK, here is the syntax in a nutshell:

SYSTASK COMMAND "os command" <WAIT|NOWAIT> <TASKNAME=taskname>
    <STATUS=statusvar> <SHELL<='shell-command'>> <CLEANUP>;

  • WAIT | NOWAIT determines if the request will be handled asynchronously (NOWAIT the default) or synchronously (WAIT)
  • TASKNAME = taskname uniquely identifies the task - use this with the WAITFOR command
  • STATUS = statusvar is a unique macro name that stores the status of the task
  • SHELL this specifies that the command should be executed by the OS shell command. You can specify a shell name else the default shell is used.
  • CLEANUP specifies that the task should be removed from the LISTTASK output when the task completes. This allows you to reuse the taskname. NOTE: This option is not available under the Windows operating system.

SYSTASK LIST <_ALL_ | taskname> <STATE> <STATVAR>;

SYSTASK KILL taskname <taskname>;

The two biggest reasons to use SYSTASK are the ability to run processes in parallel (asynchronously) and to get back a status code so you can handle any issues. As noted the X command does not support either of these features. However, SYSTASK can be run asynchronously using NOWAIT or synchronously using the WAIT option. Here is an example of copying three files at once:

/* If using Windows or not using the CLEANUP option then do this first */
/* SYSTASK KILL t1 t2 t3; */

SYSTASK COMMAND "cp &in/f1.txt &out/f1.txt" taskname=t1 status=s1 shell cleanup;
SYSTASK COMMAND "cp &in/f1.txt &out/f2.txt" taskname=t2 status=s2 shell cleanup;
SYSTASK COMMAND "cp &in/nofile.txt &out/nofile.txt" taskname=t3 status=s3 shell cleanup;

/* wait on task 1, 2 and 3 to complete before going to line of code that follow it */
WAITFOR _all_ t1 t2 t3;  

data _null_;
   if &s1 ne 0 then putlog "ERR" "OR: issue copying f1.txt";
   if &s2 ne 0 then putlog "ERR" "OR: issue copying f2.txt";
   if &s3 ne 0 then putlog "ERR" "OR: issue copying nofile.txt";
run;

ERROR: issue copying nofile.txt

UPDATE: After some initial feedback it is important to note that yes the XCMD option must be on for X command, FILENAMEE pipe, %SYSEXEC, CALL SYSTEM, SYSTASK and other commands. I used a simple example of the UNIX cp (copy) command but this could have been handled with the built in SAS function FCOPY(). I have covered the use of FCOPY() in this previous blog post.

Thursday, April 11, 2019

Binary Permutations

Would you like to store more than one value in a single variable or column? Doing so can save considerable space resulting in less network traffic and faster throughput. Inspiration and credit for this post came from the SAS Global Forum paper "Deciphering PROC COMPARE Codes: The Use of the bAND Function" by Hinson and Coughlin.

To keep things simple, consider that we have four books A, B, C and D and need to track every combination of those four. That is someone may have no books or only B and C. To make this all work, assign values using the power of 2 from 0 forward as follows:

  • 1 as 20 = 1
  • 2 as 21 = 2
  • 3 as 22 = 4
  • 4 as 23 = 8
  • 5 as 24 = 16
  • 6 as 25 = 32
  • 7 as 26 = 64
  • 8 as 27 = 128

Using permutation with replacement of 4 objects taken 2 at a time (have it or not) there are 42 or 16 possible combinations. This can be stored in a single column via the use of bitwise operators such as SAS's band function (the bitwise logical AND of two arguments). The following code will help better illustrate how this works.

data x(drop = value);
  value = 15;
  do num = 0 to 15;
    binary = put(num, binary4.);
    if num > 0 then binval = 2 ** (num - 1);
    binsum + binval;
    if num > 0 then match = band(binval, value);
    format binval binsum comma8.;
    output;
  end;
run;

The binval column contains the value associated with the num column. So if you want books B and C that is the sum of 2 + 4 or 6. If all books were desired, the value is 15 in the binsum column or the cumulative sum of the binval column. You can look at the binary column to see these values in binary format reading back from right to left every combination is covered.

Below is the sample books data set followed by the actual macro function code. The use of low level functions are used to read the data and return the matched column values. This does assume that the data set is in the correct sorted order and only has the same number of rows as needed and no more. The limit to this process is 15 combination which is one less than 215 or 32,767

data books;
  length i 3 book $6;
  do i = 1 to 4;
    book = cat("Book ", byte(64 + i));
    output;
  end;
run;

%macro band_permutations(
     dsn       =
   , column    =
   , value     = 
   , seperator = %str(,)
);
  %local dsid position type i binval retval;
    
  %let dsid = %sysfunc(open(&dsn, i));  /* open data set */
  %if &dsid %then %do;
    %let position = %sysfunc(varnum(&dsid, &column)); /* find position of column */
    %let type = %sysfunc(vartype(&dsid, &position));  /* is it 'C'har or 'N'um */
    %do i = 1 %to %sysfunc(attrn(&dsid, nlobs));      /* how many rows in data set */
      %let binval = %eval(2 ** %eval(&i - 1);         /* Calc the binary value */
   
      %if %sysfunc(band(&binval, &value)) > 0 %then %do;  /* Did value match */
        %if %sysfunc(fetchobs(&dsid, &i)) = 0 %then %do;  /* Retrive the row */
          %if %length(&retval) = 0 %then 
            %let retval = %sysfunc(getvar&type(&dsid, &position)); /* read column value */
 
            %else %let retval=&retval &seperator %sysfunc(getvar&type(&dsid, &position));
          %end;
          %else %sysfunc(sysmsg());  /* Write out message */
        %end;
      %end;  
      %let dsid = %sysfunc(close(&dsid)); /* close data set */
    %end; 
  %else %put %sysfunc(sysmsg());  /* unable to open the data set */
   
  /* return the value */
  &retval
%mend;

/* below resolves to result = Book B, Book C */
%put result = %band_permutations(dsn=books, column = book, value = 6);