Pages

SyntaxHighlighter

Saturday, April 27, 2019

X out the X command - use SYSTASK instead

This is an easy one when it comes to best practices. There are times when you may need to shell out to the operating system in order to perform a task that SAS does not otherwise handle.

The easy way to do this is via the X command, but is it the best technique to use? The same can be said for the use of %SYSEXEC and CALL SYSTEM commands. Look at the comparison table below.

Feature X Command SYSTASK
Asynchronous Processing? No Yes
Return Code? No Yes

Be sure to review the SAS documentation on SYSTASK, here is the syntax in a nutshell:

SYSTASK COMMAND "os command" <WAIT|NOWAIT> <TASKNAME=taskname>
    <STATUS=statusvar> <SHELL<='shell-command'>> <CLEANUP>;

  • WAIT | NOWAIT determines if the request will be handled asynchronously (NOWAIT the default) or synchronously (WAIT)
  • TASKNAME = taskname uniquely identifies the task - use this with the WAITFOR command
  • STATUS = statusvar is a unique macro name that stores the status of the task
  • SHELL this specifies that the command should be executed by the OS shell command. You can specify a shell name else the default shell is used.
  • CLEANUP specifies that the task should be removed from the LISTTASK output when the task completes. This allows you to reuse the taskname. NOTE: This option is not available under the Windows operating system.

SYSTASK LIST <_ALL_ | taskname> <STATE> <STATVAR>;

SYSTASK KILL taskname <taskname>;

The two biggest reasons to use SYSTASK are the ability to run processes in parallel (asynchronously) and to get back a status code so you can handle any issues. As noted the X command does not support either of these features. However, SYSTASK can be run asynchronously using NOWAIT or synchronously using the WAIT option. Here is an example of copying three files at once:

/* If using Windows or not using the CLEANUP option then do this first */
/* SYSTASK KILL t1 t2 t3; */

SYSTASK COMMAND "cp &in/f1.txt &out/f1.txt" taskname=t1 status=s1 shell cleanup;
SYSTASK COMMAND "cp &in/f1.txt &out/f2.txt" taskname=t2 status=s2 shell cleanup;
SYSTASK COMMAND "cp &in/nofile.txt &out/nofile.txt" taskname=t3 status=s3 shell cleanup;

/* wait on task 1, 2 and 3 to complete before going to line of code that follow it */
WAITFOR _all_ t1 t2 t3;  

data _null_;
   if &s1 ne 0 then putlog "ERR" "OR: issue copying f1.txt";
   if &s2 ne 0 then putlog "ERR" "OR: issue copying f2.txt";
   if &s3 ne 0 then putlog "ERR" "OR: issue copying nofile.txt";
run;

ERROR: issue copying nofile.txt

UPDATE: After some initial feedback it is important to note that yes the XCMD option must be on for X command, FILENAMEE pipe, %SYSEXEC, CALL SYSTEM, SYSTASK and other commands. I used a simple example of the UNIX cp (copy) command but this could have been handled with the built in SAS function FCOPY(). I have covered the use of FCOPY() in this previous blog post.

Thursday, April 11, 2019

Binary Permutations

Would you like to store more than one value in a single variable or column? Doing so can save considerable space resulting in less network traffic and faster throughput. Inspiration and credit for this post came from the SAS Global Forum paper "Deciphering PROC COMPARE Codes: The Use of the bAND Function" by Hinson and Coughlin.

To keep things simple, consider that we have four books A, B, C and D and need to track every combination of those four. That is someone may have no books or only B and C. To make this all work, assign values using the power of 2 from 0 forward as follows:

  • 1 as 20 = 1
  • 2 as 21 = 2
  • 3 as 22 = 4
  • 4 as 23 = 8
  • 5 as 24 = 16
  • 6 as 25 = 32
  • 7 as 26 = 64
  • 8 as 27 = 128

Using permutation with replacement of 4 objects taken 2 at a time (have it or not) there are 42 or 16 possible combinations. This can be stored in a single column via the use of bitwise operators such as SAS's band function (the bitwise logical AND of two arguments). The following code will help better illustrate how this works.

data x(drop = value);
  value = 15;
  do num = 0 to 15;
    binary = put(num, binary4.);
    if num > 0 then binval = 2 ** (num - 1);
    binsum + binval;
    if num > 0 then match = band(binval, value);
    format binval binsum comma8.;
    output;
  end;
run;

The binval column contains the value associated with the num column. So if you want books B and C that is the sum of 2 + 4 or 6. If all books were desired, the value is 15 in the binsum column or the cumulative sum of the binval column. You can look at the binary column to see these values in binary format reading back from right to left every combination is covered.

Below is the sample books data set followed by the actual macro function code. The use of low level functions are used to read the data and return the matched column values. This does assume that the data set is in the correct sorted order and only has the same number of rows as needed and no more. The limit to this process is 15 combination which is one less than 215 or 32,767

data books;
  length i 3 book $6;
  do i = 1 to 4;
    book = cat("Book ", byte(64 + i));
    output;
  end;
run;

%macro band_permutations(
     dsn       =
   , column    =
   , value     = 
   , seperator = %str(,)
);
  %local dsid position type i binval retval;
    
  %let dsid = %sysfunc(open(&dsn, i));  /* open data set */
  %if &dsid %then %do;
    %let position = %sysfunc(varnum(&dsid, &column)); /* find position of column */
    %let type = %sysfunc(vartype(&dsid, &position));  /* is it 'C'har or 'N'um */
    %do i = 1 %to %sysfunc(attrn(&dsid, nlobs));      /* how many rows in data set */
      %let binval = %eval(2 ** %eval(&i - 1);         /* Calc the binary value */
   
      %if %sysfunc(band(&binval, &value)) > 0 %then %do;  /* Did value match */
        %if %sysfunc(fetchobs(&dsid, &i)) = 0 %then %do;  /* Retrive the row */
          %if %length(&retval) = 0 %then 
            %let retval = %sysfunc(getvar&type(&dsid, &position)); /* read column value */
 
            %else %let retval=&retval &seperator %sysfunc(getvar&type(&dsid, &position));
          %end;
          %else %sysfunc(sysmsg());  /* Write out message */
        %end;
      %end;  
      %let dsid = %sysfunc(close(&dsid)); /* close data set */
    %end; 
  %else %put %sysfunc(sysmsg());  /* unable to open the data set */
   
  /* return the value */
  &retval
%mend;

/* below resolves to result = Book B, Book C */
%put result = %band_permutations(dsn=books, column = book, value = 6);  

Thursday, March 28, 2019

Please wait...

Keeping users informed on the progress of any activity is not only a courtesy but is also very critical. This is especially true for SAS web based stored processes. I can run the same code and get results back as quickly as three seconds or as long as over a minute. I can not explain this inconsistency therefore I can not expect an end user to be so patient and understanding.

As a result, it is a very good idea to display a message letting the user know that the process is underway and to please wait. Failure to do so can cause the user to submit the same code multiple times.

I wanted to do this without using an image but still be able to display a good looking message with a minimal number of lines of HTML/CSS or JavaScript. To do this I utilized the great free resource, codepen.

Codepen is a cloud based service that allows users to share HTML, CSS and JavaScript code snippets. This well designed interface makes it easy to share ideas and to provide others with code they can use to better illustrate questions. Code can easily be "forked" and saved to your own pen. See below codepen that I forked from Arlina Design then made some modifications and saved - you can see it here: https://codepen.io/tbellmer/pen/aMedyM/

Here is how I used the wait message:

data _null_;
   file _webout;
   input;
   put _infile_;
   datalines4;
<div id = 'pwcontainer' class='pwcontainer'>
   <h1 class='ctr'>Please Wait...</h1>
   <div class='pwfillme'>
      <div class='pwfillme-line'></div>
   </div>
</div>
;;;;
run;

/* your SAS code goes here */

data _null_;
   file _webout;
   input;
   put _infile_;
   datalines4;
<script>
   document.querySelector('#pwcontainer').style.display = 'none';
</script>
;;;;
run;

I would love to see SAS Institute provide a service similar to codepen, maybe call it SASPen. Imagine how much easier it would be for someone to provide others actual code to test and modify directly online. Until then, we will just have to, please wait...

Tuesday, March 19, 2019

Histogram / BoxPlot using GTL

I have been a long term fan of Sanjay Matange and Dan Heath when it comes to the SAS graph template language or GTL. I was lucky enough to have met these two individuals in person at the 2018 SAS Global Forum in Denver and talk about some of the GTL features. Be sure to check out their SAS blog Graphically Speaking.

GTL is a unique language with its own syntax that shares very little if anything from the SAS foundation language. However, this is a very powerful piece of software that should not be overlooked.

The above image provides a great sample of what GTL can do. In this case, I am using the SAS supplied SASHELP.HEART data set containing 5,209 observations and creating a histogram along with a subordinate fringe plot of the same data. The top left area of the graph is used to display textual values of key statistical measures. Below the histogram and taking up just 15% of the area is a horizontal box plot.

While a histogram breaks data into bins, a boxplot (aka box and whiskers) is a visual representation of key statistical values. Inside the box is a vertical line that reveals the median while the diamond shape is the mean. The range of the box is from the first quartile or 25th percentile to the third quartile or 75th percentile. The vertical bar on each end is set at the min/max value or the Q1/Q3 value plus 1.5 * the interquartile range. Values beyond that range are true outliers. See below.

All the code to generate and look over is included below. The key things to understand is that a template is made using define statgraph. Once the template has been created it is referenced along with the data set and rendered using proc sgrender.

proc template;
  define statgraph distribution;
    dynamic 
      VAR 
      VARLABEL 
      NORMAL
      TITLE1 
    ;

    begingraph;
      entrytitle TITLE1;
      layout lattice / 
        columns         = 1 
        rows            = 2 
        rowgutter       = 2px
        rowweights      = (.85 .15) 
        columndatarange = union
      ;

        columnaxes;
          columnaxis / 
            label   = VARLABEL 
            display = (ticks tickvalues label);
        endcolumnaxes;

        layout overlay / 
          yaxisopts = (
          offsetmin   = .035 
          griddisplay = on);

          layout gridded / 
            columns   = 2 
            border    = true 
            autoalign = (topleft topright);
           
            entry halign = left "Nobs";
            entry halign = right eval(strip(put(n(VAR), comma6.)));
            entry halign = left "Min";
            entry halign = right eval(strip(put(min(VAR), comma6.)));
            entry halign = left "Q1";
            entry halign = right eval(strip(put(q1(VAR), comma6.)));
            entry halign = left "Median";
            entry halign = right eval(strip(put(median(VAR), comma6.)));
            entry halign = left "Mean";
            entry halign = right eval(strip(put(mean(VAR), comma6.)));
            entry halign = left "Q3";
            entry halign = right eval(strip(put(q3(VAR), comma6.)));
            entry halign = left "Max";
            entry halign = right eval(strip(put(max(VAR), comma6.)));
            entry halign = left "StdDev";
            entry halign = right eval(strip(put(stddev(VAR), comma6.)));
            entry halign = left "IQR";
            entry halign = right eval(strip(put(qrange(VAR), comma6.)));
          endlayout;

          histogram VAR / scale = percent;
          if (exists(NORMAL))
            densityplot VAR / 
              normal() 
              name        = 'norm' 
              legendlabel = 'Normal'
            ;
            densityplot VAR / 
              kernel() 
              name        = 'kern' 
              legendlabel = 'Kernel' 
              lineattrs   = (
                color    = red
                pattern  = dash
              );
          endif;

          fringeplot VAR / datatransparency = .7;

          discretelegend "norm" "kern" / 
            location  = inside 
            across    = 1 
            autoalign = (topright topleft) 
            opaque    = true;
        endlayout;

        boxplot y = VAR / orient = horizontal;
      endlayout;
    endgraph;
  end;
run;

proc sgrender data = sashelp.heart template = distribution;
  dynamic var      = "systolic" 
          varlabel = "Systolic" 
          normal   = "yes"
          title1   = "Systolic Blood Pressure" 
  ;
run;

Sunday, March 3, 2019

udemy.com - best training bargain out there

This blog is primarily related to SAS programming. However, there are times when we need to expand our horizons to topics that are tangentially related to SAS. One of those items for me is to better understand and implement HTML and CSS as it relates to using SAS stored processes.

While SAS does supply a built in prompting framework to create interfaces, it is very limiting and inflexible. Of course, the way around this is to roll your own via HTML, CSS and JavaScript as needed. If you utilize the _WEBOUT stream destination in a file statement you can create whatever you like.

So the question becomes, how do you learn HTML and CSS to get started? I live near a very nice and respected community college and they offer a course on HTML/CSS that is 40 hours of classroom material. The cost for this course is $1,199.00 which is not exactly cheap, but may be in contrast to that offered at a major university.

As Mick Jagger so eloquently put it in the song Street Fighting Man, "Well, then what can a poor boy do Except to sing for a rock 'n' roll band 'Cause in sleepy London town There's no place for a street fighting man No Hey!".

The answer is the online course Modern HTML & CSS From The Beginning (Including Sass) by Brad Traversy via udemy.com. Brad Traversy is a truly amazing talent who has put out over 650 free videos on his youtube channel - Traversy Media that has over 600,000 followers. This particular course is 21 hours of material and is rated 4.8 out of 5 based on 1,173 reviews. If you look, you will be able to purchase this $149.99 class for as little as $11.99 which is 99% less than the week long class mentioned above. The nice thing is that you can take this course on your own schedule, as time permits - it does not expire. Udemy.com also offers hundreds of other courses and is just a fantastic bargain and very highly recommended.