Friday, May 29, 2015

Updating an ODS tagset

Chevell Parker has written a few excellent papers on the TAGSETS.TABLEEDITOR tagset that can be found here, here and here. ODS TAGSETS such as TABLEEDITOR are saved as item stores and need to be updated as the code can change over time. I do not update tagsets often, so created this blog entry for future reference and to share with others looking to do the same.

You can see the templates that are available by issuing the ODS PATH SHOW command. Individual tagsets in a library can be seen using SAS display manager by clicking on the Explorer window then at the bottom click on the Results tab. After that, left-mouse click on the Results node then Templates. Alternatively, you could issues ODSTEMPLATES in the SAS command window.

In the case of the TAGSETS.TABLEEDITOR I wanted to know the date and version that I was using. That can be achieved by adding OPTIONS( doc = "quick" ).

See the below code for the steps I used to update the tagset. You can also refer to Usage Note 32394: Installing and Storing Updated Tagsets for ODS MARKUP.

  libname common "pathgoeshere" ;

  ods path show ;
      current ODS PATH is:

  ods tagsets.tableeditor 
    file = "c:\temp\junk.html" 
    options( doc = "quick" ) ;
  ods tagsets.tableeditor close ;
      NOTE: Writing TAGSETS.TABLEEDITOR body file: c:\temp\junk.html
      v2.65  4/25/2013

  *-- set the template mode from READ to UPDATE ;
  ods path common.templat(update) ;

  filename temp url 
  %inc temp ;
     NOTE: Overwritting existing template/link: Tagsets.Tableeditor
     NOTE: TAGSET 'Tagsets.Tableeditor' has been save to: COMMON.TEMPLAT
  filename temp clear ;

  ods path reset ;
  ods path show ;
     Current ODS PATH list is:

  *-- reset common.templat as the first to be read in READ mode ;
  ods path(prepend) common.templat(read) ;
  ods path show ;
      current ODS PATH is:

  *-- show that the tagset has been updated as it has ;
  ods tagsets.tableeditor 
    file = "c:\temp\junk.html" 
    options( doc = "quick" ) ;
  ods tagsets.tableeditor close ;
      NOTE: Writing TAGSETS.TABLEEDITOR body file: c:\temp\junk.html
      v3.14  3/2/2015  

Monday, May 25, 2015

Four Short Macro Functions

SAS macro functions are very useful routines to return a value like a regular SAS or custom function created by proc fcmp. This blog entry contains a collection of four short, simple and to the point macro functions that you may find of some value.

In his paper, "Is This Macro Parameter Blank" Chang Y Chung does a great job testing ways to determine if a macro variable is blank. The most solid way to do this is as follows:

  %macro isBlank( param ) ;
    %sysevalf( %superq( param ) =, boolean )
  %mend ;

The idea for the next function was obtained from Art Capenter's paper which returns the name of the SAS program being executed. It is important to test for SYSIN value first as that is how you get the program name from a batch invocation.

  %macro pgmname() ;
    %local retval ;
    %let retval = %sysfunc( getoption( sysin ) ) ; /* batch mode */
    %if %isblank( &retval. ) %then 
      %let retval = %sysget( sas_execfilepath ) ; /* interactive mode */
  %mend ;

SAS veteran, Peter Crawford was a featured presenter a SAS Global Forum 2015. I really liked his %now function to return the time, day of week and date using the TWMDY format. You can see his video here: Learn Hidden Ideas in Base SAS® to Impress Colleagues

  %macro now( fmt = twmdy. ) / des = 'timestamp' ;
    %sysfunc( strip( %sysfunc( datetime(), &fmt ) ) ) 
  %mend ;

The final short macro function is one I created to obtain attributes from a SAS data set. It can handle character or numeric return values by testing the value of the attribute to return. This macro utilizes the ATTRC and ATTRN functions.

  %macro getattr( dsn =, attr = nobs ) ;
    %local attrtype clist dsid retval ;
    %if %isblank( &dsn ) %then %do ;
      %put %str(E)RROR: the DSN value is missing ;
      %return ;
    %end ;

    %let retval = . ;


    %if %index( &clist., %upcase( &attr. ) ) %then %let attrtype = attrc ;
    %else %let attrtype = attrn ;

    %let dsid = %sysfunc( open( &dsn. ) ) ;
    %if &dsid. %then %do ;
      %let retval = %sysfunc( &attrtype.( &dsid., &attr. ) ) ;
      %let dsid = %sysfunc( close( &dsid. ) ) ;
    %end ;

  %mend ;

Putting it all together in a program looks like this:

  data values ;
    length name value $64 ;
    name = "Program Name:" ;
    value = "%pgmname()" ;
    output ;
    name = "Now:" ;
    value = "%now()" ;
    output ;
    name = "Engine:" ;
    value = "%getattr( dsn = sashelp.class, attr=engine )" ;
    output ;
    name = "Obs:" ;
    value = "%getattr( dsn = sashelp.class )" ;
    output ;
  run ;

Friday, May 22, 2015

Reading multiple CSV files containing a header row

SAS allows you to read in multiple files with a wildcard which works out great for files that do not contain a header row. The firstobs = 2 can be used on a single file being processed but that only works on the first file of a concatenated set of files to be processed.

The way to get around this limitation is to use the EOV=variable option on the INFILE statement. It is important to understand that EOV is set to 1 when it encounters the first row of the 2nd and subsequent files. However, EOV stays set to 1 so you need to reset it to 0 in your code. Also the EOV=variable is treated as an automatic variable such as _N_ and _ERROR_ and it not written to the data set.

The below code uses prior blog entries split DSN to CSV and delete external files using wildcards in the creation of the sample code. The key to testing for the first row of a new file is the trailing @ on the input @ line. If that sets EOV = 1 then EOV is reset to 0 and the row is deleted. Otherwise that row is re-read on the next input statement.

    dsn          = sashelp.class
  , byvar        = sex
  , outputprefix = c:\temp\temp_
  , outputsuffix = csv
  , header       = Y )

filename fileref 'c:\temp\temp_*.csv' ;
data class2 ;
  infile fileref
    firstobs = 2 
    eov      = eov ;

  input @ ;

  if eov then do ;
    /* EOV = 1 after first row of next file and needs to be reset to 0 */
    eov = 0 ;
    delete ;
  end ;

  input name $ sex $ age height weight ;
run ;
filename fileref clear ;

%deletefiles( filename = c:\temp\temp_?.csv )

Thursday, May 21, 2015

split DSN to CSV

The below macro can be used to split a SAS data set into comma separated value (CSV) files based on the &byvar macro variable. The macro parameters support the option of including or ignoring the header row.

After some initial parameter verification, the dictionary.columns table is used to obtain the name of columns in the correct order. The input data set is then sorted on the &byvar macro variable to a new temp_&memname data set.

A data _NULL_ (do not create a SAS data set) step creates the output files via the use of the filevar= option. The filevar=variable (character string containing the physical filename) is used to dynamically specify the file used by the put statement. Interestingly, the filevar = variable is not written to the output file rather it is treated like other SAS automatic variables such as _N_ or _ERROR_.

The delimiter sensitive data (DSD) option uses a comma ( , ) as the default delimiter, so there is no need to specify a DLM= value. The put ( _ALL_ ) ( : ) is a formatted output statement [syntax is ( var list ) ( format list )] that writes out all non automatic variables to the output file. You could add the _N_ automatic variable such as put ( _N_ _ALL_ ) ( : ) if you wanted to show the row number. Just be sure to also alter the header and understand that _N_ does not get reset to zero when changing by groups. The last step is to clean up (delete) the temporary data set via proc delete.

%macro splitdsntocsv
      dsn          = sashelp.class
    , byvar        = sex
    , outputprefix = c:\temp\temp_
    , outputsuffix = csv
    , header       = Y 
  ) ;

  %local libname memname names ;

  %if %sysfunc( exist( &dsn. ) ) = 0 %then %do ;
    %put %str(E)RROR: the DSN = &dsn. does not exist. ;
    %return ;
  %end ;

  %if &byvar. = %then %do ;
    %put %str(E)RROR: the BYVAR = macro variable was not supplied. ;
    %return ;
  %end ;

  %let header = %upcase( %substr( &header., 1, 1 ) ) ;

  %if %index( &dsn., . ) %then %do ;
    %let libname = %scan( &dsn., 1 ) ;
    %let memname = %scan( &dsn., 2 ) ;
  %end ;
  %else %do ;
    %let libname = work ;
    %let memname = &dsn. ;
  %end ;

  proc sql noprint ;
    select         name
      into         :names separated by ","
      from         dictionary.columns
      where            libname = "%upcase( &libname. )"
                   and memname = "%upcase( &memname. )"
      order by     varnum ;     

    %if not %sysfunc( find( "&names.", &byvar., i ) ) %then %do ;
      %put %str(E)RROR: The BYVAR = &byvar. does not exist. ;
      quit ;
      %return ;
    %end ; 

    create table temp_&memname. as
      select       *
        from       &dsn.
        order by   &byvar. ;
  quit ;

  data _null_  ;
    set temp_&memname. ;
      by &byvar. ;
    temp_var = cats( "&outputprefix.", &byvar., ".", "&outputsuffix." ) ;
    file tempfref dsd lrecl = 32767 filevar = temp_var ;

    if first.&byvar. and "&header." = "Y" then put "&names." ;
    put ( _all_ ) ( : ) ;
  run ;

  proc delete data = temp_&memname. ;
  run ;
%mend ;

Sunday, May 10, 2015

Delete External Files using Wildcards

In the age of cloud computing you may not know the operating system (OS) that SAS is using. As a result, it is best to use SAS functions and statements that are OS independent.

SAS functions are used in the below code to delete one or multiple external files via DOS like wildcard characters. This means that the question mark (?) represents a single character while an asterisk (*) is used for one or more characters. However, those DOS wildcard symbols do not directly translate, so Perl Regular Expression (PRX) functions included in the base SAS language are used.

The FINDC() function is used to find the last slash (\ used by Windows) or backslash (/ used by Linux/UNIX). FINDC() will look for either the / or \ character starting from the end of the string via the -9999 parameter. Once the file portion of the &filename macro variable has been parsed, the TRANWRD function is used to substitute the correct Perl Regular Expression (PRX) characters. PRX reserves the period (.) to represent character so it needs to be escaped (\) as in \. to find a literal period. The entire regular expression is enclosed between the /^ (match beginning of string) to /i (case insensitive)

From there the path or folder is opened using the DOPEN function. The number of entries in the folder are determined by the DNUM function and then the filenames in the folder are read using the DREAD function. After that, the PRXMATCH function is used to determine if the file matches the supplied pattern. Files that match the pattern are then verified with FEXIST and deleted using the FDELETE function.

Here are a few examples of how this can be used followed by the source code:

%deletefiles( filename = c:\temp\xyz.csv ) /* single file */
%deletefiles( filename = c:\temp\xy?.csv ) /* ? for single character */
%deletefiles( filename = c:\temp\x*.csv ) /* files with X ends with .CSV */

%macro deletefiles( filename = ) ;
  %if %sysevalf( %superq( filename ) =,  boolean ) %then %do ;
    %put %str(E)RROR: filename was not supplied.;
    %return ;
  %end ;

  data _null_ ;
      path file name fullfilename prxstring $256 
      dir filep $8 
      rc did i cnt 4 ;

    path      = substr( "&filename.", 1, findc( "&filename", "\/", -9999 ) ) ;
    file      = substr( "&filename.", findc( "&filename", "\/", -9999 ) + 1 ) ;
    prxstring = cats( "/^"
              , tranwrd(tranwrd(tranwrd(file,".", "\."),"*",".*" ),"?",".?")
              , "/i" ) ;
    rc = filename( dir, path ) ;
    if rc = 0 then do ;
      did = dopen( dir ) ;
      if did > 0 then do ;
        do i = 1 to dnum( did ) ;
          name = dread( did, i ) ;

          if prxmatch( prxstring, name ) then do ;
            fullfilename = cats( path, name ) ;
            rc = filename( filep, fullfilename ) ;
            if rc = 0 and fexist( filep ) then do ;
              rc = fdelete( filep ) ;
              if rc = 0 then put "NOTE: " fullfilename "has been deleted." ;
              cnt + 1 ;
            end ;
            rc = filename( filep, " " ) ;
          end ;
        end ;
        rc = dclose( did ) ;
      end ;
      rc = filename( dir, " " ) ;
    end ;
    put / "NOTE: " cnt "files were deleted." ;

%mend ;