Pages

SyntaxHighlighter

Wednesday, November 5, 2014

LagLead Macro

UPDATE: If you have SAS/ETS installed, you can use proc expand as an alternative. There are a ton of options with this procedure but here is the syntax to emulate the below code:

proc expand data = sashelp.class out = example method = none ;
  convert age = sumage / transform = ( cmovsum 5 trimleft 2 trimright 2 ) ;
run ;

The SAS supplied LAG() function is used to retain previously read variables. There is no supplied LEAD function so the below lagLead function was created.

The key to making this work is to LINK to a DO LOOP that utilizes the POINT= option on the SET statement to read the data in random access mode. A RENAME= option was required to not overwrite the existing VAR of interest in the program data vector (PDV).

In the below example, while sitting on row 5 the prior two, current row and next two rows are summed resulting in a value of 65 ( 13 + 14 + 14 + 12 + 12 ).

/**************************************************************************
*     Program: lagLead.sas
*      Author: Tom Bellmer
*     Created: 05NOV2014 
*     Purpose: read and sum X rows before and after current row
**************************************************************************/

%macro lagLead( 
    dsn      = sashelp.class
  , outdsn   = example
  , var      = age
  , outvar   = sumage
  , lag      = 2
  , lead     = 2 
) ;

  %local tempvar ;
  %let tempvar = temp_%sysfunc(ceil( %sysfunc( ranuni(0) ) * 10000 ) ) ;

  data &outdsn.( drop = &tempvar. ) ;
    set &dsn. nobs = totalrows ;
      if _n_ > &lag. and _n_ <= ( totalrows - &lead. ) then link laglead ;
      output ;
    return ;

    laglead:
      do i = ( _n_ - &lag. ) to ( _n_ + &lead. ) ;
        set &dsn.( keep = &var. rename = ( &var. = &tempvar. ) ) point = i ;
        &outvar. = sum( &outvar., &tempvar. ) ;
      end ;
    return ;
  run ;
%mend ;

/* EOF: lagLead.sas  */

3 comments:

  1. The description for the lag function is incorrect, as it is named misleading.
    A better name would be a Queue (FIFO) as it is pushing values on it and retrieving the oldest in one statement. It can be used for retrieving values from previous records but doe not need to be used solely that way.

    ReplyDelete
  2. Hi Jaap, in my case I needed to sum the previous 26, current and future 26 rows of data then write the average of that sum on the current row. A very odd thing to do for sure but that was the requirement. This seems to work very well so is it really just a matter of semantics? I can see in my case I did not use the LAG function but just read data via the POINT= option in another data set buffer.

    ReplyDelete
  3. Tom that was a nice bit of code. We don't have ETS module yet so that was pretty nifty way to accomplish the same thing. Thanks for posting that. - Kevin

    ReplyDelete