Pages

SyntaxHighlighter

Monday, December 7, 2020

DOW Sort

Sorting data in ascending or descending order is a very easy and straightforward process.  However, there are times when a simple sort does not suffice.  Consider a scenario of submitting a large number of processes in a SAS grid environment that has a limit on the number of active parallel processes.

In that scenario, you will want to mix tasks that take a long time with those quicker running tasks.  There is normally a key driver that can be used to identify longer running processes.  The number of rows and columns are prime candidates to sort the data.

This example uses the available SASHELP.CLASS data set and sorts it by age and height in ascending order.  The ability to read SAS data sets in random order via the POINT= option in a SET statement make this relatively easy.  The idea is to read the highest or last row with the first X rows that are lower in value.  The technique used is a variant of the DOW Loop created and popularized by SAS Gurus Don Henderson, Paul Dorfman and Ian Whitlock.

The below data set is sorted by age and height.



The following data set shows the result of reading the last row then the top 5 rows.  After that the second to the last row is output followed by rows 6 through 10



/*******************************************************************************

     Program: sort_dow.sas

      Author: Tom Bellmer

     Created: 12/07/2020 @ 1:25:37 PM

SAS Version: SAS 9.4 (TS1M6)

          OS: LIN X64

     Purpose: Places subrows under the highest value via a DOW loop

       Notes: Assumes the input data set is sorted in ascending order

              will be used to stuff low value rows below high for RSUBMITS

              to reduce expected time in each parallel session

 

                             Modifications in descending order

FL-YYYYMMDD                             Description

----------- --------------------------------------------------------------------

 

         1    1    2    2    3    3    4    4    5    5    6    6    7    7    8

....5....0....5....0....5....0....5....0....5....0....5....0....5....0....5....0

*******************************************************************************/

 

%macro sort_dow(

    dsnin   =

  , subrows = 5

  , dsnout  = sort_dow

);

 

  %if not %sysfunc(exist(&dsnin)) %then %do;

    %put %str(E)RROR: Invalid input data set: &dsnin..;

    %return;

  %end;

 

  data &dsnout;

    if 0 then set &dsnin nobs = totobs;

 

    do until(totobs = i);

      set &dsnin point = totobs;

      totobs = totobs - 1;

      output;

      if totobs = i then stop;

 

      do _n_ = 1 to &subrows;

        i + 1;

        set &dsnin point = i;

        output;

        if totobs = i then stop;

      end;

    end;

    stop;

  run;

%mend;

 

/*EOF: sort_dow.sas */

 

 

 

proc sort data = sashelp.class out = class;

  by age height;

run;

 

%sort_dow(

    dsnin   = class

  , subrows = 5

  , dsnout  = sort_dow

);