Pages

SyntaxHighlighter

Friday, May 2, 2014

Hash Object Throwdown: SetCur() vs Find_Next() methods

Is SAS's hash iterator object's setcur() method faster than a hash object's find_next() method when extracting multiple values from a key value? In code below, 5 million rows were created for key values 'A', 'B' and 'C' then the 'B' value was searched and extracted.

It turns out the hash object's find_next() method is about 25% faster than the iterator's setcur() method.

data input ;
  length
    key $1
    sat  5 ;
 
  do key = 'A', 'B', 'C' ;
    do sat = 1 to 5000000 ;
      output ;
    end ;
  end ;
run ;
 
data
  xiterator( keep = key sat )
  xhash( keep = key sat ) ;
  if 0 then set input ;
 
  dcl hash hh( dataset: 'input', ordered: 'a', multidata: 'y' ) ;
  dcl hiter hi( 'hh' ) ;
  hh.definekey( 'key' ) ;
  hh.definedata( 'key', 'sat' ) ;
  hh.definedone() ;
 
  findthis = 'B' ;
 
  temp_start = datetime() ;
  do rc = hi.setcur( key: findthis ) by 0 while( rc = 0 and key = findthis ) ;
    output xiterator ;
    rc = hi.next() ;
  end ;
  temp_end = datetime() - temp_start ;
  put temp_end time10.4 ;
 
  temp_start = datetime() ;
  do rc = hh.find( key: findthis ) by 0 while( rc = 0 ) ;
    output xhash ;
    rc = hh.find_next() ;
  end ;
  temp_end = datetime() - temp_start ;
  put temp_end time10.4 ;
 
  stop ;
run ;
For more information, read this excellent paper on Hash Objects: Black Belt Hashigana

No comments:

Post a Comment