KanSAS Code: SAS vs SQL Update

Saturday, February 28, 2015

SAS vs SQL Update

I have always preferred to use SQL versus SAS syntax because of the wider usage of SQL for general data manipulation. However, I have heard that the SAS update statement is much more efficient than SQL for updates, so I wrote a test case to verify.

Sure enough, the SAS update statement is several orders of magnitude faster when updating roughly 5,000 rows from one table to a 100,000 row table.

data sample ;
  do id = 1 to 100000 ;
    lname = 'Smith' ;
    output ;
  end ;
run ;

data upd ;
  set sample ;
    if ranuni( 1 ) <= .05 then do ;
      lname = 'Jones' ;
      output ;
    end ;
run ;

data sample ;
  update sample upd  ;
    by id ;
run ;

proc sql ;
  update sample s
    set lname = 
      ( select   lname 
          from   upd u
          where  u.id = s.id 
      )  
      where id in
      ( select   id
          from   upd u 
      ) ;
quit ;

proc sql ;
  update sample s
    set lname = 
      ( select   lname 
          from   upd u
          where  u.id = s.id 
      )  
      where exists
      ( select   * 
          from   upd u 
          where  u.id = s.id 
      ) ; 
quit ;

2 comments:

UnknownMay 30, 2015 at 12:15 AM
Thank you so much for sharing this wonderful information. I find this very useful and informative. keep up this good work.

SAS training in Chennai
ReplyDelete
Replies
UnknownJune 14, 2017 at 2:18 AM
Thats a big difference!
The SQL-code is explicit about which column is being updated, and which column the values come from. On the other hand, the data step is implicitly referring to all columns (I guess). How do I make the data step explicit about which columns are being updated from which source column?
ReplyDelete
Replies

Add comment

KanSAS Code

Pages

SyntaxHighlighter

Saturday, February 28, 2015

SAS vs SQL Update

2 comments: