It is always best to allow a remote database to handle as much work as possible and return as little data as possible over the network. SQL is a very universal language used by most, if not all database systems.
Recently I had occasion to extract data from SQL Server and found it most efficient to utilize its system tables to do just that. In my case, I wanted both the total record count as well as the number of character and numeric data types. While this information is available using SAS dictionary tables from an ODBC libname statement, it was taking up to four minutes to return the data.
The code below creates two derived tables that contain the desired information. The problem was that there is not a common key to perform the required join. While a cross-join is possible, those joins in PROC SQL write a WARNING to the SAS log which is not allowed in my parselog macro.
The solution was to join the derived table using the ON 1 = 1 syntax. This worked and returned the correct answer in under one second which is much faster than the four minutes using a libname.
proc sql; connect to odbc(dsn=mydsn authdomain="myauth"); create table results as select totalrows , char_cnt , num_cnt from connection to odbc( select cnt.totalrows , col.char_cnt , col.num_cnt from ( select sum(p.rows) as totalrows from sys.tables t join sys.partitions p on t.object_id = p.object_id and p.index_id IN (0, 1) where t.name = 'balancechanges' ) as cnt join ( select sum(iif(ty.name in('nvarchar', 'varchar', 'char' , 'nchar', 'text'), 1, 0)) as char_cnt , sum(iif(ty.name in('nvarchar', 'varchar', 'char' , 'nchar', 'text'), 0, 1)) as num_cnt from sys.tables t join sys.columns c on t.object_id = c.object_id join sys.types ty on c.user_type_id = ty.user_type_id where t.name = 'balancechanges' ) as col on 1 = 1 ); disconnect from odbc; quit;
No comments:
Post a Comment