Thursday, March 15, 2012

Make a frequency function in SAS/IML


Aggregation is probably the most popular operation in the data world. R comes with a handy table() function. Usually in SAS, the FREQ procedure would deal with this job. It will be great if SAS/IML has an equivalent function. I just created a user-defined function or module for such a purpose. Since it contains a DO loop, the efficiency is not very ideal -- always 10 times slower than PROC FREQ for a simulated data set of one million records.

/* 1 - Use IML for simulation and aggregation */
proc iml;
start freq(invec);
x = t(unique(invec));
y = repeat(x, 1, 2);
do i = 1 to nrow(x);
y[i, 2] = ncol(loc(invec=y[i, 1]));
end;
return(y);
finish;
store module = freq;
quit;

proc iml;
load module = freq;
/* Simulate a vector with 1 million values */
test = abs(floor(rannor(1:1e6)*100));
t0 = time();
result = freq(test);
timer = time() - t0;
print timer;
/* Output the result matrix as SAS data set */
create a var{"level" "frequency"};
append from result;
close a;
quit;

proc sgplot data = a;
series x = level y = frequency;
run;

/* 2 - Use PROC FREQ for simulation and aggregation */
data test;
do i = 1 to 1e6;
test = abs(floor(rannor(1234)*100));
output;
end;
drop i;
run;

options fullstimer;
proc freq data = test noprint;
table test / out = b;
run;

No comments:

Post a Comment