Wednesday, July 27, 2011

SAS dataset declassified by Matt Shotwell


Matt Shotwell’s new R package ‘sas7bdat’ is a great achievement to bridge SAS and R. Earlier this year Revolution R, a commercial competitor against SAS, launched a RxSasData() function to read SAS’s unique ‘sas7bdat’ data structure. However, we more like the free lunch provided by the community R.

Now R would have a free access toward SAS’s datasets, including many SAS’s own help datasets. And we will be able to do a lot of tricks toward SAS’s datasets powered by R, in many areas where SAS can’t reach or we didn’t pay the licenses. For example, SAS has a SASHELP.LAKE dataset to show the surface plot feature.  We can use R to directly read it and draw a picture combining a contour plot and a surface plot.


library('sas7bdat', 'lattice')
x = read.sas7bdat('c:/program files/sas/sasfoundation/9.2/graph/sashelp/lake.sas7bda')

panel.3d.contour <-
function(x, y, z, rot.mat, distance,
nlevels = 20, zlim.scaled, ...){
add.line <- trellis.par.get("add.line")
panel.3dwire(x, y, z, rot.mat, distance,
zlim.scaled = zlim.scaled, ...)
clines <-
contourLines(x, y, matrix(z, nrow = length(x), byrow = TRUE),
nlevels = nlevels)
for (ll in clines) {
m <- ltransform3dto3d(rbind(ll$x, ll$y, zlim.scaled[2]),
rot.mat, distance)
panel.lines(m[1,], m[2,], col = add.line$col,
lty = add.line$lty, lwd = add.line$lwd)
}
}

wireframe(-Depth ~ Length * Width , x, panel.aspect = 0.6,
panel.3d.wireframe = "panel.3d.contour", shade = T,
screen = list(z = -30, x = 50), lwd = 0.01,
xlab = "Length", ylab = "Width",
zlab = "Depth")


I also had a little test to evaualate the speed of the read.sas7bdat() function. Reading the SAS dataset SASHELP.LAKE 30 times only took 1.64 second on my 3-yea-old desktop, which is certainly much faster than transforming it to a CSV file and inputting.

library('sas7bdat')
test <- function(n = 30) {
system.time(
for(i in 1:n)
read.sas7bdat('
c:/program files/sas/sasfoundation/9.2/graph/sashelp/lake.sas7bda')
)
}
test()

> user system elapsed
1.60 0.05 1.64
Hope Matt continues to improve this wonderful package:
  1. add the support for SAS datasets generated by 64bit systems;
  2. add a write.sas7bdat() function (that will be so cool!).

No comments:

Post a Comment