Friday, September 28, 2012

SAS and VBA (4) : fill missing values with last valid observation

In many data management routines, it is common to fill the missing values with the last valid one. For example, we want to maintain the patient visit log about several patients, which records their weight for each visit. Given these patients’ absence for the appointments, the data analyst has to fill the the empty weight value with the last valid observation. This log includes three columns: patient ID, visit ID and weight.
ptno visit weight
1 1 122
1 2
1 3
1 4 123
2 1 156
2 3
3 1 112
3 2
4 1 125
4 2
4 3

VBA 
VBA is quite flexible at those occasions. If the cell has missing value, we can assign a R1C1 formula to the cell to obtain non-missing value directly from its top neighboring cell. As the result, the logic is a simple one-sentence clause.
Sub Locf()
' If a cell in the 3rd column is blank then fill with the previous non-missing value
Range("C1").CurrentRegion.SpecialCells(xlCellTypeBlanks).FormulaR1C1 = "=R[-1]C"
' Format patient ID with 000
Columns("A").NumberFormat = "000"
End Sub

SAS

In SAS, we need to set up a temporary variable in a DATA STEP to memorize the valid value by the RETAIN statement. Then a conditional structure is used to exchange the values between the weight variable and the temporary variable.
data patient;  
input @1 ptno @3 visit @5 weight;
infile datalines missover;
cards;
1 1 122
1 2
1 3
1 4 123
2 1 156
2 3
3 1 112
3 2
4 1 125
4 2
4 3
;;;
run;

data result;
set patient;
* Format patient ID with 000;
format ptno z3. ;
retain tempvar 0;
if missing(weight) = 1 then weight = tempvar;
else tempvar = weight;
drop tempvar;
run;

Conclusion
SAS is a procedural language, while VBA enjoy its power based on its many objects and properties. However, one common thing in writing good codes for both of them is to avoid the unnecessary explicit loops.

1 comment:

  1. if the first obs of a patient is missing then it carries forward the previous patients last obs (filled or not).

    ReplyDelete