Thursday, July 7, 2011

Using SAS/IML for high performance VaR backtesting


Thanks to Rick’s new book about SAS/IML, I realized that SAS/IML is far beyond PROC IML, and recently it added an advanced IML language called IMLPlus that enjoys many new features. I also start to play with SAS/IML Studio, an IDE exclusively for IMPlus in addition to the SAS enhanced editor we are familiar with (Interestingly, a number of emerging IDEs for matrix languages prefer to use names like XXX Studio. For example, a hot IDE of R is called RStudio). Although SAS users/programmers are notably conservative, I had a rough intuition that learning SAS/IML will be helpful in some occasions of high-demanding computation.

For instance, backtesting of Value-at-risk(VaR) requires a lot of computer resource. At the beginning, VaR based on different methods should be calculated. Jon Danielsson introduced how to use Matlab and R to find VaRs by four ways: exponential moving average, moving average, historical simulation, and generalized autoregressive conditional heteroskedasticity (GARCH), separately [Ref. 1]. And he maintains a daily-updating plot on his website to show the trends [Ref. 2]. As Jon described, if a 4000-day data is fetched and a 1000-day rolling window for calculation is applied, it means that 3000 loops will be needed to have the VaRs , which would make the computation efficiency a concern. Fortunately, we can easily translate the codes by Matlab and R into SAS codes through PROC IML. If there is any hurdle in this porting process, I can always find tips from Rick’s resourceful blog [Ref.3]. The first three chapters of his book are also good reference [Ref. 4].

That is what I did. First I have to use R to download S&P 500 daily returns from 1995-08-09 to 2011-06-28, total 4000 observations. SAS/IML Studio can call R’s function. However, the R on my laptop is version 2.13.0, which is not supported by SAS/IML Studio currently. So I directly used RStudio, my favorite IDE for R, and saved the data as CSV.

######################(0) DOWNLOAD STEP###############################
library("tseries")
p = get.hist.quote(instrument = "^gspc",
Start = "1995-08-09",end="2011-06-28",quote="AdjClose",quiet=T)
y=diff(log(p))
nrow(y)
result=as.data.frame(y)
write.csv(result,file='c:/tmp/result.csv')
####################END OF STEP 0#####################################
Then I used PROC IML to calculate VaRs by exponential moving average, moving average and historical simulation, and eventually inserted them into a macro. Unlike R or Matlab, SAS depends on the procedures rather than the functions. Since PROC IML cannot call other SAS’s procedures, I used PROC FCMP to make PROC AUTOREG as a user-defined function for GARCH. SAS now provides virtual hard disk with physical memory specified by the MEMLIB command, which is indeed a big help for me to slash the computer’s operation time. After all, the total time to run the program below on my 2-year-old laptop is 9 minutes, which is acceptable.

**********************(1) INPUT STEP*********************************;
***********(1.1) INPUT DAILY RETURN DATA FROM R**********************;
data _tmp01;
infile "c:\tmp\result.csv" delimiter = ',' missover dsd
lrecl=32767 firstobs=2;
format date yymmdd10. r best32.;
informat date yymmdd10. r best32.;
input date $ r;
run;

***********(1.2) LOAD DATA INTO MEMORY LIBRARY***********************;
libname mem 'd:\tmp' memlib;
data mem._tmp01;
set _tmp01(keep=r);
run;
********************END OF STEP 1************************************;

**********************(2) MODULE-BUILDING STEP***********************;
***********(2.1) CREATE A FUNCTION FOR 3 TYPES OF VARS BY AVERAGING**;
%macro var123_macro();
proc iml;
lambda = 0.94;
use mem._tmp01;
read all var{r} into y;
close mem._tmp01;
t1 = &start - 1000;
t2 = &start - 1;
window = y[t1:t2];
start Var(A);
n = nrow(A);
return( (A-A[:,])[##,]/(n-1) );
finish Var;
if exist("mem.b") then do;
use mem.b;
read all var{sl1} into sl1;
close mem.b;
end;
else do;
sl1 = Var(y[1:30]);
do t = 2 to 1000;
sl1 = lambda * sl1 + (1-lambda) * y[t-1]**2;
end;
end;
sl1 = lambda * sl1 + (1-lambda) * y[&start-1]**2;
var1 = -probit(&pvalue) * sqrt(sl1) * &value;
call symput('var1', left(char(var1)));
create mem.b var{sl1};
append;
close mem.b;
call sort(window, {1});
var2 = -sqrt(Var(window)) * probit(&pvalue) * &value;
var3 = -window[1000*&pvalue] *&value;
call symput('var2', left(char(var2)));
call symput('var3', left(char(var3)));
quit;
%mend var123_macro;

proc fcmp outlib = work.myfunclib.finance;
subroutine var123(start, pvalue, value, var1, var2, var3);
outargs var1, var2, var3;
rc1 = run_macro('var123_macro', start, pvalue, value, var1, var2, var3);
endsub;
quit;

***********(2.2) CREATE A FUNCTION FOR VAR BY GARCH******************;
%macro var4_macro();
ods select none;
%let start = %eval(&start - 1000);
%let end = %eval(&start + 999);
proc autoreg data = mem._tmp01(firstobs = &start obs = &end);
model r = / noint garch=(q=1, p=1);
output out = mem._gar01 ht = ht;
run;
data _null_;
set mem._gar01 end = eof;
if eof then do;
var4 = -sqrt(ht) * probit(&pvalue) * &value;
call symput('var4', var4);
end;
run;
ods select all;
%mend;

proc fcmp outlib = work.myfunclib.finance;
function var4(start, pvalue, value);
rc1 = run_macro('var4_macro', start, pvalue, value, var4);
return(var4);
endsub;
quit;
********************END OF STEP 2************************************;

**********************(3) IMPLEMENTATION STEP************************;
options cmplib = (work.myfunclib);
data mem.one;
pvalue = 0.01;
value = 1000;
do i = 1001 to 4000;
call var123(i, pvalue, value, var1, var2, var3);
var4 = var4(i, pvalue, value);
output;
end;
run;
********************END OF STEP 3************************************;
The four VaRs can be displayed in a time series plot. Then the backtests can be conducted by Bernoulli coverage test and independence coverage test.

**********************(4) VISUALIZATION STEP*************************;
data _tmp02;
set _tmp01;
i = _n_;
run;

data _tmp03(where=(i>1000));
merge _tmp02 mem.one;
by i;
label var1 = "Exponential Moving Average"
var2 = "Moving Average"
var3 = "Historical Simulation"
var4 = "GARCH";
run;

ods html gpath = "c:\tmp" style = harvest;
proc sgplot data = _tmp03;
series x = date y = var1;
series x = date y = var2;
series x = date y = var3;
series x = date y = var4;
format var1 dollar8.2;
yaxis grid label = 'Value-at-risk ';
xaxis label = ' ';
run;
ods html close;
********************END OF STEP 4************************************;
The enhanced version of PROC IML, IMLPlus, can call SAS’s procedures within it. Rick gave a number of illustrations in the 4, 12-14 chapters of his book [Ref.4]. The GARCH part above can be re-written in SAS/IML Studio like:

value = 1000;
pvalue = 0.01;
var4 = j(4000, 1);

do i = 1001 to 4000;
start = i - 1000;
end = i - 1;
submit start end;
proc autoreg data = sasuser._tmp01(firstobs = &start obs = &end);
model r = / noint garch=(q=1, p=1);
output out = _gar01 ht = ht;
quit;
endsubmit;
use _gar01;
read all var{ht} into ht;
close _gar01;
x = ht[1000];
getvar4 = -sqrt(x)*probit(pvalue)*value;
var4[i, 1] = getvar4;
end;

create sasuser.garch from var{var4};
append;
close sasuser.garch;
In conclusion, I had a lot of fun in studying SAS/IML this summer. SAS/IML Studio is a friendly environment to write code. And IMLPlus provides a different perspective to think about SAS programming.

References:
1. Jon Danielsson. ‘Financial Risk Forecasting’. Wiley, 2011.
2. Jon Danielsson . http://www.financialriskforecasting.com/results
3. Rick Wicklin. The Do Loop. http://blogs.sas.com/iml/
4. Rick Wicklin. ’ Statistical Programming with SAS/IML Software’. SAS Publishing, 2010.

No comments:

Post a Comment