Thursday, June 27, 2013

The complexity for Fibonacci numbers in SAS

Fibonacci numbers have a lot of applications. There have been a lot of interesting researches regarding Fibonacci numbers since 800 years ago. Mathematically they are the numbers from a sequence, which is defined as F_n = F_{n-1} + F_{n-2}\hspace{5}with\hspace{5}F_0 =0, \hspace{2} F_1=1.

Experiment and Result

Fibonacci numbers can be derived from either the original definition that is a typical example of recursion, or a mathematically simplified form that is called closed-from expression. In SAS, a user-defined Fib1 function in PROC FCMP can realize the recursion expression, and another Fib2 function is created to express the closed-form expression. I insert both functions to a loop from 10 to 40 with step size of 5 and record the real time costed.
According to the figure above, both algorithms have no actually difference while N is relative small. When N becomes 30 or bigger, the recursion expression function spends a lot of system time. When N is even greater than 40, SAS enters a freezing status and I have to break from the command manually. Thus, the curve for the recursion expression seems to fit an exponential relationship fairly well. On the contrary, the algorithm of closed-from expression takes almost constant time for the execution,

Complexity

  • Recursion expression
    T(n) = T(n-1) + T(n-2) + O(1), which may imply O(2^{n-1}) + O(2^{n-2}) + O(1) = O(2^n).
  • Closed-form expression
    The equation is fairly straightforward and described as F_n= {\phi^n-(-\phi)^{-n}\over\sqrt{5}}\hspace{2}with\hspace{2} \phi={(1+\sqrt{5})\over2}.  Here \phi is the well-known golden ratio. Its complexity is a constant, which suggests O(n) = O(n-1) = O(1).

Conclusion

Mathematics sometimes significantly decreases the complexity of the computation, in this case, from O(2^n) to O(1). It is also useful for us to estimate how much resources are needed for specific purpose.

****(1) Bulid user-defined functions *****************;
proc fcmp outlib = work.test.fibonacci;
* Create the 1st function based on recursion;
function fib1(n);
if n<0 then return(0);
else if n = 1 or n = 2 then return(1);
else return(fib1(n-1) + fib1(n-2));
endsub;
* Create the 2nd function based on closed form;
function fib2(n);
phi = (1 + sqrt(5))/2;
return(round((phi**n - (1-phi)**n) / sqrt(5)));
endsub;
quit;

****(2) Run the tests sequentially********************;
options cmplib = work.test fullstimer;
%macro test();
* Create datasets for verification
%do j = 10 %to 40 %by 5;
data _fib1_&j;
do i = 1 to &j;
x = fib1(i);
output;
end;
%put Recursion method at &j;
run;
data _fib2_&j;
do i = 1 to &j;
x = fib2(i);
output;
end;
%put Closed form method at &j;
run;
%end;
%mend;

* Output SAS log to text file;
proc printto log = "c:\tmp\test.log" new;
run;
%test()
proc printto;
run;

****(3) Display the results***************************;
proc datasets nolist;
delete _:;
quit;

data one;
infile "c:\tmp\test.log" truncover;
input @1 line $80.;
if index(line, 'real') > 0;
run;

data two;
set one;
length method $20.;
retain number 5;
* Ignore the time used by PROC PRINTTO;
if _n_ > 1;
if mod(_n_ , 2) = 0 then do;
method = "Recursion";
number + 5;
end;
else method = "Closed form";
time = input(substr(line, 21, 5), 5.2);
run;

proc sgplot data = two;
series x = number y = time / group = method;
yaxis label = "Time elapsed by seconds" grid;
run;

No comments:

Post a Comment