The newest success story about data science is Google search predicts box office with 94 percent accuracy. I am a frequent movie theater goer, and it will be great if we can implement Google's impressive research result.
There are quite a few offering for this summer. Now I am considering five incoming movies.
Title | Date |
---|---|
This is the End | Wednesday, June 12 |
World War Z | Friday, June 21 |
Man of Steel | Friday, June 14 |
Monsters University | Friday, June 21 |
The Internship | Friday, June 7 |
Google Trends reflects what keywords people are searching for, which is a reliable and free data source. Let's use SAS to do some scripting work to generate the URL query based on the
get
method.data one;
input @1 title $25.;
cards;
This is the End
World War Z
Man of Steel
Monsters University
The Internship
;;;run;
data two(where=(missing(word)=0));
set one nobs = nobs;
if _n_ ne nobs then
title1 = cat(title, "%2C");
else title1 = title;
do i = 1 to 10;
word = scan(title1, i, " ");
output;
end;
keep word;
run;
proc sql noprint;
select word into: string separated by "%20"
from two
;quit;
data three;
length fullstring $500.;
fullstring = cats("http://www.google.com/trends/explore?q=", "&string", '&geo=US&date=today%203-m&cmpt=q');
run;
proc print;
run;
From SAS, I print the resulting URL. Once I paste the url in a browser, the graphics clearly tells that the box office winners are going to be
Man of Steel
and World War Z
. Finally my choice will be easier. I will surely not miss the two hottest movies.
Nice.
ReplyDeleteI wonder if there is a way of downloading the data to a sas dataset.