Friday, June 7, 2013

Use Google Trends and SAS to select movies to watch


The newest success story about data science is Google search predicts box office with 94 percent accuracy. I am a frequent movie theater goer, and it will be great if we can implement Google's impressive research result.
There are quite a few offering for this summer. Now I am considering five incoming movies.
TitleDate
This is the EndWednesday, June 12
World War ZFriday, June 21
Man of SteelFriday, June 14
Monsters UniversityFriday, June 21
The InternshipFriday, June 7

Google Trends reflects what keywords people are searching for, which is a reliable and free data source. Let's use SAS to do some scripting work to generate the URL query based on the get method.
data one;
input @1 title $25.;
cards;
This is the End
World War Z
Man of Steel
Monsters University
The Internship
;;;run;

data two(where=(missing(word)=0));
set one nobs = nobs;
if _n_ ne nobs then
title1 = cat(title, "%2C");
else title1 = title;
do i = 1 to 10;
word = scan(title1, i, " ");
output;
end;
keep word;
run;

proc sql noprint;
select word into: string separated by "%20"
from two
;quit;

data three;
length fullstring $500.;
fullstring = cats("http://www.google.com/trends/explore?q=", "&string", '&geo=US&date=today%203-m&cmpt=q');
run;

proc print;
run;
From SAS, I print the resulting URL. Once I paste the url in a browser, the graphics clearly tells that the box office winners are going to be Man of Steel and World War Z. Finally my choice will be easier. I will surely not miss the two hottest movies.

1 comment:

  1. Nice.
    I wonder if there is a way of downloading the data to a sas dataset.

    ReplyDelete