Visualizing Movies Gross Income

[This article was first published on Jkunst - R category, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Shh.. this post is an excuse to test the brand new subtitles and
captions in #ggplot2! powered by @hrbrmstr
.

The recently (I remeber this movie like it was yesterday) SW7 ($930,901,726 gross income)
and the not so standar Deadpool ($329,397,732) are top 1 and top 7 (and climbing) in
terms of gross income according to http://www.boxofficemojo.com/ site.
Have you ask yourself how much gross income the movies produces? A lot i guess!
What movies are the most succesfull in a particular saga? I dont know so write some
code to scrap and discover it because http://www.boxofficemojo.com/ have all these
data and we’re here visualize it.

Data

We’ll extract the (only US) gross income for the top 200 movies (You can get more if
you want to test the visualizations with 1000 movies) and then, for each movie extract
the daily chart section which containts for every day since the release date the gross
income per day! This is just fantastic. So here we go!!

gogo
image source

<span class="c1">#### scrap ####
</span><span class="n">url</span> <span class="o"><-</span> <span class="s2">"http://www.boxofficemojo.com/alltime/domestic.htm"</span>

<span class="n">urls</span> <span class="o"><-</span> <span class="n">paste0</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">sprintf</span><span class="p">(</span><span class="s2">"?page=%s&p=.htm"</span><span class="p">,</span> <span class="m">1</span><span class="o">:</span><span class="m">2</span><span class="p">))</span>

<span class="n">dfmovie</span> <span class="o"><-</span> <span class="n">map_df</span><span class="p">(</span><span class="n">urls</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">){</span>
  <span class="c1"># x <- sample(size = 1, urls)
</span>  <span class="n">urlmovie</span> <span class="o"><-</span> <span class="n">read_html</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="n">html_nodes</span><span class="p">(</span><span class="s2">"table table tr a"</span><span class="p">)</span> <span class="o">%>%</span>
    <span class="n">html_attr</span><span class="p">(</span><span class="s2">"href"</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="err">.</span><span class="p">[</span><span class="n">str_detect</span><span class="p">(</span><span class="err">.</span><span class="p">,</span> <span class="s2">"movies"</span><span class="p">)]</span>
  
  <span class="n">read_html</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="n">html_nodes</span><span class="p">(</span><span class="s2">"table table"</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="n">html_table</span><span class="p">(</span><span class="n">fill</span> <span class="o">=</span> <span class="n">TRUE</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="err">.</span><span class="p">[[</span><span class="m">4</span><span class="p">]]</span> <span class="o">%>%</span> 
    <span class="n">tbl_df</span><span class="p">()</span> <span class="o">%>%</span> 
    <span class="err">.</span><span class="p">[</span><span class="m">-1</span><span class="p">,</span> <span class="p">]</span> <span class="o">%>%</span> 
    <span class="n">setNames</span><span class="p">(</span><span class="n">c</span><span class="p">(</span><span class="s2">"rank"</span><span class="p">,</span> <span class="s2">"title"</span><span class="p">,</span> <span class="s2">"studio"</span><span class="p">,</span> <span class="s2">"gross"</span><span class="p">,</span> <span class="s2">"year"</span><span class="p">))</span> <span class="o">%>%</span> 
    <span class="n">mutate</span><span class="p">(</span><span class="n">url_movie</span> <span class="o">=</span> <span class="n">urlmovie</span><span class="p">)</span>
  
<span class="p">})</span> 

<span class="n">dfmovie</span> <span class="o"><-</span> <span class="n">dfmovie</span> <span class="o">%>%</span> 
  <span class="n">mutate</span><span class="p">(</span><span class="n">year</span> <span class="o">=</span> <span class="n">str_extract</span><span class="p">(</span><span class="n">year</span><span class="p">,</span> <span class="s2">"\d+"</span><span class="p">),</span>
         <span class="n">year</span> <span class="o">=</span> <span class="n">as.numeric</span><span class="p">(</span><span class="n">year</span><span class="p">),</span>
         <span class="n">have_release</span> <span class="o">=</span> <span class="n">str_detect</span><span class="p">(</span><span class="n">url_movie</span><span class="p">,</span> <span class="s2">"releases"</span><span class="p">),</span>
         <span class="n">box_id</span> <span class="o">=</span> <span class="n">str_extract</span><span class="p">(</span><span class="n">url_movie</span><span class="p">,</span> <span class="s2">"id=.*"</span><span class="p">),</span>
         <span class="n">box_id</span> <span class="o">=</span> <span class="n">str_replace_all</span><span class="p">(</span><span class="n">box_id</span><span class="p">,</span> <span class="s2">"^id=|\.htm$"</span><span class="p">,</span> <span class="s2">""</span><span class="p">))</span>

<span class="n">dfmovie2</span> <span class="o"><-</span> <span class="n">map_df</span><span class="p">(</span><span class="n">dfmovie</span><span class="o">$</span><span class="n">box_id</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">){</span>
  <span class="c1"># x <- "starwars2"
</span>  <span class="c1"># x <- sample(dfmovie$box_id, size =1); 
</span>  <span class="n">message</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
  
  <span class="k">if</span> <span class="p">(</span><span class="n">file.exists</span><span class="p">(</span><span class="n">sprintf</span><span class="p">(</span><span class="s2">"data/%s-p2.rds"</span><span class="p">,</span> <span class="n">x</span><span class="p">)))</span> <span class="p">{</span>
    <span class="c1"># I'm always have conecction issues so for avoid 
</span>    <span class="c1"># loose data I save the data.
</span>    <span class="n">dfm</span> <span class="o"><-</span> <span class="n">readRDS</span><span class="p">(</span><span class="n">sprintf</span><span class="p">(</span><span class="s2">"data/%s-p2.rds"</span><span class="p">,</span> <span class="n">x</span><span class="p">))</span>
    <span class="k">return</span><span class="p">(</span><span class="n">dfm</span><span class="p">)</span>
  <span class="p">}</span>
  
  <span class="n">html</span> <span class="o"><-</span> <span class="n">sprintf</span><span class="p">(</span><span class="s2">"http://www.boxofficemojo.com/movies/?page=main&id=%s.htm"</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="n">read_html</span><span class="p">()</span>
  
  <span class="n">img_url</span> <span class="o"><-</span> <span class="n">html</span> <span class="o">%>%</span> 
    <span class="n">html_nodes</span><span class="p">(</span><span class="s2">"table table table img"</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="err">.</span><span class="p">[[</span><span class="m">1</span><span class="p">]]</span> <span class="o">%>%</span> 
    <span class="n">html_attr</span><span class="p">(</span><span class="s2">"src"</span><span class="p">)</span>
  
  <span class="n">tmp</span> <span class="o"><-</span> <span class="n">tempfile</span><span class="p">(</span><span class="n">fileext</span> <span class="o">=</span> <span class="s2">".jpg"</span><span class="p">)</span>
  <span class="n">download.file</span><span class="p">(</span><span class="n">img_url</span><span class="p">,</span> <span class="n">tmp</span><span class="p">,</span> <span class="n">mode</span> <span class="o">=</span> <span class="s2">"wb"</span><span class="p">,</span> <span class="n">quiet</span> <span class="o">=</span> <span class="n">TRUE</span><span class="p">)</span>
  <span class="n">img</span> <span class="o"><-</span> <span class="n">jpeg</span><span class="o">::</span><span class="n">readJPEG</span><span class="p">(</span><span class="n">tmp</span><span class="p">)</span>
  <span class="n">imgpltt</span> <span class="o"><-</span> <span class="n">image_palette</span><span class="p">(</span><span class="n">img</span><span class="p">,</span> <span class="n">n</span> <span class="o">=</span> <span class="m">1</span><span class="p">,</span> <span class="n">choice</span> <span class="o">=</span> <span class="n">median</span><span class="p">)</span>
  
  <span class="c1"># par(mfrow = c(1, 2))
</span>  <span class="c1"># display_image(img)
</span>  <span class="c1"># show_col(imgpltt)
</span>  
  <span class="n">dfaux</span> <span class="o"><-</span> <span class="n">html</span> <span class="o">%>%</span> 
    <span class="n">html_nodes</span><span class="p">(</span><span class="s2">"table  table  table"</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="err">.</span><span class="p">[[</span><span class="m">2</span><span class="p">]]</span> <span class="o">%>%</span> 
    <span class="n">html_table</span><span class="p">(</span><span class="n">fill</span> <span class="o">=</span> <span class="n">TRUE</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="err">.</span><span class="p">[</span><span class="m">-1</span><span class="p">,</span> <span class="m">1</span><span class="o">:</span><span class="m">2</span><span class="p">]</span> <span class="o">%>%</span> 
    <span class="n">tbl_df</span><span class="p">()</span>

  <span class="n">dfm</span> <span class="o"><-</span> <span class="n">data_frame</span><span class="p">(</span>
    <span class="n">box_id</span> <span class="o">=</span> <span class="n">x</span><span class="p">,</span>
    <span class="n">distributor</span> <span class="o">=</span> <span class="n">str_replace</span><span class="p">(</span><span class="n">dfaux</span><span class="p">[</span><span class="m">2</span><span class="p">,</span> <span class="m">1</span><span class="p">],</span> <span class="s2">"Distributor: "</span><span class="p">,</span> <span class="s2">""</span><span class="p">),</span>
    <span class="n">genre</span> <span class="o">=</span> <span class="n">str_replace</span><span class="p">(</span><span class="n">dfaux</span><span class="p">[</span><span class="m">3</span><span class="p">,</span> <span class="m">1</span><span class="p">],</span> <span class="s2">"Genre: "</span><span class="p">,</span> <span class="s2">""</span><span class="p">),</span>
    <span class="n">mpaa_rating</span> <span class="o">=</span> <span class="n">str_replace</span><span class="p">(</span><span class="n">dfaux</span><span class="p">[</span><span class="m">4</span><span class="p">,</span> <span class="m">1</span><span class="p">],</span> <span class="s2">"MPAA Rating: "</span><span class="p">,</span> <span class="s2">""</span><span class="p">),</span>
    <span class="n">runtime</span> <span class="o">=</span> <span class="n">str_replace</span><span class="p">(</span><span class="n">dfaux</span><span class="p">[</span><span class="m">3</span><span class="p">,</span> <span class="m">2</span><span class="p">],</span> <span class="s2">"Runtime: "</span><span class="p">,</span> <span class="s2">""</span><span class="p">),</span>
    <span class="n">production_budget</span> <span class="o">=</span> <span class="n">str_extract</span><span class="p">(</span><span class="n">dfaux</span><span class="p">[</span><span class="m">4</span><span class="p">,</span> <span class="m">2</span><span class="p">],</span> <span class="s2">"\d+"</span><span class="p">),</span>
    <span class="n">img_url</span> <span class="o">=</span> <span class="n">img_url</span><span class="p">,</span>
    <span class="n">img_main_color</span> <span class="o">=</span> <span class="n">imgpltt</span>
  <span class="p">)</span>
  
  <span class="n">saveRDS</span><span class="p">(</span><span class="n">dfm</span><span class="p">,</span> <span class="n">file</span> <span class="o">=</span> <span class="n">sprintf</span><span class="p">(</span><span class="s2">"data/%s-p2.rds"</span><span class="p">,</span> <span class="n">x</span><span class="p">))</span>
  
  <span class="n">dfm</span>
    
<span class="p">})</span>

<span class="n">dfgross</span> <span class="o"><-</span> <span class="n">map_df</span><span class="p">(</span><span class="n">dfmovie</span><span class="o">$</span><span class="n">box_id</span><span class="p">,</span> <span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">){</span>
  <span class="c1"># x <- sample(dfmovie$box_id, size =1)
</span>  <span class="n">message</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>
  
  <span class="k">if</span> <span class="p">(</span><span class="n">file.exists</span><span class="p">(</span><span class="n">sprintf</span><span class="p">(</span><span class="s2">"data/%s.rds"</span><span class="p">,</span> <span class="n">x</span><span class="p">)))</span> <span class="p">{</span>
    <span class="n">dfgr</span> <span class="o"><-</span> <span class="n">readRDS</span><span class="p">(</span><span class="n">sprintf</span><span class="p">(</span><span class="s2">"data/%s.rds"</span><span class="p">,</span> <span class="n">x</span><span class="p">))</span>
    <span class="k">return</span><span class="p">(</span><span class="n">dfgr</span><span class="p">)</span>
  <span class="p">}</span>
    
  <span class="n">dfgr</span> <span class="o"><-</span> <span class="n">sprintf</span><span class="p">(</span><span class="s2">"http://www.boxofficemojo.com/movies/?page=daily&view=chart&id=%s.htm"</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span>  <span class="o">%>%</span> 
    <span class="n">read_html</span><span class="p">()</span> <span class="o">%>%</span> 
    <span class="n">html_nodes</span><span class="p">(</span><span class="s2">"table table table"</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="n">html_table</span><span class="p">(</span><span class="n">fill</span> <span class="o">=</span> <span class="n">TRUE</span><span class="p">)</span> <span class="o">%>%</span> 
    <span class="n">last</span><span class="p">()</span> <span class="o">%>%</span> 
    <span class="n">tbl_df</span><span class="p">()</span>
  
  <span class="k">if</span> <span class="p">(</span><span class="n">nrow</span><span class="p">(</span><span class="n">dfgr</span><span class="p">)</span> <span class="o">==</span> <span class="m">1</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">dfgr</span> <span class="o"><-</span> <span class="n">data_frame</span><span class="p">(</span><span class="n">box_id</span> <span class="o">=</span> <span class="n">x</span><span class="p">)</span>
  <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
    <span class="n">dfgr</span> <span class="o"><-</span> <span class="n">dfgr</span> <span class="o">%>%</span> 
      <span class="err">.</span><span class="p">[</span><span class="m">-1</span><span class="p">,</span> <span class="p">]</span> <span class="o">%>%</span> 
      <span class="n">setNames</span><span class="p">(</span><span class="n">c</span><span class="p">(</span><span class="s2">"day"</span><span class="p">,</span> <span class="s2">"date"</span><span class="p">,</span> <span class="s2">"rank"</span><span class="p">,</span> <span class="s2">"gross"</span><span class="p">,</span> <span class="s2">"pd"</span><span class="p">,</span><span class="s2">"na"</span><span class="p">,</span>
                 <span class="s2">"theatres_avg"</span><span class="p">,</span> <span class="s2">"na2"</span><span class="p">,</span> <span class="s2">"gross_to_date"</span><span class="p">,</span> <span class="s2">"day_number"</span><span class="p">))</span> <span class="o">%>%</span> 
      <span class="n">mutate</span><span class="p">(</span><span class="n">box_id</span> <span class="o">=</span> <span class="n">x</span><span class="p">)</span> <span class="o">%>%</span> 
      <span class="n">filter</span><span class="p">(</span><span class="o">!</span><span class="n">is.na</span><span class="p">(</span><span class="n">day_number</span><span class="p">))</span>
  <span class="p">}</span>
  
  <span class="n">saveRDS</span><span class="p">(</span><span class="n">dfgr</span><span class="p">,</span> <span class="n">file</span> <span class="o">=</span> <span class="n">sprintf</span><span class="p">(</span><span class="s2">"data/%s.rds"</span><span class="p">,</span> <span class="n">x</span><span class="p">))</span>
  
  <span class="n">dfgr</span>
  
<span class="p">})</span>

<span class="c1"># This is only necessary if you have a non english R version
</span><span class="n">try</span><span class="p">(</span><span class="n">x</span> <span class="o"><-</span> <span class="n">Sys.setlocale</span><span class="p">(</span><span class="s2">"LC_TIME"</span><span class="p">,</span> <span class="s2">"en_US.UTF-8"</span><span class="p">))</span>
<span class="n">try</span><span class="p">(</span><span class="n">x</span> <span class="o"><-</span> <span class="n">Sys.setlocale</span><span class="p">(</span><span class="s2">"LC_TIME"</span><span class="p">,</span> <span class="s2">"English"</span><span class="p">))</span>

<span class="n">dfgross</span> <span class="o"><-</span> <span class="n">dfgross</span> <span class="o">%>%</span> 
  <span class="n">mutate</span><span class="p">(</span><span class="n">gross</span> <span class="o">=</span> <span class="n">as.numeric</span><span class="p">(</span><span class="n">str_replace_all</span><span class="p">(</span><span class="n">gross</span><span class="p">,</span> <span class="s2">"\$|\,"</span><span class="p">,</span> <span class="s2">""</span><span class="p">)),</span>
         <span class="n">gross_to_date</span> <span class="o">=</span> <span class="n">as.numeric</span><span class="p">(</span><span class="n">str_replace_all</span><span class="p">(</span><span class="n">gross_to_date</span><span class="p">,</span> <span class="s2">"\$|\,"</span><span class="p">,</span> <span class="s2">""</span><span class="p">)),</span>
         <span class="n">day_number</span> <span class="o">=</span> <span class="n">as.numeric</span><span class="p">(</span><span class="n">day_number</span><span class="p">),</span>
         <span class="n">date2</span> <span class="o">=</span> <span class="n">str_replace_all</span><span class="p">(</span><span class="n">date</span><span class="p">,</span> <span class="s2">"\t|\."</span><span class="p">,</span> <span class="s2">""</span><span class="p">),</span>
         <span class="n">date2</span> <span class="o">=</span> <span class="n">as.Date</span><span class="p">(</span><span class="n">date2</span><span class="p">,</span> <span class="s2">"%b %d, %Y"</span><span class="p">),</span>
         <span class="n">decade</span> <span class="o">=</span> <span class="n">year</span><span class="p">(</span><span class="n">date2</span><span class="p">)</span><span class="o">/</span><span class="m">100</span><span class="p">,</span>
         <span class="n">movieserie</span> <span class="o">=</span> <span class="n">str_extract</span><span class="p">(</span><span class="n">box_id</span><span class="p">,</span> <span class="s2">"^[A-Za-z]+|\d{2,3}"</span><span class="p">),</span>
         <span class="n">serienumber</span> <span class="o">=</span> <span class="n">str_extract</span><span class="p">(</span><span class="n">box_id</span><span class="p">,</span> <span class="s2">"\d{1,2}$"</span><span class="p">),</span>
         <span class="n">serienumber</span> <span class="o">=</span> <span class="n">ifelse</span><span class="p">(</span><span class="n">is.na</span><span class="p">(</span><span class="n">serienumber</span><span class="p">),</span> <span class="m">1</span><span class="p">,</span> <span class="n">serienumber</span><span class="p">))</span> <span class="o">%>%</span> 
  <span class="n">filter</span><span class="p">(</span><span class="o">!</span><span class="n">is.na</span><span class="p">(</span><span class="n">date</span><span class="p">))</span> 


<span class="n">dfmovie</span> <span class="o"><-</span> <span class="n">dfmovie</span> <span class="o">%>%</span> 
  <span class="n">left_join</span><span class="p">(</span><span class="n">dfmovie2</span><span class="p">,</span> <span class="n">by</span> <span class="o">=</span> <span class="s2">"box_id"</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">left_join</span><span class="p">(</span><span class="n">dfgross</span> <span class="o">%>%</span> 
              <span class="n">group_by</span><span class="p">(</span><span class="n">box_id</span><span class="p">)</span> <span class="o">%>%</span> 
              <span class="n">summarise</span><span class="p">(</span><span class="n">max_day</span> <span class="o">=</span> <span class="n">max</span><span class="p">(</span><span class="n">day_number</span><span class="p">)),</span>
            <span class="n">by</span> <span class="o">=</span> <span class="s2">"box_id"</span><span class="p">)</span>

<span class="n">dfmovie</span> <span class="o"><-</span> <span class="n">dfmovie</span> <span class="o">%>%</span> 
  <span class="n">mutate</span><span class="p">(</span><span class="n">rank</span> <span class="o">=</span> <span class="n">as.numeric</span><span class="p">(</span><span class="n">rank</span><span class="p">),</span>
         <span class="n">gross</span> <span class="o">=</span> <span class="n">as.numeric</span><span class="p">(</span><span class="n">str_replace_all</span><span class="p">(</span><span class="n">gross</span><span class="p">,</span> <span class="s2">"\$|\,"</span><span class="p">,</span> <span class="s2">""</span><span class="p">)),</span>
         <span class="n">studio</span> <span class="o">=</span> <span class="n">str_replace_all</span><span class="p">(</span><span class="n">studio</span><span class="p">,</span> <span class="s2">"\."</span><span class="p">,</span> <span class="s2">""</span><span class="p">),</span>
         <span class="n">production_budget</span> <span class="o">=</span> <span class="m">1e6</span> <span class="o">*</span> <span class="n">as.numeric</span><span class="p">(</span><span class="n">production_budget</span><span class="p">)</span>
  <span class="p">)</span>

<span class="n">rm</span><span class="p">(</span><span class="n">dfmovie2</span><span class="p">)</span>

Finally we have the movies data with some interesting colums like
production_budget, total life time gross income gross and the
max_day column which count the days in theatres. Here are
the top 10 movies

<span class="n">dfmovie</span> <span class="o">%>%</span>
  <span class="n">select</span><span class="p">(</span><span class="n">rank</span><span class="p">,</span> <span class="n">title</span><span class="p">,</span> <span class="n">year</span><span class="p">,</span> <span class="n">gross</span><span class="p">,</span> <span class="n">genre</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">mutate</span><span class="p">(</span><span class="n">gross</span> <span class="o">=</span> <span class="n">dollar</span><span class="p">(</span><span class="n">gross</span><span class="p">))</span> <span class="o">%>%</span> 
  <span class="n">head</span><span class="p">(</span><span class="m">10</span><span class="p">)</span>
rank title year gross genre
1 Star Wars: The Force Awakens 2015 $931,216,133 Sci-Fi Fantasy
2 Avatar 2009 $760,507,625 Sci-Fi Adventure
3 Titanic 1997 $658,672,302 Romance
4 Jurassic World 2015 $652,270,625 Sci-Fi Horror
5 Marvel’s The Avengers 2012 $623,357,910 Action / Adventure
6 The Dark Knight 2008 $534,858,444 Action / Adventure
7 Star Wars: Episode I – The Phantom Menace 1999 $474,544,677 Sci-Fi Fantasy
8 Star Wars 1977 $460,998,007 Sci-Fi Fantasy
9 Avengers: Age of Ultron 2015 $459,005,868 Action / Adventure
10 The Dark Knight Rises 2012 $448,139,099 Action Thriller

Phantom Menace ad Jurassic World top 10?

We have the incomes by day for every movie too. So we can plot time
series and compare! The data is just telling us what to do. Here’s
a sample of the detailed data by day.

<span class="n">dfgross</span> <span class="o">%>%</span>
  <span class="n">filter</span><span class="p">(</span><span class="n">box_id</span> <span class="o">==</span> <span class="s2">"starwars7"</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">mutate</span><span class="p">(</span><span class="n">gross</span> <span class="o">=</span> <span class="n">dollar</span><span class="p">(</span><span class="n">gross</span><span class="p">),</span>
         <span class="n">gross_to_date</span> <span class="o">=</span> <span class="n">dollar</span><span class="p">(</span><span class="n">gross_to_date</span><span class="p">))</span> <span class="o">%>%</span> 
  <span class="n">select</span><span class="p">(</span><span class="n">box_id</span><span class="p">,</span> <span class="n">date</span><span class="p">,</span> <span class="n">day_number</span><span class="p">,</span> <span class="n">gross</span><span class="p">,</span> <span class="n">gross_to_date</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">head</span><span class="p">(</span><span class="m">10</span><span class="p">)</span>
box_id date day_number gross gross_to_date
starwars7 Dec. 18, 2015 1 $119,119,282 $119,119,282
starwars7 Dec. 19, 2015 2 $68,294,204 $187,413,486
starwars7 Dec. 20, 2015 3 $60,553,189 $247,966,675
starwars7 Dec. 21, 2015 4 $40,109,742 $288,076,417
starwars7 Dec. 22, 2015 5 $37,361,729 $325,438,146
starwars7 Dec. 23, 2015 6 $38,022,183 $363,460,329
starwars7 Dec. 24, 2015 7 $27,395,725 $390,856,054
starwars7 Dec. 25, 2015 8 $49,325,663 $440,181,717
starwars7 Dec. 26, 2015 9 $56,731,532 $496,913,249
starwars7 Dec. 27, 2015 10 $43,145,665 $540,058,914

Plot

Okey, here we take a breath. A lot of ideas and only one order
to code all of them. Just start considering the release date for every
movie and its gross income evolution.

First well use the color for every movie extracted using the
nice RImagePalette package and the select the top movies and
the movies with more days in theatres to annotate them in the plot.

<span class="c1">#### plot ####
</span><span class="n">cols</span> <span class="o"><-</span> <span class="n">setNames</span><span class="p">(</span><span class="n">dfmovie</span><span class="o">$</span><span class="n">img_main_color</span><span class="p">,</span> <span class="n">dfmovie</span><span class="o">$</span><span class="n">box_id</span><span class="p">)</span>

<span class="n">ntoplabel</span> <span class="o"><-</span> <span class="m">10</span> <span class="o">+</span> <span class="m">1</span> <span class="c1"># rm starwars
</span><span class="n">nmostlong</span> <span class="o"><-</span> <span class="m">10</span>

<span class="n">moviestop</span> <span class="o"><-</span> <span class="n">dfmovie</span> <span class="o">%>%</span>
  <span class="n">arrange</span><span class="p">(</span><span class="n">rank</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">head</span><span class="p">(</span><span class="n">ntoplabel</span><span class="p">)</span> <span class="o">%>%</span>
  <span class="err">.</span><span class="o">$</span><span class="n">box_id</span>

<span class="n">movieslng</span> <span class="o"><-</span> <span class="n">dfmovie</span> <span class="o">%>%</span>
  <span class="n">arrange</span><span class="p">(</span><span class="n">desc</span><span class="p">(</span><span class="n">max_day</span><span class="p">))</span> <span class="o">%>%</span> 
  <span class="n">select</span><span class="p">(</span><span class="n">max_day</span><span class="p">,</span> <span class="n">box_id</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">head</span><span class="p">(</span><span class="n">ntoplabel</span><span class="p">)</span> <span class="o">%>%</span>
  <span class="err">.</span><span class="o">$</span><span class="n">box_id</span>

<span class="n">movieslbl</span> <span class="o"><-</span> <span class="n">unique</span><span class="p">(</span><span class="n">c</span><span class="p">(</span><span class="n">moviestop</span><span class="p">,</span> <span class="n">movieslng</span><span class="p">))</span>
<span class="n">movieslbl</span> <span class="o"><-</span> <span class="n">setdiff</span><span class="p">(</span><span class="n">movieslbl</span><span class="p">,</span> <span class="n">c</span><span class="p">(</span><span class="s2">"starwars4"</span><span class="p">))</span>

<span class="n">fmt_dllr_mm</span> <span class="o"><-</span> <span class="k">function</span><span class="p">(</span><span class="n">x</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">x</span> <span class="o">%>%</span> 
    <span class="p">{</span><span class="err">.</span><span class="o">/</span><span class="m">1000000</span><span class="p">}</span> <span class="o">%>%</span> 
    <span class="n">dollar</span><span class="p">()</span>
<span class="p">}</span>

<span class="n">tt1</span> <span class="o"><-</span> <span class="s2">"Cumulative Gross Income"</span>
<span class="n">stt1</span> <span class="o"><-</span> <span class="s2">"Titanic (1997),  Avatar (2009) and Star Wars VII (2016) are the movies with most gross income in the film history."</span>
<span class="n">cptn</span> <span class="o"><-</span> <span class="s2">"jkunst.com | Data from boxofficemojo.com"</span>

<span class="n">dfgross</span> <span class="o">%>%</span> 
  <span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">date2</span><span class="p">,</span> <span class="n">gross_to_date</span><span class="p">,</span>
             <span class="n">color</span> <span class="o">=</span> <span class="n">box_id</span><span class="p">,</span> <span class="n">label</span> <span class="o">=</span> <span class="n">str_to_title</span><span class="p">(</span><span class="n">box_id</span><span class="p">)))</span> <span class="o">+</span> 
  <span class="n">geom_line</span><span class="p">(</span><span class="n">alpha</span> <span class="o">=</span> <span class="m">0.25</span><span class="p">)</span> <span class="o">+</span> 
  <span class="n">scale_color_manual</span><span class="p">(</span><span class="n">values</span> <span class="o">=</span> <span class="n">cols</span><span class="p">)</span> <span class="o">+</span> 
  <span class="n">geom_label</span><span class="p">(</span><span class="n">data</span> <span class="o">=</span> <span class="n">dfgross</span> <span class="o">%>%</span>
               <span class="n">filter</span><span class="p">(</span><span class="n">box_id</span> <span class="o">%</span><span class="k">in</span><span class="o">%</span> <span class="n">movieslbl</span><span class="p">)</span> <span class="o">%>%</span> 
               <span class="n">arrange</span><span class="p">(</span><span class="n">desc</span><span class="p">(</span><span class="n">day_number</span><span class="p">))</span> <span class="o">%>%</span> 
               <span class="n">distinct</span><span class="p">(</span><span class="n">box_id</span><span class="p">))</span> <span class="o">+</span> 
  <span class="n">theme</span><span class="p">(</span><span class="n">legend.position</span> <span class="o">=</span> <span class="s2">"none"</span><span class="p">)</span> <span class="o">+</span>
  <span class="n">xlim</span><span class="p">(</span><span class="n">as.Date</span><span class="p">(</span><span class="n">min</span><span class="p">(</span><span class="n">dfgross</span><span class="o">$</span><span class="n">date2</span><span class="p">)),</span> <span class="n">as.Date</span><span class="p">(</span><span class="n">ymd</span><span class="p">(</span><span class="m">20170101</span><span class="p">)))</span> <span class="o">+</span> 
  <span class="n">scale_y_continuous</span><span class="p">(</span><span class="n">labels</span> <span class="o">=</span> <span class="n">fmt_dllr_mm</span><span class="p">)</span> <span class="o">+</span>
  <span class="n">labs</span><span class="p">(</span><span class="n">title</span> <span class="o">=</span> <span class="n">tt1</span><span class="p">,</span> <span class="n">subtitle</span> <span class="o">=</span> <span class="n">stt1</span><span class="p">,</span> <span class="n">caption</span> <span class="o">=</span> <span class="n">cptn</span><span class="p">,</span>
       <span class="n">x</span> <span class="o">=</span> <span class="s2">"Date"</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="s2">"Cumulative Gross (millions)"</span><span class="p">)</span>

plot of chunk unnamed-chunk-6

Mmm the first conclusion I get from this:

You don’t need only network data to get a spaghetti-like plot.

Mmm I think this is a nice result for the first try (this is a lie, I did more tries
before this plot XD). Clearly we can observe the date of release and compare the
gross income between the movies. Nice to see and remeber old classics like ET and
Back to the Future. Well the time scale is so big we can’t differentiate how long
each movie had been in theatres. To get a more fair comparision we plot every movie
considering x the day since release. I’m not sure if gross is comparable due
time of release but well keep data as is.

<span class="n">tt2</span> <span class="o"><-</span> <span class="s2">"Cumulative Gross Income by Days"</span>
<span class="n">stt2</span> <span class="o"><-</span> <span class="s2">"Only 3 movies: Jurassic Park (497 days) ET, Gladiator and were more than a year in theaters."</span>


<span class="n">dfgross</span> <span class="o">%>%</span> 
<span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">day_number</span><span class="p">,</span> <span class="n">gross_to_date</span><span class="p">,</span>
           <span class="n">color</span> <span class="o">=</span> <span class="n">box_id</span><span class="p">,</span> <span class="n">label</span> <span class="o">=</span> <span class="n">str_to_title</span><span class="p">(</span><span class="n">box_id</span><span class="p">)))</span> <span class="o">+</span> 
  <span class="n">geom_line</span><span class="p">(</span><span class="n">alpha</span> <span class="o">=</span> <span class="m">0.25</span><span class="p">)</span> <span class="o">+</span> 
  <span class="n">geom_label</span><span class="p">(</span><span class="n">data</span> <span class="o">=</span> <span class="n">dfgross</span> <span class="o">%>%</span>
               <span class="n">filter</span><span class="p">(</span><span class="n">box_id</span> <span class="o">%</span><span class="k">in</span><span class="o">%</span> <span class="n">movieslbl</span><span class="p">)</span> <span class="o">%>%</span> 
               <span class="n">arrange</span><span class="p">(</span><span class="n">desc</span><span class="p">(</span><span class="n">day_number</span><span class="p">))</span> <span class="o">%>%</span> 
               <span class="n">distinct</span><span class="p">(</span><span class="n">box_id</span><span class="p">))</span> <span class="o">+</span> 
  <span class="n">scale_color_manual</span><span class="p">(</span><span class="n">values</span> <span class="o">=</span> <span class="n">cols</span><span class="p">)</span> <span class="o">+</span> 
  <span class="n">theme</span><span class="p">(</span><span class="n">legend.position</span> <span class="o">=</span> <span class="s2">"none"</span><span class="p">)</span> <span class="o">+</span>
  <span class="n">xlim</span><span class="p">(</span><span class="n">NA</span><span class="p">,</span> <span class="m">550</span><span class="p">)</span> <span class="o">+</span> 
  <span class="n">scale_y_continuous</span><span class="p">(</span><span class="n">labels</span> <span class="o">=</span> <span class="n">fmt_dllr_mm</span><span class="p">)</span> <span class="o">+</span>
  <span class="n">labs</span><span class="p">(</span><span class="n">title</span> <span class="o">=</span> <span class="n">tt2</span><span class="p">,</span> <span class="n">subtitle</span> <span class="o">=</span> <span class="n">stt2</span><span class="p">,</span>  <span class="n">caption</span> <span class="o">=</span> <span class="n">cptn</span><span class="p">,</span>
       <span class="n">x</span> <span class="o">=</span> <span class="s2">"Days since release"</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="s2">"Cumulative Gross (millions)"</span><span class="p">)</span> <span class="o">+</span>
  <span class="n">annotate</span><span class="p">(</span><span class="s2">"segment"</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="m">365</span><span class="p">,</span> <span class="n">xend</span> <span class="o">=</span> <span class="m">365</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="m">0</span><span class="p">,</span> <span class="n">yend</span> <span class="o">=</span> <span class="m">925000000</span><span class="p">,</span> <span class="n">colour</span> <span class="o">=</span> <span class="s2">"gray"</span><span class="p">)</span> <span class="o">+</span>
  <span class="n">geom_text</span><span class="p">(</span><span class="n">label</span> <span class="o">=</span> <span class="s2">"One Year"</span><span class="p">,</span> <span class="n">x</span> <span class="o">=</span> <span class="m">365</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="m">950000000</span><span class="p">)</span>

plot of chunk unnamed-chunk-7

Jurassic Park and ET were more than a year! The plot still
like spaghetti but a info-tasty spagehtti.

Now, we can compare movies between other movies in their saga to
show what part number is in general most successful in terms of
income.

<span class="n">moviessaga</span> <span class="o"><-</span> <span class="n">dfgross</span> <span class="o">%>%</span> 
  <span class="n">distinct</span><span class="p">(</span><span class="n">movieserie</span><span class="p">,</span> <span class="n">serienumber</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">count</span><span class="p">(</span><span class="n">movieserie</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">arrange</span><span class="p">(</span><span class="n">desc</span><span class="p">(</span><span class="n">n</span><span class="p">))</span> <span class="o">%>%</span> 
  <span class="n">filter</span><span class="p">(</span><span class="n">n</span> <span class="o">>=</span> <span class="m">4</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="err">.</span><span class="o">$</span><span class="n">movieserie</span>

<span class="n">tt3</span> <span class="o"><-</span> <span class="s2">"Comparing Gross Income between Sagas"</span>
<span class="n">stt3</span> <span class="o"><-</span> <span class="s2">"Interesting pattern and order is showed in Pirates of the Carrbbean, 
Shrek and Transformes where the second movie have the greatest income"</span>
<span class="n">st</span> <span class="o"><-</span> <span class="n">gsub</span><span class="p">(</span><span class="s2">"n"</span><span class="p">,</span> <span class="s2">" "</span><span class="p">,</span> <span class="n">stt3</span><span class="p">)</span>

<span class="n">dfgross</span> <span class="o">%>%</span>
  <span class="n">filter</span><span class="p">(</span><span class="n">movieserie</span> <span class="o">%</span><span class="k">in</span><span class="o">%</span> <span class="n">moviessaga</span><span class="p">)</span> <span class="o">%>%</span>
  <span class="n">mutate</span><span class="p">(</span><span class="n">movieserie</span> <span class="o">=</span> <span class="n">factor</span><span class="p">(</span><span class="n">movieserie</span><span class="p">,</span> <span class="n">levels</span> <span class="o">=</span> <span class="n">moviessaga</span><span class="p">))</span> <span class="o">%>%</span> 
  <span class="n">ggplot</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">day_number</span><span class="p">,</span> <span class="n">gross_to_date</span><span class="p">,</span> <span class="n">label</span> <span class="o">=</span> <span class="n">serienumber</span><span class="p">))</span> <span class="o">+</span> 
  <span class="n">geom_line</span><span class="p">(</span><span class="n">aes</span><span class="p">(</span><span class="n">color</span> <span class="o">=</span> <span class="n">box_id</span><span class="p">),</span> <span class="n">alpha</span> <span class="o">=</span> <span class="m">0.5</span><span class="p">)</span> <span class="o">+</span>
  <span class="n">geom_label</span><span class="p">(</span><span class="n">data</span> <span class="o">=</span> <span class="n">dfgross</span> <span class="o">%>%</span>
               <span class="n">filter</span><span class="p">(</span><span class="n">movieserie</span> <span class="o">%</span><span class="k">in</span><span class="o">%</span> <span class="n">moviessaga</span><span class="p">)</span> <span class="o">%>%</span>
               <span class="n">mutate</span><span class="p">(</span><span class="n">serienumber</span> <span class="o">=</span> <span class="n">ifelse</span><span class="p">(</span><span class="n">box_id</span> <span class="o">==</span> <span class="s2">"transformers06"</span><span class="p">,</span> <span class="m">1</span><span class="p">,</span> <span class="n">serienumber</span><span class="p">),</span>
                      <span class="n">movieserie</span> <span class="o">=</span> <span class="n">factor</span><span class="p">(</span><span class="n">movieserie</span><span class="p">,</span> <span class="n">levels</span> <span class="o">=</span> <span class="n">moviessaga</span><span class="p">))</span> <span class="o">%>%</span>
               <span class="n">arrange</span><span class="p">(</span><span class="n">desc</span><span class="p">(</span><span class="n">day_number</span><span class="p">))</span> <span class="o">%>%</span>
               <span class="n">distinct</span><span class="p">(</span><span class="n">box_id</span><span class="p">))</span> <span class="o">+</span>
  <span class="n">facet_wrap</span><span class="p">(</span><span class="o">~</span><span class="n">movieserie</span><span class="p">,</span> <span class="n">scales</span> <span class="o">=</span> <span class="s2">"free_y"</span><span class="p">)</span> <span class="o">+</span> 
  <span class="n">scale_y_continuous</span><span class="p">(</span><span class="n">labels</span> <span class="o">=</span> <span class="n">fmt_dllr_mm</span><span class="p">)</span> <span class="o">+</span>
  <span class="n">labs</span><span class="p">(</span><span class="n">title</span> <span class="o">=</span> <span class="n">tt3</span><span class="p">,</span> <span class="n">subtitle</span> <span class="o">=</span> <span class="n">stt3</span><span class="p">,</span> <span class="n">caption</span> <span class="o">=</span> <span class="n">cptn</span><span class="p">,</span>
       <span class="n">x</span> <span class="o">=</span> <span class="s2">"Days since release"</span><span class="p">,</span> <span class="n">y</span> <span class="o">=</span> <span class="s2">"Gross (millions)"</span><span class="p">)</span> <span class="o">+</span> 
  <span class="n">theme</span><span class="p">(</span><span class="n">legend.position</span> <span class="o">=</span> <span class="s2">"none"</span><span class="p">)</span>

plot of chunk unnamed-chunk-8

Aha! Nice pattern 2-3-1-4 in the Pirates of the caribbean, Shrek and Transformers
we got: The first movie have a long time in theatres but they arent more popular
than the second one (and the 3rd) in the saga and the 4th is the movie with
less gross imcome.

Now well try to implement the scatter version of gross vs production_budget.

<span class="c1">#### chart ####
</span><span class="n">dsmovie</span> <span class="o"><-</span> <span class="n">dfmovie</span> <span class="o">%>%</span> 
  <span class="n">filter</span><span class="p">(</span><span class="o">!</span><span class="n">is.na</span><span class="p">(</span><span class="n">production_budget</span><span class="p">))</span> <span class="o">%>%</span> 
  <span class="n">mutate</span><span class="p">(</span><span class="n">x</span> <span class="o">=</span> <span class="n">gross</span><span class="p">,</span>
         <span class="n">y</span> <span class="o">=</span> <span class="n">production_budget</span><span class="p">,</span>
         <span class="n">gross_budget_ratio</span> <span class="o">=</span> <span class="n">percent</span><span class="p">(</span><span class="n">gross</span><span class="o">/</span><span class="n">production_budget</span><span class="p">),</span>
         <span class="n">production_budget</span> <span class="o">=</span> <span class="n">fmt_dllr_mm</span><span class="p">(</span><span class="n">production_budget</span><span class="p">),</span>
         <span class="n">gross</span> <span class="o">=</span> <span class="n">fmt_dllr_mm</span><span class="p">(</span><span class="n">gross</span><span class="p">),</span>
         <span class="n">name</span> <span class="o">=</span> <span class="n">title</span><span class="p">,</span>
         <span class="n">color</span> <span class="o">=</span> <span class="n">img_main_color</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">list.parse3</span><span class="p">()</span> 

<span class="n">t</span> <span class="o"><-</span> <span class="n">c</span><span class="p">(</span><span class="s2">"gross_budget_ratio"</span><span class="p">,</span> <span class="s2">"production_budget"</span><span class="p">,</span> <span class="s2">"gross"</span><span class="p">,</span> <span class="s2">"distributor"</span><span class="p">,</span> <span class="s2">"mpaa_rating"</span><span class="p">)</span>
<span class="n">x</span> <span class="o"><-</span> <span class="n">t</span> <span class="o">%>%</span> <span class="n">str_to_title</span><span class="p">()</span> <span class="o">%>%</span> <span class="n">gsub</span><span class="p">(</span><span class="s2">"_"</span><span class="p">,</span> <span class="s2">" "</span><span class="p">,</span> <span class="err">.</span><span class="p">)</span>
<span class="n">y</span> <span class="o"><-</span> <span class="n">sprintf</span><span class="p">(</span><span class="s2">"{point.%s}"</span><span class="p">,</span> <span class="n">t</span><span class="p">)</span>

<span class="n">tooltip</span> <span class="o"><-</span> <span class="n">tooltip_table</span><span class="p">(</span>
  <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span>
  <span class="n">img</span> <span class="o">=</span> <span class="n">tags</span><span class="o">$</span><span class="n">img</span><span class="p">(</span><span class="n">src</span> <span class="o">=</span> <span class="s2">"{point.img_url}"</span><span class="p">,</span> <span class="n">width</span> <span class="o">=</span> <span class="m">150</span><span class="p">,</span> <span class="n">height</span> <span class="o">=</span> <span class="m">222</span><span class="p">,</span>
                 <span class="n">style</span> <span class="o">=</span> <span class="s2">"display: block;margin-left: auto;margin-right:auto"</span><span class="p">),</span>
  <span class="sb">`min-heigth`</span> <span class="o">=</span> <span class="m">300</span> 
<span class="p">)</span>

<span class="n">hcscttr</span> <span class="o"><-</span> <span class="n">highchart</span><span class="p">()</span> <span class="o">%>%</span> 
  <span class="n">hc_chart</span><span class="p">(</span><span class="n">zoomType</span> <span class="o">=</span> <span class="s2">"xy"</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">hc_title</span><span class="p">(</span><span class="n">text</span> <span class="o">=</span> <span class="s2">"Gross Income versus Production Budget"</span><span class="p">)</span> <span class="o">%>%</span>
  <span class="n">hc_add_series</span><span class="p">(</span><span class="n">data</span> <span class="o">=</span> <span class="n">dsmovie</span><span class="p">,</span> <span class="n">type</span> <span class="o">=</span> <span class="s2">"scatter"</span><span class="p">,</span> <span class="n">showInLegend</span> <span class="o">=</span> <span class="n">FALSE</span><span class="p">)</span> <span class="o">%>%</span>
  <span class="n">hc_xAxis</span><span class="p">(</span><span class="n">title</span> <span class="o">=</span> <span class="n">list</span><span class="p">(</span><span class="n">text</span> <span class="o">=</span> <span class="s2">"Gross income"</span><span class="p">))</span> <span class="o">%>%</span> 
  <span class="n">hc_yAxis</span><span class="p">(</span><span class="n">title</span> <span class="o">=</span> <span class="n">list</span><span class="p">(</span><span class="n">text</span> <span class="o">=</span> <span class="s2">"Production Budget"</span><span class="p">))</span> <span class="o">%>%</span> 
  <span class="n">hc_tooltip</span><span class="p">(</span><span class="n">useHTML</span> <span class="o">=</span> <span class="n">TRUE</span><span class="p">,</span>
             <span class="n">headerFormat</span> <span class="o">=</span> <span class="n">as.character</span><span class="p">(</span><span class="n">tags</span><span class="o">$</span><span class="n">small</span><span class="p">(</span><span class="s2">"{point.key}"</span><span class="p">)),</span>
             <span class="n">pointFormat</span> <span class="o">=</span> <span class="n">tooltip</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">hc_add_theme</span><span class="p">(</span><span class="n">hc_theme_smpl</span><span class="p">())</span> 

<span class="n">hcscttr</span>

Mmm, not sure if we see a interesting pattern but the chart is
good for an exploratoy process: For example we can see Superman
Returns
have a ~1 gross budget ratio.

Now we’ll replicate the previous plots using highcharter
to have tooltips with more information ;D. Remember, you can zoom the
chart to view with more detail

<span class="n">x</span> <span class="o"><-</span> <span class="n">c</span><span class="p">(</span><span class="s2">"Income:"</span><span class="p">,</span> <span class="s2">"Genre"</span><span class="p">,</span> <span class="s2">"Runtime"</span><span class="p">)</span>
<span class="n">y</span> <span class="o"><-</span> <span class="n">c</span><span class="p">(</span><span class="s2">"$ {point.y}"</span><span class="p">,</span> <span class="s2">"{point.series.options.extra.genre}"</span><span class="p">,</span> <span class="s2">"{point.series.options.extra.runtime}"</span><span class="p">)</span>

<span class="n">tooltip</span> <span class="o"><-</span> <span class="n">tooltip_table</span><span class="p">(</span>
  <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">,</span>
  <span class="n">tags</span><span class="o">$</span><span class="n">img</span><span class="p">(</span><span class="n">src</span> <span class="o">=</span> <span class="s2">"{point.series.options.extra.img_url}"</span><span class="p">,</span> <span class="n">width</span> <span class="o">=</span> <span class="m">150</span><span class="p">,</span> <span class="n">height</span> <span class="o">=</span> <span class="m">222</span><span class="p">,</span>
           <span class="n">style</span> <span class="o">=</span> <span class="s2">"display: block;margin-left: auto;margin-right:auto"</span><span class="p">)</span>
<span class="p">)</span>


<span class="c1"># This function is a little tricky. We put the 
# title (not the value) only if the point is
# the LAST point in the data
</span><span class="n">fmtrr</span> <span class="o"><-</span> <span class="s2">"function() {
  if (this.point.x == this.series.data[this.series.data.length-1].x & 
       this.series.options.showlabel) {
      return this.series.options.extra.title;
  } else {
      return null;
  }
}"</span>

<span class="n">hcgross</span> <span class="o"><-</span> <span class="n">highchart</span><span class="p">()</span> <span class="o">%>%</span> 
  <span class="n">hc_chart</span><span class="p">(</span><span class="n">zoomType</span> <span class="o">=</span> <span class="s2">"x"</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">hc_tooltip</span><span class="p">(</span><span class="n">followPointer</span> <span class="o">=</span>  <span class="n">FALSE</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">hc_yAxis</span><span class="p">(</span><span class="n">title</span> <span class="o">=</span> <span class="n">list</span><span class="p">(</span><span class="n">text</span> <span class="o">=</span> <span class="s2">"Gross income"</span><span class="p">))</span> <span class="o">%>%</span>
  <span class="n">hc_tooltip</span><span class="p">(</span>
    <span class="n">useHTML</span> <span class="o">=</span> <span class="n">TRUE</span><span class="p">,</span>
    <span class="n">pointFormat</span> <span class="o">=</span> <span class="n">tooltip</span>
  <span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">hc_plotOptions</span><span class="p">(</span>
    <span class="n">series</span> <span class="o">=</span> <span class="n">list</span><span class="p">(</span>
      <span class="n">dataLabels</span> <span class="o">=</span> <span class="n">list</span><span class="p">(</span>
        <span class="n">enabled</span> <span class="o">=</span> <span class="n">TRUE</span><span class="p">,</span>
        <span class="n">align</span> <span class="o">=</span> <span class="s2">"left"</span><span class="p">,</span>
        <span class="n">verticalAlign</span> <span class="o">=</span> <span class="s2">"middle"</span><span class="p">,</span>
        <span class="n">formatter</span> <span class="o">=</span> <span class="n">JS</span><span class="p">(</span><span class="n">fmtrr</span><span class="p">),</span>
        <span class="n">crop</span> <span class="o">=</span> <span class="n">FALSE</span><span class="p">,</span>
        <span class="n">overflow</span> <span class="o">=</span> <span class="n">FALSE</span>
      <span class="p">)</span>
    <span class="p">)</span>
  <span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">hc_add_theme</span><span class="p">(</span><span class="n">hc_theme_smpl</span><span class="p">())</span> 

<span class="n">hcgross1</span> <span class="o"><-</span> <span class="n">hcgross</span> <span class="o">%>%</span> 
  <span class="n">hc_title</span><span class="p">(</span><span class="n">text</span> <span class="o">=</span> <span class="n">tt1</span><span class="p">)</span> <span class="o">%>%</span>
  <span class="n">hc_subtitle</span><span class="p">(</span><span class="n">text</span> <span class="o">=</span> <span class="n">stt1</span><span class="p">)</span> <span class="o">%>%</span>
  <span class="n">hc_xAxis</span><span class="p">(</span><span class="n">title</span> <span class="o">=</span> <span class="n">list</span><span class="p">(</span><span class="n">text</span> <span class="o">=</span> <span class="s2">"Date"</span><span class="p">))</span> <span class="o">%>%</span>
  <span class="n">hc_xAxis</span><span class="p">(</span><span class="n">type</span> <span class="o">=</span> <span class="s2">"datetime"</span><span class="p">)</span>

<span class="n">hcgross2</span> <span class="o"><-</span> <span class="n">hcgross</span> <span class="o">%>%</span> 
  <span class="n">hc_title</span><span class="p">(</span><span class="n">text</span> <span class="o">=</span> <span class="n">tt1</span><span class="p">)</span> <span class="o">%>%</span> 
  <span class="n">hc_subtitle</span><span class="p">(</span><span class="n">text</span> <span class="o">=</span> <span class="n">stt2</span><span class="p">)</span> <span class="o">%>%</span>
  <span class="n">hc_xAxis</span><span class="p">(</span><span class="n">title</span> <span class="o">=</span> <span class="n">list</span><span class="p">(</span><span class="n">text</span> <span class="o">=</span> <span class="s2">"Days since release"</span><span class="p">))</span> <span class="o">%>%</span> 
  <span class="n">hc_tooltip</span><span class="p">(</span><span class="n">headerFormat</span> <span class="o">=</span> <span class="n">as.character</span><span class="p">(</span><span class="n">tags</span><span class="o">$</span><span class="n">small</span><span class="p">(</span><span class="s2">"{point.key} days sinsce release"</span><span class="p">)))</span>

<span class="k">for</span> <span class="p">(</span><span class="n">id</span> <span class="k">in</span> <span class="n">unique</span><span class="p">(</span><span class="n">dfgross</...

To leave a comment for the author, please follow the link and comment on their blog: Jkunst - R category.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)