Looking back in 2017 and plans for 2018

[This article was first published on Marcelo S. Perlin, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

As we come close to the end of 2017, its time to look back. This has
been a great year for me in many ways. This blog started as a way to
write short pieces about using R for finance and promote my
book in an organic way.
Today, I’m very happy with my decision. Discovering and trying new
writing styles keeps my interest very alive. Academic research is very
strict on what you can write and publish. It is satisfying to see that I
can promote my work and have an impact in different ways, not only
through the publication of academic papers.

My blog is build using a Jekyll
template
, meaning the whole
site, including individual posts, is built and controlled with editable
text files and Github. All files related to posts follow the same
structure, meaning I can easily gather the textual data and organize it
in a nice tibble. Let’s first have a look in all post files:

post.folder <- '~/GitRepo/msperlin.github.io/_posts/'

my.f.posts <- list.files(post.folder, full.names = TRUE)
my.f.posts

##  [1] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-15-First-post.md"                  
##  [2] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-16-BatchGetSymbols.md"             
##  [3] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-17-predatory.md"                   
##  [4] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-18-GetHFData.md"                   
##  [5] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-19-CalculatingBetas.md"            
##  [6] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-30-Exams-with-dynamic-content.md"  
##  [7] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-05-R-and-Tennis.md"                
##  [8] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-06-My-Book-is-out.md"              
##  [9] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-10-Shiny_Exams.md"                 
## [10] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-13-R-and-Tennis-Players.md"        
## [11] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-16-Writing-a-book.md"              
## [12] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-03-05-Prophet-and_stock-market.md"    
## [13] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-03-26-pmdR-exercises.md"              
## [14] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-04-pafdR-is-out.md"                
## [15] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-09-Studying-Pkg-Names.md"          
## [16] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-15-R-Finance.md"                   
## [17] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-29-Update-GetHFData-1-3.md"        
## [18] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-06-01-Instaling-R-in-Linux.md"        
## [19] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-08-24-Reinstalling_R_Packages.md"     
## [20] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-08-24-Switching_to_Linux.md"          
## [21] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-04-Package-GetLattesData.md"       
## [22] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-10-Update-GetHFData-1-4.md"        
## [23] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-14-Brazilian-Yield-Curve.md"       
## [24] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-29-_Package-GetITRData.md"         
## [25] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-12-06-_Package-GetDFPData.md"         
## [26] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-12-13-_Serving-shiny-apps-internet.md"

I posted 26 posts during 2017. Notice how all dates are in the beginning
of the file name. I can easily convert that to a Date object using
as.Date. Let’s organize it all in a nice tibble.

library(tidyverse)

## ── Attaching packages ─────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──

## ✔ ggplot2 2.2.1     ✔ purrr   0.2.4
## ✔ tibble  1.4.1     ✔ dplyr   0.7.4
## ✔ tidyr   0.7.2     ✔ stringr 1.2.0
## ✔ readr   1.1.1     ✔ forcats 0.2.0

## ── Conflicts ────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()

df.posts <- tibble(ref.date = as.Date(basename(my.f.posts)),
                   ref.month = format(ref.date, '%m'), 
                   content = sapply(my.f.posts, function(x) paste0(readLines(x), collapse = '\n') ),
                   char.length = nchar(content)) %>%  # includes output code in length calculation..
  filter(ref.date > as.Date('2017-01-01') | ref.date < as.Date('2018-01-01') ) # not really necessary but keep it for future

glimpse(df.posts)

## Observations: 26
## Variables: 4
## $ ref.date    <date> 2017-01-15, 2017-01-16, 2017-01-17, 2017-01-18, 2...
## $ ref.month   <chr> "01", "01", "01", "01", "01", "01", "02", "02", "0...
## $ content     <chr> "---\nlayout: post\ntitle: \"My first post!\"\nsub...
## $ char.length <int> 1734, 5833, 6632, 17265, 23414, 12974, 18899, 1779...

Fist, let’s look at the frequency of posts by month:

print( ggplot(df.posts, aes(x = ref.month)) + geom_histogram(stat='count')) 

## Warning: Ignoring unknown parameters: binwidth, bins, pad

It is not accidental that january was the month with the highest number
of posts. This is when I had material reserved for the book. June and
July (0!) were the worst months as I traveled a lot. In June I attended
R and Finance in Chicago, SER in Rio de Janeiro and in July I was
visiting Goethe University in Germany for the whole month. On average, I
created 2.1666667 posts per month overall, which fells quite alright. I
hope I can keep that pace for the upcoming years.

As for the length of posts, below we can see a nice pattern for its
distribution conditional on the months of the year.

print(ggplot(df.posts, aes(x=ref.month, y = char.length)) + geom_boxplot())

I was not very productive from may to august, writing a few and short
posts, when comparing to other months. This was probably due to my
travels.

Plans for 2018

Despite the usual effort in research and teaching, my plans for 2018
are:

  • Work on the second edition of the portuguese
    book
    . It significantly
    lags the english version in content and this need to be fixed. I
    already have some ideas laid out for new chapters and new packages
    to cover. I’ll write more about this update as soon as I have it
    figured out.

  • Start a portal for financial data in Brazil. I want to make it
    easy for people to visualize and download organized financial data,
    specially those without programming experience. It will include the
    usual datasets such as prices in equity/bond/derivative markets for
    various frequencies, historical yield curves, financial statements
    of companies, and so on. The idea is to offer the datasets in
    various file formats, facilitating its use in research.

Thats it. If you got this far, happy new year! Enjoy your family and the
holidays!

To leave a comment for the author, please follow the link and comment on their blog: Marcelo S. Perlin.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)