Articles by Anindya Mozumdar

An update to “An adventure in downloading books”

May 2, 2020 | Anindya Mozumdar

I received an email from Bernardo Lares as feedback on my previous article. You can also view some of the other cool work done by him in this link. His script is provided below. He uses the rvest package and the %__% operator to keep it really short and simple.
library(rvest)
library(dplyr)
library(stringr)

list <- "https://towardsdatascience.com/springer-has-released-65-machine-learning-and-data-books-for-free-961f8181f189"
aux <- read_html(list) %>%
  html_node("div") %>%
  html_text() %>%
  str_split("http://link.springer.com/openurl\\?genre=book&isbn=")
ids <- substr(unlist(aux), 1, 17)[-1]
sapply(ids, function(x) {
  url <- paste0("https://link.springer.com/content/pdf/10.1007%2F", x, ".pdf")
  download.file(url, paste0(x, ".pdf"), mode = "wb")
})
[Read more...]

An adventure in downloading books

April 26, 2020 | Anindya Mozumdar

Earlier today, I noticed a tweet from well known R community member Jozef Hajnala. The tweet was about Springer releasing around 65 books related to data science and machine learning for free to download as PDFs. Following the link in his tweet, I learned that Springer has released 408 books in total, ... [Read more...]

A guide to encoding categorical features using R

February 1, 2020 | Anindya Mozumdar

In this article, we will look at various options for encoding categorical features. We will also present R code for each of the encoding techniques. Categorical feature encoding is an important data processing step required for using these features in many statistical modelling and machine learning algorithms. The material in ... [Read more...]

Shiny splash screen using modules and shinyjs

December 10, 2019 | Anindya Mozumdar

A while ago I was researching on creating a splash screen for a Shiny application. My gut feel was that there will readily be a package available for this activity. I was surprised to see that not much information is available based on a 10 minute Google search. The top StackOverflow ... [Read more...]

R Vocabulary – Part 4

March 6, 2019 | Anindya Mozumdar

This is the fourth and final part in the series of articles on R vocabulary. In this series, we explore most of the functions mentioned in Chapter 2 of the book Advanced R. The first, second and third part of the series can be read here, here and here. In this ...
[Read more...]

R Vocabulary – Part 3

February 11, 2019 | Anindya Mozumdar

This is the third part of the series of articles on R vocabulary. In this series, we explore most of the functions mentioned in Chapter 2 of the book Advanced R. The first part of the series can be read here and the second part of the series can be read ... [Read more...]

R Vocabulary – Part 2

January 25, 2019 | Anindya Mozumdar

This is the second part of the series of articles on R vocabulary. In this series, we explore most of the functions mentioned in Chapter 2 of the book Advanced R. The first part of the series can be read here. The keyword function is used to define what is technically ... [Read more...]

R Vocabulary – Part 1

December 22, 2018 | Anindya Mozumdar

To be a proficient R user, you need to read and understand the material in the book Advanced R by Hadley Wickham. The second chapter in this book is on vocabulary - a list of functions from the base, stats and utils packages which all R users should be familiar ...
[Read more...]

Exploring R Packages – plyr

December 4, 2018 | Anindya Mozumdar

In this post, we explore the functionality provided by the plyr package. The ideas behind this package are described in this paper by Hadley Wickham. However, rather than trying to understand the theoretical underpinnings of the package, we look at some of the useful functions provided by this package and ... [Read more...]

Analysing IPL matches using Cricsheet data – Part 1

December 31, 2017 | Anindya Mozumdar

In a series of articles, I will be analysing Indian Premier League (IPL) cricket matches using data from cricsheet and using the R programming language. Cricsheet is an excellent website which provides ball-by-ball data for a large number of cricket matches. The IPL is a professional Twenty20 cricket league in ... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)