April 2016

New nasadata R package

April 30, 2016 | 0 Comments

This package intends to provide a hassle-free way to access some of NASA’s open-source API’s to build applications or models. Because the documentation seems inconsistent and there are tons of API’s, I have concentrated my efforts on three which I believe provide the best “bang for my ... [Read more...]

Register now for Hadley Wickham’s Master R in Amsterdam

April 30, 2016 | 0 Comments

On May 19 and 20, 2016, Hadley Wickham will teach his two day Master R Developer Workshop in the centrally located European city of Amsterdam. This is the first time we’ve offered Hadley’s workshop in Europe. It’s a rare chance to learn from Hadley in person. Only 3 public Master R ...
[Read more...]

Introduction to R for Data Science

April 30, 2016 | 0 Comments

Branko Kovač Data Analyst at CUBE, Data Science Mentor at Springboard, Institut savremenih nauka, Data Science Serbia, and Goran S. Milovanović, [email protected], Data Science Serbia, are giving a free introductory course on R for Data Science in Belgrade, Serbia. All course materials - slides, R scripts, data sets, summaries ... [Read more...]

Why I fart? (or how small data changes life)

April 30, 2016 | 0 Comments

I have had gas problem for quite a while. Usually, right after I have lunch, gas starts to accumulate in my belly. Then comes the fart. It was really annoying, especially when you sat in the front row of a class. Sometimes it was even painful as there are too ... [Read more...]

Introduction to R for Data Science :: Session 1

April 30, 2016 | 0 Comments

Welcome to Introduction to R for Data Science Session 1! The course is co-organized by Data Science Serbia and Startit. You will find all course material (R scripts, data sets, SlideShare presentations, readings) on these pages. [in Serbian] Lecturers dipl. ing Branko Kovač, Data Analyst at CUBE, Data Science Mentor at ... [Read more...]

Lattice exercises – part 1

April 30, 2016 | 0 Comments

In the exercises below we will use the lattice package. First, we have to install this package with install.packages("lattice") and then we will call it library(lattice) . The Lattice package permits us to create univariate, bivariate and trivariate plots. For this set of exercises we will see univariate ... [Read more...]

Base R Nostalgia — by, tapply, ave, …

April 30, 2016 | 0 Comments

photo credit: Paul Yoakum This evening I was feeling nostalgic for base R group-bys. Before there was dplyr, there was apply and its cousins. I thought it’d be nice to get out the ol’ photo-album. To start off, the base R proto-ancestor of magr... [Read more...]

Base R Nostalgia — by, tapply, ave, …

April 30, 2016 | 0 Comments

photo credit: Paul Yoakum This evening I was feeling nostalgic for base R group-bys. Before there was dplyr, there was apply and its cousins. I thought it’d be nice to get out the ol’ photo-album. To start off, the base R proto-ancestor of magrittr piping for me was the ... [Read more...]

Identify, describe, plot, and remove the outliers from the dataset

April 30, 2016 | 0 Comments

In statistics, a outlier is defined as a observation which stands far away from the most of other observations. Often a outlier is present due to the measurements error. Therefore, one of the most important task in data analysis is to identify and (if is necessary) to remove the outliers. ... [Read more...]

Data science with Docker

April 29, 2016 | 0 Comments

Using docker to facilitate your data science pipelines Until recently, and like many other fellow data scientists I have talked to, I built data science pipelines on my local machine or a remote host while relying on virtual environments. In doing so, I ensured some degree of replicability by keeping ... [Read more...]

Bad Neighbours (no, not the movie)

April 29, 2016 | 0 Comments

Another day, another compulsion to see if I can do any better than someone’s solution. This one also comes from the FiveThiryEight Puzzler challenge courtesy of Xi’an The original challenge this time was The misanthropes are coming. Suppose there is...Continue Reading →
[Read more...]

Bad Neighbours (no, not the movie)

April 29, 2016 | 0 Comments

Another day, another compulsion to see if I can do any better than someone's solution. This one also comes from the FiveThiryEight Puzzler challenge courtesy of Xi'an The original challenge this time was The misanthropes are coming. Suppose there is...Continue Reading →
[Read more...]

Tufte-style graphics in R

April 29, 2016 | 0 Comments

It's not an overstatement to say that, at least for me personally, Edward Tufte's book The Visual Display of Quantitative Information was transformative. Reading this book got me and, I feel confident saying, many many other data scientists passionate about visualizing data. This is the book that popularized Minard's chart ... [Read more...]

Reasons to Move your Surveys Online

April 29, 2016 | 0 Comments

When I was collecting data for my last project, I printed off reams upon reams of paper for my questionnaires, information sheets etc. I did not particularly like it at the time but I could not see a different way of doing it. However, when it was completed and I ...
[Read more...]

Cross-Validation: Estimating Prediction Error

April 29, 2016 | 0 Comments

What is cross-validation? Cross-Validation is a technique used in model selection to better estimate the test error of a predictive model. The idea behind cross-validation is to create a number of partitions of sample observations, known as the validation sets, from the training data set. After fitting a model on ... [Read more...]

testthat 1.0.0

April 28, 2016 | 0 Comments

testthat 1.0.0 is now available on CRAN. Testthat makes it easy to turn your existing informal tests into formal automated tests that you can rerun quickly and easily. Learn more at http://r-pkgs.had.co.nz/tests.html. Install the latest version with: install.packages("testthat") This version of testthat saw ...
[Read more...]

Talk on regtools and P-Values

April 28, 2016 | 0 Comments

I’m deeply greatful to Hui Lin and the inimitable Yihui Xie for arranging for me to give a “virtual seminar talk” to the Central Iowa R Users Group. You can view my talk, including an interesting Q&A session, online. (The actual start is at 0:34.) There are two separate ...
[Read more...]

Playing with Twitter Data

April 28, 2016 | 0 Comments

Last Friday, the Institute for Social Sciences hosted a great one-day conference on various aspects of the reproducability crisis, Making Social Science Transparent. It was the first time I’ve done much tweeting during an event like this, and while it felt a little silly, it was also fun, it ... [Read more...]

The Life-Changing Magic of Tidying Text

April 28, 2016 | 0 Comments

When I went to the rOpenSci unconference about a month ago, I started work with Dave Robinson on a package for text mining using tidy data principles. What is this tidy data you keep hearing so much about? As described by Hadley Wickham, tidy data has a specific structure: each ... [Read more...]
1 2 3 13

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)