DIY – cheat sheets

March 20, 2017
By
DIY – cheat sheets

I found recently, that in addition to a great list of cheatsheets designed by RStudio, one can also download a template for new cheatsheets from RStudio Cheat Sheets webpage. With this template you can design your own cheatsheet, and submit it to the collection of Contributed Cheatsheets (Garrett Grolemund will help to improve the submission … Czytaj...

Read more »

How Do You Discover R Packages?

March 19, 2017
By
How Do You Discover R Packages?

Like I mentioned in my last blog post, I am contributing to a session at userR 2017 this coming July that will focus on discovering and learning about R packages. This is an increasingly important issue for R users as we all decide which of the 10,000+...

Read more »

Practical Data Science with R: ACM SIGACT News Book Review and Discount!

March 19, 2017
By
Practical Data Science with R: ACM SIGACT News Book Review and Discount!

Our book Practical Data Science with R has just been reviewed in Association for Computing Machinery Special Interest Group on Algorithms and Computation Theory (ACM SIGACT) News by Dr. Allan M. Miller (U.C. Berkeley)! The book is half off at Manning form March 21st 2017 using the following code (please share/Tweet): Deal of the Day … Continue...

Read more »

Exploring 2017 Retail Store Closings with R

March 19, 2017
By
Exploring 2017 Retail Store Closings with R

A story about one of the retail chains (J.C. Penny) releasing their list of stores closing in 2017 crossed paths with my Feedly reading list today and jogged my memory that there were a number of chains closing many of their doors this year, and I wanted to see the impact that might have on... Continue reading...

Read more »

News and Updates Surrounding plotly for R

March 19, 2017
By
News and Updates Surrounding plotly for R

The plotly R package will soon release version 4.6.0 which includes new features that are over a year in the making. The NEWS file lists all the new features and changes. This webinar highlights the most important new features including animations and multiple linked views. Concrete examples with code that you can run yourself will

Read more »

Data science for Doctors: Inferential Statistics Exercises (part-2)

March 19, 2017
By
Data science for Doctors: Inferential Statistics Exercises (part-2)

Data science enhances people’s decision making. Doctors and researchers are making critical decisions every day. Therefore, it is absolutely necessary for those people to have some basic knowledge of data science. This series aims to help people that are around medical field to enhance their data science skills. We will work with a health related Related exercise sets:

Read more »

Linear Regression and ANOVA shaken and stirred (Part 1)

March 19, 2017
By

Linear Regression and ANOVA concepts are understood as separate concepts most of the times. The truth is they are extremely related to each other being ANOVA a particular case of Linear Regression. Even worse, its quite common that students do memorize equations and tests instead of trying to understand Linear Algebra and Statistics concepts that can keep you away...

Read more »

The U.S. Has Been At War 222 Out of 239 Years

March 19, 2017
By
The U.S. Has Been At War 222 Out of 239 Years

This morning, I discovered an interesting statistic, America Has Been At War 93% of the Time – 222 Out of 239 Years – Since 1776,  i.e. the U.S. has only been at peace for less than 20 years total since its birth. I wanted to check, get a better understanding and look at other countries in the world. As always,...

Read more »

Preparing Datetime Data for Analysis with padr and dplyr

March 19, 2017
By
Preparing Datetime Data for Analysis with padr and dplyr

Two months ago padr was introduced, followed by an improved version that allowed for applying pad on group level. See the introduction blogs or the vignette("padr") for more package information. In this blog I give four more elaborate examples on how to go from raw data to insight with padr, dplyr and ggplot2. They might serve as recipes for...

Read more »

Preparing Datetime Data for Analysis with `padr` and `dplyr`

March 19, 2017
By
Preparing Datetime Data for Analysis with `padr` and `dplyr`

Two months ago padr was introduced, followed by an improved version that allowed for applying pad on group level. See the introduction blogs or the vignette("padr") for more package information. In this blog I give four more elaborate examples on how to go from raw data to insight with padr, dplyr and ggplot2. They might serve as recipes for...

Read more »

Rcpp 0.12.10: Some small fixes

March 19, 2017
By

The tenth update in the 0.12.* series of Rcpp just made it to the main CRAN repository providing GNU R with by now over 10,000 packages. Windows binaries for Rcpp, as well as updated Debian packages will follow in due course. This 0.12.10 release follows the 0.12.0 release from late July, the

Read more »

tidyquant Integrates Quandl: Getting Data Just Got Easier

tidyquant Integrates Quandl: Getting Data Just Got Easier

Today I’m very pleased to introduce the new Quandl API integration that is available in the development version of tidyquant. Normally I’d introduce this feature during the next CRAN release (v0.5.0 coming soon), but it’s really useful and honest...

Read more »

Faces of #rstats Twitter

March 18, 2017
By
Faces of #rstats Twitter

This week I was impressed by this tweet where Daniel Pett, Digital Humanities Lead at the British Museum, presented a collage of Twitter profile pics of all his colleagues. He made this piece of art using R (for collecting the usernames) and Python. I?...

Read more »

Python & R vs. SPSS & SAS

March 18, 2017
By
Python & R vs. SPSS & SAS

When we’re working for clients we mostly come across the statistical programming languages SAS, SPSS, R and Python. Of these SAS and SPSS are probably the most used. However, the interest for the open source languages R and Python is increasing. In recent years, some of our clients migrated from using SAS or SPSS to

Read more »

Presentation “R for Data Science”

March 18, 2017
By
Presentation “R for Data Science”

Some weeks ago I had a presentation at my work place about “R for data science” that I’d like to share with you. I’ve written the slides in R and rmarkdown and uploaded them to rpubs.com. I chose to use rmarkdown for my slides although we have great company PowerPoint templates, because I wanted to … Continue...

Read more »

My 3 video presentations on “Essential R”

March 17, 2017
By
My 3 video presentations on “Essential R”

In this post I include my  3 video presentations on the topic “Essential R”. In these 3 presentations I cover the entire landscape of R. I cover the following R Language – The essentials Key R Packages (dplyr, lubridate, ggplot2, etc.) How to create R Markdown and share reports A look at Shiny apps How … Continue...

Read more »

Contours of statistical penalty functions as GIF images

March 17, 2017
By
Contours of statistical penalty functions as GIF images

Many statistical modeling problems reduce to a minimization problem of the general form: or where $f$ is some type of loss function, $\mathbf{X}$ denotes the data, and $g$ is a penalty, also referred to by other names, such as “regularization term” (problems (1) and (2-3) are often equivalent by the way). Of course both, $f$ and $g$, may depend on further...

Read more »

Data Science at StitchFix

March 17, 2017
By
Data Science at StitchFix

If you want to see a great example of how data science can inform every stage of a business process, from product concept to operations, look no further than Stitch Fix's Algorithms Tour. Scroll down through this explainer to see how this personal styling service uses data and statistical inference to suggest clothes their customers will love, ship them...

Read more »

One way MANOVA exercises

March 17, 2017
By
One way MANOVA exercises

In ANOVA our interest lies in knowing if one continuous dependent variable is affected by one or more categorical independent variables. MANOVA is an extension of ANOVA where we are now able to understand how several dependent variables are affected by independent variables. For example consider an investigation where a medical investigator has developed 3 Related exercise sets:

Read more »

Experimenting With Sankey Diagrams in R and Python

March 17, 2017
By
Experimenting With Sankey Diagrams in R and Python

A couple of days ago, I spotted a post by Oli Hawkins on Visualising migration between the countries of the UK which linked to a Sankey diagram demo of Internal migration flows in the UK. One of the things that interests me about the Jupyter and RStudio centred reproducible research ecosystems is their support for

Read more »

DataChats: An Interview with Hank Roark

March 17, 2017
By

Hola! We just released episode 14 of our DataChats video series, you'll like this one. In this episode, we interview Hank Roark. Hank is a Senior Data Scientist at Boeing and a long time user of the R language. Prior to his current role, he led the C...

Read more »

Another R [Non-]Standard Evaluation Idea

March 17, 2017
By
Another R [Non-]Standard Evaluation Idea

Jonathan Carroll had a an interesting R language idea: to use @-notation to request value substitution in a non-standard evaluation environment (inspired by msyql User-Defined Variables). He even picked the right image: The idea is kind of reverse from some Lisp ideas ("evaled unless ticked"), but an interesting possibility. We can play along with it … Continue...

Read more »

Quandl and Forecasting

March 17, 2017
By

A Reproducible Finance with R Post by Jonathan Regenstein Welcome to another installment of Reproducible Finance with R. Today we are going to shift focus in recognition of the fact that there’s more to Finance than stock prices, and there’s more to data download than quantmod/getSymbols. In this post, we will explore commodity prices using

Read more »

Because its Friday… The IKEA Billy index

March 17, 2017
By
Because its Friday… The IKEA Billy index

Introduction Because it is Friday, another ‘playful and frivolous‘ data exercise 🙂 IKEA is more than a store, it is a very nice experience to go through. I can drop of my two kids at smàland, have some ‘quality time’ … Continue reading →

Read more »

How many digits into pi do you need to go to find your birthday?

March 17, 2017
By
How many digits into pi do you need to go to find your birthday?

FIND YOUR BIRTHDAY IN PI, IN THREE DIFFERENT FORMATS It was Pi Day (March 14, like 3/14, like 3.14, get it?) recently and Time Magazine did a fun interactive app in which you can find your birthday inside the digits of pi. However: 1) They only used one format of birthday, in which July 4th The post

Read more »

Summary of the BaselR meetup – march 2017

Summary of the BaselR meetup – march 2017

Last week I took part in the BaselR meeting hosted in the very nice Roche Learning Center Auditorium and organised by Mango-Solutions. This post will summarise this meetup. FFTrees FFTrees was presented by its creator, Nathaniel D. Phillips. This ...

Read more »

what does more efficient Monte Carlo mean?

March 16, 2017
By
what does more efficient Monte Carlo mean?

“I was just thinking that there might be a magic trick to simulate directly from this distribution without having to go for less efficient methods.” In a simple question on X validated a few days ago popped up the remark that the person asking the question wanted a direct simulation method

Read more »

Book Review: Testing R Code

March 16, 2017
By
Book Review: Testing R Code

When it comes to getting things right in data science, most of the focus goes to the data and the statistical methodology used. But when a misplaced parenthesis can throw off your results entirely, ensuring correctness in your programming is just as important. A new book published by CRC Press, Testing R Code by Richard (Richie) Cotton, provides all...

Read more »

Mapping Housing Data with R

March 16, 2017
By
Mapping Housing Data with R

What is my home worth?  Many homeowners in America ask themselves this question, and many have an answer.  What does the market think, though?  The best way to estimate a property’s value is by looking at other, similar properties that have sold recently in the same area – the comparable sales approach.  In an effort … Continue...

Read more »

Sponsors

Mango solutions











Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

ODSC1

ODSC2

datasociety

http://www.eoda.de







CRC R books series







Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.