Announcing ggraph: A grammar of graphics for relational data

February 22, 2017
I am absolutely thrilled to announce that ggraph has finally been released on CRAN. ggraph is my most ambitious package to date and its very early genesis has been described in a prior post. If any mention of ggraph is completely new to you, then in s...

Euler Problem 13: Large Sum of 1000 Digits

February 22, 2017
Euler Problem 13 asks to add one hundred numbers with fifty digits. This seems like a simple problem where it not that most computers are not designed to deal with numbers with a lot of integers. For example:   When asking … Continue reading → The post Euler Problem 13: Large Sum of 1000 Digits appeared first...

The difference between R and Excel

February 22, 2017
If you're an Excel user (or any other spreadsheet, really), adapting to learn R can be hard. As this blog post by Gordon Shotwell explains, one of the reasons is that simple things can be harder to do in R than Excel. But it's worth perservering, because complex things can be easier. While Excel (ahem) excels at things like...

February 22, 2017
This post was originally published on SmartCat, 22 Feb 2017.My inaugural blog as a Data Science Consultant for SmartCat. The code that accompanies the analyses presented here is available at the respective GitHub repository. On how to use R to estimate the optimal time during the day for aliens to invade Earth and a few more...

February 22, 2017
Radiohead is known for having some fairly maudlin songs, but of all of their tracks, which is the most depressing? Data scientist and R enthusiast Charlie Thompson ranked all of their tracks according to a "gloom index", and created the following chart of gloominess for each of the band's nine studio albums. (Click for the interactive version, crated with...

How to Teach R: Common mistakes

February 22, 2017
by Garrett Grolemund Would you like to teach people to use R? If so, I would like to jump-start your efforts. I’m one half of RStudio’s education team, and I’ve taught thousands of people to use R, usually in face-to-face workshops. Over time, I’ve come to appreciate that teaching R in a short workshop is

Quick tip: knitr Python Windows setup checklist

February 22, 2017
One of the nifty things about using R is that you can use it for many different purposes and even other languages! If you want to use Python in your knitr docs or the newish RStudio R notebook functionality, you might encounter some fiddliness getting all the moving parts running on Windows. This is a …

leaflet 1.1.0

February 22, 2017
Leaflet 1.1.0 is now available on CRAN! The Leaflet package is a tidy wrapper for the Leaflet.js mapping library, and makes it incredibly easy to generate interactive maps based on spatial data you have in R. This release was nearly a year in the making, and includes many important new features. Easily add textual labels on

Data Transformation in R: The #Tidyverse-Approach of Organizing Data #rstats

February 22, 2017
Yesterday, I had the pleasure to give a talk at the 8th Hamburg R User-Group meeting. I talked about data wrangling and data transformation, and how the… Read more "Data Transformation in R: The #Tidyverse-Approach of Organizing Data #rstats"

Part 3: Spatial analysis of geotagged data

February 21, 2017
Part 3: Spatial analysis of geotagged data See the other parts in this series of blog posts. In parts 1 and 2 we extracted spatial coordinates from our photos and then made an interactive web map that included data associate with those photos. Here I...

Raccoon | Ch 2.5 – Unbalanced and Nested Anova

February 21, 2017
Raccoon is a free web-book about Statistical Models with R. This chapter tackles two Anova special cases: Unbalanced Anova and Nested Anova . The post Raccoon | Ch 2.5 – Unbalanced and Nested Anova appeared first on Quantide - R training & consulting.

The Zero Bug

February 21, 2017
I am going to write about an insidious statistical, data analysis, and presentation fallacy I call “the zero bug” and the habits you need to cultivate to avoid it. The zero bug Here is the zero bug in a nutshell: common data aggregation tools often can not “count to zero” from examples, and this causes … Continue...

February 21, 2017
Announcing: DataCamp for the classroom, a new free plan for Academics. We want to support every student that wants to learn Data Science. That is why, as of today, professors/teachers/TA’s/… can give their students 6 months of FREE access to the f...

Linear Regression and ANOVA shaken and stirred

February 21, 2017
Linear Regression and ANOVA concepts are understood as separate concepts most of the times. The truth is they are extremely related to each other being ANOVA a particular case of Linear Regression. Even worse, its quite common that students do memorize equations and tests instead of trying to understand Linear Algebra and Statistics concepts that can keep you away...

Text Mining on Wine Description

February 21, 2017
Here is an example of text mining with correspondence analysis. Within the context of research into the characteristics of the wines from Chenin vines in the Loire Valley (French wines), a set of 10 dry white wines from Touraine were studied: 5 Touraine Protected Appellation of Origin (AOC) from Sauvignon vines, and 5 Vouvray AOC from Chenin

Three R Shiny tricks to make your Shiny app shines (2/3): Semi-collapsible sidebar

February 21, 2017
EDIT: Actually there is a much easier way to do so, by just adding the code below to the UI: tags\$script(HTML(“\$(‘body’).addClass(‘sidebar-mini’);”)) Thanks at @_pvictorr for suggesting it! Original post: In this tutorials sequence, we are going to see three tricks to do the following in a Shiny app: Add Next and Previous buttons to navigate

Coming soon!

February 21, 2017
We've just received a picture of the cover of the BCEA book, which is really, really close to being finally published!I did mention this in a few other posts (for example here and here) and it has been in fact a rather long process, so much so that I...

How to make a global map in R, step by step

February 21, 2017
In this post, I want to walk you through the logic of building a map, step by step ... The post How to make a global map in R, step by step appeared first on SHARP SIGHT LABS.

Use switch() instead of ifelse() to return a NULL

February 21, 2017
Have you ever tried to return a NULL with the ifelse() function? This function is a simple vectorized workflow for conditional statements. However, one can’t just return a NULL value as a result of this evaluation. Check a tricky workaround solution...

Who is Alan Turing?

February 21, 2017
This government is committed to introducing posthumous pardons for people with certain historical sexual offence convictions who would be innocent of any crime now (British Government Spokesperson, September 2016) Last September, the British government announced its intention to pursue what has become known as the Alan Turing law, offering exoneration to the tens of thousands … Continue...

Sentiment Analysis in R

February 21, 2017
Current research in finance and the social sciences utilizes sentiment analysis to understand human decisions in response to textual materials. While sentiment analysis has received great traction lately, the available tools are not yet living up to the needs of researchers. Especially R has not yet capabilities that most research desires. Our package “SentimentAnalysis” performs … Continue...

How to Create a Data Visualization from the New York Times in R

February 20, 2017
Undoubtedly, the New York Times publishes the best data visualizations and infographics that are data intensive, yet are elegant. The elegance comes from carefully studying the data, identifying the key patterns and simplifying the graphics to show these patterns or trends. Here’s what Amanda Cox, editor of The Upshot, said in an interview “we probably The post

First commit or initial commit?

February 20, 2017
When I create a new .git repository, my first commit message tends to be “1st commit”. I’ve been wondering what other people use as initial commit message. Today I used the gh package to get first commits of all repositories of the ropensci and r...

coauthorship and citation networks

February 20, 2017
As I discovered (!) the Annals of Applied Statistics in my mailbox just prior to taking the local train to Dauphine for the first time in 2017 (!), I started reading it on the way, but did not get any further than the first discussion paper by Pengsheng Ji and Jiashun Jin on coauthorship and

Training Neural Networks with MXNet

February 20, 2017
Multilayer perceptron Multilayer perceptron (MLP) is the simplest feed-forward neural network. It mitigates the constraints of original perceptron that was able to learn only linearly separable patterns from the data. It achieves this by introducing at least one hidden layer in order to learn representation of the data that would enable linear separation. In the first layer MLP apply linear...

R Weekly

February 20, 2017
During my Monday morning ritual of avoiding work,  I found this publication that is written in R, for people who use R – R Weekly.  The authors do a pretty awesome job of aggregating useful, entertaining, and informative content about what’s happening surrounding our favorite programming language.  Check it out, give the authors some love on GitHub, and leave … Continue...

rxNeuralNet vs. xgBoost vs. H2O

February 20, 2017
Recently, I did a session at local user group in Ljubljana, Slovenija, where I introduced the new algorithms that are available with MicrosoftML package for Microsoft R Server 9.0.3. For dataset, I have used two from (still currently) running sessions from Kaggle. In the last part, I did image detection and prediction of MNIST dataset … Continue...

strcode – structure your code better

I am pleased to announce my package strcode, a package that should make structuring code easier. You can install it from GitHub, a CRAN submission is planned at a later stage.

Data Science for Doctors – Part 4 : Inferential Statistics (1/5)

February 20, 2017
Data science enhances people’s decision making. Doctors and researchers are making critical decisions every day. Therefore, it is absolutely necessary for those people to have some basic knowledge of data science. This series aims to help people that are around medical field to enhance their data science skills. We will work with a health related Related exercise sets: