Updates to the ‘forecast’ package for R

June 20, 2016
By
Updates to the ‘forecast’ package for R

The forecast package for R, created and maintained by Professor Rob Hyndman of Monash University, is one of the more useful R packages available available on CRAN. Statistical forecasting — the process of predicting the future value of a time series — is used in just about every realm of data analysis, whether it's trying to predict a future...

Read more »

How to reshape data in R: tidyr vs reshape2

June 20, 2016
By

Reshape your data from long to wide, split a column, aggregate: a comparison between tidyr and reshape2 R packages to tidy data The post How to reshape data in R: tidyr vs reshape2 appeared first on MilanoR.

Read more »

Analyzing Public Health Data in R

June 20, 2016
By
Analyzing Public Health Data in R

Today’s post is by Thomas Yokota, an epidemiologist in Hawaii. I’ve been corresponding with Thomas via email and telephone for a while. I asked Thomas if he could write an introduction to how R, mapping and open data are used in the public health community. This is his reply. What is public health? Public health is a The post

Read more »

R for Publication by Page Piccinini: Lesson 5 – Analysis of Variance (ANOVA)

June 20, 2016
By
R for Publication by Page Piccinini: Lesson 5 – Analysis of Variance (ANOVA)

In today’s lesson we’ll take care of the baseline issue we had in the last lesson when we have a linear model with an interaction. To do that we’ll be learning about analysis of variance or ANOVA. We’ll also be going over how to make barplots with error bars, but not without hearing my reasons Lesson 5: Analysis...

Read more »

A Call to Arms[list] Data Analysis!

June 19, 2016
By
A Call to Arms[list] Data Analysis!

The NPR vis team contributed to a recent story about Armslist, a “craigslist for guns”. Now, I’m neither pro-“gun” or anti-“gun” since this subject, like most heated ones, has more than two sides. What I am is pro-data, and the U.S. Congress is so deep in the pockets of the NRA that there’s no way... Continue reading →

Read more »

Amazing Things That Happen When You Toss a Coin 12 Times

June 19, 2016
By
Amazing Things That Happen When You Toss a Coin 12 Times

If there is a God, he’s a great mathematician (Paul Dirac) Imagine you toss a coin 12 times and you count how many heads and tails you are obtaining after each throwing (the coin is equilibrated so the probability of head or tail is the same). At some point, it can happen that number of heads and number of tails … Continue reading...

Read more »

Using xda with googlesheets in R

June 18, 2016
By
Using xda with googlesheets in R

Want to do a quick, exploratory data analysis in R of your data that’s stored in a spreadsheet on Google Drive? You’re in luck, because now you can use the new xda

Read more »

Venn Diagram Comparison of Boruta, FSelectorRcpp and GLMnet Algorithms

June 18, 2016
By
Venn Diagram Comparison of Boruta, FSelectorRcpp and GLMnet Algorithms

Feature selection is a process of extracting valuable features that have significant influence on dependent variable. This is still an active field of research and machine wandering. In this post I compare few feature selection algorithms: traditional GLM with regularization, computationally demanding Boruta and entropy based filter from FSelectorRcpp (free of Java/Weka) package....

Read more »

The ffanalytics R Package for Fantasy Football Data Analysis

June 18, 2016
By
The ffanalytics R Package for Fantasy Football Data Analysis

Introduction We are continuously looking to provide users ways to replicate our analyses and improve their performance in fantasy football. To that aim, we are introducing the ffanalytics R package that includes The post The ffanalytics R Package for Fantasy Football Data Analysis appeared first on Fantasy Football Analytics.

Read more »

PSA: R’s rnorm() and mvrnorm() use different spreads

June 17, 2016
By

Quick public service announcement for my fellow R nerds: R has two commonly-used random-Normal generators: rnorm and MASS::mvrnorm. I was foolish and assumed that their parameterizations were equivalent when you’re generating univariate data. But nope: Base R can generate univariate … Continue reading →

Read more »

Venn Diagram Comparison of Boruta, FSelectorRcpp and GLMnet Algorithms

June 17, 2016
By
Venn Diagram Comparison of Boruta, FSelectorRcpp and GLMnet Algorithms

Feature selection is a process of extracting valuable features that have significant influence on dependent variable. This is still an active field of research and machine wandering. In this post I compare few feature selection algorithms: traditional GLM with regularization, computationally demanding Boruta and entropy based filter from FSelectorRcpp (free of Java/Weka) package....

Read more »

R packaging industry close-up: How fast are we growing?

June 17, 2016
By
R packaging industry close-up: How fast are we growing?

I worked a bit over the weekend preparing my talk to be delivered at the seminar organized by IBPAD this week at University of Brasilia, addressing the interfaces of Big Data and Society. I was invited to present the R package SciencesPo for an eclectic crowd. Eclectic in terms of background as well as familiarity with R, so, I...

Read more »

Can’t compute the standard deviation in your head? Divide the range by four.

June 17, 2016
By
Can’t compute the standard deviation in your head? Divide the range by four.

Suppose you divide the range by four instead of taking the standard deviation. How accurate will you be? The post Can’t compute the standard deviation in your head? Divide the range by four. appeared first on Decision Science News.

Read more »

DataCamp course: Importing and managing financial data

June 17, 2016
By

The team at DataCamp announced a new R/Finance course series in a recent email:Subject: Data Mining Tutorial, R/Finance course series, and more!R/Finance - A new course series in the worksWe are working on a whole new course series on applied finance u...

Read more »

Data Journalism Awards Data Visualization of the Year, 2016

June 17, 2016
By
Data Journalism Awards Data Visualization of the Year, 2016

Congratulations to Peter Aldhous and Charles Seife of Buzzfeed News, winners of the 2016 Data Journalism Award for Data Visualization of the Year. They were recognized by their reporting for Spies in the Sky, which analyzed FAA air traffic records to visulize the domestic surveillance activities of the US government. Aldhouse and Seife used the R language to create...

Read more »

Introducing xda: R package for exploratory data analysis

June 17, 2016
By
Introducing xda: R package for exploratory data analysis

This R package contains several tools to perform initial exploratory analysis on any input dataset. It includes custom functions for plotting the data as well as performing different kinds of analyses such as univariate, bivariate and multivariate investigation which is the first step of any predictive modeling pipeline. This package can be used to get

Read more »

Summary Statistics With Aggregate()

June 16, 2016
By
Summary Statistics With Aggregate()

The aggregate() function subsets dataframes, and time series data, then computes summary statistics. The structure of the aggregate() function is aggregate(x, by, FUN). Answers to the exercises are available here. Exercise 1 Aggregate the “airquality” data by “airquality$Month“, returning means on each of the numeric variables. Also, remove “NA” values. Exercise 2 Aggregate the “airquality”

Read more »

Your data vis “Spidey-sense” & the need for a robust “utility belt”

June 16, 2016
By
Your data vis “Spidey-sense” & the need for a robust “utility belt”

@theboysmithy did a great piece on coming up with an alternate view for a timeline for an FT piece. Here’s an excerpt (read the whole piece, though, it’s worth it): Here is an example from a story recently featured in the FT: emerging- market populations are expected to age more rapidly than those in developed... Continue reading →

Read more »

Mapping US Counties in R with FIPS

June 16, 2016
By
Mapping US Counties in R with FIPS

Anyone who’s spent any time around data knows primary keys are your friend. Enter the FIPS code. FIPS is the Federal Information Processing Standard and appears in most data sets published by the US government. Name Matching The map below is an example as the “wrong way” to do something like this. This map uses

Read more »

The R Packages of UseR! 2016

June 16, 2016
By
The R Packages of UseR! 2016

by Joseph Rickert It is always a delight to discover a new and useful R package, and it is especially nice when the discovery comes with at context and testimonial to its effectiveness. It is also satisfying to be able to check in once in awhile and get an idea of what people think is hot, or current or...

Read more »

A Return.Portfolio Wrapper to Automate Harry Long Seeking Alpha Backtests

June 16, 2016
By
A Return.Portfolio Wrapper to Automate Harry Long Seeking Alpha Backtests

This post will cover a function to simplify creating Harry Long type rebalancing strategies from SeekingAlpha for interested readers. As … Continue reading →

Read more »

Visualizing obesity across United States by using data from Wikipedia

June 16, 2016
By
Visualizing obesity across United States by using data from Wikipedia

In this post I will show how to collect from a webpage and to analyze or visualize in R. For this task I will use the rvest package and will get the data from Wikipedia. I got the idea to write this post from Fisseha Berhane. I will gain access to the prevalence of obesity Related Post

Read more »

Radar charts in R using Plotly

June 16, 2016
By

This post is inspired by this question on Stack Overflow.. We’ll show how to create excel style Radar Charts in R using the plotly package.

Read more »

Hyperparameter Optimization in H2O: Grid Search, Random Search and the Future

June 15, 2016
By
Hyperparameter Optimization in H2O: Grid Search, Random Search and the Future

“Good, better, best. Never let it rest. ‘Til your good is better and your better is best.” – St. Jerome tl;dr H2O now has random hyperparameter search with time- and metric-based early stopping. Bergstra and Bengio write on p. 281: Compared with neural networks configured by a pure grid search, we find that random...

Read more »

Euro 2016 Squads

June 15, 2016
By
Euro 2016 Squads

This weekend I was having fun in France watching some Euro 2016 matches, visiting friends and avoiding Russian hooligans. Before my flight over I scraped some tables on the tournaments Wikipedia page with my newly acquired rvest skills, with the idea to build up a bilateral database of Euro 2016 squads and their players clubs. … Continue reading...

Read more »

Taking a closer look at Quantum gates and their operations

June 15, 2016
By
Taking a closer look at Quantum gates and their operations

This post is a continuation of my earlier post ‘Exploring Quantum gate operations with QCSimulator’. Here I take a closer look at more quantum gates and their operations, besides implementing these new gates in my Quantum Computing simulator, the  QCSimulator in R. Disclaimer: This article represents the author’s viewpoint only and doesn’t necessarily represent IBM’s

Read more »

Gender ratios of programmers, by language

June 15, 2016
By
Gender ratios of programmers, by language

While there are many admirable efforts to increase participation by women in STEM fields, in many programming teams men still outnumber women, often by a significant margin. Specifically by how much is a fraught question, and accurate statistics are hard to come by. Another interesting question is whether the gender disparity varies by language, and how to define a...

Read more »

Monthly Regional Tourism Estimates

June 15, 2016
By
Monthly Regional Tourism Estimates

A big 18 month project at work culminated today in the release of new Monthly Regional Tourism Estimates for New Zealand. Great work by the team in an area where we’ve pioneered the way, using administrative data from electronic transactions to supplement traditional sources in producing official statistics. Here’s a screen shot from one of the pages letting...

Read more »

Calculate your nutrients with my new package: NutrientData

June 15, 2016
By

I have created a new package: NutrientData This package contains data sets with the composition of Foods: Raw, Processed, Prepared. The source of the data is the USDA National Nutrient Database for Standard Reference, Release 28 (2015), a long with two functions to search and calculate nutrients. You download it from github: devtools::install_github("56north/NutrientData") Lets first... Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series











Contact us if you wish to help support R-bloggers, and place your banner here.