## Updates to the ‘forecast’ package for R

June 20, 2016
By

The forecast package for R, created and maintained by Professor Rob Hyndman of Monash University, is one of the more useful R packages available available on CRAN. Statistical forecasting — the process of predicting the future value of a time series — is used in just about every realm of data analysis, whether it's trying to predict a future...

## How to reshape data in R: tidyr vs reshape2

June 20, 2016
By

Reshape your data from long to wide, split a column, aggregate: a comparison between tidyr and reshape2 R packages to tidy data The post How to reshape data in R: tidyr vs reshape2 appeared first on MilanoR.

## Analyzing Public Health Data in R

June 20, 2016
By

Today’s post is by Thomas Yokota, an epidemiologist in Hawaii. I’ve been corresponding with Thomas via email and telephone for a while. I asked Thomas if he could write an introduction to how R, mapping and open data are used in the public health community. This is his reply. What is public health? Public health is a The post

## R for Publication by Page Piccinini: Lesson 5 – Analysis of Variance (ANOVA)

June 20, 2016
By

In today’s lesson we’ll take care of the baseline issue we had in the last lesson when we have a linear model with an interaction. To do that we’ll be learning about analysis of variance or ANOVA. We’ll also be going over how to make barplots with error bars, but not without hearing my reasons Lesson 5: Analysis...

## A Call to Arms[list] Data Analysis!

June 19, 2016
By

The NPR vis team contributed to a recent story about Armslist, a “craigslist for guns”. Now, I’m neither pro-“gun” or anti-“gun” since this subject, like most heated ones, has more than two sides. What I am is pro-data, and the U.S. Congress is so deep in the pockets of the NRA that there’s no way... Continue reading →

## Amazing Things That Happen When You Toss a Coin 12 Times

June 19, 2016
By

If there is a God, he’s a great mathematician (Paul Dirac) Imagine you toss a coin 12 times and you count how many heads and tails you are obtaining after each throwing (the coin is equilibrated so the probability of head or tail is the same). At some point, it can happen that number of heads and number of tails … Continue reading...

## Using xda with googlesheets in R

June 18, 2016
By

Want to do a quick, exploratory data analysis in R of your data that’s stored in a spreadsheet on Google Drive? You’re in luck, because now you can use the new xda

## Venn Diagram Comparison of Boruta, FSelectorRcpp and GLMnet Algorithms

June 18, 2016
By

Feature selection is a process of extracting valuable features that have significant influence on dependent variable. This is still an active field of research and machine wandering. In this post I compare few feature selection algorithms: traditional GLM with regularization, computationally demanding Boruta and entropy based filter from FSelectorRcpp (free of Java/Weka) package....

## The ffanalytics R Package for Fantasy Football Data Analysis

June 18, 2016
By

Introduction We are continuously looking to provide users ways to replicate our analyses and improve their performance in fantasy football. To that aim, we are introducing the ffanalytics R package that includes The post The ffanalytics R Package for Fantasy Football Data Analysis appeared first on Fantasy Football Analytics.

## PSA: R’s rnorm() and mvrnorm() use different spreads

June 17, 2016
By

Quick public service announcement for my fellow R nerds: R has two commonly-used random-Normal generators: rnorm and MASS::mvrnorm. I was foolish and assumed that their parameterizations were equivalent when you’re generating univariate data. But nope: Base R can generate univariate … Continue reading →

## Venn Diagram Comparison of Boruta, FSelectorRcpp and GLMnet Algorithms

June 17, 2016
By

Feature selection is a process of extracting valuable features that have significant influence on dependent variable. This is still an active field of research and machine wandering. In this post I compare few feature selection algorithms: traditional GLM with regularization, computationally demanding Boruta and entropy based filter from FSelectorRcpp (free of Java/Weka) package....

## R packaging industry close-up: How fast are we growing?

June 17, 2016
By

I worked a bit over the weekend preparing my talk to be delivered at the seminar organized by IBPAD this week at University of Brasilia, addressing the interfaces of Big Data and Society. I was invited to present the R package SciencesPo for an eclectic crowd. Eclectic in terms of background as well as familiarity with R, so, I...

## Can’t compute the standard deviation in your head? Divide the range by four.

June 17, 2016
By

Suppose you divide the range by four instead of taking the standard deviation. How accurate will you be? The post Can’t compute the standard deviation in your head? Divide the range by four. appeared first on Decision Science News.

## DataCamp course: Importing and managing financial data

June 17, 2016
By

The team at DataCamp announced a new R/Finance course series in a recent email:Subject: Data Mining Tutorial, R/Finance course series, and more!R/Finance - A new course series in the worksWe are working on a whole new course series on applied finance u...

## Data Journalism Awards Data Visualization of the Year, 2016

June 17, 2016
By

Congratulations to Peter Aldhous and Charles Seife of Buzzfeed News, winners of the 2016 Data Journalism Award for Data Visualization of the Year. They were recognized by their reporting for Spies in the Sky, which analyzed FAA air traffic records to visulize the domestic surveillance activities of the US government. Aldhouse and Seife used the R language to create...

## Introducing xda: R package for exploratory data analysis

June 17, 2016
By

This R package contains several tools to perform initial exploratory analysis on any input dataset. It includes custom functions for plotting the data as well as performing different kinds of analyses such as univariate, bivariate and multivariate investigation which is the first step of any predictive modeling pipeline. This package can be used to get

## Summary Statistics With Aggregate()

June 16, 2016
By

The aggregate() function subsets dataframes, and time series data, then computes summary statistics. The structure of the aggregate() function is aggregate(x, by, FUN). Answers to the exercises are available here. Exercise 1 Aggregate the “airquality” data by “airquality\$Month“, returning means on each of the numeric variables. Also, remove “NA” values. Exercise 2 Aggregate the “airquality”

## Your data vis “Spidey-sense” & the need for a robust “utility belt”

June 16, 2016
By

@theboysmithy did a great piece on coming up with an alternate view for a timeline for an FT piece. Here’s an excerpt (read the whole piece, though, it’s worth it): Here is an example from a story recently featured in the FT: emerging- market populations are expected to age more rapidly than those in developed... Continue reading →

## Mapping US Counties in R with FIPS

June 16, 2016
By

Anyone who’s spent any time around data knows primary keys are your friend. Enter the FIPS code. FIPS is the Federal Information Processing Standard and appears in most data sets published by the US government. Name Matching The map below is an example as the “wrong way” to do something like this. This map uses

## The R Packages of UseR! 2016

June 16, 2016
By

by Joseph Rickert It is always a delight to discover a new and useful R package, and it is especially nice when the discovery comes with at context and testimonial to its effectiveness. It is also satisfying to be able to check in once in awhile and get an idea of what people think is hot, or current or...

## A Return.Portfolio Wrapper to Automate Harry Long Seeking Alpha Backtests

June 16, 2016
By

This post will cover a function to simplify creating Harry Long type rebalancing strategies from SeekingAlpha for interested readers. As … Continue reading →

## Visualizing obesity across United States by using data from Wikipedia

June 16, 2016
By

In this post I will show how to collect from a webpage and to analyze or visualize in R. For this task I will use the rvest package and will get the data from Wikipedia. I got the idea to write this post from Fisseha Berhane. I will gain access to the prevalence of obesity Related Post

## Radar charts in R using Plotly

June 16, 2016
By

This post is inspired by this question on Stack Overflow.. We’ll show how to create excel style Radar Charts in R using the plotly package.

## Hyperparameter Optimization in H2O: Grid Search, Random Search and the Future

June 15, 2016
By

“Good, better, best. Never let it rest. ‘Til your good is better and your better is best.” – St. Jerome tl;dr H2O now has random hyperparameter search with time- and metric-based early stopping. Bergstra and Bengio write on p. 281: Compared with neural networks configured by a pure grid search, we find that random...

## Euro 2016 Squads

June 15, 2016
By

This weekend I was having fun in France watching some Euro 2016 matches, visiting friends and avoiding Russian hooligans. Before my flight over I scraped some tables on the tournaments Wikipedia page with my newly acquired rvest skills, with the idea to build up a bilateral database of Euro 2016 squads and their players clubs. … Continue reading...

## Taking a closer look at Quantum gates and their operations

June 15, 2016
By
$Taking a closer look at Quantum gates and their operations$

This post is a continuation of my earlier post ‘Exploring Quantum gate operations with QCSimulator’. Here I take a closer look at more quantum gates and their operations, besides implementing these new gates in my Quantum Computing simulator, the  QCSimulator in R. Disclaimer: This article represents the author’s viewpoint only and doesn’t necessarily represent IBM’s

## Gender ratios of programmers, by language

June 15, 2016
By

While there are many admirable efforts to increase participation by women in STEM fields, in many programming teams men still outnumber women, often by a significant margin. Specifically by how much is a fraught question, and accurate statistics are hard to come by. Another interesting question is whether the gender disparity varies by language, and how to define a...

## Monthly Regional Tourism Estimates

June 15, 2016
By

A big 18 month project at work culminated today in the release of new Monthly Regional Tourism Estimates for New Zealand. Great work by the team in an area where we’ve pioneered the way, using administrative data from electronic transactions to supplement traditional sources in producing official statistics. Here’s a screen shot from one of the pages letting...

## Calculate your nutrients with my new package: NutrientData

June 15, 2016
By

I have created a new package: NutrientData This package contains data sets with the composition of Foods: Raw, Processed, Prepared. The source of the data is the USDA National Nutrient Database for Standard Reference, Release 28 (2015), a long with two functions to search and calculate nutrients. You download it from github: devtools::install_github("56north/NutrientData") Lets first... Read more »

## Recent popular posts

Contact us if you wish to help support R-bloggers, and place your banner here.