Your data vis “Spidey-sense” & the need for a robust “utility belt”

June 16, 2016
By
Your data vis “Spidey-sense” & the need for a robust “utility belt”

@theboysmithy did a great piece on coming up with an alternate view for a timeline for an FT piece. Here’s an excerpt (read the whole piece, though, it’s worth it): Here is an example from a story recently featured in the FT: emerging- market populations are expected to age more rapidly than those in developed... Continue reading →

Read more »

Mapping US Counties in R with FIPS

June 16, 2016
By
Mapping US Counties in R with FIPS

Anyone who’s spent any time around data knows primary keys are your friend. Enter the FIPS code. FIPS is the Federal Information Processing Standard and appears in most data sets published by the US government. Name Matching The map below is an example as the “wrong way” to do something like this. This map uses

Read more »

The R Packages of UseR! 2016

June 16, 2016
By
The R Packages of UseR! 2016

by Joseph Rickert It is always a delight to discover a new and useful R package, and it is especially nice when the discovery comes with at context and testimonial to its effectiveness. It is also satisfying to be able to check in once in awhile and get an idea of what people think is hot, or current or...

Read more »

A Return.Portfolio Wrapper to Automate Harry Long Seeking Alpha Backtests

June 16, 2016
By
A Return.Portfolio Wrapper to Automate Harry Long Seeking Alpha Backtests

This post will cover a function to simplify creating Harry Long type rebalancing strategies from SeekingAlpha for interested readers. As … Continue reading →

Read more »

Visualizing obesity across United States by using data from Wikipedia

June 16, 2016
By
Visualizing obesity across United States by using data from Wikipedia

In this post I will show how to collect from a webpage and to analyze or visualize in R. For this task I will use the rvest package and will get the data from Wikipedia. I got the idea to write this post from Fisseha Berhane. I will gain access to the prevalence of obesity Related Post

Read more »

Radar charts in R using Plotly

June 16, 2016
By

This post is inspired by this question on Stack Overflow.. We’ll show how to create excel style Radar Charts in R using the plotly package.

Read more »

Hyperparameter Optimization in H2O: Grid Search, Random Search and the Future

June 15, 2016
By
Hyperparameter Optimization in H2O: Grid Search, Random Search and the Future

“Good, better, best. Never let it rest. ‘Til your good is better and your better is best.” – St. Jerome tl;dr H2O now has random hyperparameter search with time- and metric-based early stopping. Bergstra and Bengio write on p. 281: Compared with neural networks configured by a pure grid search, we find that random...

Read more »

Euro 2016 Squads

June 15, 2016
By
Euro 2016 Squads

This weekend I was having fun in France watching some Euro 2016 matches, visiting friends and avoiding Russian hooligans. Before my flight over I scraped some tables on the tournaments Wikipedia page with my newly acquired rvest skills, with the idea to build up a bilateral database of Euro 2016 squads and their players clubs. … Continue reading...

Read more »

Taking a closer look at Quantum gates and their operations

June 15, 2016
By
Taking a closer look at Quantum gates and their operations

This post is a continuation of my earlier post ‘Exploring Quantum gate operations with QCSimulator’. Here I take a closer look at more quantum gates and their operations, besides implementing these new gates in my Quantum Computing simulator, the  QCSimulator in R. Disclaimer: This article represents the author’s viewpoint only and doesn’t necessarily represent IBM’s

Read more »

Gender ratios of programmers, by language

June 15, 2016
By
Gender ratios of programmers, by language

While there are many admirable efforts to increase participation by women in STEM fields, in many programming teams men still outnumber women, often by a significant margin. Specifically by how much is a fraught question, and accurate statistics are hard to come by. Another interesting question is whether the gender disparity varies by language, and how to define a...

Read more »

Monthly Regional Tourism Estimates

June 15, 2016
By
Monthly Regional Tourism Estimates

A big 18 month project at work culminated today in the release of new Monthly Regional Tourism Estimates for New Zealand. Great work by the team in an area where we’ve pioneered the way, using administrative data from electronic transactions to supplement traditional sources in producing official statistics. Here’s a screen shot from one of the pages letting...

Read more »

Calculate your nutrients with my new package: NutrientData

June 15, 2016
By

I have created a new package: NutrientData This package contains data sets with the composition of Foods: Raw, Processed, Prepared. The source of the data is the USDA National Nutrient Database for Standard Reference, Release 28 (2015), a long with two functions to search and calculate nutrients. You download it from github: devtools::install_github("56north/NutrientData") Lets first... Read more »

Iraq-Wikileaks Analysis with R

June 14, 2016
By
Iraq-Wikileaks Analysis with R

In a place of extreme violence and devoid of order, the practical subsumes the principle. I drifted down the path of bribery and corruption endemic to the streets of Baghdad”.Jason Whiteley, Father of Money: Buying Peace in BaghdadAs I mentioned in a previous post, I wanted to explore the Wikileaks data of the US...

Read more »

jsonlite 0.9.22: distinguish between double and integer

June 14, 2016
By
jsonlite 0.9.22: distinguish between double and integer

Today a new version of the jsonlite package was released to CRAN. This update includes a few internal enhancements and one new feature. Doubles vs integers The new always_decimal parameter forces formatting of doubles in decimal notation....

Read more »

R, Yelp and the Search for Good Indian Food – An Open Course

June 14, 2016
By

New Free Course by Springboard and DataCamp Are all Yelp restaurant reviews created equal? Should we place greater trust in reviews made by people who know the cuisine well? How about reviews of ethnic restaurants by people of that ethnicity or reviews by seasoned Yelpers? We may not be able to find the perfect restaurant all the time,...

Read more »

Intro to The data.table Package

June 14, 2016
By
Intro to The data.table Package

Data Frames R provides a helpful data structure called the “data frame” that gives the user an intuitive way to organize, view, and access data.  Many of the functions that you would use to read in external files (e.g. read.csv) or connect to databases (RMySQL), will return a data frame structure by default. While there

Read more »

Using Microsoft R Server on a single machine for experiments with 600 million taxi rides.

June 14, 2016
By
Using Microsoft R Server on a single machine for experiments with 600 million taxi rides.

by Dmitry Pechyoni, Microsoft Data Scientist The New York City taxi dataset is one of the largest publicly available datasets. It has about 1.1 billion taxi rides in New York City. Previously this dataset was explored and visualized in a number of blog posts, where the authors used various technologies (e.g., PostgreSQL and Apache Elastic Search). Moreoever, in a...

Read more »

githubinstall: New R Package for Easy to Install R Packages on GitHub

June 14, 2016
By

1. OverviewA growing number of R packages are created by various people in the world. A part of the cause of it is the devtools package that makes it easy to develop R packages . The devtools package not only facilitates the process to develop R packages but also provides an another way to distribute R packages.When developers...

Read more »

My knitr LaTeX template: manuscript and supplement interleaved in one source file

June 14, 2016
By

Most of the time between starting manuscript and having it accepted after peer-review is spent writing, re-writing and re-arranging content. In Word, keeping track of figure numbers is a big pain, even more so when figures are moved between the main ma...

Read more »

R Hero saves Backup City with archivist and GitHub

June 14, 2016
By

Have you ever suffered because of the impossibility of reproducing graphs, tables or analysis’ results in R? Have you ever bothered yourself for not being able to share R objects (i.e., plots or final analysis models) within your reports, posters or articles? Or maybe simply you have too many objects you can’t manage to store in a convenient...

Read more »

Le Monde puzzle [#965]

June 13, 2016
By
Le Monde puzzle [#965]

A game-related Le Monde mathematical puzzle: Starting with a pile of 10⁴ tokens, Bob plays the following game: at each round, he picks one of the existing piles with at least 3 tokens, takes away one of the tokens in this pile, and separates the remaining ones into two non-empty piles of arbitrary size. Bob

Read more »

8 new R jobs from all over the world (2016-06-13)

June 13, 2016
By
8 new R jobs from all over the world (2016-06-13)

Here are the new R Jobs for 2016-06-13. To post your R job on the next post Just visit this link and post a new R job to the R community. You can either post a job for free, or pay $50 to have your job featured. Current R jobs Job seekers: please follow the links below to learn more and apply for your R job of interest:...

Read more »

R holds top ranking in KDnuggets software poll

June 13, 2016
By
R holds top ranking in KDnuggets software poll

The open-source R language is the most frequently used analytics / data science software, selected by 49% of the 2895 voters of the 2016 KDNuggets Software Poll. (R was also the top selection in last year's poll.) Python was a close second at 45.8%, and SQL was third at 35.5%. (Respondents could select multiple tools in the poll, and...

Read more »

tidyr 0.5.0

June 13, 2016
By
tidyr 0.5.0

I’m pleased to announce tidyr 0.5.0. tidyr makes it easy to “tidy” your data, storing it in a consistent form so that it’s easy to manipulate, visualise and model. Tidy data has a simple convention: put variables in the columns and observations in the rows. You can learn more about it in the tidy data

Read more »

Open to Non-Conference Attendees – R Workshops at EARL 2016

June 13, 2016
By
Open to Non-Conference Attendees – R Workshops at EARL 2016

EARL is a Conference for users and developers of the open source R programming language. The primary focus of the Conference is the commercial usage of R across a range of industry sectors with the aim of sharing knowledge and … Continue reading →

Read more »

Presentation slides on using graphics

June 13, 2016
By
Presentation slides on using graphics

Last week I gave a seminar for around 40 analysts from another government agency on using graphics to represent data. In doing such presentations, I usually focus on different purposes of graphics: exploratory as part of the analysis workflow (eg as diagnosis for statistical models) for presenting results Exactly what the purpose is makes quite a difference to...

Read more »

R for Publication by Page Piccinini: Lesson 4 – Multiple Regression

June 13, 2016
By
R for Publication by Page Piccinini: Lesson 4 – Multiple Regression

Introduction Today we’ll see what happens when you have not one, but two variables in your model. We will also continue to use some old and new dplyr calls, as well as another parameter for our ggplot2 figure. I’ll be taking for granted some of the set-up steps from Lesson 1, so if you haven’t done Lesson 4: Multiple...

Read more »

Germany most likely to win Euro 2016

June 13, 2016
By
Germany most likely to win Euro 2016

After World Cup 2014 we finally are facing the next spectacular football event now: Euro 2016. With billions of football fans spread all over the world, football still seems to be the single most popular sport. Might have something to do with the fact that football is a game of underdogs: David could beat Goliath any day. Just take

Read more »

Manhattanly: R package for Interactive Manhattan Plots

June 13, 2016
By

Introduction The new R package, manhattanly, creates interactive manhattan plots using the plotly.js engine. The plots are usable from the R console, the RStudio viewer pane, R Markdown documents, in Shiny apps, embeddable in websites and can be exported as .png files. By hovering the mouse over a point, you can see annotation information such

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series











Contact us if you wish to help support R-bloggers, and place your banner here.