## Rsurveygizmo: An R package for interacting with the Survey Gizmo API

June 6, 2016
By

Several years ago our team began using SurveyGizmo for our online surveys (and, actually, a bunch of other projects as well, from polls to data entry templates). At the time, SurveyGizmo provided a nice balance between cost and customization when compared to similar products from, e.g., Qualtrics and SurveyMonkey. Over the years SurveyGizmo has greatly expanded the kinds of user...

## Time Series Analysis Using Max/Min… and some Neuroscience.

June 6, 2016
By

Introduction Time series has maximum and minimum points as general patterns. Sometimes the noise present on it causes problems to spot general behavior. In this post, we will smooth time series -reducing noise- to maximize the story that data has to tell us. And then, an easy formula will be applied to find and plot max/min...

## Ross Ihaka on the history of the R project

June 6, 2016
By

Ross Ihaka, one of the co-creators of R (along with Robert Gentleman), recently gave an interview to the University of Auckland's alumni magazine, Ingenio. In the article, he shares the story of the genesis of R in the early 1990s: The story all began back in the early 1990s when the internet was in its infancy and computers at...

## What are the Best Machine Learning Packages in R?

June 6, 2016
By

Guest post by Khushbu Shah The most common question asked by prospective data scientists is – “What is the best programming language for Machine Learning?” The answer to this question always results in a debate whether to choose R, Python or MATLAB for Machine Learning. Nobody can, in reality, answer the question as to whether Python or R is best...

## XGBoost workshop and meetup talk with Tianqi Chen

June 6, 2016
By

XGBoost is a fantastic open source implementation of Gradient Boosting Machines, a general purpose supervised learning method...

## Handling missing data with MICE package; a simple approach

June 6, 2016
By

This is a quick, short and concise tutorial on how to impute missing data. Previously, we have published an extensive tutorial on imputing missing values with MICE package. Current tutorial aim to be simple and user friendly for those who just starting using R. Preparing the dataset I have created a simulated dataset, which you Related Post

## idbr: access the US Census Bureau International Data Base in R

June 6, 2016
By

Today’s post is by Kyle Walker, a professor of geography at Texas Christian University. I’ve been a fan of Kyle’s work for a while. When I saw that he wrote a package for accessing the Census Bureau’s International Data Base, I asked him to write a guest post about it. The US Census Bureau’s International The post

## Frequency analysis challenge – a console-based game for R/python

June 6, 2016
By

Six months ago we’ve introduced ‘The Proton’ – a console based R game with six data wrangling puzzles. Around 15-30 minutes of fun with data. The game is on CRAN in the package BetaBit. And just few days ago we’ve added a second game – frequon(). Eight puzzles related with frequency analysis of encoded messages. … Czytaj dalej...

## R profiling

June 5, 2016
By

Profiling in R R has a built in performance and memory profiling facility: Rprof. Type  into your console to learn more. The way the profiler works is as follows: you start the profiler by calling Rprof, providing a filename where the profiling data should be stored you call the R functions that you want to analyse you The post

## Building the Data Matrix for the Task at Hand and Analyzing Jointly the Resulting Rows and Columns

June 5, 2016
By

Someone decided what data ought to go into the matrix. They placed the objects of interest in the rows and the features that differentiate among those objects into the columns. Decisions were made either to collect information or to store what was gathered for other purposes (e.g., data mining).A set of mutually constraining choices...

## Understanding data.table Rolling Joins

June 5, 2016
By

Understanding data.table Rolling JoinsRobert NorbergJune 5, 2016IntroductionRolling joins in data.table are incredibly useful, but not that well documented. I wrote this to help myself figure out how to use them and perhaps it can help you too.library(data.table)The SetupImagine we have an eCommerce website that uses a third party (like PayPal) to handle payments. We track user sessions on our website and...

## Exploring Quantum Gate operations with QCSimulator

June 5, 2016
By
$Exploring Quantum Gate operations with QCSimulator$

Introduction: Ever since I was initiated into Quantum Computing, through IBM’s Quantum Experience I have been hooked. My initial encounter with domain made me very excited and all wound up. The reason behind this, I think, is because there is an air of mystery around ‘Quantum’ anything.  After my early rush with the Quantum Experience,

## Bootstrap and cross-validation for evaluating modelling strategies

Modelling strategies I’ve been re-reading Frank Harrell’s Regression Modelling Strategies, a must read for anyone who ever fits a regression model, although be prepared - depending on your background, you might get 30 pages in and suddenly become convinced you’ve been doing nearly everything wrong before, which can be disturbing. I wanted to evaluate three simple modelling strategies in dealing...

## Curated list of R tutorials for Data Science

June 3, 2016
By

Here is topic wise list of R tutorials for Data Science, Time Series Analysis, Natural Language Processing and Machine Learning. This list also serves as a reference guide for several common data analysis tasks. The R Language Awesome-R Repository on GitHub R Reference Card: Cheatsheet R bloggers: blog aggregator R Resources on GitHub Awesome R

## Using geom_step

June 3, 2016
By

geom_step is an interesting geom supplied by the R package ggplot2. It is an appropriate rendering option for financial market data and we will show how and why to use it in this article. Let’s take a simple example of plotting market data. In this case we are plotting the "ask price" (the publicly published … Continue reading...

## Visualizing a flood with R

June 3, 2016
By

As more settlements in Texas and France are impacted by severe flooding, this is a good time to thank the hydrologists at the NOAA who forecast river level rises in advance and give residents in affected areas time to move to higher ground. Along with topgraphic, rainfall, and weather data, monitoring stations maintained by NOAA and the USGS along...

## RQGIS – integrating R with QGIS

June 3, 2016
By

This is the excerpt for your very first post.

## Images as x-axis labels (updated)

June 2, 2016
By

They say "if you want to find an answer on the internet, just present a wrong one as fact. Then wait." It didn't take long, actually. Despite my searches while trying to get images into x-axis labels it seems I...Continue Reading →

## Using caret to compare models

June 2, 2016
By

by Joseph Rickert The model table on the caret package website lists more that 200 variations of predictive analytics models that are available withing the caret framework. All of these models may be prepared, tuned, fit and evaluated with a common set of caret functions. All on its own, the table is an impressive testament to the utility and...

## methylKit v0.9.6

June 2, 2016
By

We released a new version of methylKit, which is a package for DNA methylation analysis with bisulfite-seq data. This version comes with many changes summarized below. you can also have a look at the release notes. The vignette is now converted to...

## Remote Pair Programming in R

June 2, 2016
By

Recently I’ve been doing a lot of remote pair programming with clients. A few people have asked how this works. Rather than try to explain it, I recorded one of my sessions with a client last week, and you can watch it below: The actual session was about an hour long, though I edited it The post

## Images as x-axis labels

June 2, 2016
By

Open-source software is awesome. If I found that a piece of closed-source software was missing a feature that I wanted, well, bad luck. I probably couldn't even tell if was actually missing or if I just didn't know about it....Continue Reading →

## R for Publication by Page Piccinini: Lesson 2 – Linear Regression

June 2, 2016
By

This is our first lesson where we actually learn and use a new statistic in R. For today’s lesson we’ll be focusing on linear regression. I’ll be taking for granted some of the set-up steps from Lesson 1, so if you haven’t done that yet be sure to go back and do it. By the Lesson 2: Linear...

## A demonstration of vtreat data preparation

June 1, 2016
By

This article is a demonstration the use of the R vtreat variable preparation package followed by caret controlled training. In previous writings we have gone to great lengths to document, explain and motivate vtreat. That necessarily gets long and unnecessarily feels complicated. In this example we are going to show what building a predictive model … Continue reading...

## Le Monde puzzle [#964]

June 1, 2016
By

A not so enticing Le Monde mathematical puzzle: Find the minimal value of a five digit number divided by the sum of its digits. This can formalised as finding the minimum of N/(a+b+c+d+e) when N writes abcde. And solved by brute force. Using a rough approach to finding the digits of a five-digit number, the

## Reference semantics in R

June 1, 2016
By

Question I recently got a mail from Václav on reference semantics in data.tree, reading as follows: Dear Christoph, I am rather inexperienced when it comes to environments in R and henceforth I apologize if my question is basic; however, my colleagues are no better than me to answer my question. I would have a question iro The post

## Covcalc: Shiny App for Calculating Coverage Depth or Read Counts for Sequencing Experiments

June 1, 2016
By

How many reads do I need? What's my sequencing depth? These are common questions I get all the time. Calculating how much sequence data you need to hit a target depth of coverage, or the inverse, what's the coverage depth given a set amount of sequenci...

## Trisurf Plots in R using Plotly

June 1, 2016
By

In this post we’ll show how to create Triangular Surface Plots in R. This post is based on timelyportfolio’s gist. Moebius Strip 2D Surface over a disk Chopper from python

## Scripting Loops In R

June 1, 2016
By

An R programmer can determine the order of processing of commands, via use of the control statements; repeat{}, while(), for(), break, and next Answers to the exercises are available here. Exercise 1 The repeat{} loop processes a block of code until the condition specified by the break statement, (that is mandatory within the repeat{} loop),