Useful dplyr Functions (w/examples)

July 10, 2017
By
Useful dplyr Functions (w/examples)

The R package dplyr is an extremely useful resource for data cleaning, manipulation, visualisation and analysis. It contains a large number of very useful functions and is, without doubt, one of my top 3 R packages today (ggplot2 and reshape2 being the others). When I was learning how to use dplyr for the first time,… Continue reading Useful dplyr...

Read more »

What is magrittr’s future in the tidyverse?

July 10, 2017
By
What is magrittr’s future in the tidyverse?

For many R users the magrittr pipe is a popular way to arrange computation and famously part of the tidyverse. The tidyverse itself is a rapidly evolving centrally controlled package collection. The tidyverse authors publicly appear to be interested in re-basing the tidyverse in terms of their new rlang/tidyeval package. So it is natural to … Continue reading What...

Read more »

Volatility modelling in R exercises (Part-3)

July 10, 2017
By
Volatility modelling in R exercises (Part-3)

This is the third part of the series on volatility modelling. For other parts of the series follow the tag volatility. In this exercise set we will use GARCH models to forecast volatility. Answers to the exercises are available here. Exercise 1 Load the rugarch and the FinTS packages. Next, load the m.ibmspln dataset from Related exercise sets: Forecasting: Linear...

Read more »

The R Shiny packages you need for your web apps!

July 10, 2017
By
The R Shiny packages you need for your web apps!

Shiny is an R Package to deploy web apps using an R backend. Let’s face it, Shiny is awesome! It brings all the power of R to a simple web app with interactivity, user inputs, and interactive visualizations. If you don’t know Shiny yet, you can access a selection of apps on Show me shiny. As The post The R...

Read more »

The R Journal, Volume 9, Issue 1, June 2017 – is online!

July 10, 2017
By

The new issue of The R Journal is now available! You may Download the complete issue, or choose your topic of interest from the following links:   Table of contents Editorial  Roger Bivand 4 Contributed Research Articles iotools: High-Performance I/O Tools for R  Taylor Arnold, Michael J. Kane and Simon Urbanek 6 IsoGeneGUI: Multiple Approaches for Dose-Response Analysis of Microarray Data Using R ...

Read more »

Downloading S&P 500 Stock Data from Google/Quandl with R (Command Line Script)

July 10, 2017
By
Downloading S&P 500 Stock Data from Google/Quandl with R (Command Line Script)

In this article, I provide an R script for easily downloading closing prices for stocks included in the S&P 500 index.

Read more »

Assessing the Accuracy of our models (R Squared, Adjusted R Squared, RMSE, MAE, AIC)

July 10, 2017
By
Assessing the Accuracy of our models (R Squared, Adjusted R Squared, RMSE, MAE, AIC)

Assessing the accuracy of our model There are several ways to check the accuracy of our models, some are printed directly in R within the summary output, others are just as easy to calculate with specific functions. R-Squared This is probably the most commonly used statistics and allows us to understand the percentage of variance in the target variable explained by the...

Read more »

Linear Mixed Effects Models in Agriculture

July 10, 2017
By
Linear Mixed Effects Models in Agriculture

This post was originally part of my previous post about linear models. However, I later decided to split it into several texts because it was effectively too long and complex to navigate.If you struggle to follow the code in this page please refer to this post (for example for the necessary packages): Linear Models (lm, ANOVA and ANCOVA) in Agriculture Linear...

Read more »

Generalized Linear Models and Mixed-Effects in Agriculture

July 10, 2017
By
Generalized Linear Models and Mixed-Effects in Agriculture

After publishing my previous post, I realized that it was way too long and so I decided to split it in 2-3 parts. If you think something is missing in the explanation here it may be related to the fact that this was originally part of the previous post (http://r-video-tutorial.blogspot.co.uk/2017/06/linear-models-anova-glms-and-mixed.html), so please look there first (otherwise please post your...

Read more »

A tour of the tibble package

July 9, 2017
By

Dataframes are used in R to hold tabular data. Think of the prototypical spreadsheet or database table: a grid of data arranged into rows and columns. That’s a dataframe. The tibble R package provides a fresh take on dataframes to fix some longstan...

Read more »

Using simulation for power analysis: an example based on a stepped wedge study design

July 9, 2017
By
Using simulation for power analysis: an example based on a stepped wedge study design

Simulation can be super helpful for estimating power or sample size requirements when the study design is complex. This approach has some advantages over an analytic one (i.e. one based on a formula), particularly the flexibility it affords in setting up the specific assumptions in the planned study, such as time trends, patterns of missingness, or effects of different levels...

Read more »

How to Scrape Images from Google

July 9, 2017
By
How to Scrape Images from Google

In my last post, I tried to train a deep neural net to detect brand logos in images. For that, I downloaded the Flickr27-dataset, containing 270 images of 27 different brands. As this dataset is rather small I turned to google image search and wrote a...

Read more »

Summer of data science 1: Genomic prediction machines #SoDS17

July 9, 2017
By
Summer of data science 1: Genomic prediction machines #SoDS17

Genetics is a data science, right? One of my Summer of data science learning points was to play with out of the box prediction tools. So let’s try out a few genomic prediction methods. The code is on GitHub, and the simulated data are on Figshare. Genomic selection is the happy melding of quantitative and

Read more »

sweep: Extending broom for time series forecasting

sweep: Extending broom for time series forecasting

We’re pleased to introduce a new package, sweep, now on CRAN! Think of it like broom for the forecast package. The forecast package is the most popular package for forecasting, and for good reason: it has a number of sophisticated forecast modeling f...

Read more »

My Presentation at useR! 2017, Etc.

July 8, 2017
By
My Presentation at useR! 2017, Etc.

I gave a talk titled, “Parallel Computation in R:  What We Want, and How We (Might) Get It,” at last week’s useR! 2017 conference in Brussels. You can view my slides here, and I think the conference organizers said the videos would be placed online, not sure of that though. The goal of the talk … Continue reading My...

Read more »

Hacking statistics or: How I Learned to Stop Worrying About Calculus and Love Stats Exercises (Part-2)

July 8, 2017
By
Hacking statistics or: How I Learned to Stop Worrying About Calculus and Love Stats Exercises (Part-2)

Statistics are often taught in school by and for people who like Mathematics. As a consequence, in those class emphasis is put on leaning equations, solving calculus problems and creating mathematics models instead of building an intuition for probabilistic problems. But, if you read this, you know a bit of R programming and have access Related exercise sets: Hacking statistics...

Read more »

Learning Club 16: Genetic Algorithms

July 8, 2017
By

Some time ago I published a blog post with the title Know your data structures!. In this previous post I explained how I improved the running time of a genetic algorithm. I promised to go more into detail about other noteworthy things in the code in a separate article since not everything was straightforward when … Continue reading Learning...

Read more »

Improving state-space modelling of the Australian 2007 federal election

Improving state-space modelling of the Australian 2007 federal election

After I wrote a couple of weeks back about state-space modelling of the Australian 2007 federal election, I received some very helpful feedback from Bob Carpenter, a Columbia University research scientist and one of the core developers of the Stan probabilistic language. Some of the amendments I made in response to his comments and my evolving thinking are...

Read more »

xts 0.10-0 on CRAN!

July 7, 2017
By

A new, and long overdue, release of xts is now on CRAN!  The major change is the completely new plot.xts() written by Michael Weylandt and Ross Bennett, and which is based on Jeff Ryan's quantmod::chart_Series code.Do note that the new plot.xts() includes breaking changes to the original (and rather limited) plot.xts().  However, we believe the new functionality more than compensates...

Read more »

Stan Weekly Roundup, 7 July 2017

July 7, 2017
By

Holiday weekend, schmoliday weekend. Ben Goodrich and Jonah Gabry shipped RStan 2.16.2 (their numbering is a little beyond base Stan, which is at 2.16.0). This reintroduces error reporting that got lost in the 2.15 refactor, so please upgrade if you want to debug your Stan programs! Joe Haupt translated the JAGS examples in the second The post Stan Weekly...

Read more »

In praise of syntactic sugar

July 7, 2017
By

There has been some talk of adding native pipe notation to R (for example here, here, and here). I think a critical aspect of such an extension would be to treat such a notation as syntactic sugar and not insist such a pipe match magrittr semantics, or worse yet give a platform for authors to … Continue reading In...

Read more »

XGBoost support added to Rattle

July 7, 2017
By
XGBoost support added to Rattle

by Fang Zhou, Data Scientist; and Graham Williams, Director of Data Science, all at Microsoft Rattle — the R Analytical Tool To Learn Easily — is a popular open-source GUI for data mining using R. It presents statistical and visual summaries of data, transforms data that can be readily modelled, builds both unsupervised and supervised models from the data,...

Read more »

Generalized Additive Models

July 6, 2017
By
Generalized Additive Models

This is also a flexible and smooth technique which captures the Non linearities in the data and helps us to fit Non linear Models.In this article I am going to discuss the implementation of GAMs in R using the 'gam' package .Simply saying GAMs are just a Generalized version of Linear Models in which the Related Post Second step with...

Read more »

useR!2017

July 6, 2017
By
useR!2017

This o2r team members Daniel and Edzer had the pleasure to participate in the largest conference of R developers and users, useR!2017 in Brüssels, Belgium. Daniel Nüst @nordholmen presenting containerit, creates a docker img from an R session to archive reproducibly @o2r_project @cboettig pic.twitter.com/o65O8s8jXY— Edzer Pebesma (@edzerpebesma) 6. Juli 2017 Daniel presented a new R extension package, containerit, in the Data...

Read more »

Introducing tidygraph

Introducing tidygraph

I’m very pleased to announce that my new package tidygraph is now available on CRAN. As the name suggests, tidygraph is an entry into the tidyverse that provides a tidy framework for all things relational (networks/graphs, trees, etc.). tidygraph ...

Read more »

Working With R and Big Data: Use Replyr

July 6, 2017
By

In our latest R and Big Data article we discuss replyr. Why replyr replyr stands for REmote PLYing of big data for R. Why should R users try replyr? Because it lets you take a number of common working patterns and apply them to remote data (such as databases or Spark). replyr allows users to … Continue reading Working...

Read more »

Motor vehicle collisions in New York City – R / Shiny Data Visualization

July 6, 2017
By
Motor vehicle collisions in New York City – R / Shiny Data Visualization

Introduction Shiny by RStudio is a web application framework targeted for R programmers. It allows our analysis to be extended as an interactive web application  that can be The post Motor vehicle collisions in New York City - R / Shiny Data Visualization appeared first on NYC Data Science Academy Blog.

Read more »

Set Theory Ordered Pairs and Cartesian Product with R

July 6, 2017
By

Part 5 of 5 in the series Set TheoryOrdered and Unordered Pairs A pair set is a set with two members, for example, , which can also be thought of as an unordered pair, in that . However, we seek a more a strict and rich object that tells us... The post Set Theory Ordered Pairs and Cartesian Product with...

Read more »

Parallel Computing Exercises: Snowfall (Part-1)

July 6, 2017
By
Parallel Computing Exercises: Snowfall (Part-1)

R has a lot of tools to speed up computations making use of multiple CPU cores either on one computer, or on multiple machines. This series of exercises aims to introduce the basic techniques for implementing parallel computations using multiple CPU cores on one machine. The initial step in preparation for parallelizing computations is to Related exercise sets: Shiny Application...

Read more »

Search R-bloggers

Sponsors

Mango solutions





Zero Inflated Models and Generalized Linear Mixed Models with R

r-brain.io



Quantide: statistical consulting and training

ODSC1

ODSC2

datasociety

http://www.eoda.de







CRC R books series







Six Sigma Online Training



omictools

Contact us if you wish to help support R-bloggers, and place your banner here.