a Galton-Watson riddle

December 29, 2016
By
a Galton-Watson riddle

The Riddler of this week has an extinction riddle which summarises as follows: One observes a population of N individuals, each with a probability of 10⁻⁴ to kill the observer each day. From one day to the next, the population decreases by one individual with probability K√N 10⁻⁴ What is the value of K that

Read more »

Sampling from shifted Gompertz distribution

December 29, 2016
By
Sampling from shifted Gompertz distribution

using Accept-Reject method - Shifted Gompertz distribution Shifted Gompertz distribution is useful distribution which can be used to describe time needed for adopting new innovation within the market. Recent studies showed that it outperforms Bass model of diffusion in some cases1. Its pdf is given by Below we...

Read more »

7 Visualizations You Should Learn in R

December 29, 2016
By
7 Visualizations You Should Learn in R

7 Visualizations You Should Learn in R With ever increasing volume of data, it is impossible to tell stories without visualizations. Data visualization is an art of how to turn numbers into useful knowledge. R Programming lets you learn this art by offering a set of inbuilt functions and libraries to build visualizations and present... Read MoreThe...

Read more »

Quadratic Discriminant Analysis of Two Groups

December 29, 2016
By

As mentioned in the post on classification with linear discriminant analysis, LDA assumes the groups in question have equal covariance matrices . Therefore, often when the groups do not have equal covariance matrices, observations are frequently assigned to groups with large variances on the diagonal of its corresponding covariance matrix... The post Quadratic Discriminant Analysis of Two Groups...

Read more »

Using R to prevent food poisoning in Chicago

December 29, 2016
By

There are more than 15,000 restaurants in Chicago, but fewer than 40 inspectors tasked with making sure they comply with food-safety standards. To help prioritize the facilities targeted for inspection, the City of Chicago used R to create a model that predicts which restaurants are most likely to fail an inspection. Using this model to deploy inspectors, the City...

Read more »

Intermediate Tree 1

December 29, 2016
By
Intermediate Tree 1

If you followed through the Basic Decision Tree exercise, this should be useful for you. This is like a continuation but we add so much more. We are working with a bigger and badder datasets. We will be also using techniques we learned from model evaluation and work with ROC, accuracy and other metrics. Answers

Read more »

An Interview With Jo Hardin, author of Foundations of Inference

December 29, 2016
By

Hey R fans! A new episode of DataCamp's DataChats video series is out!  In this episode, we interview Jo Hardin. Jo is a Professor of Mathematics at Pomona College with many years of R experience.  She has a pure passion for education and has been w...

Read more »

Reactive acronym list in stratvis, a timevis-based Shiny app

December 29, 2016
By
Reactive acronym list in stratvis, a timevis-based Shiny app

Abstract I present a method for reactively updating a table of acronyms from a Shiny interactive timeline using renderDataTable and timevis. The method is used in the new Shiny app, stratvis. The stratvis app The stratvis Shiny app provides a rich a...

Read more »

Euler Problem 5: Smallest Multiple

December 29, 2016
By

Solution to Euler Problem 5 in the R Language for Statistical Computing: What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20? Continue reading → The post Euler Problem 5: Smallest Multiple appeared first on The Devil is in the Data.

Read more »

The Instant Rise of Machine Intelligence?

December 28, 2016
By
The Instant Rise of Machine Intelligence?

Currently the news are filled with articles about the rise of machine intelligence, artificial intelligence and deep learning. For the average reader it seems that there was this single technical breakthrough that made AI possible. While I strongly bel...

Read more »

Tip: Optimize your Rcpp loops

December 28, 2016
By
Tip: Optimize your Rcpp loops

In this post, I will show you how to optimize your Rcpp loops so that they are 2 to 3 times faster than a standard implementation. Context Real data example For this post, I will use a big.matrix which represents genotypes for 15,283 individuals, corresponding to the number of mutations (0, 1 or 2) at 287,155 different loci. Here, I will use...

Read more »

Combine choropleth data with raster maps using R

December 28, 2016
By
Combine choropleth data with raster maps using R

Switzerland is a country with lots of mountains, and several large lakes. While the political subdivisions (called municipalities) cover the high mountains and lakes, nothing much of economic interest happens in these places. (Raclette and sailing are wonderful, but don't count for our purposes.) For this reason, the Swiss Federal Statistical Office publishes the boundaries of the "productive" parts...

Read more »

Exploratory Data Analysis Using R (Part-I)

December 28, 2016
By
Exploratory Data Analysis Using R (Part-I)

The greatest value of a picture is when it forces us to notice what we never expected to see. — John W. Tukey. Exploratory Data Analysis.Why do we use exploratory graphs in data analysis?Understand data propertiesFind patterns in dataSuggest mod...

Read more »

Celebrating our 100th R exercise set

December 28, 2016
By
Celebrating our 100th R exercise set

Yesterday we published our 100th set of exercises on R-exercises. Kudos and many thanks to Avi, Maria Elisa, Euthymios, Francisco, Imtiaz, John, Karolis, Mary Anne, Matteo, Miodrag, Paritosh, Sammy, Siva, Vasileios, and Walter for contributing so much great material to practice R programming! Even more thanks to Onno, who is working (largely) behind the scenes

Read more »

Making Shiny apps awesome

Making Shiny apps awesome

A week before Christmas our CTO, Marek Rogala gave a speech about ways to make Shiny apps do much more than usual during R enthusiasts meeting in Warsaw. In case you have missed this event we have published his presentation online: NOTE! in this pr...

Read more »

R code to accompany Real-World Machine Learning (Chapters 2-4 Updates)

December 28, 2016
By
R code to accompany Real-World Machine Learning (Chapters 2-4 Updates)

Abstract I updated the R code to accompany Chapter 2-4 of the book “Real-World Machine Learning” by Henrik Brink, Joseph W. Richards, and Mark Fetherolf to be more consistent with the listings and figures as presented in the book. rwml-R Chapters...

Read more »

Behind the scenes of CRAN

December 27, 2016
By

(Just from my point of view as a package maintainer.)New users of R might not appreciate the full benefit of CRAN and new package maintainers may not appreciate the importance of keeping their packages updated and free of warnings and errors. This is something I only came to realize myself in the last few years

Read more »

More on Orthogonal Regression

December 27, 2016
By

Some time ago I wrote a post about orthogonal regression. This is where we fit a regression line so that we minimize the sum of the squares of the orthogonal (rather than vertical) distances from the data points to the regression line.Subsequently, I received the following email comment:"Thanks for this blog post. I enjoyed reading it. I'm wondering...

Read more »

R For Beginners: Some Simple R Code to do Common Statistical Procedures, Part Two

December 27, 2016
By
R For Beginners:  Some Simple R Code to do Common Statistical Procedures, Part Two

An R tutorial by D. M. Wiig This posting contains an embedded Word document. To view the document full screen click on the icon in the lower right hand corner of the embedded document.    

Read more »

Analyzing the 2015 California Health Interview Survey in R

December 27, 2016
By
Analyzing the 2015 California Health Interview Survey in R

A few years ago, I wrote about how to analyze the 2012 California Health Interview Survey in R. In 2012, plans for Covered California (Obamacare in California) were just beginning to take shape. Today, Covered California is a relatively mature program and it is arguably the most successful implementation of the Affordable Care Act in the United...

Read more »

List Vol.2 Exercises

December 27, 2016
By
List Vol.2 Exercises

Answers to the exercises are available here. Exercise 1 Consider 3 vectors, day, month and year: year=c(2005:2016) month=c(1:12) day=c(1:31) Define a list Date such as: Date= $year 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 $month 1 2 3 4 5 6 7 8 9 10 11 12 $day

Read more »

Parallelizing Data Analytics on Azure with the R Interface Tool

December 27, 2016
By
Parallelizing Data Analytics on Azure with the R Interface Tool

by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) In data science, to develop a model with optimal performance, exploratory experiments on different sets of hyper-parameters are often performed. Preliminary analyses on smaller data can be performed on a single machine, while the experimental one on large-scale data by sweeping multi-sets of parameters can...

Read more »

add layer to specific panel of facet_plot output

December 27, 2016
By
add layer to specific panel of facet_plot output

This is a question from ggtree google group: Dear ggtree team, how can I apply a geom_xxx to only one facet panel? For example if i want to get geom_hline(yintersect=1:30) or a geom_text() in the dot panel? I cant see the facet_grid(. ~ var) function call, so I don’t know which subsetting to use. I have already read http://stackoverflow.com/questions/29873155/geom-text-and-facets-not-working ...

Read more »

An Overview of Room Rentals in Sydney | RSelenium, rvest, Leaflet, googleway

December 27, 2016
By
An Overview of Room Rentals in Sydney | RSelenium, rvest, Leaflet, googleway

Started another data scraping script similar to the post about rental rates in Houston; except this time i picked Sydney, Australia. The site that i’ve selected uses an awful lot of javascript, so the rvest package won’t be enough in … Continue reading →

Read more »

Why you should master R (even if it might eventually become obsolete)

December 27, 2016
By

In last week’s blog post I asked How much data science do you actually remember? It’s a critical question. If you study data science, but forget everything that you learn, you’ll be in big trouble when you go in for an interview. Or, you’ll be in big trouble if you actually get a data science The post

Read more »

Spatial analysis pipelines with simple features in R

December 27, 2016
By

In November, the new simple features package for R sf hit CRAN. The package is like rgdal, sp, and rgeos rolled into one, is much faster, and allows for data processing with dplyr verbs! Also, as sf objects are represented in a much simpler way than sp objects, it allows for spatial analysis in R within...

Read more »

Start with wordcloud

Start with wordcloud

I followed my good resolutions on practising data analysis in my previous post and started to play with the French drug database. After importing the data, I started classically with data visualisation. In this database, there is a lot of text data. To visualise this, some wordcloud is always welcome. They are maybe not accurate at all but...

Read more »

Data Preparation, Long Form and tl;dr Form

December 26, 2016
By
Data Preparation, Long Form and tl;dr Form

Data preparation and cleaning are some of the most important steps of predictive analytic and data science tasks. They are laborious, where most of the errors are made, your last line of defense against a wild data, and hold the biggest opportunities for outcome improvement. No matter how much time you spend on then, they … Continue...

Read more »

The Basics of Bayesian Statistics

December 26, 2016
By

Bayesian Inference is a way of combining information from data with things we think we already know. For example, if we wanted to get an estimate of the mean height of people, we could use our prior knowledge that people are generally between 5 and 6 feet tall to inform the results from the data we collect. If our...

Read more »

Sponsors

Mango solutions









Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

ODSC1

ODSC2

datasociety

http://www.eoda.de







CRC R books series







Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.