Intermediate Tree 1

December 29, 2016
By
Intermediate Tree 1

If you followed through the Basic Decision Tree exercise, this should be useful for you. This is like a continuation but we add so much more. We are working with a bigger and badder datasets. We will be also using techniques we learned from model evaluation and work with ROC, accuracy and other metrics. Answers

Read more »

An Interview With Jo Hardin, author of Foundations of Inference

December 29, 2016
By

Hey R fans! A new episode of DataCamp's DataChats video series is out!  In this episode, we interview Jo Hardin. Jo is a Professor of Mathematics at Pomona College with many years of R experience.  She has a pure passion for education and has been w...

Read more »

Reactive acronym list in stratvis, a timevis-based Shiny app

December 29, 2016
By
Reactive acronym list in stratvis, a timevis-based Shiny app

Abstract I present a method for reactively updating a table of acronyms from a Shiny interactive timeline using renderDataTable and timevis. The method is used in the new Shiny app, stratvis. The stratvis app The stratvis Shiny app provides a rich a...

Read more »

Euler Problem 5: Smallest Multiple

December 29, 2016
By

Solution to Euler Problem 5 in the R Language for Statistical Computing: What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20? Continue reading → The post Euler Problem 5: Smallest Multiple appeared first on The Devil is in the Data.

Read more »

The Instant Rise of Machine Intelligence?

December 28, 2016
By
The Instant Rise of Machine Intelligence?

Currently the news are filled with articles about the rise of machine intelligence, artificial intelligence and deep learning. For the average reader it seems that there was this single technical breakthrough that made AI possible. While I strongly bel...

Read more »

Tip: Optimize your Rcpp loops

December 28, 2016
By
Tip: Optimize your Rcpp loops

In this post, I will show you how to optimize your Rcpp loops so that they are 2 to 3 times faster than a standard implementation. Context Real data example For this post, I will use a big.matrix which represents genotypes for 15,283 individuals, corresponding to the number of mutations (0, 1 or 2) at 287,155 different loci. Here, I will use...

Read more »

Combine choropleth data with raster maps using R

December 28, 2016
By
Combine choropleth data with raster maps using R

Switzerland is a country with lots of mountains, and several large lakes. While the political subdivisions (called municipalities) cover the high mountains and lakes, nothing much of economic interest happens in these places. (Raclette and sailing are wonderful, but don't count for our purposes.) For this reason, the Swiss Federal Statistical Office publishes the boundaries of the "productive" parts...

Read more »

Exploratory Data Analysis Using R (Part-I)

December 28, 2016
By
Exploratory Data Analysis Using R (Part-I)

The greatest value of a picture is when it forces us to notice what we never expected to see. — John W. Tukey. Exploratory Data Analysis.Why do we use exploratory graphs in data analysis?Understand data propertiesFind patterns in dataSuggest mod...

Read more »

Celebrating our 100th R exercise set

December 28, 2016
By
Celebrating our 100th R exercise set

Yesterday we published our 100th set of exercises on R-exercises. Kudos and many thanks to Avi, Maria Elisa, Euthymios, Francisco, Imtiaz, John, Karolis, Mary Anne, Matteo, Miodrag, Paritosh, Sammy, Siva, Vasileios, and Walter for contributing so much great material to practice R programming! Even more thanks to Onno, who is working (largely) behind the scenes

Read more »

Making Shiny apps awesome

Making Shiny apps awesome

A week before Christmas our CTO, Marek Rogala gave a speech about ways to make Shiny apps do much more than usual during R enthusiasts meeting in Warsaw. In case you have missed this event we have published his presentation online: NOTE! in this pr...

Read more »

R code to accompany Real-World Machine Learning (Chapters 2-4 Updates)

December 28, 2016
By
R code to accompany Real-World Machine Learning (Chapters 2-4 Updates)

Abstract I updated the R code to accompany Chapter 2-4 of the book “Real-World Machine Learning” by Henrik Brink, Joseph W. Richards, and Mark Fetherolf to be more consistent with the listings and figures as presented in the book. rwml-R Chapters...

Read more »

Behind the scenes of CRAN

December 27, 2016
By

(Just from my point of view as a package maintainer.)New users of R might not appreciate the full benefit of CRAN and new package maintainers may not appreciate the importance of keeping their packages updated and free of warnings and errors. This is something I only came to realize myself in the last few years

Read more »

More on Orthogonal Regression

December 27, 2016
By

Some time ago I wrote a post about orthogonal regression. This is where we fit a regression line so that we minimize the sum of the squares of the orthogonal (rather than vertical) distances from the data points to the regression line.Subsequently, I received the following email comment:"Thanks for this blog post. I enjoyed reading it. I'm wondering...

Read more »

R For Beginners: Some Simple R Code to do Common Statistical Procedures, Part Two

December 27, 2016
By
R For Beginners:  Some Simple R Code to do Common Statistical Procedures, Part Two

An R tutorial by D. M. Wiig This posting contains an embedded Word document. To view the document full screen click on the icon in the lower right hand corner of the embedded document.    

Read more »

Analyzing the 2015 California Health Interview Survey in R

December 27, 2016
By
Analyzing the 2015 California Health Interview Survey in R

A few years ago, I wrote about how to analyze the 2012 California Health Interview Survey in R. In 2012, plans for Covered California (Obamacare in California) were just beginning to take shape. Today, Covered California is a relatively mature program and it is arguably the most successful implementation of the Affordable Care Act in the United...

Read more »

List Vol.2 Exercises

December 27, 2016
By
List Vol.2 Exercises

Answers to the exercises are available here. Exercise 1 Consider 3 vectors, day, month and year: year=c(2005:2016) month=c(1:12) day=c(1:31) Define a list Date such as: Date= $year 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 $month 1 2 3 4 5 6 7 8 9 10 11 12 $day

Read more »

Parallelizing Data Analytics on Azure with the R Interface Tool

December 27, 2016
By
Parallelizing Data Analytics on Azure with the R Interface Tool

by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) In data science, to develop a model with optimal performance, exploratory experiments on different sets of hyper-parameters are often performed. Preliminary analyses on smaller data can be performed on a single machine, while the experimental one on large-scale data by sweeping multi-sets of parameters can...

Read more »

add layer to specific panel of facet_plot output

December 27, 2016
By
add layer to specific panel of facet_plot output

This is a question from ggtree google group: Dear ggtree team, how can I apply a geom_xxx to only one facet panel? For example if i want to get geom_hline(yintersect=1:30) or a geom_text() in the dot panel? I cant see the facet_grid(. ~ var) function call, so I don’t know which subsetting to use. I have already read http://stackoverflow.com/questions/29873155/geom-text-and-facets-not-working ...

Read more »

An Overview of Room Rentals in Sydney | RSelenium, rvest, Leaflet, googleway

December 27, 2016
By
An Overview of Room Rentals in Sydney | RSelenium, rvest, Leaflet, googleway

Started another data scraping script similar to the post about rental rates in Houston; except this time i picked Sydney, Australia. The site that i’ve selected uses an awful lot of javascript, so the rvest package won’t be enough in … Continue reading →

Read more »

Why you should master R (even if it might eventually become obsolete)

December 27, 2016
By

In last week’s blog post I asked How much data science do you actually remember? It’s a critical question. If you study data science, but forget everything that you learn, you’ll be in big trouble when you go in for an interview. Or, you’ll be in big trouble if you actually get a data science The post

Read more »

Spatial analysis pipelines with simple features in R

December 27, 2016
By

In November, the new simple features package for R sf hit CRAN. The package is like rgdal, sp, and rgeos rolled into one, is much faster, and allows for data processing with dplyr verbs! Also, as sf objects are represented in a much simpler way than sp objects, it allows for spatial analysis in R within...

Read more »

Start with wordcloud

Start with wordcloud

I followed my good resolutions on practising data analysis in my previous post and started to play with the French drug database. After importing the data, I started classically with data visualisation. In this database, there is a lot of text data. To visualise this, some wordcloud is always welcome. They are maybe not accurate at all but...

Read more »

Data Preparation, Long Form and tl;dr Form

December 26, 2016
By
Data Preparation, Long Form and tl;dr Form

Data preparation and cleaning are some of the most important steps of predictive analytic and data science tasks. They are laborious, where most of the errors are made, your last line of defense against a wild data, and hold the biggest opportunities for outcome improvement. No matter how much time you spend on then, they … Continue...

Read more »

The Basics of Bayesian Statistics

December 26, 2016
By

Bayesian Inference is a way of combining information from data with things we think we already know. For example, if we wanted to get an estimate of the mean height of people, we could use our prior knowledge that people are generally between 5 and 6 feet tall to inform the results from the data we collect. If our...

Read more »

Descriptive Analytics-Part 5: Data Visualisation (Spatial data)

December 25, 2016
By
Descriptive Analytics-Part 5: Data Visualisation (Spatial data)

Descriptive Analytics is the examination of data or content, usually manually performed, to answer the question “What happened?”. In order to be able to solve this set of exercises you should have solved the part 0, part 1, part 2,part 3, and part 4 of this series but also you should run this script which

Read more »

Building Shiny App exercises part 3

December 25, 2016
By
Building Shiny App exercises part 3

ADD CONTROL WIDGETS Welcome to the third part of our series. In this part you will learn how to build and place inside your app the rest of the widgets which were mentioned in part 2. More specifically we will analyze: 1) helptext, 2) numericInput, 3) radioButtons, 4) selectInput, 5) sliderInput and 6) textInput. As

Read more »

Googly: An interactive app for analyzing IPL players, matches and teams using R package yorkr

December 25, 2016
By
Googly: An interactive app for analyzing IPL players, matches and teams using R package yorkr

Presenting ‘Googly’, a cool Shiny app that I developed over the last couple of days. This interactive Shiny app was on my mind for quite some time, and I finally got down to implementing it. The Googly Shiny app is based on my R package ‘yorkr’ which is now available in CRAN. The R package … Continue...

Read more »

Extracting data on shadow economy from PDF tables

December 25, 2016
By
Extracting data on shadow economy from PDF tables

Data on the shadow economy? I’m reading Kenneth Rogoff’s The Curse of Cash. It was one of Bloomberg’s Best Books of 2016 and the Financial Times’ Best Economics Books of 2016, and I recommend it. It’s an excellent and convincing book, makin...

Read more »

Christmas Tree with ggplot

December 25, 2016
By
Christmas Tree with ggplot

# create data x <- c(8,7,6,7,6,5,6,5,4,5,4,3,4,3,2,3,2,1,0.5,0.1) dat1 <- data.frame(x1 = 1:length(x), x2 = x) dat2 <- data.frame(x1 = 1:length(x), x2 = -x) dat1$xvar <- dat2$xvar <- NA dat1$yvar <- dat2$yvar <- NA dat1$siz <- dat2$siz <- NA dat1$col <- dat2$col dec_threshold){ dat1$xvar <- row #sample(1:dat1$x1,1) dat1$yvar <- sample(1:dat1$x2-1,1) dat1$siz <- runif(1,0.5,1.5) dat1$col dec_threshold){ dat2$xvar <-

Read more »

Sponsors

Mango solutions









Zero Inflated Models and Generalized Linear Mixed Models with R

Quantide: statistical consulting and training

ODSC1

ODSC2

datasociety

http://www.eoda.de







CRC R books series







Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.