Answer probability questions with simulation

August 20, 2017
By
Answer probability questions with simulation

Probability is at the heart of data science. Simulation is also commonly used in algorithms such as the bootstrap. After completing this exercise, you will have a slightly stronger intuition for probability and for writing your own simulation algorithms. Most of the problems in this set have an exact analytical solution, which is not the case Related exercise sets: Hacking statistics...

Read more »

#10: Compacting your Shared Libraries, After The Build

August 20, 2017
By
#10: Compacting your Shared Libraries, After The Build

Welcome to the tenth post in the rarely ranting R recommendations series, or R4 for short. A few days ago we showed how to tell the linker to strip shared libraries. As discussed in the post, there are two options. One can either set up ~/.R/Makevars by passing the strip-debug option to the linker. Alternatively, one can adjust src/Makevars...

Read more »

joyplot for GSEA result

August 20, 2017
By
joyplot for GSEA result

I am very glad to find that someone figure out how to use ggjoy with ggtree. I really love ggjoy and believe it can be a good tool to visualize gene set enrichment (GSEA) result. DOSE/clusterProfiler support several visualization methods. running...

Read more »

DART: Dropout Regulation in Boosting Ensembles

August 19, 2017
By
DART: Dropout Regulation in Boosting Ensembles

The dropout approach developed by Hinton has been widely employed in the context of deep learnings to prevent the deep neural network from over-fitting, as shown in https://statcompute.wordpress.com/2017/01/02/dropout-regularization-in-deep-neural-networks. In the paper http://proceedings.mlr.press/v38/korlakaivinayak15.pdf, the dropout is also proposed to address the over-fitting in tree boosting ensembles, e.g. MART, caused by the so-called “over-specialization”. In particular, while

Read more »

How to install wgrib2 in OSX

August 19, 2017
By
How to install wgrib2 in OSX

Prompted by both my own struggles with wgrib2 compilation and a plea on the rNOMADS email listserv, I’m going to describe how to compile and install wgrib2 on Mac OS. First of all, some background: wgrib2 is an excellent utility written by Wesley Ebisuzaki at NOAA.  It allows for a number of swift and stable

Read more »

Is dplyr Easily Comprehensible?

August 19, 2017
By
Is dplyr Easily Comprehensible?

dplyr is one of the most popular R packages. It is powerful and important. But is it in fact easily comprehensible?dplyr makes sense to those of us who use it a lot. And we can teach part time R users a lot of the common good use patterns. But, is it an easy task to … Continue reading Is...

Read more »

simplyR

August 19, 2017
By

simplyR is a web space where we’ll be posting practical and easy guides for solving real important problems using R programming language. As we aren’t fans of unnecessary complications, we’ll keep the content of our tutorials / R codes as simpl...

Read more »

Zillow Rent Analysis

August 19, 2017
By
Zillow Rent Analysis

Hello Readers, This is a notification post – Did you realize our website has moved? The blog is live at New JA Blog under the domain http://www.journeyofanalytics.com . You can read about the rent analysis post here. If you received this post AND an email from anu_analytics, then please disregard this post. If you received this … Continue reading Zillow...

Read more »

More things with the New Zealand Election Study by @ellis2013nz

August 19, 2017
By
More things with the New Zealand Election Study by @ellis2013nz

A new cross tab tool I recently put up a simple web app, built with R Shiny, to let users explore the relationship between party vote in the 2014 New Zealand general election and a range of demographic and attitudinal questions in the New Zealand Elec...

Read more »

Obstacles to performance in parallel programming

August 18, 2017
By

Making your code run faster is often the primary goal when using parallel programming techniques in R, but sometimes the effort of converting your code to use a parallel framework leads only to disappointment, at least initially. Norman Matloff, author of Parallel Computing for Data Science: With Examples in R, C++ and CUDA, has shared chapter 2 of that...

Read more »

Starting a Rmarkdown Blog with Bookdown + Hugo + Github

August 18, 2017
By
Starting a Rmarkdown Blog with Bookdown + Hugo + Github

Finally, -after 24h of failed attempts-, I could get my favourite Hugo theme up and running with R Studio and Blogdown. All the steps I followed are detailed in my new Blogdown entry, which is also a GitHub repo. After … Sigue leyendo →

Read more »

ggvis Exercises (Part-2)

August 18, 2017
By
ggvis Exercises (Part-2)

INTRODUCTION The ggvis package is used to make interactive data visualizations. The fact that it combines shiny’s reactive programming model and dplyr’s grammar of data transformation make it a useful tool for data scientists. This package may allows us to implement features like interactivity, but on the other hand every interactive ggvis plot must be Related exercise sets: How to...

Read more »

GoTr – R wrapper for An API of Ice And Fire

August 18, 2017
By
GoTr – R wrapper for An API of Ice And Fire

Ava Yang It’s Game of Thrones time again as the battle for Westeros is heating up. There are tons of ideas, ingredients and interesting analyses out there and I was craving for my own flavour. So step zero, where is the data? Jenny Bryan’s purrr tutorial introduced the list got_chars, representing characters information from the first five books, which seems not...

Read more »

Estimating Gini coefficient when we only have mean income by decile by @ellis2013nz

August 18, 2017
By
Estimating Gini coefficient when we only have mean income by decile by @ellis2013nz

Income inequality data Ideally the Gini coefficient to estimate inequality is based on original household survey data with hundreds or thousands of data points. Often this data isn’t available due to access restrictions from privacy or other concer...

Read more »

Oil leakage… those old BMW’s are bad :-)

August 18, 2017
By
Oil leakage…  those old BMW’s are bad :-)

Introduction My first car was a 13 year Mitsubishi Colt, I paid 3000 Dutch Guilders for it. I can still remember a friend that would not like me to park this car in front of his house because of possible … Continue reading →

Read more »

RcppArmadillo 0.7.960.1.0

August 17, 2017
By
RcppArmadillo 0.7.960.1.0

The bi-monthly RcppArmadillo release is out with a new version 0.7.960.1.0 which is now on CRAN, and will get to Debian in due course. And it is a big one. Lots of nice upstream changes from Armadillo, and lots of work on our end as the Google Summe...

Read more »

2017 App Update

August 17, 2017
By

As you may have noticed, we have made a few changes to our apps for the 2017 season to bring you a smoother and quicker experience while also adding more The post 2017 App Update appeared first on Fantasy Football Analytics.

Read more »

Chapman University DataFest Highlights

August 17, 2017
By
Chapman University DataFest Highlights

Editor’s Note: The 2017 Chapman University DataFest was held during the weekend of April 21-23. The 2018 DataFest will be held during the weekend of April 27-29. DataFest was founded by Rob Gould in 2011 at UCLA with 40 students. In just seven years, it has grown to 31 sites in three countries. Have a look at Mine Çetinkaya-Rundel’s post...

Read more »

I made a 3D movie with ggplot2 once – here’s how I did it

August 17, 2017
By
I made a 3D movie with ggplot2 once – here’s how I did it

Some time ago (last year actually 😳) I had a blast developing a feature for ggforce which had been on my mind for far to long than its limited utility warranted. The idea was to showcase the new facetting extension powers I’d added to ggplot2 by...

Read more »

RStudio Server Pro is ready for BigQuery on the Google Cloud Platform

August 17, 2017
By
RStudio Server Pro is ready for BigQuery on the Google Cloud Platform

RStudio is excited to announce the availability of RStudio Server Pro on the Google Cloud Platform. RStudio Server Pro GCP is identical to RStudio Server Pro, but with additional convenience for data scientists, including pre-installation o...

Read more »

20 years of the R Core Group

August 17, 2017
By

The first "official" version of R, version 1.0.0, was released on February 29, 200. But the R Project had already been underway for several years before then. Sharing this tweet, from yesterday, from R Core member Peter Dalgaard: It was twenty years ago today, Ross Ihaka got the band to play.... #rstats pic.twitter.com/msSpPz2kyA — Peter Dalgaard (@pdalgd) August 16,...

Read more »

Simpson’s Rule for Approximating Definite Integrals in R

August 17, 2017
By
Simpson’s Rule for Approximating Definite Integrals in R

Part 9 of 9 in the series Numerical AnalysisSimpson’s rule is another closed Newton-Cotes formula for approximating integrals over an interval with equally spaced nodes. Unlike the trapezoidal rule, which employs straight lines to approximate a definite integral, Simpson’s rule uses the third Lagrange polynomial, to approximate the definite integral... The post Simpson’s Rule for Approximating Definite Integrals in R...

Read more »

Probability functions beginner

August 17, 2017
By
Probability functions beginner

On this set of exercises, we are going to explore some of the probability functions in R with practical applications. Basic probability knowledge is required. Note: We are going to use random number functions and random process functions in R such as runif, a problem with these functions is that every time you run them Related exercise sets: Hacking statistics...

Read more »

Tesseract and Magick: High Quality OCR in R

August 17, 2017
By
Tesseract and Magick: High Quality OCR in R

Last week we released an update of the tesseract package to CRAN. This package provides R bindings to Google's OCR library Tesseract. install.packages("tesseract") The new version ships with the latest libtesseract 3.05.01 on Windows and MacOS. Furthermore it includes enhancements for managing language data and using tesseract together with the magick package. Installing Language Data The new version has several improvements for installing...

Read more »

Update on Our ‘revisit’ Package

August 16, 2017
By
Update on Our ‘revisit’ Package

On May 31, I made a post here about our R package revisit, which is designed to help remedy the reproducibility crisis in science. The intended user audience includes reviewers of research manuscripts submitted for publication, scientists who wish to confirm the results in a published paper, and explore alternate analyses, and members of the … Continue reading Update...

Read more »

Visualising Water Consumption using a Geographic Bubble Chart

August 16, 2017
By

A geographic bubble chart is a straightforward method to visualise quantitative information with a geospatial relationship. Last week I was in Vietnam helping the Phú Thọ Water Supply Joint Stock Company with their data science. They asked me to create … Continue reading → The post Visualising Water Consumption using a Geographic Bubble Chart appeared first on The Devil is...

Read more »

Use the LENGTH statement to pre-set the lengths of character variables in SAS – with a comparison to R

Use the LENGTH statement to pre-set the lengths of character variables in SAS – with a comparison to R

I often create character variables (i.e. variables with strings of text as their values) in SAS, and they sometimes don’t render as expected.  Here is an example involving the built-in data set SASHELP.CLASS. Here is the code: data c1;      set sashelp.class;      * define a new character variable to classify someone as tall or

Read more »

How to build an image recognizer in R using just a few images

August 16, 2017
By
How to build an image recognizer in R using just a few images

Microsoft Cognitive Services provides several APIs for image recognition, but if you want to build your own recognizer (or create one that works offline), you can use the new Image Featurizer capabilities of Microsoft R Server. The process of training an image recognition system requires LOTS of images — millions and millions of them. The process involves feeding those...

Read more »

Thank You For The Very Nice Comment

August 16, 2017
By
Thank You For The Very Nice Comment

Somebody nice reached out and gave us this wonderful feedback on our new Supervised Learning in R: Regression (paid) video course. Thanks for a wonderful course on DataCamp on XGBoost and Random forest. I was struggling with Xgboost earlier and Vtreat has made my life easy now :). Supervised Learning in R: Regression covers a … Continue reading Thank...

Read more »

Search R-bloggers

Sponsors

Mango solutions





Zero Inflated Models and Generalized Linear Mixed Models with R

r-brain.io



Quantide: statistical consulting and training

ODSC1

ODSC2

datasociety

http://www.eoda.de





CRC R books series







Six Sigma Online Training



omictools

statcon.de

Contact us if you wish to help support R-bloggers, and place your banner here.