An architecture for real-time scoring with R

March 1, 2019
By
An architecture for real-time scoring with R

Let's say you've developed a predictive model in R, and you want to embed predictions (scores) from that model into another application (like a mobile or Web app, or some automated service). If you expect a heavy load of requests, R running on a single server isn't going to cut it: you'll need some kind of distributed architecture with...

Read more »

The delta method and its implementation in R

March 1, 2019
By

Suppose that you have a sample of a variable of interest, e.g. the heights of men in certain population, and for some obscured reason you are interest not in the mean height μ but in its square μ². How would you inference on μ², e.g. test a hypothesis or calculate a confidnce interval? The delta … Continue reading The...

Read more »

Powerball demystified

March 1, 2019
By

The US Powerball lottery hysteria took another step when no one won the big jackpot in the last draw that took place on October 20, 2018. So, the total jackpot is now 2.22 billion dollars. I am sure that you want to win this jackpot. I myself want to win it. Actually, there are two different … Continue reading Powerball...

Read more »

R Journal publication

March 1, 2019
By
R Journal publication

The R Journal is the open access, refereed journal of the R project for statistical computing. It features short to medium length articles covering topics that should be of interest to users or developers of R. Christoph Weiss, Gernot Roetzer and myself have joined forces to write an R package and the accompanied paper: Forecast... Related posts: R tips and tricks...

Read more »

A brief history of clinical trials

March 1, 2019
By

The earliest report of a clinical trial is probably provided in the Book of Daniel. Daniel and a group of other Jewish people who stayed at the palace of the king of Babylon, did not want to eat the king’s non-Kosher food and preferred a vegetarian diet. To show that vegetarian and Kosher diet is healthier, … Continue reading A...

Read more »

Bayesian state space modelling of the Australian 2019 election by @ellis2013nz

Bayesian state space modelling of the Australian 2019 election by @ellis2013nz

So I’ve been back in Australia for five months now. While things have been very busy in my new role at Nous Group, it’s not so busy that I’ve failed to notice there’s a Federal election due some time by November this year. I’m keen to apply some of the techniques I used in New Zealand in the richer...

Read more »

What is logistic in the logistic regression?

March 1, 2019
By
What is logistic in the logistic regression?

Suppose that you are interviewed for a data scientist role. You are asked about logistic regression, and you answer all sorts of questions: How to run it in Python, how would you perform feature selection, and how would you use it for prediction. For the last question you answer that if you have the estimated of the regression … Continue reading What...

Read more »

Some comments on AB testing implementation

March 1, 2019
By

Many job postings in the field of technology (mainly for Data Scientist jobs, but not only) require knowledge and/or experience in “AB testing”. What is AB testing? A brief inspection at Wikipedia reveals that this is a method for assessing the impact of a certain change when it is carried out. For example, one may sick … Continue reading Some...

Read more »

Binning Data in a Database

February 28, 2019
By
Binning Data in a Database

Roz King just wrote an interesting article on binning data (a common data analytics step) in a database. He compares a case-based approach (where the bin divisions are stuffed into code) with a join based approach. He shares code and timings. Best of all: rquery gets some attention and turns out to be the dominant … Continue reading Binning...

Read more »

Binning Columns in Remote Tables with dplyr and rquery

February 28, 2019
By
Binning Columns in Remote Tables with dplyr and rquery

We’ll benchmark performance on three methods for binning columns in remote database tables in R. The CASE WHEN (dplyr::case_when) statement and a natural join from dplyr with be compared to using a natural join with the rquery package.

Read more »

Creating a Favicon with R

February 28, 2019
By
Creating a Favicon with R

I use the Hugo Coder theme for this website, but I don’t like the default favicon, so I decided to make a new one using ggplot2. For those of you who don’t know, a favicon is the little icon that shows up on your browser tab next to the website name (in most browsers). How hard could it be, right?...

Read more »

Some R Packages for ROC Curves

February 28, 2019
By
Some R Packages for ROC Curves

In a recent post, I presented some of the theory underlying ROC curves, and outlined the history leading up to their present popularity for characterizing the performance of machine learning models. In this post, I describe how to search CRAN for packages to plot ROC curves, and highlight six useful packages. Although I began with a few ideas about packages...

Read more »

htmlunitjars Updated to 2.34.0

February 28, 2019
By

The in-dev htmlunit package for javascript-“enabled” web-scraping without the need for Selenium, Splash or headless Chrome relies on the HtmlUnit library and said library just released version 2.34.0 with a wide array of changes that should make it possible to scrape more gnarly javascript-“enabled” sites. The Chrome emulation is now also on-par with Chrome 72... Continue reading →

Read more »

How to make children eat more vegetables

February 28, 2019
By

Will plates with vegetables and fruit paintings cause children to eat more vegetables and fruits? Here is an example for how not to test this hypothesis

Read more »

EARL London early bird tickets now on sale

February 28, 2019
By

Early bird tickets for the Enterprise Applications of the R Language Conference are now on sale! The EARL Conference is in its sixth year, its a cross-sector conference that focuses on the commercial use of the R programming language. Take a look at our highlights from last year: We are busy putting together another brilliant agenda, but there’s still...

Read more »

drat All The 📦! : Enabling Easier Package Discovery and Installation with Your Own CRAN-like Repo for Your Packages

February 28, 2019
By
drat All The 📦! : Enabling Easier Package Discovery and Installation with Your Own CRAN-like Repo for Your Packages

I’ve got a work-in-progress drat-ified CRAN-like repo for (eventually) all my packages over at CINC🔗 (“CINC is not CRAN” and it also sounds like “sync”). This is in parallel with a co-location/migration of all my packages to SourceHut (just waiting for the sr.ht alpha API to be baked) and a self-hosted public Gitea instance. Everything... Continue reading →

Read more »

RStudio Instructor Training

February 27, 2019
By

We are pleased to announce the launch of RStudio’s instructor training and certification program. Its goal is to help people apply modern evidence-based teaching practices to teach data science using R and RStudio’s products, and to help people who need such training find the trainers they need. Like the training programs for flight instructors, the ski patrol, and the Carpentries,...

Read more »

CDSBMexico: remember to apply for BioC2019 travel scholarships

February 27, 2019
By
CDSBMexico: remember to apply for BioC2019 travel scholarships

This blog post was first published at the CDSBMexico website. #CDSBMexico: remember to apply for BioC2019 travel scholarships!!Due date is March 15thhttps://t.co/iegG0qQzwuLet us help you! Here we give you some ideas 💡We can also give you feedback via Slack ✅#rstats #bioconductor @Bioconductor #bioc2019 #diversity #LatAm #rstatsES pic.twitter.com/EORg8d2Qxj— ComunidadBioInfo (@CDSBMexico) March 1, 2019 About 10 months ago we announced our plans to...

Read more »

CDSBMexico: remember to apply for BioC2019 travel scholarships

February 27, 2019
By
CDSBMexico: remember to apply for BioC2019 travel scholarships

This blog post was first published at the CDSBMexico website. #CDSBMexico: remember to apply for BioC2019 travel scholarships!!Due date is March 15thhttps://t.co/iegG0qQzwuLet us help you! Here we give you some ideas 💡We can also give you feedback via Slack ✅#rstats #bioconductor @Bioconductor #bioc2019 #diversity #LatAm #rstatsES pic.twitter.com/EORg8d2Qxj— ComunidadBioInfo (@CDSBMexico) March 1, 2019 About 10 months ago we announced our plans to...

Read more »

A wee look at group_map and group_split in dplyr

February 27, 2019
By

Dplyr 0.8.0 launched recently, which you probably already know, but just in case you missed it.. Two new functions have been catching my eye : group_map and group_split. The aim of this post - take a first look at these and try and get a ne...

Read more »

Tips for drawing a normal distribution

February 27, 2019
By
Tips for drawing a normal distribution

KNOW THY INFLECTION POINT Click to Enlarge We have drawn a lot of sorry-looking normal distributions in our life. It’s a shape that’s hard to get down without a lot of practice. Here’s a few tips that can make it easier. Start by marking out standard deviations on the x axis from -3 to +3 The post Tips for...

Read more »

Individual patch connectivity

February 27, 2019
By
Individual patch connectivity

Answering questions from users is actually a good way to find potentially interesting things to post… With the obvious advantage that other people might find it useful too! So… let’s start… This MetaLandSim user wanted a way to derive the contribution of each individual patch to overall landscape connectivity. Here I used the method presented … Continue reading Individual...

Read more »

Investigating words distribution with R – Zipf’s law

February 27, 2019
By
Investigating words distribution with R – Zipf’s law

Hello again! Typically I would start by describing a complicated problem that can be solved using machine or deep learning methods, but today I want to do something different, I want to show you some interesting probabilistic phenomena! Have you heard of Zipf’s law? I hadn’t until recently. Zipf’s law is an empirical law that Article Investigating words distribution...

Read more »

KDA–Robustness Results

February 27, 2019
By
KDA–Robustness Results

This post will display some robustness results for KDA asset allocation. Ultimately, the two canary instruments fare much better using … Continue reading →

Read more »

You Need to Start Branding Your Graphs. Here’s How, with ggplot!

February 26, 2019
By
You Need to Start Branding Your Graphs. Here’s How, with ggplot!

In today's post I want to help you incorporate your company's branding into your ggplot graphs. Why should you care about this? I'm glad you asked!     Have you ever seen a graph that looks like this? Of course you have! This is the default ggplot theme, and these graphs are everywhere. Now, look--I like the way this graph looks. The...

Read more »

Robust Regressions: Dealing with Outliers in R

February 26, 2019
By
Robust Regressions: Dealing with Outliers in R

Robust Regressions in R CategoriesRegression Models Tags Machine Learning Outlier R Programming Video Tutorials It is often the case that a dataset contains significant outliers – or observations that are significantly out of range from the majority of other observations in our dataset. Let us see how we can use robust regressions to deal with this issue. I described in another tutorial how we can run...

Read more »

handlr: convert among citation formats

Citations are a crucial piece of scholarly work. They hold metadata on each scholarly work, including what people were involved, what year the work was published, where it was published, and more. The links between citations facilitate insight into many questions about scholarly work. Citations come in many different formats including BibTex, RIS, JATS, and many more. This is not...

Read more »

“If You Were an R Function, What Function Would You Be?”

February 26, 2019
By

We’ve been getting some good uptake on our piping in R article announcement. The article is necessarily a bit technical. But one of its key points comes from the observation that piping into names is a special opportunity to give general objects the following personality quiz: “If you were an R function, what function would … Continue reading “If...

Read more »

uRos2019: tutorials, keynote speakers, registration and call for papers!

February 26, 2019
By
uRos2019: tutorials, keynote speakers, registration and call for papers!

The 7th use of R in Official Statistics conference is the event for all things R in the production and use of government statistics. The 7th installment of this conference will take place from 20 to 21 May 2019 at … Continue reading →

Read more »

Search R-bloggers

Sponsors