5 Reasons to Learn H2O for High-Performance Machine Learning

January 12, 2020
By
5 Reasons to Learn H2O for High-Performance Machine Learning

H2O is the scalable, open-source Machine Learning library that features AutoML. Here are 5 Reasons why it's an essential library for creating production data science code. Full-Stack Data Science Series This is part of a series of articles on ...

Read more »

pointblank v0.3

January 12, 2020
By
pointblank v0.3

English French German The newest release of the pointblank package makes it really easy to validate your data with workflows attuned to your data quality needs. You can install pointblank 0.3 from CRAN with: install.packages("pointblank") The design goals of pointblank are to enable two important data validation workflows with a common set of validation step functions, and, to have the code work seamlessly with...

Read more »

OMG O2G!

January 12, 2020
By
OMG O2G!

The oil-to-gas ratio was recently at its highest level since October 2013, as Middle East saber-rattling and a recovering global economy supported oil, while natural gas remained oversupplied despite entering the major draw season. Even though the ratio has eased in the last week, it remains over one standard deviation above its long-term average. Is now the time to...

Read more »

No Framework, No Problem! Structuring your project folder and creating custom Shiny components

January 12, 2020
By
No Framework, No Problem! Structuring your project folder and creating custom Shiny components

Pedro Coutinho Silva is a software engineer at Appsilon Data Science. It is not always possible to create a dashboard that fully meets your expectations or requirements using only existing libraries. Maybe you want a specific function that needs to be custom built, or maybe you want to add your own style or company branding. Whatever the case, a moment...

Read more »

How to reverse engineer a heat map into its underlying values

January 12, 2020
By
How to reverse engineer a heat map into its underlying values

Astrolabe Diagnostics is a fully bootstrapped five-person biotech startup. We offer the Antibody Staining Data Set (ASDS), a free service that helps immunologists find out the expression of different molecules (markers) across subsets in the immune system. Essentially, the ASDS is a big table of numbers, where every row is a subset and every column a … Continue reading How...

Read more »

Start 2020 with mad new skills you learned at rstudio::conf. Final Call

January 11, 2020
By
Start 2020 with mad new skills you learned at rstudio::conf. Final Call

There will be no better time or place this year to accelerate your knowledge of all things R and RStudio than at rstudio::conf 2020 in San Francisco. While we’re approaching capacity, there’s still room for you! Whether you’re a dedic...

Read more »

Data Science Essentials

January 11, 2020
By
Data Science Essentials

One the greatest strengths of R for data science work is the vast number and variety of packages and capabilities that are available. However, it can be intimidating to navigate this large and dynamic open source ecosystem, especially for a newcomer. All the information you need is out there, but it is often fragmented across numerous stack overflow threads...

Read more »

New vtreat Feature: Nested Model Bias Warning

January 11, 2020
By

For quite a while we have been teaching estimating variable re-encodings on the exact same data they are later naively using to train a model on, leads to an undesirable nested model bias. The vtreat package (both the R version and Python version) both incorporate a cross-frame method that allows one to use all the … Continue reading New...

Read more »

postdoc at Warwick on robust SMC [call]

January 11, 2020
By
postdoc at Warwick on robust SMC [call]

Here is a call for a research fellow at the University of Warwick to work with Adam Johansen and Théo Damoulas on the EPSRC and Lloyds Register Foundaton funded project “Robust Scalable Sequential Monte Carlo with application to Urban Air Quality”. To quote The position will be based primarily at the Department of Statistics of

Read more »

`R` you ready for python (gentle introduction to reticulate package)’

January 10, 2020
By

Just like how Thanos claimed to be inevitable in The Avengers, the direct or indirect use of python has become inevitable for R users in recent years. Fret not R users, you don’t have to abandon your favourite IDE, Rstudio, when using python. With the reticulate package you can use python in Rstudio and even have a mixture of...

Read more »

Heterogeneous treatment effects and homogeneous outcome variances

January 10, 2020
By
Heterogeneous treatment effects and homogeneous outcome variances

Recently there has been a couple of meta-analyses investigating heterogeneous treatment effects by analyzing the ratio of the outcome variances in the treatment and control group. The argument made in these articles is that if individuals differ in their response, then observed variances in the treatment and control group in RCTs should differ. For instance, Winkelbeiner et al. (2019)...

Read more »

rfoaas 2.1.0: New upstream so new access point!

January 9, 2020
By
rfoaas 2.1.0: New upstream so new access point!

FOAAS, having been resting upstream for some time, released version 2.1.0 of its wonderful service this week! So without too much further ado we went to work and added support for it. And now we are in fact thrilled to announce that release 2.1.0 of...

Read more »

heatmaply 1.0.0 – beautiful interactive cluster heatmaps in R

January 9, 2020
By
heatmaply 1.0.0 – beautiful interactive cluster heatmaps in R

I’m excited to announce that heatmaply version 1.0.0 has been published to CRAN! (getting started vignette is available here) What is heatmaply? heatmaply is an R package for easily creating interactive cluster heatmaps that can be shared online as a stand-alone HTML file. Interactivity includes a tooltip display of values when hovering over cells, as … Continue reading "heatmaply...

Read more »

glueformula: simply build regression formulas from vectors with variable names

January 9, 2020
By
glueformula: simply build regression formulas from vectors with variable names

The small new package glueformula with a single function gf facilitates constructing regression formulas from vectors with variable names. The syntax is similar to glue strings. Here is an example: # Example: build a formula # for ivreg with gf libra...

Read more »

A Year in Flow

January 8, 2020
By

Revisiting the river flow profile plot from an earlier post, the video below loops each day's flow profile for the Delaware River in 2019.  Data is from USGS gages processed using R and Windows Live Movie Maker.

Read more »

World map of visited countries in R

January 8, 2020
By

Like me, if you like traveling as much as R you might want to draw a world map of the countries you have visited in R. Below an example with the countries I have visited as of January 2020: To draw this map in R, you will need the following packages: library(highcharter) library(dplyr) library(maps) As usual, you need the packages to be installed on...

Read more »

Nonlinear combinations of model parameters in regression

Nonlinear combinations of model parameters in regression

Nonlinear regression plays an important role in my research and teaching activities. While I often use the ‘drm()’ function in the ‘drc’ package for my research work, I tend to prefer the ‘nls()’ function for teaching purposes, mainly because, in my opinion, the transition from linear models to nonlinear models is smoother, for beginners. One problem with ‘nls()’ is...

Read more »

BH 1.72.0-3 on CRAN

January 8, 2020
By
BH 1.72.0-3 on CRAN

The BH 1.72.0-1 release of BH required one update 1.72.0-2 when I botched a hand-edited path (to comply with the old-school path-length-inside-tar limit). Turns out another issue needed a fix. This release improved on prior ones by starting from a p...

Read more »

Vignette: Podlover – A Package to Analyze Podcasting Data

January 8, 2020
By
Vignette: Podlover – A Package to Analyze Podcasting Data

Note: Some of the code blocks below got reformated by the WordPress editor. To see the working code, please visit the original vignette in the Repo’s README.md file. The Backstory: Podlove – a WordPress plugin for Podcasting The Podlove podcasting suite is an open source toolset to help you publish and manage a podcast within … Continue reading Vignette:...

Read more »

Validate Me! Simple Test vs. Holdout Samples in R

January 8, 2020
By
Validate Me! Simple Test vs. Holdout Samples in R

In statistics, it is often necessary to not only model data but test that model as well. To do this, you need to randomly separate the data into two groups ensuring even samples regardless of … The post Validate Me! Simple Test vs. Holdout Samples in R appeared first on ProgrammingR.

Read more »

How to use your Garmin watch to tell your team you’re going for a run

January 7, 2020
By
How to use your Garmin watch to tell your team you’re going for a run

Building an API in NodeJS and R to send message to Slack from your Garmin watch. Why on earth ThinkR is a remote company, meaning that we all work from our home. On top of other cool things about remote work, this allows me to skip my lunch break and...

Read more »

From Shock to Competence: How Not to Panic When You Receive E-mail from CRAN about Failed Checks

January 7, 2020
By

This post was contributed by Julia Romanowska, Researcher at the University of Bergen, Norway. Thank you, Julia! I’m involved in development of the Haplin R package, which enables fast genetic association analyses (very useful for those involved in genetic epidemiology research). We are a team of scientists that have various background, from genetics, through bioinformatics and statistics. Here’s a short...

Read more »

Python vs. R for Data Science: What’s the Difference?

January 7, 2020
By
Python vs. R for Data Science: What’s the Difference?

If you’re new to data science, or your organization is, you’ll need to pick a language to analyze your data and a thoughtful way to make that decision. Full disclosure: While I can write Python, my background is mostly in the R community—but I'll try my best to be non-partisan. The good news is that you don't need to sweat...

Read more »

Some More Thoughts on Impostering

January 7, 2020
By

Two years ago, I wrote about meta-learning to fight imposter feelings. In this blog I made a distinction between impostering because you don’t feel you are up to the job, and because you feel you ought to know something which you don’t. The meta-learning blog focuses on how you define yourself as a data scientist and what, as a...

Read more »

Psst, don’t tell anybody: The World is getting more rational!

January 7, 2020
By
Psst, don’t tell anybody: The World is getting more rational!

Happy New Year to all of you! 2020 is here and it seems that we are being overwhelmed by more and more irrationality, especially fake news and conspiracy theories. In this post, I will give you some indication that this might actually not be the case (shock horror: good news alert!). We will be using … Continue reading "Psst,...

Read more »

How to create Bar Race Animation Charts in R

January 7, 2020
By
How to create Bar Race Animation Charts in R

Bar Race Animation Charts have started going Viral on Social Media leaving a lot of Data Enthusiasts wondering how are these Bar Race Animation Charts made. The objective of this post is to explain how to build such Bar Race Animation Charts using R — R with the power of versatile packages. Packages The packages that are required to build animated...

Read more »

Game of Life, the DTerminal edition

January 6, 2020
By
Game of Life, the DTerminal edition

Yet another offshoot from my Advent of Code 2019 adventures (first half, second half, all solutions): on day 24, the challenge was to program a variant of Conway’s game of life, and I figured I might as well try my approach on the real thing! A quick google for existing implementations yields three main approaches: nested for, shifting of matrix...

Read more »

The ‘Spelling Bee Honeycomb’ puzzle: efficient computation in R

January 6, 2020
By
The ‘Spelling Bee Honeycomb’ puzzle: efficient computation in R

Previously in this series: The “lost boarding pass” puzzle The “deadly board game” puzzle The “knight on an infinite chessboard” puzzle The “largest stock profit or loss” puzzle The “birthday paradox” puzzle I love 538’s Riddler column, and the January 3 puzzle is a fun one. I’ll quote: The New York Times recently launched some new...

Read more »

Working with Windows CMD system commands in R

January 6, 2020
By
Working with Windows  CMD system commands in R

From time to time, when developing in R, working and wrangling data , preparing for machine learning projects, it comes the time, one would still need to access the operating system commands from/in R. In this blog post, let’s take…Read more ›

Read more »

Search R-bloggers

Sponsors