Big News: Porting vtreat to Python

July 20, 2019
By

We at Win-Vector LLC have some big news. We are finally porting a streamlined version of our R vtreat variable preparation package to Python. vtreat is a great system for preparing messy data for suprevised machine learning. The new implementation is based on Pandas, and we are experimenting with pushing the sklearn.pipeline.Pipeline APIs to their … Continue reading Big...

Read more »

Program Evaluation: Difference-in-differences in R

July 20, 2019
By
Program Evaluation: Difference-in-differences in R

Are you interested in guest posting? Publish at DataScience+ via your editor (i.e., RStudio). Category Advanced Modeling Tags Linear Regression Logistic Regression R Programming Regression analysis is one of the most demanding machine learning methods...

Read more »

GEDCOM Reader for the R Language: Analysing Family History

July 20, 2019
By
GEDCOM Reader for the R Language: Analysing Family History

Understanding who you are is strongly related to understanding your family history. Discovering ancestors is now a popular hobby, as many archives are available on the internet. The GEDCOM...

Read more »

Impressions from useR! 2019

July 20, 2019
By
Impressions from useR! 2019

This year, the greater R community gathering useR! took place in sunny Toulouse in July, bringing together over 1000 practitioners from both academia and industry. The event spanned over five...

Read more »

Generating a Gallery of Visualizations for a Static Website (using R)

July 19, 2019
By

While I was browsing the website of fellow R blogger Ryo Nakagawara1, I was intrigued by his “Visualizations” page. The concept of creating an online “portfolio” is not novel 2, but I...

Read more »

Adding Syntax Highlight

July 19, 2019
By

Syntax highlighting Previously, I posted entries without any syntax highlighting as I was satisfied using basic blogdown and Hugo functions until a Disqus member commented in the previous post to...

Read more »

Germination data and time-to-event methods: comparing germination curves

Germination data and time-to-event methods: comparing germination curves

Very often, seed scientists need to compare the germination behaviour of different seed populations, e.g., different plant species, or one single plant species submitted to different temperatures, light conditions,...

Read more »

Analysis of a Flash Flood

July 19, 2019
By
Analysis of a Flash Flood

Flash floods seem to be increasing in many areas. This post will show how to download local USGS flow and precipitation data and generate a 3-panel chart of flow,...

Read more »

Watch keynote presentations from the useR!2019 conference

July 19, 2019
By

The keynote presentations from last week's useR!2019 conference in Toulouse are now available for everyone to view on YouTube. (The regular talks were also recorded and video should follow...

Read more »

Time series forecast cross-validation by @ellis2013nz

Time series forecast cross-validation by @ellis2013nz

Time series cross-validation is important part of the toolkit for good evaluation of forecasting models. forecast::tsCV makes it straightforward to implement, even with different combinations of explanatory regressors...

Read more »

How to make 3D Plots in R (from 2D Plots of ggplot2)

July 19, 2019
By
How to make 3D Plots in R (from 2D Plots of ggplot2)

Are you interested in guest posting? Publish at DataScience+ via your editor (i.e., RStudio). Category Visualizing Data Tags Best R Packages Data Visualisation R Programming 3D Plots built in the right way for the right...

Read more »

What NOT to do when building a shiny app (lessons learned the hard way)

What NOT to do when building a shiny app (lessons learned the hard way)

I’ve been building R shiny apps for a while now, and ever since I started working with shiny, it has significantly increased the set of services I offer my...

Read more »

Statistical matching, or when one single data source is not enough

Statistical matching, or when one single data source is not enough

I was recently asked how to go about matching several datasets where different samples of individuals were interviewed. This sounds like a big problem; say that you have dataset A...

Read more »

An R Users Guide to JSM 2019

July 18, 2019
By
An R Users Guide to JSM 2019

If you are like me, and rather last minute about making a plan to get the most out of a large conference, you are just starting to think about...

Read more »

Dotplot – the single most useful yet largely neglected dataviz type

July 18, 2019
By
Dotplot – the single most useful yet largely neglected dataviz type

I have to confess that the core message of this post is not really a fresh saying. But if I was given a chance to deliver one dataviz advise...

Read more »

Wordcloud of conference abstracts – FOSS4G Edinburgh

July 18, 2019
By
Wordcloud of conference abstracts – FOSS4G Edinburgh

FOSS4G conference wordcloud of abstracts. Code included!

Read more »

RStudio Trainer Directory Launches

July 17, 2019
By

Several dozen people have taken part in RStudio’s instructor training and certification program since it was announced earlier this year. Since our last update, many of them have completed...

Read more »

Plotting Bayes Factors for multiple comparisons using ggsignif

Plotting Bayes Factors for multiple comparisons using ggsignif

This week my post is relatively short and very focused. What makes it interesting (at least to me) is whether it will be seen as a useful “bridge” between frequentist methods...

Read more »

Processing satellite image collections in R with the gdalcubes package

July 17, 2019
By
Processing satellite image collections in R with the gdalcubes package

The problem Introduction and overview of gdalcubes Installation Demo dataset Creating image collections Creating and processing data cubes Chaining data cube...

Read more »

rOpenSci Hiring for New Position in Statistical Software Testing and Peer Review

Are you passionate about statistical methods and software? If so we would love...

Read more »

Combining momentum and value into a simple strategy to achieve higher returns

July 17, 2019
By
Combining momentum and value into a simple strategy to achieve higher returns

In this post I'll introduce a simple investing strategy that is well diversified and has been shown to work across different markets. In short, buying cheap and uptrending stocks...

Read more »

An Ad-hoc Method for Calibrating Uncalibrated Models

July 16, 2019
By
An Ad-hoc Method for Calibrating Uncalibrated Models

In the previous article in this series, we showed that common ensemble models like random forest and gradient boosting are uncalibrated: they are not guaranteed to estimate aggregates or...

Read more »

Three Strategies for Working with Big Data in R

July 16, 2019
By
Three Strategies for Working with Big Data in R

For many R users, it’s obvious why you’d want to use R with big data, but not so obvious how. In fact, many people (wrongly) believe that R just...

Read more »

101 Machine Learning Algorithms for Data Science with Cheat Sheets

July 16, 2019
By
101 Machine Learning Algorithms for Data Science with Cheat Sheets

Your one-stop-shop for machine learning algorithms. Each algorithm is complete with a short description and links to examples. If you would like to take the algorithms with you, click...

Read more »

shinymeta — a revolution for reproducibility

July 16, 2019
By
shinymeta — a revolution for reproducibility

Joe Cheng presented shinymeta enabling reproducibility in shiny at useR in July 2019. I am really thankful for this. This article shows a… Continue reading on Towards Data Science »

Read more »

Shiny Modules

July 16, 2019
By
Shiny Modules

Tidiness is half the life .. this is a German saying that you might not necessarily...

Read more »

eRum2020 in Milan

July 16, 2019
By
eRum2020 in Milan

The European R conference will visit Milan in 2020! Mirai Solutions is delighted to actively support and participate in the organization of the event. The European R Users Meeting (eRum)...

Read more »

Reinforcement Learning: Life is a Maze

July 16, 2019
By
Reinforcement Learning: Life is a Maze

It can be argued that most important decisions in life are some variant of an exploitation-exploration problem. Shall I stick with my current job or look for a new...

Read more »

Bojack Horseman and Tidy Data Principles (Part 1)

July 15, 2019
By
Bojack Horseman and Tidy Data Principles (Part 1)

Motivation After reading The Life Changing Magic of Tidying Text and A tidy text analysis of Rick and Morty I wanted to do something similar for Rick and Morty and...

Read more »

Search R-bloggers

Sponsors