Scraping Machinery Parts

November 10, 2019
By
Scraping Machinery Parts

I’ve been exploring the feasibility of aggregating data on prices of replacement parts for heavy machinery. There are a number of websites which list this sort of data. I’m focusing on the static sites for the moment. I’m using are R with {rvest} (and a few other Tidyverse packages thrown in for good measure). library(glue) library(dplyr) library(purrr) library(stringr) library(rvest) The data are paginated. Fortunately the URL...

Read more »

A comparison of methods for predicting clothing classes using the Fashion MNIST dataset in RStudio and Python (Part 1)

November 10, 2019
By
A comparison of methods for predicting clothing classes using the Fashion MNIST dataset in RStudio and Python (Part 1)

Florianne Verkroost is a PhD candidate at Nuffield College at the University of Oxford. With a passion for data science and a background in mathematics and econometrics. She applies...

Read more »

Cleaning the Table

November 10, 2019
By

While I’m talking about getting data into R this weekend, here’s another quick example that came up in class this week. The mortality data in the previous example were...

Read more »

#TidyTuesday: horror films, squirrels and commuters

November 10, 2019
By
#TidyTuesday: horror films, squirrels and commuters

Tidy Tuesday is a fun weekly activity where a lot of R enthusiasts make different visualisations, and possibly modelling, of the same dataset. You can read more about it...

Read more »

Dangerous streets of Bratislava! Animated maps using open data in R

November 9, 2019
By
Dangerous streets of Bratislava! Animated maps using open data in R

At the work recently, I wanted to make some interesting start-up pitch (presentation) ready animated visualization and got some first experience with spatial data (e.g. polygons). I enjoyed working...

Read more »

future 1.15.0 – Lazy Futures are Now Launched if Queried

November 9, 2019
By
future 1.15.0 – Lazy Futures are Now Launched if Queried

No dogs were harmed while making this release future 1.15.0 is now on CRAN, accompanied by a recent, related update of future.callr 0.5.0. The main update is a change...

Read more »

Reading in Data

November 9, 2019
By

Here’s a common situation: you have a folder full of similarly-formatted CSV or otherwise structured text files that you want to get into R quickly and easily. Reading data...

Read more »

Rcpp 1.0.3: More Spit and Polish

November 9, 2019
By
Rcpp 1.0.3: More Spit and Polish

The third maintenance release 1.0.3 of Rcpp, following up on the 10th anniversary and the 1.0.0. release both pretty much exactly one year ago, arrived on CRAN yesterday....

Read more »

Using Spark from R for performance with arbitrary code – Part 4 – Using the lower-level invoke API to manipulate Spark’s Java objects from R

November 9, 2019
By
Using Spark from R for performance with arbitrary code – Part 4 – Using the lower-level invoke API to manipulate Spark’s Java objects from R

Introduction In the previous parts of this series, we have shown how to write functions as both combinations of dplyr verbs and SQL query generators that can be executed by...

Read more »

Learning Linux – the wrong way – day 2

November 8, 2019
By
Learning Linux – the wrong way – day 2

Unborking the borked laptop - Recap I’m trying to learn some Linux. Ostensibly to do some data science at the command line, because it feels...

Read more »

Intrumental variable regression and machine learning

Intrumental variable regression and machine learning

Intro Just like the question “what’s the difference between machine learning and statistics” has shed a lot of ink (since at least Breiman (2001)), the same question but where...

Read more »

A small simple random sample will often be better than a huge not-so-random one by @ellis2013nz

November 8, 2019
By
A small simple random sample will often be better than a huge not-so-random one by @ellis2013nz

An interesting big data thought experiment The other day on Twitter I saw someone referencing a paper or a seminar or something that was reported to examine the following situation:...

Read more »

Gold-Mining Week 10 (2019)

November 7, 2019
By

Week 10 Gold Mining and Fantasy Football Projection Roundup now available. The post Gold-Mining Week 10 (2019) appeared first on Fantasy Football Analytics.

Read more »

OddsPlotty – the first official package I have ‘officially’ launched

November 7, 2019
By
OddsPlotty – the first official package I have ‘officially’ launched

Motivation for this The background to this package linked to a project I undertook about a year ago. The video relates to the project and the how R really...

Read more »

New R Support in Azure Machine Learning

November 7, 2019
By

Azure Machine Learning has added support for the R language, it was announced at the Ignite conference in Orlando this week. A new R package azuremlsdk (available to install...

Read more »

Tidyverse evolutions: curly-curly operator and pivoting (feat. tidytuesday data & leaflet visuals)

The tidyverse ecosystem is steadily growing and adapting to the needs of its users. As part of this evolution, existing tools are being replaced by new and better methods....

Read more »

Package Manager 1.1.0 – No Interruptions

November 6, 2019
By
Package Manager 1.1.0 – No Interruptions

No interruptions. That was our team’s goal for RStudio Package Manager 1.1.0 - we set out to make R package installation fast enough that it wouldn’t interrupt your work. More and...

Read more »

3D GPS data animation – virtually climb the Alps

November 6, 2019
By
3D GPS data animation – virtually climb the Alps

Using the amazing package rayshader I wanted to render a video of my tour to Alpe d'Huez. Now I created an R package that can use any GPX file...

Read more »

Combining Price Elasticities and Sales Forecastings for Sales Improvement

November 6, 2019
By
Combining Price Elasticities and Sales Forecastings for Sales Improvement

How can you adjust your prices to meet your sales quota better? By combining sales forecasts and price elasticity estimations, you can make recommendations to increase the profit. In...

Read more »

styler 1.2.0

November 5, 2019
By

We are pleased to announce that styler 1.2.0 is now available on CRAN. All the below features were added after styler 1.1.0, except the ones listed under Other changes were added somewhere...

Read more »

rOpenSci Announces a New Award From The Gordon and Betty Moore Foundation to Improve the Scientific Package Ecosystem for R

Today we are pleased to announce that we have received new funding from the Gordon and Betty Moore Foundation. The $894k grant will help us improve infrastructure for R...

Read more »

renv: Project Environments for R

November 5, 2019
By

We’re excited to announce that renv is now available on CRAN! You can install renv with: install.packages("renv") renv is an R dependency manager. Use renv to make your projects more: Isolated: Each...

Read more »

KGC Climate Classification and Solar Irradiance through R Packages

November 5, 2019
By

I obviously haven't been blogging lately, but that doesn't mean that I haven't been thinking about what ought to be my next blog post. Fortunately,...

Read more »

Implicit Tax Rates on Consumption and Labor in Europe

November 5, 2019
By
Implicit Tax Rates on Consumption and Labor in Europe

The aim of this blog post is to compute the implicit tax rates (ITR) on consumption, labour and corporate income for France, Italy, Spain, Germany and the Euro Area...

Read more »

Data Science on Rails: Analyzing Customer Churn

November 5, 2019
By
Data Science on Rails: Analyzing Customer Churn

Customer Relationship Management (CRM) is not only about acquiring new customers but especially about retaining existing ones. That is because acquisition is often much more expensive than retention. In...

Read more »

A First Look at Confidence Distributions

November 4, 2019
By

Using a probability distribution to characterize uncertainty is at the core of statistical inference. So, it seems natural to try to summarize the information about the parameters in statistical...

Read more »

RSiteCatalyst Version 1.4.16 Release Notes

November 4, 2019
By

It’s been a while since the last update, but RSiteCatalyst is still going strong! Thanks to Wen for submitting a fix/enhancement to enable the ability to use multiple columns...

Read more »

Spatial Data Analysis with INLA

November 4, 2019
By
Spatial Data Analysis with INLA

by Virgilio Gómez Rubio Introduction In this session I will focus on Bayesian inference using the integrated nested Laplace approximation (INLA) method. As described...

Read more »

tidync: scientific array data from NetCDF in R

tidync: scientific array data from NetCDF in R

In May 2019 version 0.2.0 of tidync was approved by rOpenSci and accepted to CRAN. Here we provide a quick overview of the typical workflow with some pseudo-code for...

Read more »

Search R-bloggers

Sponsors