Implied risk premia

May 14, 2020
By
Implied risk premia

In our last post, we applied machine learning to the Capital Aset Pricing Model (CAPM) to try to predict future returns for the S&P 500. This analysis was part of our overall project to analyze the various methods to set return expectations when seeking to build a satisfactory portfolio. Others include historical averages and discounted cash flow models we...

Read more »

R doesn’t need to throttle AWS Athena anymore

R doesn’t need to throttle AWS Athena anymore

RBloggers|RBloggers-feedburner I am happy to announce that RAthena-1.9.0 and noctua-1.7.0 have been released onto the cran. They both bring two key features: More stability when working with AWS Athena, focusing on AWS Rate Exceeded throttling errors New helper function to convert AWS S3 backend files to save cost NOTE: RAthena and noctua features correspond to each other, as a result I will refer...

Read more »

glmnet v4.0: generalizing the family parameter

May 14, 2020
By
glmnet v4.0: generalizing the family parameter

I’ve had the privilege of working with Trevor Hastie on an extension of the glmnet package which has just been released. In essence, the glmnet() function’s family parameter can now be any object of class family. This enables the user … Continue reading →

Read more »

Overview of the yuima and yuimaGUI R packages

May 14, 2020
By
Overview of the yuima and yuimaGUI R packages

The YUIMA Project is an open source academic project aimed at developing a complete environment for estimation and simulation of Stochastic Differential Equations and other Stochastic Processes via the R package called yuima and its Graphical User Interface yuimaGUI. Quickstart # install the package install.packages('yuima') # load the package require(yuima) The YUIMA Object The main object is the yuima object which allows to describe the model...

Read more »

Overview of the yuima and yuimaGUI R packages

May 14, 2020
By
Overview of the yuima and yuimaGUI R packages

The YUIMA Project is an open source academic project aimed at developing a complete environment for estimation and simulation of Stochastic Differential Equations and other Stochastic Processes via the R package called yuima and its Graphical User Interface yuimaGUI. Quickstart # install the package install.packages('yuima') # load the package require(yuima) The YUIMA Object The main object is the yuima object which allows to describe the model...

Read more »

Data Privacy in the Age of COVID-19

May 14, 2020
By
Data Privacy in the Age of COVID-19

Hugo Bowne-Anderson, the host of DataFramed, the DataCamp podcast, recently interviewed Katharine Jarmul, Head of Product at Cape Privacy. Introducing Katharine Jarmul Hugo Bowne Anderson: Hey Katharine. Katharine Jarmul: Hi Hugo. Hugo Bowne Anderson: How are you? Katharine Jarmul: Good. How are you? Hugo Bowne Anderson: Pretty good. So I'm going to read your bio as Ryan read mine so everyone knows who you are....

Read more »

Using {drake} for Machine Learning

May 14, 2020
By

A few weeks ago, Miles McBain toke us for a tour through his project organisation in this blogpost. Not surprisingly given Miles’ frequent shoutouts about the package, it is completely centered around drake. About a year ago on twitter, he convinced me to take this package for a spin. I was immediately sold. It cured a number of pains...

Read more »

RStudio Team Admin Training – Remotely

May 14, 2020
By
RStudio Team Admin Training – Remotely

We train you in RStudio Server Pro, RStudio Connect and RStudio Package Manager – remotely  RStudio Team is a bundle of professional software for statistical data analysis, package management and data product exchange. As a certified RStudio partner, we will train you in the proper use of RStudio Team – in German. We will help you get started with

Read more »

Use existing functions and data through packages

May 14, 2020
By
Use existing functions and data through packages

Packages give you access to a huge set of functions and datasets, most of which are provided by the generous R community. They are the secret sauce which makes it possible to use R for pretty much anything you can imagine. Additionally, lots of packag...

Read more »

Checking your R package on Solaris

May 13, 2020
By

TL;DR To check your package on Solaris, call rhub::check() as usual and choose one of our Solaris builders. Bookmark this page, in case you get an email from CRAN about your package failing on Solaris. Oracle Solaris Oracle Solaris is a non-free Unix operating system. CRAN regularly tests R packages on Solaris 10. See this talk by Uwe Ligges if you want to know more about the motivation for...

Read more »

Intro to {polite} Web Scraping of Soccer Data with R!

May 13, 2020
By
Intro to {polite} Web Scraping of Soccer Data with R!

Fans of soccer/football have been left bereft of their prime form of entertainment these past few months and I’ve seen a huge uptick in the amount of casual fans and bloggers turning to learning programming languages such as R o...

Read more »

Create and deploy a Custom Vision predictive service in R with AzureVision

May 13, 2020
By

The AzureVision package is an R frontend to Azure Computer Vision and Azure Custom Vision. These services let you leverage Microsoft’s Azure cloud to carry out visual recognition tasks using advanced image processing models, with minimal machine learning expertise. The basic idea behind Custom Vision is to take a pre-built image recognition model supplied by Azure, and customise it...

Read more »

Summer school registration opened and bootnet version 1.4 on CRAN

May 13, 2020
By
Summer school registration opened and bootnet version 1.4 on CRAN

Psychological Networks Amsterdam Summer School – The online edition From June 29 to July 3, we will host an online edition of the Psychological Networks Amsterdam Summer School! During this week, we will make a video lecture series available together with exercises and solutions, and will be available for assistance throughout the week on a

Read more »

Book Review: Introductory Time Series with R

May 12, 2020
By

I'm a big fan of R and time series analysis, so I was excited to read the book "Introductory Time Series with R. I've been using the book for about 9 years, so I thought it was about time for a review! In this review, I'm going to cover the following t...

Read more »

Multinomial classification with tidymodels and #TidyTuesday volcano eruptions

May 12, 2020
By
Multinomial classification with tidymodels and #TidyTuesday volcano eruptions

Lately I’ve been publishing screencasts demonstrating how to use the tidymodels framework, from first steps in modeling to how to evaluate complex models. Today’s screencast demonstrates how to implement multiclass or multinomial classification using with this week’s #TidyTuesday dataset on volcanoes. 🌋 Here is the code I used in the video, for those who prefer reading instead of...

Read more »

Calculating ratios with Tidyverse

May 12, 2020
By

Calculating percentages is a fairly common operation, right? However, doing it without leaving the pipeflow always force me to do some bizarre piping such as double grouping and summarise. I am using again the nuclear accidents dataset, and trying to calculate the percentage of accidents that happened in Europe each year. nuclear_accidents % mutate(Year = Date %__% mdy() %__% year(), In_Europe...

Read more »

coronavirus v0.2.0 is now on CRAN

May 12, 2020
By

Version 0.2.0 of the coronavirus R data package was pushed today to CRAN. The coronavirus package provides a tidy format for Johns Hopkins University Center for Systems Science and Engineering (JHU CCSE) Coronavirus dataset. Version 0.2.0 catch up with the significant changes in the data that took place since the initial release on February 24, changing the package status...

Read more »

One-proportion and goodness of fit test (in R and by hand)

May 12, 2020
By
One-proportion and goodness of fit test (in R and by hand)

Introduction In R Data One-proportion test Assumption of prop.test() and binom.test() Chi-square goodness of fit test Does my distribution follow a given distribution? Observed frequencies Expected frequencies Observed vs. expected frequencies By hand One-proportion test Verification in R Goodness of fit test Verification in R Introduction In a previous article, I presented the Chi-square test of independence in R which is used to test the independence between two categorical variables. In this article, I show how...

Read more »

Filtering with string statements in dplyr

May 12, 2020
By

A question came up recently at work about how to use a filter statement entered as a complete string variable inside dplyr’s filter() function – for example dplyr::filter(my_data, "var1 == 'a'"). There does not seem to be much out there on this and I was not sure how to do it either but luckily jakeybob had a neat solution...

Read more »

Building R 4+ for Windows with OpenBLAS

May 12, 2020
By
Building R 4+ for Windows with OpenBLAS

This post outlines the steps needed to build R 4+ for Windows with OpenBLAS. The release of R 4.0 includes significant changes to the Windows build system from prior versions—for the better! Before anything, we all owe Jeroen Ooms significant gratitude for the many hours he spent working on the build system. Thank you, Jeroen!! Read the full article... The...

Read more »

Collaborative data science: High level guidance for ethical scientific peer reviews

May 12, 2020
By
Collaborative data science: High level guidance for ethical scientific peer reviews

Preamble Catalan Castellers are collaborating (Wikipedia) Availability of distributed code tracking tools and associated collaborative tools make life much easier in building collaborative scientific tools and products. This is now especially much more important in data science as it is applied in many different industries as a de-facto standard. Essentially a computational science field in academics now become industry-wide practice. Peer-review is...

Read more »

Say It Ain’t So: using Weezer album cover colours in R

May 12, 2020
By
Say It Ain’t So: using Weezer album cover colours in R

I’m a long-term fan of Weezer. Such was the brilliance of their first two albums that I have stuck with them through thick and thin. And dear me, there has been some very thin music. Nonetheless I own every album – thirteen of them. Among them are six albums entitled “Weezer”. These records are colloquially

Read more »

Will AI become conscious any time soon?

May 12, 2020
By
Will AI become conscious any time soon?

We all know the classical Sci-Fi trope of intelligent machines becoming conscious and all the potential ramifications that could follow from there (free will, fighting their human creators, ethical dilemmas and so forth). Now, is this a realistic scenario? As a researcher in the area of AI (see e.g. So, what is AI really?), with … Continue reading "Will...

Read more »

How to schedule R scripts

May 11, 2020
By
How to schedule R scripts

Running R with taskscheduleR and cronR In a previous post, we talked about how to run R from the Windows Task Scheduler. This article will talk about two additional approaches to schedule R scripts, including using the taskscheduleR package on Windows and the cronR package for Linux. For scheduling Python code, check out this post. The post How to...

Read more »

The USMS ePostal Over the Last 20+ Years

The USMS ePostal Over the Last 20+ Years

In a previous post we discussed the 2020 USMS ePostal results, and hypothesized that declines in average distances swum by older age groups are caused by higher proportions of weaker swimmers participating in older age groups, rather than age based declines. We also mentioned how USMS epostal results are observational data, and so we can’t draw any strict causal...

Read more »

The Hitchhiker’s Guide to Ggplot2 is for sale at $10.00

May 11, 2020
By

The Hitchhiker’s Guide to Ggplot2 is for sale at $10.00, the original price was $29.99 and includes the R Markdown notebooks to ease your study. It will be on a discounted price for 24 hours. You can get it from Leanpub.

Read more »

AMMI analyses for GE interactions

AMMI analyses for GE interactions

The CoViD-19 situation in Italy is little by little improving and I feel a bit more optimistic. It’s time for a new post! I will go back to a subject that is rather important for most agronomists, i.e. the selection of crop varieties. All farmers are perfectly aware that crop performances are affected both by the genotype and by the environment....

Read more »

In defence of the 95% CI

May 11, 2020
By
In defence of the 95% CI

TLDR: BayestestR currently uses a 89% threshold by default for Credible Intervals (CI). Should we change that? If so, by what? Join the discussion here. Magical numbers, or conventional thresholds, have bad press in statistics, and there are many of...

Read more »

To stratify or not to stratify? It might not actually matter

May 11, 2020
By

Continuing with the theme of exploring small issues that come up in trial design, I recently used simulation to assess the impact of stratifying (or not) in the context of a multi-site Covid-19 trial with a binary outcome. The investigators are concern...

Read more »

Search R-bloggers

Sponsors