R Community Explorer – Google Summer of Code Projects

February 21, 2020
By
R Community Explorer – Google Summer of Code Projects

By Benaiah Ubah, Claudia Vitolo and Rick Pack Introduction Google Summer of Code (GSoC) is an annual 3-month open-source software development (coding) program that provides a platform for mentors and... The post R Community Explorer – Google Summer of Code Projects appeared first on R Consortium.

Read more »

Illuminating the Illuminated – Part Four: Tempora Mutantur | Changepoint Analysis of the Voynich Manuscript

February 21, 2020
By
Illuminating the Illuminated – Part Four: Tempora Mutantur | Changepoint Analysis of the Voynich Manuscript

Our past interrogation of the Voynich Manuscript has deconstructed its esoteric symbols into a form more suitable for our ends, subjected its statistical properties to comparison with more mundane texts, and unearthed its hidden internal structures via the esoteric process of topic modelling. In this final post, we

Read more »

Tidy Discounted Cash Flow Analysis in R (for Company Valuation)

February 20, 2020
By
Tidy Discounted Cash Flow Analysis in R (for Company Valuation)

The tidy data principles are a cornerstone of financial data management and the data modeling workflow. The foundation for tidy data management is the tidyverse, a collection of R packages, that work in harmony, are built for scalability, and are taugh...

Read more »

rOpenSci’s Leadership in #rstats Culture

rOpenSci’s Leadership in #rstats Culture

At their closing keynote at the 2020 RStudio Conference, Hilary Parker and Roger Peng mentioned that they hatched the idea for their excellent Not So Standard Deviations podcast following their reunion at the 2015 rOpenSci unconf, (“runconf15”). That statement went straight to my heart because it pin-pointed how I had been feeling throughout the week of RStudio Conference that...

Read more »

Rebalancing! Really?

February 20, 2020
By

In our last post, we introduced benchmarking as a way to analyze our hero’s investment results apart from comparing it to alternate weightings or Sharpe ratios. In this case, the benchmark was meant to capture the returns available to a global aggregate of investable risk assets. If you could own almost every stock and bond globally and in the...

Read more »

A classification approach to predicting air crash survival

February 20, 2020
By
A classification approach to predicting air crash survival

Introduction Historically there have been several instance of air plane crashes. This study is an attempt to explore the possible causes of such air crashes, and to determine if air travel is a safe option. Objective The objective of this study are ...

Read more »

Analysing tweets with the rtweet package

Analysing tweets with the rtweet package

This is a brief post on collecting and analysing tweets. I will show how to use the rtweet package to extract Twitter posts about the R community. This ties into last weeks post on rstudio::conf and community values, and is also related to my previous intro post on web scraping. First, we load rtweet and other (tidyverse) packages we will...

Read more »

DALEX v 1.0 and the Explanatory Model Analysis

February 20, 2020
By
DALEX v 1.0 and the Explanatory Model Analysis

The DALEX package version 1.0 CRAN release is scheduled for Feb 20. It brings lots of improvements and changes. Below I will briefly summarise how this package helps to develop better and safer predictive models. To see code snippets scroll to the end....

Read more »

EARL Conference 2020 – Why YOU should Submit an Abstract

February 20, 2020
By

In 2014 we launched the EARL (Enterprise Application of the R Language) Conference aimed at connecting and inspiring business users... The post EARL Conference 2020 – Why YOU should Submit an Abstract appeared first on Mango Solutions.

Read more »

Testing for a causal effect (with 2 time series)

February 19, 2020
By
Testing for a causal effect (with 2 time series)

A few days ago, I came back on a sentence I found (in a French newspaper), where someone was claiming that “… an old variable explains 85% of the change in a new variable. So we can talk about causality” and I tried to explain that it was just stupid : if we consider the regression of the temperature...

Read more »

Aligning 2D NMR Spectra Part 1

Aligning 2D NMR Spectra Part 1

In one-dimensional \(^1\)H NMR spectroscopy, particularly biomolecular NMR, it is frequently necessary to align spectra before chemometric or metabolomics analysis. Poor alignment arises largely from pH and ionic strength induced shifts in aqueous samples. There are a number of published alignment algorithms for the one-dimensional case. In this series of posts, I’ll discuss the alignment process for the case of...

Read more »

2 Months in 2 Minutes – rOpenSci News, February 2020

2 Months in 2 Minutes – rOpenSci News, February 2020

rOpenSci HQ On behalf of rOpenSci, thank you to everyone who has contributed their creativity, curiosity, smarts, and time in the last year. Read our Thank You, 2019. Software Peer Review 3 community-contributed packages passed...

Read more »

Debugging: Signals and Subprocesses

February 19, 2020
By

This is a short story about a non-trivial bug in the processx package, and how I fixed it. It is a good showcase of the some debugging tools. The bug processx is an R package to start and manage external processes. It is used by the callr package to run code in another R session. The original bug report by Will Landau has a nice, clean reproducible...

Read more »

Circular regression trees and forests

February 19, 2020
By
Circular regression trees and forests

A flexible framework for probabilistic forecasting of circular data is introduced, using distributional regression trees and random forests based on the von Mises distribution. Citation Lang MN, Schlosser L, Hothorn T, Mayr GJ,...

Read more »

Dynamic UI Elements in Shiny – Part 2

February 19, 2020
By
Dynamic UI Elements in Shiny – Part 2

Continuing our effort of applying the principles of reactivity to the UI part of a ShinyApp, this blog introduces two ways of conditionally rendering UI-elements in your app. Both presented solutions accomplish the same goal, once from the server part and once from the UI part of your application. Der Beitrag Dynamic UI Elements in Shiny – Part 2 erschien...

Read more »

Data science trainings in Berlin & Hamburg

February 19, 2020
By
Data science trainings in Berlin & Hamburg

R is one of the leading programming languages for data analysis. In April and October 2020 we will bring our popular trainings “Introduction to R“ and “Machine Learning with R“ to Berlin and Hamburg. Save one of the coveted places and become a data science expert with R! Berlin Introduction to R 21.04. – 22.04.2020

Read more »

The significance of the sector on the salary in Sweden, a comparison between different occupational groups

February 18, 2020
By
The significance of the sector on the salary in Sweden, a comparison between different occupational groups

In my last post, I found that the sector has a significant impact on the salary of engineers. Is the significance of the sector unique to engineers or are there similar correlations in other occupational groups? Statistics Sweden use NUTS (Nomenclature des Unités Territoriales Statistiques), which is the EU’s hierarchical regional division, to specify the regions. The F-value from the Anova...

Read more »

R, Public Health and Politics

February 18, 2020
By

Last week, Lancet published the paper Improving the prognosis of health care in the USA by Alison P Galvani, Alyssa S Parpia, Eric M Foster, Burton H Singer, Meagan C Fitzpatrick of CIDMA, the Center for Infectious Disease Modeling and Analysis, Yale School of Public Health. The paper, which, provides a detailed analysis of the single-payer system introduced by...

Read more »

ChemoSpec2D Update

I’m pleased to announce that ChemoSpec2D, a package for exploratory data analysis of 2D NMR spectra, has been updated on CRAN and is coming to a mirror near you. Barring user reports to the contrary, I feel like the package has pretty much stabil...

Read more »

Spatial predictions with GAMs and rasters

February 18, 2020
By
Spatial predictions with GAMs and rasters

Spatial predictions with GAMs and rasters One powerful use of GAMs is for interpolating to unsampled locations. We can combine GAMs with raster package to conveniently predict a GAM model to places we have not got data. Simulate some spatial data We...

Read more »

Working with Statistics Canada Data in R, Part 4: Canadian Census Data – cancensus Package Setup

February 18, 2020
By

Back to Working with Statistics Canada Data in R, Part 3. What is cancensus Setting up cancensus: Adding API Key and Cache Path to .Rprofile Editing .Rprofile on Linux Editing .Rprofile on Windows What is cancensus In the Introduction to the “Working with Statistics Canada Data in R” series, I discussed the three main types The post Working with...

Read more »

SMC on the 2019-2020 nCoV outbreak

February 18, 2020
By
SMC on the 2019-2020 nCoV outbreak

Two weeks ago, Kurcharski et al., from the CMMID nCoV working group at the London School of Hygiene and Tropical Medicine, published on medrXiv a statistical analysis via a stochastic SEIR model of the evolution of the 2019-2020 nCoV epidemics, with prediction of a peak outbreak by late February in Wuhan and a past outbreak

Read more »

rWind is working again!

February 18, 2020
By
rWind is working again!

 Yep, after several months fallen, rWind R package is working again! I'm sorry, but I'm too busy lately with my PhD dissertation and I have not all the time that I'd like to improve rWind 😞. The problem was due to the change of the URL to GFS server, and it was solved thanks to a pull request in...

Read more »

Dataviz Workshop at RStudio::conf

February 18, 2020
By

Workshop materials are available here: https://rstd.io/conf20-dataviz Consider buying the book; it’s good: Data Visualization: A Practical Introduction / Buy on Amazon I was delighted to have the opportunity to teach a two-day workshop on Data Visualization using ggplot2 at this year’s rstudio::conf(2020) in January. It was my first time attending the conference and it was a terrific experience. I...

Read more »

Get Better: R for absolute beginners

February 18, 2020
By
Get Better: R for absolute beginners

As part of the series on development of early career researchers in the lab, we spent three sessions over three weeks learning the basics of R. In my book “The Digital Cell”, I advocate R as the main number-crunching software but the R literacy in my lab is actually quite mixed. In order to know

Read more »

What and who is IT community? What does it take to be part?

February 18, 2020
By

This blog post is long over due and has been rattling in my head for long time. Simply and boldly put it, community is everyone involved behind the result of your search for a particular problem. And by that I…Read more ›

Read more »

How is information gain calculated?

February 17, 2020
By
How is information gain calculated?

This post will explore the mathematics behind information gain. We’ll start with the base intuition behind information gain, but then explain why it has the calculation that it does. What is information gain? Information gain is a measure frequently used in decision trees to determine which variable to split the input dataset on at each The post How is...

Read more »

Lasso Regression (home made)

February 17, 2020
By
Lasso Regression (home made)

To compute Lasso regression, define the soft-thresholding functionThe R function would be soft_thresholding = function(x,a){ sign(x) * pmax(abs(x)-a,0) } To solve our optimization problem, set so that the optimization problem can be written, equivalently hence and one gets or, if we develop Again, if there are weights , the coordinate-wise update becomes The code to compute this componentwise descent...

Read more »

Hyperparameter tuning and #TidyTuesday food consumption

February 17, 2020
By
Hyperparameter tuning and #TidyTuesday food consumption

Last week I published a screencast demonstrating how to use the tidymodels framework and specifically the recipes package. Today, I’m using this week’s #TidyTuesday dataset on food consumption around the world to show hyperparameter tuning! Here is the code I used in the video, for those who prefer reading instead of or in addition to video. Explore the...

Read more »

Search R-bloggers

Sponsors