Metadata : from PostgreSQL comments to R labels

April 18, 2019
By

Metadata are an essential part of a robust data science workflow ; they record the meaning of each variable : its units, quality, allowed range, how we collect it, when it’s been recorded etc. Data without metadata are practically worthless. Here we will show how to transfer the metadata from PostgreSQL to R. In PostgreSQL

Read more »

Base Rate Fallacy – or why No One is justified to believe that Jesus rose

April 18, 2019
By
Base Rate Fallacy – or why No One is justified to believe that Jesus rose

In this post we are talking about one of the most unintuitive results of statistics: the so called false positive paradox which is an example of the so called...

Read more »

Applying gradient descent – primer / refresher

April 18, 2019
By
Applying gradient descent – primer / refresher

Every so often a problem arises where it’s appropriate to use gradient descent, and it’s fun (and / or easier) The post Applying gradient descent – primer / refresher...

Read more »

Common Uncommon Notations that Confuse New R Coders

April 17, 2019
By

Here are a few of the more commonly used notations found in R code and documentation that confuse coders of any skill level who are new to R. Be...

Read more »

A Comparative Review of the JASP Statistical Software

April 17, 2019
By
A Comparative Review of the JASP Statistical Software

JASP is a free and open source statistics package that targets beginners looking to point-and-click their way through analyses. This article is one of a series of reviews which...

Read more »

ANCOVA example – April 18, 2019

April 17, 2019
By
ANCOVA example – April 18, 2019

I recently had the need to run an ANCOVA, not a task I perform all that often and my first time using R to do so (I’ve done it...

Read more »

RStudio Package Manager 1.0.8 – System Requirements

April 17, 2019
By
RStudio Package Manager 1.0.8 – System Requirements

Installing R packages on Linux systems has always been a risky affair. In RStudio Package Manager 1.0.8, we’re giving administrators and R users the information they need to make installing packages...

Read more »

When Standards Go Wild – Software Review for a Manuscript

When Standards Go Wild – Software Review for a Manuscript

Stefanie Butland, rOpenSci Community Manager Some things are just irresistible to a community manager – PhD student Hugo Gruson’s recent tweets definitely fall into that category. I was surprised and...

Read more »

Explore the landscape of R packages for automated data exploration

April 17, 2019
By
Explore the landscape of R packages for automated data exploration

Do you spend a lot of time on data exploration? If yes, then you will like today’s post about AutoEDA written by Mateusz Staniak. If you ever dreamt of...

Read more »

Bayes vs. the Invaders! Part Three: The Parallax View

April 17, 2019
By
Bayes vs. the Invaders! Part Three: The Parallax View

The Parallax View In the previous post of this series unveiling the relationship between UFO sightings and population, we crossed the threshold of normality underpinning linear models to construct...

Read more »

A Detailed Guide to Plotting Line Graphs in R using ggplot geom_line

April 16, 2019
By
A Detailed Guide to Plotting Line Graphs in R using ggplot geom_line

When it comes to data visualization, it can be fun to think of all the flashy and exciting ways to display a dataset. But if you're trying to convey...

Read more »

Setting up RStudio Server on a Cloud for Collaboration and Reproducibility

April 16, 2019
By
Setting up RStudio Server on a Cloud for Collaboration and Reproducibility

Roland Stevenson is a data scientist and consultant who may be reached on Linkedin. When setting up R and RStudio Server on a cloud Linux instance, some thought should be...

Read more »

Vectorizing functions in R is easy

April 16, 2019
By

Imagine you have a function that only takes one argument, but you would really like to work on a vector of values. A short example on how function Vectorize()...

Read more »

Two interesting facts about high-dimensional random projections

April 16, 2019
By
Two interesting facts about high-dimensional random projections

John Cook recently wrote an interesting blog post on random vectors and random projections. In the post, he states two surprising facts of high-dimensional geometry and gives some intuition...

Read more »

Controlling Data Layout With cdata

April 16, 2019
By
Controlling Data Layout With cdata

Here is an example how easy it is to use cdata to re-layout your data. Tim Morris recently tweeted the following problem (corrected). Please will you take pity on...

Read more »

Writing a letter to DataCamp

April 15, 2019
By

Since 2017 I have been an instructor for DataCamp, the VC-backed online data science education platform. What this means is that I am not an employee, but I have...

Read more »

Customize Your Interactive EDA: Explore the Fuel Economy of the U.S. Car Market

Customize Your Interactive EDA: Explore the Fuel Economy of the U.S. Car Market

Interactive EDA is nice but customized interactive EDA is even nicer. To celebrate the new CRAN version of my ‘ExPanDaR’ package I prepare a customized variant of ‘ExPanD’ to...

Read more »

Customize Your Interactive EDA: Explore the Fuel Economy of the U.S. Car Market

Customize Your Interactive EDA: Explore the Fuel Economy of the U.S. Car Market

Interactive EDA is nice but customized interactive EDA is even nicer. To celebrate the new CRAN version of my ‘ExPanDaR’ package I prepare a customized variant of ‘ExPanD’ to...

Read more »

Even with randomization, mediation analysis can still be confounded

April 15, 2019
By
Even with randomization, mediation analysis can still be confounded

Randomization is super useful because it usually eliminates the risk that confounding will lead to a biased estimate of a treatment effect. However, this only goes so far. If...

Read more »

The sinh-arcsinh normal distribution

April 15, 2019
By
The sinh-arcsinh normal distribution

This month’s issue of Significance magazine has a very nice summary article of the sinh-arcsinh normal distribution. (Unfortunately, the article seems to be behind a paywall.) This distribution was...

Read more »

BayesComp 20 [full program]

April 15, 2019
By
BayesComp 20 [full program]

The full program is now available on the conference webpage of BayesComp 20, next 7-10 Jan 2020. There are eleven invited sessions, including one j-ISBA session, and a further...

Read more »

Bioconductor S4 classes for high-throughput omics data

April 15, 2019
By

Bioconductor S4 classes for high-throughput omics data Motivation Multi-omics data integration and analysis. What a beast! It is one of the major challenges in the era of...

Read more »

R Programmers Earn More than Python Programmers

April 14, 2019
By
R Programmers Earn More than Python Programmers

At least globally, that is. According to the 2019 Stack Overflow Developer Survey, R users globally reported earning an average of $64k per year, $1k more than the $63k...

Read more »

New package: GetBCBData

April 14, 2019
By
New package: GetBCBData

The Central Bank of Brazil (BCB) offers access to its SGS system (sistema gerenciador de series temporais) with a official API available here. With time, I find myself using more...

Read more »

{attempt} 0.3.0 is now on CRAN

April 14, 2019
By

Last week, a new version of {attempt} was published on CRAN. This version includes some improvements in the current code base, and the addition of new functions. You can get it with...

Read more »

Describe and understand Bayesian models and posteriors using bayestestR

April 14, 2019
By
Describe and understand Bayesian models and posteriors using bayestestR

The Bayesian framework is quickly gaining popularity among scientists, leading to the growing popularity of packages to fit Bayesian models, such as rstanarm or brms. However, extracting summary indices...

Read more »

Tidying Video Game Metadata: A Case Study

April 14, 2019
By
Tidying Video Game Metadata: A Case Study

CategoriesData Management Tags Case Study Data Manipulation Data Visualisation R Programming tidyverse This article was jointly written by Arvid J. Kingl & Viktor Konakovsky The Battle for Wesnoth is an open-source, turn-based strategy game. The game...

Read more »

Understanding Bayesian Inference with a simple example in R!

April 14, 2019
By
Understanding Bayesian Inference with a simple example in R!

 Hi there! Last summer, the Royal Botanical Garden (Madrid, Spain) hosted the first edition of MadPhylo, a workshop about Bayesian Inference in phylogeny using RevBayes. It was a pleasure for...

Read more »

Piping is Method Chaining

April 14, 2019
By

What R users now call piping, popularized by Stefan Milton Bache and Hadley Wickham, is inline function application (this is notationally similar to, but distinct from the powerful interprocess...

Read more »

Search R-bloggers


Sponsors

Mango solutions







Zero Inflated Models and Generalized Linear Mixed Models with R



wiley.com/learn/datascience

Quantide: statistical consulting and training

ODSC boston

http://www.eoda.de









Six Sigma Online Training

mljar.com

Our ads respect your privacy. Read our Privacy Policy page to learn more.

Contact us if you wish to help support R-bloggers, and place your banner here.