SHARPEn your portfolio

February 6, 2020
By
SHARPEn your portfolio

In our last post, we started building the intuition around constructing a reasonable portfolio to achieve an acceptable return. The hero of our story had built up a small nest egg and then decided to invest it equally across the three major asset classes: stocks, bonds, and real assets. For that we used three liquid ETFs (SPY, SHY, and...

Read more »

Comparing Ensembl GTF and cDNA

February 6, 2020
By
Comparing Ensembl GTF and cDNA

It seems that most people think Ensembl’s GTF file and cDNA fasta file mean the same transcripts: Watch out! @ensembl's Fasta and GTF annotation files available via https://t.co/2AhCSnL7py do not match (there are transcripts in the GTF not found in the Fasta file. Anyone else expected...

Read more »

“Clearing the Confusion” series

February 6, 2020
By

In recent weeks, I’ve posted three tutorials with Clearing the Confusion titles, all in my regtools GitHub repo. Topics have been unbalanced classification data; k-fold cross validation; and scaling in PCA. Comments welcome!

Read more »

Le Monde puzzle [#1130]

February 6, 2020
By
Le Monde puzzle [#1130]

A two-player game as Le weekly Monde current mathematical puzzle: Abishag and Caleb fill in alternance a row of N boxes in a row by picking one then two then three &tc. consecutive boxes. When a player is unable to find enough consecutive boxes, the player has lost. Who is winning when N=29? When N=30?

Read more »

Prying “.R” Script Files Away from Xcode (et al) on macOS

February 6, 2020
By

As the maintainer of RSwitch — and developer of my own (for personal use) macOS, iOS, watchOS, iPadOS and tvOS apps — I need the full Apple Xcode install around (more R-focused macOS folk can get away with just the command-line tools being installed). As an Apple Developer who insanely runs the macOS & Xcode... Continue reading →

Read more »

Function to download biotic interaction datasets

February 6, 2020
By

I work in ecology, biogeography, etc… Biotic interactions (interactions between species) and its repercussions on species distributions is my main research interest. As such, I had, at some point, to download datasets on species interactions. I wanted to be able to produce a uniform (more or less, not as much as I would like) R … Continue reading Function...

Read more »

How to use bootstraplib’s Live Theme Previewer to customize Shiny apps?

February 6, 2020
By
How to use bootstraplib’s Live Theme Previewer to customize Shiny apps?

One of the announcements of RStudio conf 2020 that caught my eyes is a brand new package {bootstraplib} - https://github.com/rstudio/bootstraplib/ . It’s another open-source contribution from RStudio (a PBC). {bootstraplib} basically provides tools for theming shiny and rmarkdown from R via Bootstrap (3 or 4) Sass. If you’re not aware of Bootstrap, it’s one of the most popular (open-source) css...

Read more »

New Data Scientist Stickers

February 5, 2020
By
New Data Scientist Stickers

We have a new data scientist sticker! If you see Nina or John at a conference/MeetUp, please ask us for a sticker!

Read more »

The simplest tidy machine learning workflow

February 5, 2020
By

caret is a magical package for doing machine learning in R. Look at this code for running a regularized regression: library(caret) inTrain % mutate(Sale_Price = log10(Sale_Price)) %__% select(Sale_Price, .pred) %__% rmse(Sale_Price, .pred) and here’s what I think it should look like in pseudocode: ############################# Pseudocode ###################################### ############################################################################### library(AmesHousing) # devtools::install_github("tidymodels/tidymodels") library(tidymodels) ames % # Split test/train initial_split(prop = .75) %__% ...

Read more »

Introduction to the forecastLM package

February 5, 2020
By

I am pleased to announce a new R package - forecastLM. The package, as the name implies, provides applications for forecasting regular time series data with a linear regression model (based on the lm function from the stats package). It supports both ts and tsibble objects as inputs and enables simple extractions of features from the input object on...

Read more »

Visualization of the Debt/GDP ratio and national debt level

February 5, 2020
By
Visualization of the Debt/GDP ratio and national debt level

I saw this graph on Twitter a few days ago: Short googling revealed that this is a relatively old graph from October 2017. On one hand, this is a really cool visualization. On the other hand, it also belongs… Continue reading →

Read more »

Please, somebody create an ETF that buys EU CO2 emission allowances! I’d like the gamble to earn money by fighting climate change.

February 5, 2020
By
Please, somebody create an ETF that buys EU CO2 emission allowances! I’d like the gamble to earn money by fighting climate change.

I really would like to buy now EU CO2 allowances and keep them for 5-20 years before selling them back. This transaction is likely to reduce total EU CO2 emissions and I would even have the chance to make some money out of it. That one can actually reduce emission (rather than only postpone them) just by holding allowances for...

Read more »

Shiny: Load testing and horizontal scaling

February 5, 2020
By
Shiny: Load testing and horizontal scaling

„Money can’t buy you happiness, but it can buy you more EC2 Instances…“ – With this quote Sean Lopp, Product Manager at RStudio, PBC, rang in his „Scaling Shiny“ showcase. In this showcase, he uses a load-testing approach to show how a Shiny application can be scaled for 10,000 users. RStudio’s shiny WebApp framework is an R

Read more »

#TidyTuesday and tidymodels

February 4, 2020
By
#TidyTuesday and tidymodels

This week I started my new job as a software engineer at RStudio, working with Max Kuhn and other folks on tidymodels. I am really excited about tidymodels because my own experience as a practicing data scientist has shown me some of the areas for growth that still exist in open source software when it comes to modeling and...

Read more »

Some 2020 R Conferences

February 4, 2020
By
Some 2020 R Conferences

rstudio::conf kicked off the 2020 season for R conferences last week with record attendance somewhere north of twenty-one hundred. Session topics ranged from business to science, marketing to medicine and attracted R users with very varied backgrounds including DevOps professionals, data scientists, journalists, physicians, statisticians, R package developers, Shiny developers and more. Although it is true that the San...

Read more »

The Fun of Building Things and the Challenge of Learning – the rOpenSci OzUnconf 2019

The Fun of Building Things and the Challenge of Learning – the rOpenSci OzUnconf 2019

It was the best of times, it was the worst of times. Dickens might have meant it figuratively, but in the case of the rOpenSci OzUnconf 2019, we mean it literally. Set to the backdrop of a national emergency that is still ongoing from 11-13 December, our participants came from across Australia as well as New Zealand, Japan, India and...

Read more »

wrapr Update: Removing Some Under-Used Functions and Classes

February 4, 2020
By

For the next version of the R package wrapr we are going to be removing a number of under-used functions/methods and classes. This update will likely happen in March 2020, and is the start of the wrapr 2.* series. Most of the items being removed are different abstractions for helping with function composition. We ended … Continue reading wrapr...

Read more »

Consensus clustering in R

February 4, 2020
By
Consensus clustering in R

The logic behind the Monti consensus clustering algorithm is that in the face of resampling the ideal clusters should be stable, thus any pair of samples should either always or never cluster together. We can use this principle to infer the optimal number of clusters (K). This works by examining cluster stability from K=2 to

Read more »

RStudio::conf 2020 San Francisco Recap

February 4, 2020
By
RStudio::conf 2020 San Francisco Recap

RStudio::conf 2020 is a wrap! What a tremendous experience. It was quite a production to send four Appsilon team members to San Francisco, California from Warsaw with nearly 100 kg of swag, but it was absolutely worthwhile. We were a proud sponsor of the event, and having a booth set up in the main lobby Article RStudio::conf 2020 San...

Read more »

Epidemiology: How contagious is Novel Coronavirus (2019-nCoV)?

February 4, 2020
By
Epidemiology: How contagious is Novel Coronavirus (2019-nCoV)?

A new invisible enemy, only 30kb in size, has emerged and is on a killing spree around the world: 2019-nCoV, the Novel Coronavirus! It has already killed more people than the SARS pandemic and its outbreak has been declared a Public Health Emergency of International Concern (PHEIC) by the World Health Organization (WHO). If you … Continue reading "Epidemiology:...

Read more »

Calculate all the CVs of all the QC Levels of all the Methods of all the Instruments at all the Sites all at once … with Sunquest LIS and dplyr

February 4, 2020
By
Calculate all the CVs of all the QC Levels of all the Methods of all the Instruments at all the Sites all at once … with Sunquest LIS and dplyr

Background As part of our lab accreditation requirements, we have to provide measurement uncertianty estimates for all tests at all hospital sites. As you might imagine, with thousands of testcodes in Sunquest LIS, getting all the coefficients of variation (CVs) represents a daunting task for the quality technologist to accomplish. As it turns out, by … Continue reading Calculate...

Read more »

LP and Tidy Data Principles (Part 1)

February 3, 2020
By
LP and Tidy Data Principles (Part 1)

Motivation The Life Changing Magic of Tidying Text is one of those post I keep re-reading from time to time and I wanted to try the analysis with songs. I shall use lp package, a small data package I had for experimental purposes. Note: If some images appear too small on your screen you can open them in a new tab to...

Read more »

Analysing an open cohort stepped-wedge clustered trial with repeated individual binary outcomes

February 3, 2020
By
Analysing an open cohort stepped-wedge clustered trial with repeated individual binary outcomes

I am currently wrestling with how to analyze data from a stepped-wedge designed cluster randomized trial. A few factors make this analysis particularly interesting. First, we want to allow for the possibility that between-period site-level correlation will decrease (or decay) over time. Second, there is possibly additional clustering at the patient level since individual outcomes will be measured repeatedly...

Read more »

Some lessons from rstudio::conf

February 3, 2020
By
Some lessons from rstudio::conf

Today I’m departing a little from the problem/context/solution format of these posts to share some things I learned from last week’s rstudio::conf. When I started in R a few years ago, I never thought I would have any place at a coding conference for computer people. But thanks to some help from my lab and my … Continue reading "Some...

Read more »

Grid point occurrence records onto a raster

February 3, 2020
By
Grid point occurrence records onto a raster

The ‘gridRecords‘ function, which has just been added to the ‘fuzzySim‘ package (from version 2.6 on), takes a raster stack and a set of spatial coordinates of a species’ presence (and optionally absence) records, and returns a data frame with … Continue reading →

Read more »

Conference feelings: from newbie to sponsor

February 2, 2020
By
Conference feelings: from newbie to sponsor

In the summer of 2008, nearly 12 years ago, I attended my first R/Bioconductor conference: BioC2008. Just last week I went to my second rstudio::conf(2020) which I greatly enjoyed. After some tweets exchanges today, I started reflecting on my journey and wanted to share my thoughts. Why I like going to conferences I typically enjoy going to conferences, though I also...

Read more »

Working with audio in R using av

Working with audio in R using av

The latest version of the rOpenSci av package includes some useful new tools for working with audio data. We have added functions for reading, cutting, converting, transforming, and plotting audio data in any popular audio / video format (mp3, mkv, aac, etc). The functionality can either be used by itself, or to prepare audio data for further analysis in R...

Read more »

The palindrome of 02.02.2020

February 2, 2020
By
The palindrome of 02.02.2020

As of writing this blog-post, today is February 2nd, 2020. Or as I would say it, 2nd of February, 2020. There is nothing magical about it, it is just a sequence of numbers. On a boring Sunday evening, what could…Read more ›

Read more »

R Tip: Check What Repos You are Using

February 2, 2020
By

In a lot of our R writing we casually say “install from CRAN using install.packages('PKGNAME')” or “update your packages by using update.packages(ask = FALSE, checkBuilt = TRUE) (and answering ‘no’ to all questions about compiling).” We recently became aware that for some users this isn’t complete advice. The above depends on your R install pointing … Continue reading R...

Read more »

Search R-bloggers

Sponsors