#TidyTuesday and tidymodels

February 4, 2020
By
#TidyTuesday and tidymodels

This week I started my new job as a software engineer at RStudio, working with Max Kuhn and other folks on tidymodels. I am really excited about tidymodels because my own experience as a practicing data scientist has shown me some of the areas for growth that still exist in open source software when it comes to modeling and...

Read more »

Some 2020 R Conferences

February 4, 2020
By
Some 2020 R Conferences

rstudio::conf kicked off the 2020 season for R conferences last week with record attendance somewhere north of twenty-one hundred. Session topics ranged from business to science, marketing to medicine and attracted R users with very varied backgrounds including DevOps professionals, data scientists, journalists, physicians, statisticians, R package developers, Shiny developers and more. Although it is true that the San...

Read more »

The Fun of Building Things and the Challenge of Learning – the rOpenSci OzUnconf 2019

The Fun of Building Things and the Challenge of Learning – the rOpenSci OzUnconf 2019

It was the best of times, it was the worst of times. Dickens might have meant it figuratively, but in the case of the rOpenSci OzUnconf 2019, we mean it literally. Set to the backdrop of a national emergency that is still ongoing from 11-13 December, our participants came from across Australia as well as New Zealand, Japan, India and...

Read more »

wrapr Update: Removing Some Under-Used Functions and Classes

February 4, 2020
By

For the next version of the R package wrapr we are going to be removing a number of under-used functions/methods and classes. This update will likely happen in March 2020, and is the start of the wrapr 2.* series. Most of the items being removed are different abstractions for helping with function composition. We ended … Continue reading wrapr...

Read more »

Consensus clustering in R

February 4, 2020
By
Consensus clustering in R

The logic behind the Monti consensus clustering algorithm is that in the face of resampling the ideal clusters should be stable, thus any pair of samples should either always or never cluster together. We can use this principle to infer the optimal number of clusters (K). This works by examining cluster stability from K=2 to

Read more »

RStudio::conf 2020 San Francisco Recap

February 4, 2020
By
RStudio::conf 2020 San Francisco Recap

RStudio::conf 2020 is a wrap! What a tremendous experience. It was quite a production to send four Appsilon team members to San Francisco, California from Warsaw with nearly 100 kg of swag, but it was absolutely worthwhile. We were a proud sponsor of the event, and having a booth set up in the main lobby Article RStudio::conf 2020 San...

Read more »

Epidemiology: How contagious is Novel Coronavirus (2019-nCoV)?

February 4, 2020
By
Epidemiology: How contagious is Novel Coronavirus (2019-nCoV)?

A new invisible enemy, only 30kb in size, has emerged and is on a killing spree around the world: 2019-nCoV, the Novel Coronavirus! It has already killed more people than the SARS pandemic and its outbreak has been declared a Public Health Emergency of International Concern (PHEIC) by the World Health Organization (WHO). If you … Continue reading "Epidemiology:...

Read more »

Calculate all the CVs of all the QC Levels of all the Methods of all the Instruments at all the Sites all at once … with Sunquest LIS and dplyr

February 4, 2020
By
Calculate all the CVs of all the QC Levels of all the Methods of all the Instruments at all the Sites all at once … with Sunquest LIS and dplyr

Background As part of our lab accreditation requirements, we have to provide measurement uncertianty estimates for all tests at all hospital sites. As you might imagine, with thousands of testcodes in Sunquest LIS, getting all the coefficients of variation (CVs) represents a daunting task for the quality technologist to accomplish. As it turns out, by … Continue reading Calculate...

Read more »

LP and Tidy Data Principles (Part 1)

February 3, 2020
By
LP and Tidy Data Principles (Part 1)

Motivation The Life Changing Magic of Tidying Text is one of those post I keep re-reading from time to time and I wanted to try the analysis with songs. I shall use lp package, a small data package I had for experimental purposes. Note: If some images appear too small on your screen you can open them in a new tab to...

Read more »

Analysing an open cohort stepped-wedge clustered trial with repeated individual binary outcomes

February 3, 2020
By
Analysing an open cohort stepped-wedge clustered trial with repeated individual binary outcomes

I am currently wrestling with how to analyze data from a stepped-wedge designed cluster randomized trial. A few factors make this analysis particularly interesting. First, we want to allow for the possibility that between-period site-level correlation will decrease (or decay) over time. Second, there is possibly additional clustering at the patient level since individual outcomes will be measured repeatedly...

Read more »

Some lessons from rstudio::conf

February 3, 2020
By
Some lessons from rstudio::conf

Today I’m departing a little from the problem/context/solution format of these posts to share some things I learned from last week’s rstudio::conf. When I started in R a few years ago, I never thought I would have any place at a coding conference for computer people. But thanks to some help from my lab and my … Continue reading "Some...

Read more »

Grid point occurrence records onto a raster

February 3, 2020
By
Grid point occurrence records onto a raster

The ‘gridRecords‘ function, which has just been added to the ‘fuzzySim‘ package (from version 2.6 on), takes a raster stack and a set of spatial coordinates of a species’ presence (and optionally absence) records, and returns a data frame with … Continue reading →

Read more »

Conference feelings: from newbie to sponsor

February 2, 2020
By
Conference feelings: from newbie to sponsor

In the summer of 2008, nearly 12 years ago, I attended my first R/Bioconductor conference: BioC2008. Just last week I went to my second rstudio::conf(2020) which I greatly enjoyed. After some tweets exchanges today, I started reflecting on my journey and wanted to share my thoughts. Why I like going to conferences I typically enjoy going to conferences, though I also...

Read more »

Working with audio in R using av

Working with audio in R using av

The latest version of the rOpenSci av package includes some useful new tools for working with audio data. We have added functions for reading, cutting, converting, transforming, and plotting audio data in any popular audio / video format (mp3, mkv, aac, etc). The functionality can either be used by itself, or to prepare audio data for further analysis in R...

Read more »

The palindrome of 02.02.2020

February 2, 2020
By
The palindrome of 02.02.2020

As of writing this blog-post, today is February 2nd, 2020. Or as I would say it, 2nd of February, 2020. There is nothing magical about it, it is just a sequence of numbers. On a boring Sunday evening, what could…Read more ›

Read more »

R Tip: Check What Repos You are Using

February 2, 2020
By

In a lot of our R writing we casually say “install from CRAN using install.packages('PKGNAME')” or “update your packages by using update.packages(ask = FALSE, checkBuilt = TRUE) (and answering ‘no’ to all questions about compiling).” We recently became aware that for some users this isn’t complete advice. The above depends on your R install pointing … Continue reading R...

Read more »

rstudio::conf 2020 Slides on Futures

February 1, 2020
By
rstudio::conf 2020 Slides on Futures

Design: Dan LaBar

Read more »

Primitive Functions List

February 1, 2020
By

Ever wondered which R functions are actually passed to internal C code? Well, wonder no more as it turns out there is an unexported named list within the methods package providing instructions for turning builtin and special functions into generic functions. Wrapping this list with names() gives us the list of all R functions which wrap calls to .Primitive(). names(methods:::.BasicFunsList) #...

Read more »

Get and Set List Elements with magrittr

February 1, 2020
By

Introduction Did you know that the magrittr pipe, %__%, can be used for more than just data.frames and tibbles? In this blog post, we look at how we can create get and set functions for list elements. Getting List Elements First, let’s create a simple list. z1 %. How can we do that? Well we can pipe our list into a . which...

Read more »

A guide to encoding categorical features using R

February 1, 2020
By

In this article, we will look at various options for encoding categorical features. We will also present R code for each of the encoding techniques. Categorical feature encoding is an important data processing step required for using these features in many statistical modelling and machine learning algorithms. The material in the article is heavily borrowed from the post Smarter Ways...

Read more »

Monsters

February 1, 2020
By
Monsters

Ooh, see the fire is sweepin’Our very street todayBurns like a red coal carpetMad bull lost its way(Gimme Shelter, The Rolling Stones) After following this easy tutorial, you will be able to create tiled images from a photograph. You may want to use your own portrait or some other as I did. I use geom_tile: … Continue reading Monsters...

Read more »

The significance of the region on the salary in Sweden, a comparison between different occupational groups

The significance of the region on the salary in Sweden, a comparison between different occupational groups

In my last post, I found that the region has a significant impact on the salary of engineers. Is the significance of the region unique to engineers or are there similar correlations in other occupational groups? Statistics Sweden use NUTS (Nomenclature des Unités Territoriales Statistiques), which is the EU’s hierarchical regional division, to specify the regions. The F-value from the Anova...

Read more »

Comparing Ensembl GTF and cDNA

January 31, 2020
By
Comparing Ensembl GTF and cDNA

It seems that most people think Ensembl’s GTF file and cDNA fasta file mean the same transcripts: Watch out! @ensembl's Fasta and GTF annotation files available via https://t.co/2AhCSnL7py do not match (there are transcripts in the GTF not found in the Fasta file. Anyone else expected...

Read more »

50+ Free DataSets for DataScience Projects

January 31, 2020
By
50+ Free DataSets for DataScience Projects

Hello All, This is just a short note to specify that the list of FREE datasets is updated for 2020. There are 50+ sites and links to the newly released Google Dataset search engine. So, have fun exploring these data repositories to master programming, create stunning visualizations and build your own unique project portfolios. Some The post 50+ Free...

Read more »

Lewis Carroll’s proposed rules for tennis tournaments by @ellis2013nz

January 31, 2020
By
Lewis Carroll’s proposed rules for tennis tournaments by @ellis2013nz

Last week I wrote about the impact of seeding the draw in a tennis tournament. Seeding is one way to increase the chance of the top players making it to the final rounds of a single elimination tournament, leading to fairer outcomes and to a higher cha...

Read more »

rco: Make Your R Code Run Faster Today!

January 31, 2020
By

The rco package can optimize R code in a variety of different ways. The package implements common subexpression elimination, constant folding, constant propagation, dead code elimination, among other very relevant code optimization strategies. Currently, the rco could be downloaded as a GitHub package. The rco  package functions as an RStudio Addin, be used through a shiny GUI … Continue reading rco:...

Read more »

15+ Resources to Get Started with R

January 31, 2020
By
15+ Resources to Get Started with R

R is the second most sought after language in data science behind Python, so gaining mastery of R is a prerequisite to a thriving career in the field. Whether you’re an experienced developer or a newbie considering a career move, here are some excellent resources so you can get started with R. [Related Article: Data-Driven Exploration … Continue reading 15+...

Read more »

Beginners guide to Bubble Map with Shiny

January 31, 2020
By
Beginners guide to Bubble Map with Shiny

Map Bubble Map bubble is type of map chart where bubble or circle position  indicates geoghraphical location and bubble size is used to show differences in magnitude of quantitative variables like population. We will be using Highcharter package to show earthquake magnitude and depth . Highcharter is a versatile charting library to build interactive charts, … Continue reading Beginners...

Read more »

nnetsauce for R

January 30, 2020
By
nnetsauce for R

nnetsauce for R

Read more »

Search R-bloggers

Sponsors