colorspace @ useR! 2019

September 1, 2019
By
colorspace @ useR! 2019

Conference presentation about the colorspace toolbox for manipulating and assessing color palettes at useR! 2019 in Toulouse: Slides, video, replication materials, and working paper. Abstract (Authors: Achim Zeileis, Jason C. F...

Read more »

RSwitch 1.4.1 Released

September 1, 2019
By
RSwitch 1.4.1 Released

A minor update to RSwitch has been released. Apart from some internal code reorganization there are three user-facing changes. First, RSwitch is now notarized! That means you won’t get a notice about it being from an “unidentified developer” nor will folks on Catalina see a warning about unable to check the download for malware. You... Continue reading →

Read more »

Estimating variance: should I use n or n – 1? The answer is not what you think

Estimating variance: should I use n or n – 1? The answer is not what you think

Estimates of population parameters based on samples are not exact: there is always some error involved. In principle, one can estimate a population parameter with any estimator, but some will be better than others. There is one particular case which was always very confusing to me (because of the multiple alternatives) and that is the estimation of the variance...

Read more »

Use ExPanD to Create a Notebook for Your EDA

Use ExPanD to Create a Notebook for Your EDA

The ‘ExPanDaR’ package offers a toolbox for interactive exploratory data analysis (EDA). You can read more about it here. The ‘ExPanD’ shiny app allows you to customize your analysis to some extent but often you might want to continue and extend your analysis with additional models and visualizations that are not part of the ‘ExPanDaR’ package. Thus, I am currently...

Read more »

Using Spark from R for performance with arbitrary code – Part 1 – Spark SQL translation, custom functions, and Arrow

August 31, 2019
By
Using Spark from R for performance with arbitrary code – Part 1 – Spark SQL translation, custom functions, and Arrow

Introduction Apache Spark is a popular open-source analytics engine for big data processing and thanks to the sparklyr and SparkR packages, the power of Spark is also available to R users. This series of articles will attempt to provide practical insights into using the sparklyr interface to gain the benefits of Apache Spark while still retaining the ability to use R...

Read more »

‘There is a game I play’ – Analyzing Metacritic scores for video games

August 30, 2019
By
‘There is a game I play’ – Analyzing Metacritic scores for video games

There is a game I play / try to make myself okay / try so hard to make the pieces all fit / smash it apart / just for the f**k of it (Nine Inch Nails: The Big Come Down) After this rather distressing opening by the Nine Inch Nails, let’s turn to a more uplifting topic: video games! There...

Read more »

Explaining Predictions: Random Forest Post-hoc Analysis (randomForestExplainer package)

August 30, 2019
By
Explaining Predictions: Random Forest Post-hoc Analysis (randomForestExplainer   package)

Recap This is a continuation on the explanation of machine learning model predictions. Specifically, random forest models. We can depend on the random forest package itself to explain predictions based on impurity importance or permutation importance. Today, we will explore external packages which aid in explaining random forest predictions. External packages There are external a few packages which offer to calculate variable...

Read more »

Lesser known dplyr functions

August 30, 2019
By

The dplyr package is an essential tool for manipulating data in R. The “Introduction to dplyr” vignette gives a good overview of the common dplyr functions (list taken from the vignette itself): filter() to select cases based on their values. arrange() to … Continue reading →

Read more »

Seeking postdoc (or contractor) for next generation Stan language research and development

August 30, 2019
By

The Stan group at Columbia is looking to hire a postdoc* to work on the next generation compiler for the Stan open-source probabilistic programming language. Ideally, a candidate will bring language development experience and also have research interests in a related field such as programming languages, applied statistics, numerical analysis, or statistical computation. The language

Read more »

Why R?

August 30, 2019
By

I was working with our copy editor on Appendix A of Practical Data Science with R, 2nd Edition; Zumel, Mount; Manning 2019, and ran into this little point (unfortunately) buried in the back of the book. In our opinion the R ecosystem is the fastest path to substantial data science, statistical, and machine learning accomplishment. … Continue reading Why...

Read more »

Bigram Analysis of Democratic Debates

August 30, 2019
By
Bigram Analysis of Democratic Debates

This tutorial will mainly focus on ggplot and bigrams, but it does gloss over clustering for a heatmap. This project started a while back, tweetingContinue ReadingBigram Analysis of Democratic Debates

Read more »

It is Time for CRAN to Ban Package Ads

August 30, 2019
By

NPM (a popular Javascript package repository) just banned package advertisements. I feel the CRAN repository should do the same. Not all R-users are fully aware of package advertisements. But they clutter up work, interfere with reproducibility, and frankly are just wrong. Here is an example which could be considered to contain advertisements: .onAttach() from ggplot2 … Continue reading It...

Read more »

Break up with Excel: Intro and Advanced R Data Science Courses at MSACL.org Salzburg Austria, September 21–23, 2019

August 30, 2019
By
Break up with Excel: Intro and Advanced R Data Science Courses at MSACL.org Salzburg Austria, September 21–23, 2019

MSACL Conference There are two RStats Data Science courses happening in Salzburg Austria on September 22–24, 2019 at the 6th annual MSACL Clinical Mass Spectrometry Conference. These courses are held twice annually, once in Europe and once in Palm Springs. Introductory Course The introductory course will be taught by Dan Holmes, MD of the University … Continue reading Break...

Read more »

Securing Shiny apps with AWS Cognito authentication

Securing Shiny apps with AWS Cognito authentication

Background Shiny apps are a great way to share information and empower your users. Sometimes you want to make sure that only authenticated and authorized users will be able to view your shiny apps. There are a number of ways to make sure only certain users have access to your apps. For example, you can subscribe to the professional plan in...

Read more »

What’s happening at EARL Conference 2019?

August 30, 2019
By

The Enterprise Applications of the R Language Conference (EARL) is the place to be for anyone using R in their organisation. You’ll be joined by R users from all over the data world, presenting their real-world projects and use cases, and ideas and solutions. The conference is run by Mango Solutions, as part of our commitment to the data...

Read more »

rstudio::conf(2020) Diversity and international scholarships

August 29, 2019
By

rstudio::conf(2020L) continues our tradition of diversity scholarships, and this year we’re increasing the program size to 44 recipients. As a result of thinking about our goals, this year we have two components to the program: 38 domestic...

Read more »

Time series graphics using feasts

August 29, 2019
By
Time series graphics using feasts

This is the second post on the new tidyverts packages for tidy time series analysis. The previous post is here. For users migrating from the forecast package, it might be useful to see how to get similar graphics to those they are used to. The forecast package is built for ts objects, while the feasts package provides features, statistics and...

Read more »

Ecosystems chapter of “evidence-based software engineering” reworked

August 29, 2019
By

The Ecosystems chapter of my evidence-based software engineering book has been reworked (I have given up on the idea that this second pass is also where the polishing happens; polishing still needs to happen, and there might be more material migration between chapters); download here. I have been reading books on biological ecosystems, and a

Read more »

anytime 0.3.6

August 29, 2019
By

A fresh and very exciting release of the anytime package is arriving on CRAN right now. This is the seventeenth release, and it comes pretty much exactly one month after the preceding 0.3.5 release. anytime is a very focused package aiming to do just...

Read more »

How to Build Analytics Platforms – Part 3: Customizable Workflow and Dashboards

August 29, 2019
By
How to Build Analytics Platforms – Part 3: Customizable Workflow and Dashboards

What does a modern analytics platforms need to offer companies real added value? The more information a platform represents and the more detailed the individual analytics projects are, the stricter is often the way in which projects must be planned and implemented. But what would happen if the individual project steps were based on the

Read more »

Love affairs and linear differential equations

August 29, 2019
By
Love affairs and linear differential equations

Differential equations are a powerful tool for modeling how systems change over time, but they can be a little hard to get into. Love, on the other hand, is humanity’s perennial topic; some even claim it is all you need. In this blog post — inspired by Strogatz (1988, 2015) — I will introduce linear differential equations as a...

Read more »

How to Make Your CSS Systematically Awesome with SASS

August 29, 2019
By
How to Make Your CSS Systematically Awesome with SASS

tl; dr SASS is CSS for programmers.  It gives you the building blocks that you’re used to, such as variables, conditions, and loops.  And it helps you organize. The bigger the project, the bigger the advantages offered by SASS. It’s a way of managing CSS styles even if you’re not very good at it.   SASS Article How to Make...

Read more »

R-Related Talks Coming to ODSC West 2019 (and a 30% discount)

August 28, 2019
By

Press HERE to register to the ODSC West 2019 conference with a 30% discount! (or use the code: ODSCRBloggers) R is one of the most commonly-used languages within data science, and its applications are always expanding. From the traditional use of data or predictive analysis, all the way to machine or deep learning, the uses … Continue reading R-Related...

Read more »

How to create multiple variables with a single line of code in R

August 28, 2019
By

Are you interested in guest posting? Publish at DataScience+ via your editor (i.e., RStudio). Category Data Management Tags Data Manipulation R Programming tidyverse Tips & Tricks When I have a dataset with many variables and want to create a new variable for each of them, then the first thing comes into my mind is to write a new line of code for each transformation (e.g., new...

Read more »

Ryacas version 1.0.0 released!

It is with great pleasure that I can announce that Ryacas version 1.0.0 is now released to CRAN (https://cran.r-project.org/package=Ryacas). I wish to thank all co-authors: Rob Goedman, Gabor Grothendieck, Søren Højsgaard, Grzegorz Mazur, Ayal Pinkus. It means that you can install the package by (possible after binaries have been built): install.packages("Ryacas") Followed by: library(Ryacas) (The source code is available at https://github.com/mikldk/ryacas/.) Now you have the yacas computer algebra...

Read more »

Tidy time series data using tsibbles

August 28, 2019
By
Tidy time series data using tsibbles

There is a new suite of packages for tidy time series analysis, that integrates easily into the tidyverse way of working. We call these the tidyverts packages, and they are available at tidyverts.org. Much of the work on these packages has been done by Earo Wang and Mitchell O’Hara-Wild. The first of the packages to make it to CRAN was...

Read more »

July 2019 “Top 40” R Packages

August 28, 2019
By
July 2019 “Top 40” R Packages

One hundred seventy-six new packages made it to CRAN in July. Here are my “Top 40” picks organized into twelve categories: Data, Data Science, Finance, Genomics, Machine Learning, Mathematics, Medicine, Statistics, Time Series, Topological Data Analysis, Utilities and Visualization. Data eia v0.3.2: Provides API access to data from the US Energy Information Administration (EIA). Use of the API requires a free...

Read more »

Errors and Debugging in RStudio

August 28, 2019
By
Errors and Debugging in RStudio

Diagnosing and fixing errors in your code can be time-consuming and frustrating. There are two ways you can make your life easier. The first is knowing the tools at your disposal in RStudio to debug errors. RStudio provides a variety of tools to help you diagnose the problem at its source and come up with a solution as quick...

Read more »

PostcodesioR 0.1.1 is on CRAN

August 28, 2019
By
PostcodesioR 0.1.1 is on CRAN

Introduction The latest stable version of my UK geocoder package has finally made it to CRAN. PostcodesioR is a wrapper for postcodes.io and it provides multiple functions to work with UK geospatial data. This package is based exclusively on open data provided by Ordnance Survey and Office for National Statistics and turned into an API

Read more »

Search R-bloggers

Sponsors