## Selecting the max value from each group, a case study: base R

September 14, 2019
By

Introduction Let’s say we wish to group some data by a variable, then for each group we wish to find the row of the maximum value of another variable, and then finally extract the entire row. This is a fairly common task and in fact I’ve had to do this exact data exploration technique on several occasions in the last...

## Twitter “Account Analysis” in R

September 14, 2019
By

This past week @propublica linked to a really spiffy resource for getting an overview of a Twitter user’s profile and activity called accountanalysis. It has a beautiful interface that works as well on mobile as it does in a real browser. It also is fully interactive and supports cross-filtering (zoom in on the timeline and... Continue reading →

## RSwitch 1.5.0 Release Now Also Corrals RStudio Server Connections

September 14, 2019
By

RSwitch is a macOS menubar application that works on macOS 10.14+ and provides handy shortcuts for developing with R on macOS. Version 1.5.0 brings a reorganized menu system and the ability to manage and make connections to RStudio Server instances. Here’s a quick peek at the new setup: All books, links, and other reference resources... Continue reading →

## September 2019 Democratic Debates Added to {ggchicklet}

September 14, 2019
By

The latest round of the 2020 Democratic debates is over and the data from all the 2019 editions of the debates have been added to {ggchicklet}. The structure of the debates2019 built-in dataset has changed a bit: library(ggchicklet) library(hrbrthemes) library(tidyverse) debates2019 ## # A tibble: 641 x 7 ## elapsed timestamp speaker topic debate_date debate_group... Continue reading →

## ttdo 0.0.3: New package

September 13, 2019
By

A new package of mine arrived on CRAN yesterday, having been uploaded a few days prior on the weekend. It extends the most excellent (and very minimal / zero depends) unit testing package tinytest by Mark van der Loo with the very clever and well-don...

## Get Funded by the R Consortium – Call for Proposals Open Now!

September 13, 2019
By

Strengthen the R community with Your Project The R Consortium is committed to supporting the R community by funding projects that create important infrastructure and fortify long term stability for... The post Get Funded by the R Consortium – Call for Proposals Open Now! appeared first on R Consortium.

## Tell Me a Story: How to Generate Textual Explanations for Predictive Models

September 13, 2019
By

TL;DR: If you are going to explain predictions for a black box model you should combine statistical charts with natural language descriptions. This combination is more powerful than SHAP/LIME/PDP/Break Down charts alone. During this summer Adam Izdebski implemented this feature for explanations generated in R with DALEX library. How he did it? Find out here: … Czytaj dalej Tell...

## Tell Me a Story: How to Generate Textual Explanations for Predictive Models

September 13, 2019
By

by: Adam IzdebskiAmazing things were created during summer internships at MI2DataLab this year. One of them is the generator of natural language descriptions for DALEX explainers developed by Adam Izdebski.Is text better than charts for explanations?P...

## NIMBLE short course at Bayes Comp 2020 conference

September 13, 2019
By

We’ll be giving a short course on NIMBLE on January 7, 2020 at the Bayes Comp 2020 conference being held January 7-10 in Gainesville, Florida, USA. Bayes Comp is a popular biennial ISBA-sponsored conference focused on computational methods/algorithms/technologies for Bayesian inference. The short course focuses on programming algorithms in NIMBLE and is titled: “Developing, modifying,

## Initializing an empty list

September 13, 2019
By

Problem How do I initialize an empty list for use in a for-loop or function? Context Sometimes I’m writing a for-loop (I know, I know, don’t use for-loops, but sometimes it’s just easier. I’m a little less good at apply functions than I’d like to be) and I know I’ll need to store the output … Continue reading "Initializing...

## Curiosity + Data + Customer Segmentation = Goodies

September 13, 2019
By

TL; DR  I used a Kaggle database to show you how to separate your customers into distinct groups based on their purchase behavior.  With this method, store managers can customize interactions with  existing and potential customers to increase loyalty and eventually, all of the goodies that come with consistent purchases.  For the R enthusiasts out Article Curiosity + Data...

## #FunDataFriday – The magick package in R

September 13, 2019
By

The magick package allows us to super easily work with images and animations in R!

## Reproducing the kidney cancer example from BDA

September 13, 2019
By
$Reproducing the kidney cancer example from BDA$

This is an attempt at reproducing the analysis of Section 2.7 of Bayesian Data Analysis, 3rd edition (Gelman et al.), on kidney cancer rates in the USA in the 1980s. I have done my best to clean the data from the original. Andrew wrote a blog post to “disillusion about the reproducibility of textbook

## Getting on the meet-up bandwagon – our first meet up event

September 12, 2019
By

My company Draper and Dash have tasked me with organising a wider meet-up event for anyone who is interested in AI / ML in healthcare. This wider working group consists of people from different sectors, however they are interested in how we can apply AI / ML methods in their organisations. Why did we choose...

## Regex Problem? Here’s an R package that will write Regex for you

September 12, 2019
By

REGEX is that thing that scares everyone almost all the time. Hence, finding some alternative is always very helpful and peaceful too. Here’s a nice R package thst helps us do REGEX without knowing REGEX. REGEX This is the REGEX pattern to test the validity of a URL: ^(http)(s)?(\:\/\/)(www\.)?(*)\$ A typical regular expression contains — Characters ( http ) and Meta Characters (). The...

## Fitting ‘complex’ mixed models with ‘nlme’: Example #4

Testing for interactions in nonlinear regression Factorial experiments are very common in agriculture and they are usually laid down to test for the significance of interactions between experimental factors. For example, genotype assessments may be performed at two different nitrogen fertilisation levels (e.g. high and low) to understand whether the ranking of genotypes depends on nutrient availability. For those of you...

## Fitting ‘complex’ mixed models with ‘nlme’: Example #3

Accounting for the experimental design in regression analyses In this post, I am not going to talk about real complex models. However, I am going to talk about models that are often overlooked by agronomists and biologists, while they may be necessary in several circumstances, especially with field experiments. The point is that field experiments are very often laid down in...

## Athena and R … there is another way!?

September 12, 2019
By

RBloggers|RBloggers-feedburner Intro: Currently there are two key ways in connecting to Amazon Athena from R, using the ODBC and JDBC drivers. To access the ODBC driver R users can use the excellent odbc package supported by Rstudio. To access the JDBC driver R users can either use the RJDBC R package or the helpful wrapper package AWR.Athena which wraps the RJDBC...

## Laguerre-Samuelson Inequality

September 12, 2019
By

Chebychev’s Theorem gives bounds on how spread out a probability distribution can be from the mean, in terms of the standard deviation. More precisely, if $$X$$ is a random variable with mean $$\mu$$ and standard deviation $$\sigma$$, then \[ P(|X - ...

## Fitting ‘complex’ mixed models with ‘nlme’: Example #2

A repeated split-plot experiment with heteroscedastic errors Let’s imagine a field experiment, where different genotypes of khorasan wheat are to be compared under different nitrogen (N) fertilisation systems. Genotypes require bigger plots, with respect to fertilisation treatments and, therefore, the most convenient choice would be to lay-out the experiment as a split-plot, in a randomised complete block design. Genotypes would...

## Social Network Visualization with R

September 12, 2019
By

In this month’s we are going to look at data analysis and visualization of social networks using R programming. Friendster Networks Mapping Friendster was a yesteryear social media network, something akin to Facebook. I’ve never used it but it is one of those easily available datasets where you have a list of users and all The post Social Network...

## WVPlots 1.1.2 on CRAN

September 12, 2019
By

I have put a new release of the WVPlots package up on CRAN. This release adds palette and/or color controls to most of the plotting functions in the package. WVPlots was originally a catch-all package of ggplot2 visualizations that we at Win-Vector tended to use repeatedly, and wanted to turn into “one-liners.” A consequence of … Continue reading WVPlots...

## Survival analysis with strata, clusters, frailties and competing risks in in Finalfit

September 12, 2019
By

Background In healthcare, we deal with a lot of binary outcomes. Death yes/no, disease recurrence yes/no, for instance. These outcomes are often easily analysed using binary logistic regression via finalfit(). When the time taken for the outcome to occur is important, we need a different approach. For instance, in patients with cancer, the time taken until … Continue reading "Survival...

## A DevOps Process for Deploying R to Production

September 12, 2019
By

I've been at the EARL Conference in London this week, and as always it's been inspiring to see so many examples of R being used in production at companies like Sainsbury's, BMW, Austria Post, PartnerRe, Royal Free Hospital, the BBC, the Financial Times, and many others. My own talk, A DevOps Process for Deploying R to Production, presented one...

## Use more of your data with matrix factorisation

September 12, 2019
By

Previously I posted on how to apply gradient descent on linear regression as an example. With that as background it’s The post Use more of your data with matrix factorisation appeared first on Daniel Oehm | Gradient Descending.

## Gold-Mining Week 2 (2019)

September 11, 2019
By

Welcome to the 2019 Fantasy Football Season! Week 2 Gold Mining and Fantasy Football Projection Roundup now available. The post Gold-Mining Week 2 (2019) appeared first on Fantasy Football Analytics.

## Call for Help: Lead R/Shiny Developer

September 11, 2019
By

Dear Fantasy Football Analytics Community, In 2013, we at Fantasy Football Analytics released web apps to help people make better decisions in fantasy football based on the wisdom of the The post Call for Help: Lead R/Shiny Developer appeared first on Fantasy Football Analytics.

## R/Medicine 2019 Workshops

September 11, 2019
By

R/Medicine 2019 kicked off on Thursday with two outstanding workshops. It was difficult to choose between the two, but fortunately both presenters developed rich sets of materials that are available online. Alison Hill delivered R Markdown for Medicine with an elegant HTML exposition masterfully created to cultivate beginners while still engaging experienced R Markdown users. Photo by Samuel Zeller on Unsplash In...