Wrangling Wikileaks DMs

August 5, 2018
By

Using R to turn raw data into browsable and reusable content.

Read more »

Collecting Expressions in R

August 5, 2018
By
Collecting Expressions in R

Not a full R article, but a quick note demonstrating by example the advantage of being able to collect many expressions and pack them into a single extend_se() node. This example may seem extreme or unnatural. However we have seen once you expose a system to enough users you see a lot more extreme use … Continue reading Collecting...

Read more »

Time to Accept It: publishing in the Journal of Statistical Software

August 5, 2018
By
Time to Accept It: publishing in the Journal of Statistical Software

Originally posted on The Geokook.: When I was considering submitting my paper on psd to J. Stat. Soft. (JSS), I kept noticing that the time from “Submitted” to “Accepted” was nearly two years in many cases.  I ultimately decided that was much too long of a review process, no matter what the impact factor might be (and in two…

Read more »

How Do We Draw a Line?

August 5, 2018
By
How Do We Draw a Line?

She dreams in colour, she dreams in red, can’t find a better man (Better Man, Pearl Jam) Today I bring another experiment based on The Quick Draw! Data from Google, one of my most fortunate discoveries of the last times. The Quick Draw! is a web game developed by Google, that can be played on a computer, … Continue reading How...

Read more »

Estimating treatment effects and ICCs from (G)LMMs on the observed scale using Bayes, Part 1: lognormal models

August 5, 2018
By
Estimating treatment effects and ICCs from (G)LMMs on the observed scale using Bayes, Part 1: lognormal models

When a multilevel model includes either a non-linear transformation (such as the log-transformation) of the response variable, or of the expectations via a GLM link-function, then the interpretation of the results will be different compared to a standard Gaussian multilevel model; specifically, the estimates will be on a transformed scale and not in the original units, and the effects...

Read more »

Longitudinal heat plots

August 4, 2018
By
Longitudinal heat plots

During our research on the effect of prednisone consumption during pregency on health outcomes of the baby (Palmsten K, Rolland M, Hebert MF, et al., Patterns of prednisone use during pregnancy in women with rheumatoid arthritis: Daily and cumulative dose. Pharmacoepidemiol Drug Saf. 2018 Apr;27(4):430-438. https://www.ncbi.nlm.nih.gov/pubmed/29488292) we developed a custom plot to visualize for each patient … Continue reading Longitudinal...

Read more »

Introducing the HCmodelSets Package

August 4, 2018
By
Introducing the HCmodelSets Package

By Henrique Helfer Hoeltgebaum Introduction I am happy to introduce the package HCmodelSets, which is now available on CRAN. This package implements the methods proposed by Cox, D.R. and Battey, H.S. (2017). In particular it performs the reduction, exploratory and … Continue reading →

Read more »

Digging into mbox details: A tale of tm & reticulate

August 4, 2018
By
Digging into mbox details: A tale of tm & reticulate

✨ I had to processes a bunch of emails for a $DAYJOB task this week and my “default setting” is to use R for pretty much everything (this should come as no surprise). Treating mail as data is not an uncommon task and many R packages exist that can reach out and grab mail from... Continue reading →

Read more »

RStudio:addins part 5 – Profile your code on keypress in the background, with no dependencies.

August 4, 2018
By
RStudio:addins part 5 – Profile your code on keypress in the background, with no dependencies.

Introduction Profiling our code is a very useful tool to determine how well the code performs on different metrics. The addin we will create in this article will let us use a keyboard shortcut to run profiling on R code selected in RStudio without blocking the session or requiring any external packages. Specifically for very simple overview use, it may be beneficial...

Read more »

Video: How to run R and Python in SQL Server from a Jupyter notebook

August 3, 2018
By

Did you know that you can run R and Python code within a SQL Server instance? Not only does this give you a powerful server to process your data science calculations, but it makes things faster by eliminating the need to move data between client and server. Best of all, you can access this power via your favourite interface...

Read more »

Stats Note: Making Sense of Open-Ended Responses with Text Analysis

August 3, 2018
By

Using Text Mining on Open Ended Items Good survey design is both art and science. You have to think about how people will read and process your questions, and what sorts of responses might result from different question forms and wording. One of the big rules I follow in survey design is that you don't assess any of your...

Read more »

The program for uRos2018 is online

August 3, 2018
By
The program for uRos2018 is online

The uRos2018 conference is aimed at professionals and academics who are involved in producing or consuming official (government) statistics. We are happy to announce that we recently posted the full program of the 6th international conference on the use of … Continue reading →

Read more »

Saving ts objects as csv files

August 2, 2018
By

Occasionally R might not be the tool you want to use (hard to believe, but apparently that happens). Then you may need to export some data from R via a csv file. When the data is stored as a ts object, the time index can easily get lost. So I wrote a little function to make this easier, using...

Read more »

World Cup prediction winners

World Cup prediction winners

Predicting the outcome of the different teams in the FIFA World Cup has been of great interest to the general public, and predicting the outcome has also attracted quite some attention in the R community. The World Cup has ended and by now, everyone knows that France managed to take home the trophy that slipped through their fingers when...

Read more »

How Should I Organize My R Research Projects?

August 2, 2018
By

While I understand the languages I need well enough, I don't know much about programming best practices. This goes from function naming to code organization, along with all the tools others created to manage projects (git, make, ctabs, etc.). For short scripts and blog posts, this is fine. Even for a research paper where you're using tools rather than...

Read more »

Basic Generalized Linear Modeling – Part 2: Exercises

August 1, 2018
By
Basic Generalized Linear Modeling – Part 2: Exercises

In this exercise, we will try to handle the model that has been over-dispersed using the quasi-Poisson model. Over-dispersion simply means that the variance is greater than the mean. It’s important because it leads to inflation in the models and increases the possibility of Type I errors. We will use a data-set on amphibian road Related exercise sets:Spatial Data...

Read more »

Use domain knowledge to review prior distributions

August 1, 2018
By

At the Insurance Data Science conference, both Eric Novik and Paul-Christian Bürkner emphasised in their talks the value of thinking about the data generating process when building Bayesian statistical models. It is also a key step in Michael Betancourt’s Principled Bayesian Workflow. In this post, I will discuss in more detail how to set priors, and review the prior and...

Read more »

The Hawkes Hand

August 1, 2018
By
The Hawkes Hand

Thanks to @LukeBornn & Tim Swartz, I had the opportunity to present the #HawkesHand🔥🏀at #CASSIS18 . I learned a lot, had a hell of a time & met some great people #RuffRuff . My Hawkes Hand slides can be viewed here https://t.co/RVkyoNc2IY pic.twitter.com/hPfBcpDhcv — MikeJackTzen (@MKJCKTZN) August 8, 2018 Thanks to Luke Bornn & Tim

Read more »

R Generation: 25 Years of R

August 1, 2018
By
R Generation: 25 Years of R

The August 2018 issue of Significance Magazine includes a retrospective feature on the R language. (I suggest reading the PDF version, also available for free access.) The article by Nick Thieme looks back at the 25 years since the R language was first conceived at Auckland University in 1992. It follows the history of R through the first public...

Read more »

Making Jekyll Blog Suitable for R-bloggers

August 1, 2018
By
Making Jekyll Blog Suitable for R-bloggers

According to the post add your blog, adding one’s blog to R-bloggers isn’t easy at all, especially for people who use R Markdown to write posts and use Jekyll to generate static web page on GitHub. Two reasons make it difficult: The feed you submit should ONLY be about R (e.g: with R code, or directly related to the R...

Read more »

Exploratory Data Analysis in R (introduction)

August 1, 2018
By
Exploratory Data Analysis in R (introduction)

Exploratory data analysis (EDA) the very first step in a data project. We will create a code-template to achieve this with one function.

Read more »

Understanding Titanic Dataset with H2O’s AutoML, DALEX, and lares library

August 1, 2018
By
Understanding Titanic Dataset with H2O’s AutoML, DALEX, and lares library

If you have been studying or working with Machine Learning for at least a week, I am sure you have already played with the Titanic dataset! Today I bring some fun DALEX (Descriptive mAchine Learning EXplanations) functions to study the whole set’s response to the Survival feature and some individual explanation examples. Before we start, Related PostK-fold cross-validation in...

Read more »

Thanks, NVIDIA

August 1, 2018
By
Thanks, NVIDIA

Andrew and I both received a note like this from NVIDIA: We have reviewed your NVIDIA GPU Grant Request and are happy support your work with the donation of (1) Titan Xp to support your research. Thanks! In case other people are interested, NVIDA’s GPU grant program provides ways for faculty or research scientists to The post Thanks, NVIDIA...

Read more »

EARL Conference 2018 – the best yet!

August 1, 2018
By

With 6 weeks to go until the 2018 London EARL Conference, we can officially announce it will be the biggest and best yet. Launched in 2014, EARL focuses on the commercial use of R, an idea born from our experience of hosting the LondonR user group. Established in 2007, the user group has grown from 25 people meeting in...

Read more »

Joint Statistical Meetings Talk

July 31, 2018
By

The Joint Statistical Meetings are being held in the beautiful city of Vancouver, British Columbia this year. I gave a talk on data visualization this year, which is a new one for me, but an area I’m quite excited about. I’ve been looking into the newer toolsets using Javascript graphics for a while now for different projects, and this talk...

Read more »

Beyond Basic R – Data Munging

July 31, 2018
By
Beyond Basic R – Data Munging

What we couldn’t cover In the data cleaning portion of our Intro to R class, we cover a variety of common data manipulation tasks. Most of these were achieved using the package dplyr, including removing or retaining certain columns (select), filtering out rows by column condition (filter), creating new columns (mutate), renaming columns (rename), grouping data by categorical variables (group_by),...

Read more »

MünsteR Meetup on Blog Mining: Deriving the success of blog posts from metadata and text data.

July 31, 2018
By
MünsteR Meetup on Blog Mining: Deriving the success of blog posts from metadata and text data.

In our next MünsteR R-user group meetup on Tuesday, August 28th, 2018 Jenny Saatkamp will give a talk titled Blog Mining: Deriving the success of blog posts from metadata and text data. You can RSVP here: http://meetu.ps/e/F7zDN/w54bW/f In our next MünsteR Meetup, Jenny Saatkamp will present her Blog Mining analysis, which is based on 1.500 blog posts from the codecentric...

Read more »

A package for dimensionality reduction of large data

A package for dimensionality reduction of large data

Motivation Note: Recently, two new UMAP R packages have appeared. These new packages provide more features than umapr does and they are more actively developed. These packages are: umap, which provides the same Python wrapping function as umapr and also an R implementation, removing the need for the Python version to be installed. It is available on CRAN. uwot, which also provides...

Read more »

Source code chapter added to “Evidence-based software engineering using R”

July 31, 2018
By

The Source Code chapter of my evidence-based software engineering book has been added to the draft pdf (download here). This chapter has suffered from coming last and there is still lots of work to be done. Almost all the source code related data has been plundered to fill up earlier chapters. Some data did not

Read more »

Search R-bloggers


Sponsors

Mango solutions





Zero Inflated Models and Generalized Linear Mixed Models with R



datasciencego.com

Quantide: statistical consulting and training

ODSC2 west

ODSC1_london

datasociety

http://www.eoda.de

max kuhn









Six Sigma Online Training



mljar.com

computationalanalytics.com

Our ads respect your privacy. Read our Privacy Policy page to learn more.

Contact us if you wish to help support R-bloggers, and place your banner here.