Articles by Guest Blogger

How to give money to the R project

December 11, 2018 | Guest Blogger

by Mark Niemann-Ross, an author, educator, and writer who teaches about R and Raspberry Pi at LinkedIn Learning I spend a LOT of time at r-project.org, in particular the sections for documentation and CRAN. But I hadn’t spent much time in the other areas: R Project, R Foundation, ... [Read more...]

Report from the Enterprise Applications of the R Language conference

November 16, 2018 | Guest Blogger

by Mark Niemann-Ross Mango Solutions presented the EARL conference in Seattle this November and I was fortunate enough to have time to attend. During the conference I took notes on the presentations, which I’ll pass along to you. Thoughts on the future of R in industry The EARL conference ...

[Read more...]

Compare outlier detection methods with the OutliersO3 package

March 8, 2018 | Guest Blogger

by Antony Unwin, University of Augsburg, Germany There are many different methods for identifying outliers and a lot of them are available in R. But are outliers a matter of opinion? Do all methods give the same results? Articles on outlier methods use a mixture of theory and practice. Theory ... [Read more...]

DataExplorer: Fast Data Exploration With Minimum Code

February 8, 2018 | Guest Blogger

by Boxuan Cui, Data Scientist at Smarter Travel Once upon a time, there was a joke: In Data Science, 80% of time spent prepare data, 20% of time spent complain about need for prepare data. — Big Data Borat (@BigDataBorat) February 27, 2013 According to a Forbes article, cleaning and organizing data is the most ... [Read more...]

An introduction to seplyr

December 14, 2017 | Guest Blogger

by John Mount, Win-Vector LLC [`seplyr`](https://winvector.github.io/seplyr/) is an [`R`](https://www.r-project.org) package that supplies improved standard evaluation interfaces for many common data wrangling tasks. The core of `seplyr` is a re-skinning of [`dplyr`](https://CRAN.R-project.org/package=dplyr)'s functionality to `seplyr` ... [Read more...]

How to make Python easier for the R user: revoscalepy

November 28, 2017 | Guest Blogger

by Siddarth Ramesh, Data Scientist, Microsoft I’m an R programmer. To me, R has been great for data exploration, transformation, statistical modeling, and visualizations. However, there is a huge community of Data Scientists and Analysts who turn to Python for these tasks. Moreover, both R and Python experts exist ... [Read more...]

Scale up your parallel R workloads with containers and doAzureParallel

November 21, 2017 | Guest Blogger

by JS Tan (Program Manager, Microsoft) The R language is by and far the most popular statistical language, and has seen massive adoption in both academia and industry. In our new data-centric economy, the models and algorithms that data scientists build in R are not just being used for research ... [Read more...]

Recap: EARL Boston 2017

November 9, 2017 | Guest Blogger

By Emmanuel Awa, Francesca Lazzeri and Jaya Mathew, data scientists at Microsoft A few of us got to attend EARL conference in Boston last week which brought together a group of talented users of R from academia and industry. The conference highlighted various Enterprise Applications of R. Despite being a ... [Read more...]

Role Playing with Probabilities: The Importance of Distributions

November 2, 2017 | Guest Blogger

by Jocelyn Barker, Data Scientist at Microsoft I have a confession to make. I am not just a statistics nerd; I am also a role-playing games geek. I have been playing Dungeons and Dragons (DnD) and its variants since high school. While playing with my friends the other day it ... [Read more...]

Estimating mean variance and mean absolute bias of a regression tree by bootstrapping using foreach and rpart packages

October 26, 2017 | Guest Blogger

by Błażej Moska, computer science student and data science intern One of the most important thing in predictive modelling is how our algorithm will cope with various datasets, both training and testing (previously unseen). This is strictly connected with the concept of bias-variance tradeoff. Roughly speaking, variance of ... [Read more...]

Calculating a fuzzy kmeans membership matrix with R and Rcpp

August 24, 2017 | Guest Blogger

by Błażej Moska, computer science student and data science intern Suppose that we have performed clustering K-means clustering in R and are satisfied with our results, but later we realize that it would also be useful to have a membership matrix. Of course it would be easier to ... [Read more...]

Tutorial: Deep Learning with R on Azure with Keras and CNTK

August 9, 2017 | Guest Blogger

by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) Microsoft's Cognitive Toolkit (better known as CNTK) is a commercial-grade and open-source framework for deep learning tasks. At present CNTK does not have a native R interface but can be accessed through Keras, a high-level API ... [Read more...]

Data Science Accelerator for Credit Risk Prediction

July 12, 2017 | Guest Blogger

by Fang Zhou, Data Scientist; Graham Williams, Director of Data Science, all at Microsoft Credit Risk Scoring is a classic but increasingly important operation in banking as banks are becoming far more risk careful when lending for mortgages, credit card payments or other commercial purposes, in an industry known for ... [Read more...]

XGBoost support added to Rattle

July 7, 2017 | Guest Blogger

by Fang Zhou, Data Scientist; and Graham Williams, Director of Data Science, all at Microsoft Rattle — the R Analytical Tool To Learn Easily — is a popular open-source GUI for data mining using R. It presents statistical and visual summaries of data, transforms data that can be readily modelled, builds both ... [Read more...]

Who is the caretaker? Evidence-based probability estimation with the bnlearn package

May 26, 2017 | Guest Blogger

by Juan M. Lavista Ferres , Senior Director of Data Science at Microsoft In what was one of the most viral episodes of 2017, political science Professor Robert E Kelly was live on BBC World News talking about the South Korean president being forced out of office when both his kids decided ... [Read more...]

AzureDSVM: a new R package for elastic use of the Azure Data Science Virtual Machine

May 19, 2017 | Guest Blogger

by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) The Azure Data Science Virtual Machine (DSVM) is a curated VM which provides commonly-used tools and software for data science and machine learning, pre-installed. AzureDSVM is a new R package that enables seamless interaction with the ... [Read more...]

R is for Archaeology: A report on the 2017 Society of American Archaeology meeting

April 14, 2017 | Guest Blogger

by [Ben Marwick](https://twitter.com/benmarwick/), Associate Professor of Archaeology, University of Washington and Senior Research Scientist, University of Wollongong The [Society of American Archaeology (SAA)](http://www.saa.org/) is one of the largest professional organisations for archaeologists in the world, and just concluded its annual meeting in ... [Read more...]

Massively-parallel computations on Azure clusters with R, made easy with doAzureParallel

March 29, 2017 | Guest Blogger

by JS Tan (Program Manager, Microsoft) For users of the R language, scaling up their work to take advantage of cloud-based computing has generally been a complex undertaking. We are therefore excited to announce doAzureParallel, a lightweight R package built on Azure Batch that allows you to easily use Azure’... [Read more...]

Running your R code on Azure with mrsdeploy

March 22, 2017 | Guest Blogger

by John-Mark Agosta, data scientist manager at Microsoft Let’s say you’ve built a model in R that is larger than you can conveniently run locally, and you want to take advantage of Azure’s resources simply to run it on a larger machine. This blog explains how to ... [Read more...]

AUC Meets the Wilcoxon-Mann-Whitney U-Statistic

March 15, 2017 | Guest Blogger

by Bob Horton, Senior Data Scientist, Microsoft The area under an ROC curve (AUC) is commonly used in machine learning to summarize the performance of a predictive model with a single value. But you might be surprised to learn that the AUC is directly connected to the Mann-Whitney U-Statistic, which ... [Read more...]

1 2 »

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Articles by Guest Blogger

How to give money to the R project

Report from the Enterprise Applications of the R Language conference

Compare outlier detection methods with the OutliersO3 package

DataExplorer: Fast Data Exploration With Minimum Code

An introduction to seplyr

How to make Python easier for the R user: revoscalepy

Scale up your parallel R workloads with containers and doAzureParallel

Recap: EARL Boston 2017

Role Playing with Probabilities: The Importance of Distributions

Estimating mean variance and mean absolute bias of a regression tree by bootstrapping using foreach and rpart packages

Calculating a fuzzy kmeans membership matrix with R and Rcpp

Tutorial: Deep Learning with R on Azure with Keras and CNTK

Data Science Accelerator for Credit Risk Prediction

XGBoost support added to Rattle

Who is the caretaker? Evidence-based probability estimation with the bnlearn package

AzureDSVM: a new R package for elastic use of the Azure Data Science Virtual Machine

R is for Archaeology: A report on the 2017 Society of American Archaeology meeting

Massively-parallel computations on Azure clusters with R, made easy with doAzureParallel

Running your R code on Azure with mrsdeploy

AUC Meets the Wilcoxon-Mann-Whitney U-Statistic

Articles by Guest Blogger

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)