Blog Archives

Calculating a fuzzy kmeans membership matrix with R and Rcpp

August 24, 2017
By

by Błażej Moska, computer science student and data science intern Suppose that we have performed clustering K-means clustering in R and are satisfied with our results, but later we realize that it would also be useful to have a membership matrix. Of course it would be easier to repeat clustering using one of the fuzzy kmeans functions available in...

Read more »

Tutorial: Deep Learning with R on Azure with Keras and CNTK

August 9, 2017
By
Tutorial: Deep Learning with R on Azure with Keras and CNTK

by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) Microsoft's Cognitive Toolkit (better known as CNTK) is a commercial-grade and open-source framework for deep learning tasks. At present CNTK does not have a native R interface but can be accessed through Keras, a high-level API which wraps various deep learning backends including CNTK, TensorFlow,...

Read more »

Data Science Accelerator for Credit Risk Prediction

July 12, 2017
By
Data Science Accelerator for Credit Risk Prediction

by Fang Zhou, Data Scientist; Graham Williams, Director of Data Science, all at Microsoft Credit Risk Scoring is a classic but increasingly important operation in banking as banks are becoming far more risk careful when lending for mortgages, credit card payments or other commercial purposes, in an industry known for fierce competition and the global financial crisis. With an...

Read more »

XGBoost support added to Rattle

July 7, 2017
By
XGBoost support added to Rattle

by Fang Zhou, Data Scientist; and Graham Williams, Director of Data Science, all at Microsoft Rattle — the R Analytical Tool To Learn Easily — is a popular open-source GUI for data mining using R. It presents statistical and visual summaries of data, transforms data that can be readily modelled, builds both unsupervised and supervised models from the data,...

Read more »

Who is the caretaker? Evidence-based probability estimation with the bnlearn package

May 26, 2017
By
Who is the caretaker? Evidence-based probability estimation with the bnlearn package

by Juan M. Lavista Ferres , Senior Director of Data Science at Microsoft In what was one of the most viral episodes of 2017, political science Professor Robert E Kelly was live on BBC World News talking about the South Korean president being forced out of office when both his kids decided to take an easy path to fame...

Read more »

AzureDSVM: a new R package for elastic use of the Azure Data Science Virtual Machine

May 19, 2017
By

by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) The Azure Data Science Virtual Machine (DSVM) is a curated VM which provides commonly-used tools and software for data science and machine learning, pre-installed. AzureDSVM is a new R package that enables seamless interaction with the DSVM from a local R session, by providing functions...

Read more »

R is for Archaeology: A report on the 2017 Society of American Archaeology meeting

April 14, 2017
By
R is for Archaeology: A report on the 2017 Society of American Archaeology meeting

by (https://twitter.com/benmarwick/), Associate Professor of Archaeology, University of Washington and Senior Research Scientist, University of Wollongong The (http://www.saa.org/) is one of the largest professional organisations for archaeologists in the world, and just concluded its annual meeting in Vancouver, BC at the end of March. The R language has been a part of this meeting...

Read more »

Massively-parallel computations on Azure clusters with R, made easy with doAzureParallel

March 29, 2017
By
Massively-parallel computations on Azure clusters with R, made easy with doAzureParallel

by JS Tan (Program Manager, Microsoft) For users of the R language, scaling up their work to take advantage of cloud-based computing has generally been a complex undertaking. We are therefore excited to announce doAzureParallel, a lightweight R package built on Azure Batch that allows you to easily use Azure’s flexible compute resources right from your R session. The...

Read more »

Running your R code on Azure with mrsdeploy

March 22, 2017
By
Running your R code on Azure with mrsdeploy

by John-Mark Agosta, data scientist manager at Microsoft Let’s say you’ve built a model in R that is larger than you can conveniently run locally, and you want to take advantage of Azure’s resources simply to run it on a larger machine. This blog explains how to provision and run an Azure virtual machine (VM) for this, using the...

Read more »

AUC Meets the Wilcoxon-Mann-Whitney U-Statistic

March 15, 2017
By
AUC Meets the Wilcoxon-Mann-Whitney U-Statistic

by Bob Horton, Senior Data Scientist, Microsoft The area under an ROC curve (AUC) is commonly used in machine learning to summarize the performance of a predictive model with a single value. But you might be surprised to learn that the AUC is directly connected to the Mann-Whitney U-Statistic, which is commonly used in a robust, non-parametric alternative to...

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)