Blog Archives

Publish R functions as stored procedures with the sqlrutils package

April 4, 2017
By

If you've created an R function (say, a routine to clean up missing values in a data set, or a function to make forecasts using a machine learning model), and you want to make it easy for DBAs to use it, it's now possible to publish R functions as a SQL Server 2016 stored procedure. The sqlrutils package provides...

Read more »

The Most Popular Languages for Data Scientists/Engineers

April 3, 2017
By
The Most Popular Languages for Data Scientists/Engineers

The results of the 2017 StackOverflow Survey of nearly 65,000 developers were published recently, and includes lots of interesting insights about their work, lives and preferences. The results include a cross-tabulation of the most popular languages amongst the "Data Scientist/Engineer" subset, and the results were ... well, surprising: When thinking about data scientists, it certainly makes sense to see...

Read more »

Tutorial: Using R for Scalable Data Analytics

March 31, 2017
By
Tutorial: Using R for Scalable Data Analytics

At the recent Strata conference in San Jose, several members of the Microsoft Data Science team presented the tutorial Using R for Scalable Data Analytics: Single Machines to Spark Clusters. The materials are all available online, including the presentation slides and hands-on R scripts. You can follow along with the materials at home, using the Data Science Virtual Machine...

Read more »

Learning Scrabble strategy from robots, using R

March 30, 2017
By
Learning Scrabble strategy from robots, using R

While you might think of Scrabble as that game you play with your grandparents on a rainy Sunday, some people take it very seriously. There's an international competition devoted to Scrabble, and no end of guides and strategies for competitive play. James Curley, a psychology professor at Columbia University, has used an interesting method to collect data about what...

Read more »

UK government using R to modernize reporting of official statistics

March 28, 2017
By
UK government using R to modernize reporting of official statistics

Like all governments, the UK government is responsible for producing reports of official statistics on an ongoing basis. That process has traditionally been a highly manual one: extract data from government systems, load it into a mainframe statistical analysis tool and run models and forecasts, extract the results to a spreadsheet to prepare data for presentation, and ultimately combine...

Read more »

Data science languages score highly in RedMonk rankings

March 27, 2017
By
Data science languages score highly in RedMonk rankings

Redmonk have once again updated (a little later than usual) their bi-annual programming language report with their January 2017 rankings. If you haven't come across these rankings before, they are based on GitHub contributions and StackOverflow questions related to around 40 commonly-used programming languages. The raw data (as of January 2017) is shown below — as you might guess...

Read more »

Comparing subreddits, with Latent Semantic Analysis in R

March 24, 2017
By
Comparing subreddits, with Latent Semantic Analysis in R

FiveThirtyEight published a fascinating article this week about the subreddits that provided support to Donald Trump during his campaign, and continue to do so today. Reddit, for those not in the know, is an popular online social community organized into thousands of discussion topics, called subreddits (the names all begin with "r/"). Most of the subreddits are a useful...

Read more »

Give a talk about an application of R at EARL

March 21, 2017
By
Give a talk about an application of R at EARL

The EARL (Enterprise Applications of R) conference is one of my favourite events to go to. As the name of the conference suggests, the focus of the conference is where the rubber of the R language meets the road of it being used to solve real-world problems. Prior conferences have included presentations on how Maersk uses R to optimize...

Read more »

Alteryx integrates with Microsoft R

March 20, 2017
By
Alteryx integrates with Microsoft R

You can now use Alteryx Designer, the data science workflow tool from Alteryx, as a drag-and-drop interface for many of the big-data statistical modeling tools included with Microsoft R. Alteryx v11.0 includes expanded support for Microsoft SQL Server 2016, Microsoft R Server, Azure SQL Data Warehouse, and Microsoft Analytics Platform System (APS), with new workflow tools to access functionality...

Read more »

Data Science at StitchFix

March 17, 2017
By
Data Science at StitchFix

If you want to see a great example of how data science can inform every stage of a business process, from product concept to operations, look no further than Stitch Fix's Algorithms Tour. Scroll down through this explainer to see how this personal styling service uses data and statistical inference to suggest clothes their customers will love, ship them...

Read more »

Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)