Articles by David Smith

Microsoft R Server in the News

January 19, 2017 | David Smith

Since the release of Microsoft R Server 9 last month, there's been quite a bit of news in the tech press about the capabilities it provides for using R in production environments. Infoworld's article, Microsoft’s R tools bring data science to the masses, takes a look back at Microsoft's vision ... [Read more...]

Diversity in the R Community

January 18, 2017 | David Smith

In the follow-up to the useR! conference in Stanford last year, the Women in R Task force took the opportunity to survey the 900-or-so participants about their backgrounds, experiences and interests. With 455 responses, the recently-published results provide an interesting snapshot about the R community (or at least that subset able ... [Read more...]

Git Gud with Git and R

January 17, 2017 | David Smith

If you're doing any kind of in-depth programming in the R language (say, creating a report in Rmarkdown, or developing a package) you might want to consider using a version-control system. And if you collaborate with another person (or a team) on the work, it makes things infinitely easier when ... [Read more...]

The fivethirtyeight R package

January 16, 2017 | David Smith

Andrew Flowers, quantitiative editor of FiveThirtyEight.com, announced at last weeks' RStudio conference the availability of a new R package containing data and analyses from some of their data journalism features: the fivethirtyeight package. (Andrew's talk isn't yet online, but you can see him discuss several of these stories in ... [Read more...]

Microsoft R Server tips from the Tiger Team

January 13, 2017 | David Smith

The Microsoft R Server Tiger Team assists customers around the world to implement large-scale analyytic solutions. Along the way, they discover useful tips and best practices, and share them on the Tiger Team blog. Here are a few recent tips from the Tiger Team on using Microsoft R Server: Gather ... [Read more...]

In case you missed it: December 2016 roundup

January 11, 2017 | David Smith

In case you missed them, here are some articles from December of particular interest to R users. Power BI now has a gallery of custom visualizations built with R. Chicago's Department of Public Health uses R to prioritize health inspections at restaurants. A beautiful map of Switzerland municipalities combined with ... [Read more...]

What can we learn from StackOverflow data?

January 9, 2017 | David Smith

StackOverflow, the popular Q&A site for programmers, provides useful information to nearly 5 million programmers worldwide with its database of questions and answers — not to mention the additional comments that other programmers provide. (You might be interested in the architecture, based SQL Server 2016, required to deliver the 8.5 billion pages Stack ... [Read more...]

Analyzing emotions in video with R

January 6, 2017 | David Smith

In the run-up to the election last year, Ben Heubl from The Economist used the Emotion API to chart the emotions portrayed by the candidates during the debates (note: auto-play video in that link). In his walkthrough of the implementation, Ben used Python to process the video files, and R ... [Read more...]

Three reasons to learn R today

January 6, 2017 | David Smith

If you're just getting started with data science, the Sharp Sight Labs blog argues that R is the best data science language to learn today. The blog post gives several detailed reasons, but the main arguments are: R is an extremely popular (arguably the most popular) data progamming language, and ... [Read more...]

The biggest R stories from 2016

January 3, 2017 | David Smith

It's been another great year for the R project and the R community. Let's look at some of the highlights from 2016. The R 3.3 major release brought some significant performance improvements to R, along with a spiffy new logo. There were also two updates in 2016: R 3.3.1 and R 3.3.2. (The R 3.2 series ... [Read more...]

Power BI custom visuals, based on R

December 30, 2016 | David Smith

You've been able to include user-defined charts using R in Power BI dashboards for a while now, but a recent update to Power BI includes seven new custom charts based on R in the customs visuals gallery. You can see the new chart types by visiting the Power BI Custom ... [Read more...]

Using R to prevent food poisoning in Chicago

December 29, 2016 | David Smith

There are more than 15,000 restaurants in Chicago, but fewer than 40 inspectors tasked with making sure they comply with food-safety standards. To help prioritize the facilities targeted for inspection, the City of Chicago used R to create a model that predicts which restaurants are most likely to fail an inspection. Using ... [Read more...]

Combine choropleth data with raster maps using R

December 28, 2016 | David Smith

Switzerland is a country with lots of mountains, and several large lakes. While the political subdivisions (called municipalities) cover the high mountains and lakes, nothing much of economic interest happens in these places. (Raclette and sailing are wonderful, but don't count for our purposes.) For this reason, the Swiss Federal ... [Read more...]

The Basics of Bayesian Statistics

December 26, 2016 | David Smith

Bayesian Inference is a way of combining information from data with things we think we already know. For example, if we wanted to get an estimate of the mean height of people, we could use our prior knowledge that people are generally between 5 and 6 feet tall to inform the results ... [Read more...]

Merry ChRistmas!

December 23, 2016 | David Smith

Christmas day is soon upon us, so here's a greeting made with R: Each frame is a Voronoi Tesselation: about 1,000 points are chosen across the plane, which each generate a polygon comprising the region closer to it than any other selected point. These process is repeated for three designs (a ... [Read more...]

Interactive decision trees with Microsoft R

December 20, 2016 | David Smith

Even though ensembles of trees (random forests and the like) generally have better predictive power and robustness, fitting a single decision tree to data can often be very useful for: understanding the important variables in a data set exploring unusual subsegments of the data (and the explanatory variables that define ... [Read more...]

Mixed Integer Programming in R with the ompr package

December 19, 2016 | David Smith

Numerical optimization is an important tool in the data scientist's toolbox. Many classical statistical problems boil down to finding the highest (or lowest) point on a multi-dimensional surface: the base R function optim provides many techniques for solving such maximum likelihood problems. Counterintuitively, numerical optimizations are easiest (though rarely actually ... [Read more...]

Predicting flu deaths with R

December 16, 2016 | David Smith

As Google learned, predicting the spread of influenza, even with mountains of data, is notoriously difficult. Nonetheless, bioinformatician and R user Shirin Glander has created a two-part tutorial about predicting flu deaths with R (part 2 here). The analysis is based on just 136 cases of influenza A H7N9 in China ... [Read more...]
1 17 18 19 20 21 94

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)