Articles by David Smith

Give a talk about an application of R at EARL

March 21, 2017 | David Smith

The EARL (Enterprise Applications of R) conference is one of my favourite events to go to. As the name of the conference suggests, the focus of the conference is where the rubber of the R language meets the road of it being used to solve real-world problems. Prior conferences have ... [Read more...]

Alteryx integrates with Microsoft R

March 20, 2017 | David Smith

You can now use Alteryx Designer, the data science workflow tool from Alteryx, as a drag-and-drop interface for many of the big-data statistical modeling tools included with Microsoft R. Alteryx v11.0 includes expanded support for Microsoft SQL Server 2016, Microsoft R Server, Azure SQL Data Warehouse, and Microsoft Analytics Platform System (... [Read more...]

Data Science at StitchFix

March 17, 2017 | David Smith

If you want to see a great example of how data science can inform every stage of a business process, from product concept to operations, look no further than Stitch Fix's Algorithms Tour. Scroll down through this explainer to see how this personal styling service uses data and statistical inference ... [Read more...]

Book Review: Testing R Code

March 16, 2017 | David Smith

When it comes to getting things right in data science, most of the focus goes to the data and the statistical methodology used. But when a misplaced parenthesis can throw off your results entirely, ensuring correctness in your programming is just as important. A new book published by CRC Press, ... [Read more...]

Neural Networks: How they work, and how to train them in R

March 15, 2017 | David Smith

With the current focus on deep learning, neural networks are all the rage again. (Neural networks have been described for more than 60 years, but it wasn't until the the power of modern computing systems became available that they have been successfully applied to tasks like image recognition.) Neural networks are ... [Read more...]

Benchmarking rxNeuralNet for OCR

March 13, 2017 | David Smith

The MicrosoftML package introduced with Microsoft R Server 9.0 added several new functions for high-performance machine learning, including rxNeuralNet. Tomaz Kastrun recently applied rxNeuralNet to the MNIST database of handwritten digits to compare its performance with two other machine learning packages, h2o and xgboost. The results are summarized in the ... [Read more...]

Updates to the Data Science Virtual Machine for Linux

March 10, 2017 | David Smith

The Data Science Virtual Machine (DSVM) is a virtual machine image on the Azure Marketplace assembled for data scientists. The goal of the DSVM is provide a broad array of popular data-oriented tools in a single environment, and make data scientists and developers highly productive in their work. It's available ... [Read more...]

The Rise of Civilization, Visualized with R

March 9, 2017 | David Smith

This animation by geographer James Cheshire shows something at once simple and profound: the founding and growth of the cities of the world since the dawn of civilization. Dr Cheshire created the animation using R and the rworldmap package, using data from this Nature dataset. The complete R code is ... [Read more...]

In case you missed it: February 2017 roundup

March 7, 2017 | David Smith

In case you missed them, here are some articles from February of particular interest to R users. Public policy researchers use R to predict neighbourhoods in US cities subject to gentrification. The ggraph package provides a grammar-of-graphics framework for visualizing directed and undirected graphs. Facebook has open-sourced the "prophet" package ... [Read more...]

R 3.3.3 now available

March 6, 2017 | David Smith

The R core group announced today the release of R 3.3.3 (code-name: "Another Canoe"). As the wrap-up release of the R 3.3 series, this update mainly contains minor bug-fixes. (Bigger changes are planned for R 3.4.0, expected in mid-April.) Binaries for the Windows version are already up on the CRAN master site, and ... [Read more...]

Scholarships encourage diversity at useR!2017

March 1, 2017 | David Smith

While representation of women and minorities at last year's useR! conference was the highest it's ever been, there is always room for more diversity. To encourage more underrepresented individuals to attend, the useR! committee has taken several steps, including asking attendees to adhere to a supportive code of conduct and ... [Read more...]

Forecasting gentrification in city neighborhoods, with R

February 28, 2017 | David Smith

If you've lived in a big city, you're likely familiar with the impact of gentrification. For longtime residents of a neighbourhood, it can represent a decline in the culture and vibrancy of your community; for recent or prospective residents, it can represent a financial opportunity in rising home prices. For ... [Read more...]

ggraph: ggplot for graphs

February 27, 2017 | David Smith

A graph, a collection of nodes connected by edges, is just data. Whether it's a social network (where nodes are people, and edges are friend relationships), or a decision tree (where nodes are branch criteria or values, and edges decisions), the nature of the graph is easily represented in a ... [Read more...]

Preview: R Tools for Visual Studio 1.0

February 23, 2017 | David Smith

After more than a year in preview R Tools for Visual Studio, the open-source extension to the Visual Studio IDE for R programming, is nearing its official release. RTVS Release Candidate 1 is now available for download, giving you the opportunity to try out the new features ahead of the official ... [Read more...]

The difference between R and Excel

February 22, 2017 | David Smith

If you're an Excel user (or any other spreadsheet, really), adapting to learn R can be hard. As this blog post by Gordon Shotwell explains, one of the reasons is that simple things can be harder to do in R than Excel. But it's worth perservering, because complex things can ... [Read more...]

Finding Radiohead’s most depressing song, with R

February 22, 2017 | David Smith

Radiohead is known for having some fairly maudlin songs, but of all of their tracks, which is the most depressing? Data scientist and R enthusiast Charlie Thompson ranked all of their tracks according to a "gloom index", and created the following chart of gloominess for each of the band's nine ... [Read more...]

Catterplots: Plots with cats

February 17, 2017 | David Smith

As a devotee of Tufte, I'm generally against chartjunk. Graphical elements that obscure interpretation of the data occasionally have a useful role to play, but more often than not that role is to entertain the expense of enlightenment, or worse, to actively mislead. So it's with mixed feelings that I ... [Read more...]
1 15 16 17 18 19 94

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)