Articles by David Smith

Reflections on ROpenSci Unconference 2017

May 30, 2017 | David Smith

Last week I attended the ROpenSci Unconference in Los Angeles, and it was fantastic. Now in its fourth year, the ROpenSci team brought together a talented and diverse group of about 70 R developers from around the world to work on R-related projects in an intense 2-day hackathon. Not only did ... [Read more...]

Love is all around: Popular words in pop hits

May 25, 2017 | David Smith

Data scientist Giora Simchoni recently published a fantastic analysis of the history of pop songs on the Billboard Hot 100 using the R language. Giora used the rvest package in R to scrape data from the Ultimate Music Database site for the 350,000 chart entries (and 35,000 unique songs) since 1940, and used those ... [Read more...]

Microsoft R Open 3.4.0 now available

May 24, 2017 | David Smith

Microsoft R Open (MRO), Microsoft's enhanced distribution of open source R, has been upgraded to version 3.4.0 and is now available for download for Windows, Mac, and Linux. This update upgrades the R language engine to R 3.4.0, reduces the size of the installer image, and updates the bundled packages. R 3.4.0 (upon ... [Read more...]

Create smooth animations in R with the tweenr package

May 23, 2017 | David Smith

There are several tools available in R for creating animations (movies) from statistical graphics. The animation package by Yihui Xie will create an animated GIF or video file, using a series of R charts you generate as the frames. And the gganimate package by David Robinson is an extension to ... [Read more...]

Preview of EARL San Francisco

May 22, 2017 | David Smith

The first ever EARL (Enterprise Applications of the R Language) conference in San Francisco will take place on June 5-7 (and it's not too late to register). The EARL conference series is now in its fourth year, and the prior conferences in London and Boston have each been a fantastic ... [Read more...]

R/Finance 2017 livestreaming today and tomorrow

May 19, 2017 | David Smith

If you weren't able to make it to Chicago for R/Finance, the annual conference devoted to applications of R in the financial industry, don't fret: the entire conference is being livestreamed (with thanks to the team at Microsoft). You can watch the proceedings at aka.ms/r_finance, and ... [Read more...]

An Introduction to Spatial Data Analysis and Visualization in R

May 17, 2017 | David Smith

The Consumer Data Research Centre, the UK-based organization that works with consumer-related organisations to open up their data resources, recently published a new course online: An Introduction to Spatial Data Analysis and Visualization in R. Created by James Cheshire (whose blog Spatial.ly regularly features interesting R-based data visualizations) and ... [Read more...]

R in Financial Services: Challenges and Opportunities

May 16, 2017 | David Smith

At the New York R Conference earlier this year, my colleague Lixun Zhang gave a presentation on the challenges and opportunites financial services companies encounter when using R. In the talk, he shares some lessons learned while working with an couple of international banks that have been using SAS, but ... [Read more...]

R and Python support now built in to Visual Studio 2017

May 15, 2017 | David Smith

The new Visual Studio 2017 has built-in support for programming in R and Python. For older versions of Visual Studio, support for these languages has been available via the RTVS and PTVS add-ins, but the new Data Science Workloads in Visual Studio 2017 make them available without a separate add-in. Just choose ... [Read more...]

Analyzing the home advantage in English soccer, with R

May 12, 2017 | David Smith

It's well-known that the home team has an advantage in soccer (or football, as it's called in England). But which teams have made the most of their home-field advantage over the years? Evolutionary biologist (and Liverpool fan) Joe Gallagher analyzed the percentage of points won in the UK Premier League (... [Read more...]

Analyzing data on CRAN packages

May 11, 2017 | David Smith

There's a handy new function in R 3.4.0 for anyone interested in data about CRAN packages. It's not documented, but it's pretty simple: tools::CRAN_package_db() returns a data frame with one row for every package on CRAN and 65 columns of data on those packages, as shown below. __ names(tools::... [Read more...]

Stack Overflow Trends

May 10, 2017 | David Smith

Developer Q&A site Stack Overflow recently introduced Stack Overflow Trends, a useful tool for tracking the growth and decline in the rate of questions asked on various topics (by their Stack Overflow tag). For example, you can see that activity around both R and Python has been increasing over ... [Read more...]

Real-time scoring with Microsoft R Server 9.1

May 4, 2017 | David Smith

Once you've built a predictive model, in many cases the next step is to operationalize the model: that is, generate predictions from the pre-trained model in real time. In this scenario, latency becomes the critical metric: new data typically become available a single row at a time, and it's important ... [Read more...]

Technical Foundations of Informatics: A modern introduction to R

May 3, 2017 | David Smith

Informatics (or Information Science) is the practice of creating, storing, finding, manipulating and sharing information. These are all tasks that the R language was designed for, and so Technical Foundations of Informatics, the online course guide for the University of Washington course of the same name, also provides an excellent ... [Read more...]

The Datasaurus Dozen

May 2, 2017 | David Smith

There's a reason why data scientists spend so much time exploring data using graphics. Relying only on data summaries like means, variances, and correlations can be dangerous, because wildly different data sets can give similar results. This is a principle that has been demonstrated in statistics classes for decades with ... [Read more...]

Using Microsoft R with Alteryx

May 1, 2017 | David Smith

Alteryx Designer, the self-service analytics workflow tool, recently added integration with Microsoft R. This allows you to train models provided by Microsoft R, and create predictions from them, without needing to write R code — you simply drag-and-drop to create a workflow. In a recent post at the Microsoft R blog, ... [Read more...]

Make pleasingly parallel R code with rxExecBy

April 28, 2017 | David Smith

Some things are easy to convert from a long-running sequential process to a system where each part runs at the same time, thus reducing the required time overall. We often call these "embarrassingly parallel" problems, but given how easy it is to reduce the time it takes to execute them ... [Read more...]

Where Europe lives, in 14 lines of R Code

April 27, 2017 | David Smith

Via Max Galka, always a great source of interesting data visualizations, we have this lovely visualization of population density in Europe in 2011, created by Henrik Lindberg: Impressively, the chart was created with just 14 lines of R code: (To recreate it yourself, download the GEOSTAT-grid-POP-1K-2011-V2-0-1.zip file ... [Read more...]
1 13 14 15 16 17 94

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)