Blog Archives

Painting with Data

August 4, 2017
By
Painting with Data

The accidental aRt tumblr (mentioned here a few years ago) continues to provide a steady stream of images that wouldn't look out of place in a modern art gallery, but which in fact are data visualizations (mostly attempted in R), gone wrong. (Here's a typical recent entry.) But now, Giora Simchoni has taken this concept to the next level...

Read more »

Text categorization with deep learning, in R

August 3, 2017
By

Given a short review of a product, like "I couldn't put it down!", can you predict what the product is? In that case it's pretty easy — it's for a book — but this general problem of text categorization comes up in a lot of natural language analysis problems. In his talk at useR!2017 (shown below), Microsoft data scientist...

Read more »

Applications in energy, retail and shipping

August 2, 2017
By
Applications in energy, retail and shipping

The Solutions section of the Cortana Intelligence Gallery provides more than two dozen working examples of applying machine learning, data science and artificial intelligence to real-world problems. Each solution provides sample data, scripts for model training and evaluation, and reporting of predictions. You can deploy a complete stack in Azure to implement the solution with the click of a...

Read more »

A modern database interface for R

August 1, 2017
By

At the useR! conference last month, Jim Hester gave a talk about two packages that provide a modern database interface for R. Those packages are the odbc package (developed by Jim and other members of the RStudio team), and the DBI package (developed by Kirill Müller with support from the R Consortium). To communicate with databases, a common protocol...

Read more »

How to use H2O with R on HDInsight

July 31, 2017
By

H2O.ai is an open-source AI platform that provides a number of machine-learning algorithms that run on the Spark distributed computing framework. Azure HDInsight is Microsoft's fully-managed Apache Hadoop platform in the cloud, which makes it easy to spin up and manage Azure clusters of any size. It's also easy to to run H2O on HDInsight: H2O AI Platform is...

Read more »

Learn parallel programming in R with these exercises for "foreach"

July 28, 2017
By

The foreach package provides a simple looping construct for R: the foreach function, which you may be familiar with from other languages like Javascript or C#. It's basically a function-based version of a "for" loop. But what makes foreach useful isn't iteration: it's the way it makes it easy to run those iterations in parallel, and save time on...

Read more »

The R6 Class System

July 27, 2017
By

R is an object-oriented language with several object-orientation systems. There's the original (and still widely-used) S3 class system based on the "class" attribute. There's the somewhat stricter, signature-based S4 class system. There are reference classes (also called R5), which provide R objects with multiple references without duplicating data in memory. And now there's the R6 class system, implemented as...

Read more »

Introducing Joyplots

July 26, 2017
By

This is a joyplot: a series of histograms, density plots or time series for a number of data segments, all aligned to the same horizontal scale and presented with a slight overlap. Peak time for sports and leisure #dataviz. About time for a joyplot; might do a write-up on them. #rstats code at https://t.co/Q2AgW068Wa pic.twitter.com/SVT6pkB2hB — Henrik Lindberg (@hnrklndbrg)...

Read more »

SQL Server 2017 release candidate now available

July 25, 2017
By

SQL Server 2017, the next major release of the SQL Server database, has been available as a community preview for around 8 months, but now the first full-featured release candidate is available for public preview. For those looking to do data science with data in SQL Server, there are a number of new features compared to SQL Server 2017...

Read more »

Analyzing Github pull requests with Neural Embeddings, in R

July 24, 2017
By

At the useR!2017 conference earlier this month, my colleague Ali Zaidi gave a presentation on using Neural Embeddings to analyze GitHub pull request comments (processed using the tidy text framework). The data analysis was done using R and distributed on Spark, and the resulting neural network trained using the Microsoft Cognitive Toolkit. You can see the slides here, and...

Read more »

Search R-bloggers


Sponsors

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)