Data visualization with googleVis exercises part 9

July 30, 2017
By
Data visualization with googleVis exercises part 9

Histogram & Calendar chart This is part 9 of our series and we are going to explore the features of two interesting types of charts that googleVis provides like histogram and calendar charts. Read the examples below to understand the logic of what we are going to do and then test yous skills with the Related exercise sets: Data Visualization...

Read more »

Matching, Optimal Transport and Statistical Tests

July 30, 2017
By
Matching, Optimal Transport and Statistical Tests

To explain the “optimal transport” problem, we usually start with Gaspard Monge’s “Mémoire sur la théorie des déblais et des remblais“, where the the problem of transporting a given distribution of matter (a pile of sand for instance) into another (an excavation for instance). This problem is usually formulated using distributions, and we seek the “optimal” transport from one...

Read more »

Scripting for data analysis (with R)

July 30, 2017
By
Scripting for data analysis (with R)

Course materials (GitHub) This was a PhD course given in the spring of 2017 at Linköping University. The course was organised by the graduate school Forum scientium and was aimed at people who might be interested in using R for data analysis. The materials developed from a part of a previous PhD course from a

Read more »

Understanding Overhead Issues in Parallel Computation

July 29, 2017
By
Understanding Overhead Issues in Parallel Computation

In my talk at useR! earlier this month, I emphasized the fact that a major impediment to obtaining good speed from parallelizing an algorithm is systems overhead of various kinds, including: Contention for memory/network. Bandwidth limits — CPU/memory, CPU/network, CPU/GPU. Cache coherency problems. Contention for I/O ports. OS and/or R limits on number of sockets … Continue reading Understanding...

Read more »

Memorable dataviz with the R program, talk awarded people’s choice prize

July 29, 2017
By

“Memorable dataviz with the R program” awarded people’s choice prize For the past two years Dr Nick Hamilton has invited me to give a talk on creating data visuals with the R program at the wonderful UQ Winterschool in Bioinformatics. This year...

Read more »

Tidy Time Series Analysis, Part 3: The Rolling Correlation

Tidy Time Series Analysis, Part 3: The Rolling Correlation

In the third part in a series on Tidy Time Series Analysis, we’ll use the runCor function from TTR to investigate rolling (dynamic) correlations. We’ll again use tidyquant to investigate CRAN downloads. This time we’ll also get some help from the...

Read more »

Forecasting workshop in Perth

July 29, 2017
By

On 26-28 September 2017, I will be running my 3-day workshop in Perth on “Forecasting: principles and practice” based on my book of the same name. Topics to be covered include seasonality and trends, exponential smoothing, ARIMA modelling, dynamic regression and state space models, as well as forecast accuracy methods and forecast evaluation techniques such as cross-validation. Workshop participants are expected...

Read more »

More documentation for Win-Vector R packages

July 29, 2017
By
More documentation for Win-Vector R packages

The Win-Vector public R packages now all have new pkgdown documentation sites! (And, a thank-you to Hadley Wickham for developing the pkgdown tool.) Please check them out (hint: vtreat is our favorite). The package sites: cdata replyr seplyr sigr vtre...

Read more »

Updated overbought/oversold plot function

July 29, 2017
By
Updated overbought/oversold plot function

A good six years ago I blogged about plotOBOS() which charts a moving average (from one of several available variants) along with shaded standard deviation bands. That post has a bit more background on the why/how and motivation, but as a teaser here is the resulting chart of the SP500 index (with ticker ^GSCP):   The code uses a few standard...

Read more »

R Markdown exercises part 1

July 29, 2017
By
R Markdown exercises part 1

INTRODUCTION R Markdown is one of the most popular data science tools and is used to save and execute code, create exceptional reports whice are easily shareable. The documents that R Markdown provides are fully reproducible and support a wide variety of static and dynamic output formats. Using markdown syntax, which provides an easy way Related exercise sets: How to...

Read more »

Stan Weekly Roundup, 28 July 2017

July 28, 2017
By

Here’s the roundup for this past week. Michael Betancourt added case studies for methodology in both Python and R, based on the work he did getting the ML meetup together: RStan workflow PyStan workflow Michael Betancourt, along with Mitzi Morris, Sean Talts, and Jonah Gabry taught the women in ML workshop at Viacom in NYC The post Stan Weekly...

Read more »

Learn parallel programming in R with these exercises for "foreach"

July 28, 2017
By

The foreach package provides a simple looping construct for R: the foreach function, which you may be familiar with from other languages like Javascript or C#. It's basically a function-based version of a "for" loop. But what makes foreach useful isn't iteration: it's the way it makes it easy to run those iterations in parallel, and save time on...

Read more »

Hacking Strings with stringi

July 28, 2017
By
Hacking Strings with stringi

In the last set of exercises, we worked on the basic concepts of string manipulation with stringr. In this one we will go further into hacking strings universe and learn how to use stringi package.Note that stringi acts as a backend of stringr but have many more useful string manipulation functions compared to stringr and Related exercise sets: Hacking strings...

Read more »

Analyzing “Wait-Delay” Settings in Common Crawl robots.txt Data with R

July 28, 2017
By
Analyzing “Wait-Delay” Settings in Common Crawl robots.txt Data with R

One of my tweets that referenced an excellent post about the ethics of web scraping garnered some interest: Apologies for a Medium link but if you do ANY web scraping, you need to read this #rstats // Ethics in Web Scraping https://t.co/y5YxvzB8Fd— boB Rudis (@hrbrmstr) July 26, 2017 If you load that up that tweet... Continue reading →

Read more »

simmer 3.6.3

July 28, 2017
By

The third update of the 3.6.x release of simmer, the Discrete-Event Simulator for R, is on CRAN. First of all and once again, I must thank Duncan Garmonsway (@nacnudus) for writing a new vignette: “The Bank Tutorial: Part II”. Among various fixes and performance improvements, this release provides a way of knowing the progress of a simulation.… Continuar leyendo simmer 3.6.3...

Read more »

Joy Division, Population Surfaces and Pioneering Electronic Cartography

July 28, 2017
By
Joy Division, Population Surfaces and Pioneering Electronic Cartography

There has been a resurgence of interest in data visualizations inspired by Joy Division’s Unknown Pleasures album cover. These so-called “Joy Plots” are easier to create thanks to the development of the “ggjoy” R package and also some nice code posted using D3. I produced a global population map (details here) using a similar technique in 2013 and since

Read more »

Visualising Similarity: Maps vs. Graphs

July 28, 2017
By
Visualising Similarity: Maps vs. Graphs

The visualization of complex data sets is of essential importance in communicating your data products. Beyond pie charts, histograms, line graphs and other common forms of visual communication begins the reign of data sets that encompass too much information to be easily captured by these simple data displays. A typical context that abounds with complexity is found in the...

Read more »

Joy Plot of Length Frequencies

July 27, 2017
By
Joy Plot of Length Frequencies

There has been a bit of a buzz recently about so-called “joyplots.” Wilke described joyplots as “partially overlapping line plots that create the impression of a mountain range.” I would describe them as partially overlapping densit...

Read more »

Looking for R at JSM

July 27, 2017
By

I am very much looking forward to attending JSM which begins this Sunday. And once again, I will be spending a good bit of my time hunting for new and interesting applications of R. In years gone by, this was a difficult game at JSM because R, R Package, Shiny, tidyverse and the like did not often turn up...

Read more »

Social Network Analysis and Topic Modeling of codecentric’s Twitter friends and followers

July 27, 2017
By
netword_2017-07-28

I have written the following post about Social Network Analysis and Topic Modeling of codecentric’€™s Twitter friends and followers for codecentric’s blog: Recently, Matthias Radtke has written a very nice blog post on Topic Modeling of the c...

Read more »

Reading PCAP Files with Apache Drill and the sergeant R Package

July 27, 2017
By

It’s no secret that I’m a fan of Apache Drill. One big strength of the platform is that it normalizes the access to diverse data sources down to ANSI SQL calls, which means that I can pull data from parquet, Hie, HBase, Kudu, CSV, JSON, MongoDB and MariaDB with the same SQL syntax. This also... Continue reading →

Read more »

Le Monde puzzle [#1707]

July 27, 2017
By
Le Monde puzzle [#1707]

A geometric Le Monde mathematical puzzle: Given a pizza of diameter 20cm, what is the way to cut it by two perpendicular lines through a point distant 5cm from the centre towards maximising the surface of two opposite slices?  Using the same point as the tip of the four slices, what is the way to

Read more »

The R6 Class System

July 27, 2017
By

R is an object-oriented language with several object-orientation systems. There's the original (and still widely-used) S3 class system based on the "class" attribute. There's the somewhat stricter, signature-based S4 class system. There are reference classes (also called R5), which provide R objects with multiple references without duplicating data in memory. And now there's the R6 class system, implemented as...

Read more »

Options for teaching R to beginners: a false dichotomy?

July 27, 2017
By
Options for teaching R to beginners: a false dichotomy?

I've been reading David Robinson's excellent blog entry "Teach the tidyverse to beginners" (http://varianceexplained.org/r/teach-tidyverse), which argues that a tidyverse approach is the best way to teach beginners.  He summarizes two competing curricula:1) "Base R first": teach syntax such as $ and ], built in functions like ave() and tapply(), and use base graphics2) "Tidyverse first": start from scratch with...

Read more »

Divided Differences Method of Polynomial Interpolation

July 27, 2017
By
Divided Differences Method of Polynomial Interpolation

Part of 6 in the series Numerical AnalysisThe divided differences method is a numerical procedure for interpolating a polynomial given a set of points. Unlike Neville’s method, which is used to approximate the value of an interpolating polynomial at a given point, the divided differences method constructs the interpolating polynomial... The post Divided Differences Method of Polynomial Interpolation appeared first...

Read more »

RStudio meets MilanoR – Presentations, photos and video!

July 27, 2017
By
RStudio meets MilanoR – Presentations, photos and video!

Hello R-users, On June 29th we had the great pleasure to host the RStudio in Milano at Impact Hub. It went absolutely well with great participation, thank you all! This post is just to give the materials to those of you who could not make it or just want to go through it again and The post RStudio meets...

Read more »

Quick Way of Installing all your old R libraries on a New Device

July 26, 2017
By
Quick Way of Installing all your old R libraries on a New Device

I recently bought a new laptop and began installing essential software all over again, including R of course! And I wanted all the libraries that I had installed in my previous laptop. Instead of installing libraries one by one all over again, I did the following: Step 1: Save a list of packages installed in … Continue reading Quick...

Read more »

How Virtual Tags have transformed SCADA data analysis

July 26, 2017
By

This article describes how to use Virtual tags to analyse SCADA data. Virtual tags provide context o SCADA or Historian data by combining information from various tags with meta data about these tags. Continue reading → The post How Virtual Tags have transformed SCADA data analysis appeared first on The Devil is in the Data.

Read more »

tidyr::spread() and dplyr::rename_at() in action

I was recently confronted to a situation that required going from a long dataset to a wide dataset, but with a small twist: there were two datasets, which I had to merge into one. You might wonder what kinda crappy twist that is, right? Well, let’s take a look at the data: data1; data2 ## # A tibble: 20 x 4 ##...

Read more »

Search R-bloggers

Sponsors

Mango solutions









Zero Inflated Models and Generalized Linear Mixed Models with R

r-brain.io



Quantide: statistical consulting and training

ODSC2

ODSC1

datasociety

http://www.eoda.de





CRC R books series







Six Sigma Online Training



statcon.de

mljar.com

Contact us if you wish to help support R-bloggers, and place your banner here.