Language Difficulty and Diversity

June 17, 2015
By
Language Difficulty and Diversity

*For R users not interested in the post but the code, a markdown file is available on github.  Thanks to Zuguang Gu and Bob Rudis for the 'circlize' and 'waffle' packages respectively.I've been studying Arabic for about 10 months no...

Read more »

licorice: plot Likert-like data

June 16, 2015
By
licorice: plot Likert-like data

I wanted to create a nice visualization from a survey data set. I quickly stumbled upon the likert package (go check it out). I did however have some trouble getting it to work the way I wanted. Therefore I made a quick implementation of my own that y...

Read more »

Shiny 0.12: Interactive Plots with ggplot2

June 16, 2015
By
Shiny 0.12: Interactive Plots with ggplot2

Shiny 0.12 has been released to CRAN! Compared to version 0.11.1, the major changes are: Interactive plots with base graphics and ggplot2 Switch from RJSONIO to jsonlite For a full list of changes and bugfixes in this version, see the NEWS file. To install the new version of Shiny, run: install.packages(c("shiny", "htmlwidgets")) htmlwidgets is not

Read more »

Pairwise-complete correlation considered dangerous

June 16, 2015
By

by B. W. Lewis This note warns about potentially misleading results when using the use=pairwise.complete.obs and related options in R’s cor and cov functions. Pitfalls are illustrated using a very simple pathological example followed by a brief list of alternative ways to deal with missing data and some references about them. Known unknowns R includes excellent facilities for handling...

Read more »

How to place titles in lattice plots

June 16, 2015
By
How to place titles in lattice plots

I like the Economist theme in the latticeExtra package. It produces nice looking charts that mimic the design of the weekly newspaper, such as in this example:For some time I wondered how I could put the title of my lattice plots into the top left corner as well (by default titles are centred). Reviewing...

Read more »

North American seminars: June 2015

June 15, 2015
By

For the next few weeks I am travelling in North America and will be giving the following talks. 19 June: Southern California Edison, Rosemead CA. “Probabilistic forecasting of peak electricity demand”. 23 June: International Symposium on Forecasting, Riverside CA. “MEFM: An R package for long-term probabilistic forecasting of electricity demand”. 25 June: Google, Mountain View, CA. “Automatic algorithms

Read more »

Comparing #rstats and #pdf15 intraday hashtag streams

June 15, 2015
By
Comparing #rstats and #pdf15 intraday hashtag streams

This post is a lecture for IS624 Predictive Analytics, which is part of the CUNY Master’s program in Data Analytics. …Continue reading →

Read more »

Visualization and Analysis of Reddit’s "The Button" Data

June 15, 2015
By
Visualization and Analysis of Reddit’s "The Button" Data

IntroductionPeople are weird. And if there's anything that's greater collective proof of this fact than Reddit, you'd be hard pressed to find it.I tend to put reddit in the same bucket as companies like Google, Amazon and Netflix, where they have enoug...

Read more »

A step by step (screenshots) tutorial for upgrading R on Windows

June 15, 2015
By
A step by step (screenshots) tutorial for upgrading R on Windows

tl;dr If you are running R on Windows you can easily upgrade to the latest version of R using the installr package. Simply run the following code: # installing/loading the latest installr package: install.packages("installr"); library(installr) # install+load installr   updateR() # updating R. Running “updateR()” will detect if there is a new R version available, and … Continue reading...

Read more »

metricsgraphics 0.8.5 is now on CRAN!

June 15, 2015
By

I’m super-pleased to announce that the Benevolent CRAN Overlords accepted the metricsgraphics package into CRAN over the weekend. Now, you no longer need to rely on github/devtools to use MetricsGraphics.js charts from your R scripts. If you’re not familiar with htmlwidgets, take a look at the official site for them. To make it easier to

Read more »

Plotting spatial neighbors in ggmap

June 15, 2015
By
Plotting spatial neighbors in ggmap

The R package spdep has great utilities to define spatial neighbors (e.g. dnearneigh, knearneigh, with a nice vignette to boot), but the plotting functionality is aimed at base graphics. If you’re hoping to plot spatial neighborhoods as line segments in ggplot2, or ggmap, you’ll need the neighborhood data to be stored in a data frame. So, to...

Read more »

Shiny Wool Skeins

June 15, 2015
By
Shiny Wool Skeins

Chaos is not a pit: chaos is a ladder (Littlefinger in Game of Thrones) Some time ago I wrote this post to show how my colleague Vu Anh translated into Shiny one of my experiments, opening my eyes to an amazing new world. I am very proud to present you the first Shiny experiment entirely … Continue reading...

Read more »

Connect R to Bloomberg with the RBlpapi package

June 15, 2015
By

For anyone who works with financial data and has access to a Bloomberg terminal, there is a new R package to interface to Bloomberg data services: RBlpapi. (If you had searched for an R connection to Bloomberg you wouldn’t have found this one — Bloomberg is happy to have software that connects to its public API, but not to...

Read more »

Shiny App for the Wittgenstein Centre Population Projections

June 15, 2015
By
Shiny App for the Wittgenstein Centre Population Projections

A few weeks ago a new version of the the Wittgenstein Centre Data Explorer was launched. The data explorer is intended to disseminate the results of a recent global population projection exercise which uniquely incorporates level of education (as well as age … Continue reading →

Read more »

Introducing passivetotal – R Package To Work With the PassiveTotal API

June 14, 2015
By

As a precursor to releasing Episode 18 of DDSec Podcast, we’re releasing a really basic R package to interface with the PassiveTotal API. We asked Brandon Dixon to be on the podcast to talk about his new visualization for users of PassiveTotal, which is a “threat research platform created for analysts, by analysts.”. PT...

Read more »

Are These Losses from The Same Distribution?

June 14, 2015
By
Are These Losses from The Same Distribution?

In Advanced Measurement Approaches (AMA) for Operational Risk models, the bank needs to segment operational losses into homogeneous segments known as “Unit of Measures (UoM)”, which are often defined by the combination of lines of business (LOB) and Basel II event types. However, how do we support whether the losses in one UoM are statistically

Read more »

How to use SparkR within Rstudio?

June 14, 2015
By

Setting up Spark and SparkR is quite easy (assume you are running v.1.4): just grab one of the pre-built binaries and unzip to a folder. There is also a shell script to start SparkR from command line. The document suggest to put the following linesSys....

Read more »

Mimicking a Google Form with a Shiny app

June 14, 2015
By
Mimicking a Google Form with a Shiny app

In this post we will walk through the steps required to build a shiny app that mimicks a Google Form. It will allow users to submit responses to some input fields, save their data, and allow admins to view the submitted responses. Like many of my other posts, it may seem lengthy, but that’s only because I like...

Read more »

Parallel and a new laptop

June 14, 2015
By
Parallel and a new laptop

I am thinking about a new laptop. For one thing a 1366*768 resolution just seems to get impractically small. Secondly, faster comutations, more memory.Regarding CPU speed, my current laptop has a lowly Celeron 877. From what I see at my computers activ...

Read more »

The Overlap Coefficient

June 14, 2015
By
The Overlap Coefficient

I'm currently working with a client who is researching best covariance matrices for financial time series.  Specifically, they are looking at what best describes 1 month out of sample distributions.  They are not concerned with the means, just the variance.A paper on the subject from the Journal of Portfolio Management -- "A Test of Covariance Matrix Forecasting...

Read more »

HTTPS for CRAN: how and why

June 13, 2015
By
HTTPS for CRAN: how and why

R gained some basic support for https in version 3.2.0 (see NEWS) via the method = "libcurl" argument in base functions download.file and url. The global option download.file.method is used to make this the default. Unfortunately the implementation has a few limitations: there is no way to set request options (authentication, proxy,...

Read more »

pkgKitten 0.1.3: Still creating R Packages that purr

June 13, 2015
By

A new release, now at version 0.1.3, of pkgKitten arrived on CRAN this morning. The main change is (optional) support of the excellent whoami package by Gabor which allows us to fill in the Author: and Maintainer: fields of the DESCRIPTION file with ...

Read more »

SAS vs R? The right answer to the wrong question?

June 13, 2015
By

For a long time I tracked a discussion on LinkedIn that consisted of various opinions about using SAS vs R. Some people can take this very personal.  Recently there was an interesting post at the DataCamp blog addressing this topic. They also prov...

Read more »

Hadley Wickham’s Master R Developer Workshop – Washington DC registration is open

June 12, 2015
By
Hadley Wickham’s Master R Developer Workshop – Washington DC registration is open

“Master” R in Washington DC this September! Join RStudio Chief Data Scientist Hadley Wickham at the AMA – Executive Conference Center in Arlington, VA on September 14 and 15, 2015 for this rare opportunity to learn from one of the R community’s most popular and innovative authors and package developers. It will be at least another

Read more »

SparkR: Distributed data frames with Spark and R

June 12, 2015
By

R is now integrated with Apache Spark, the open-source cluster computing framework. The Databricks blog announced this week that yesterday's release of Spark 1.4 would include SparkR, "an R package that allows data scientists to analyze large datasets and interactively run jobs on them from the R shell". The SparkR 1.4 announcement led with the news: Spark 1.4 introduces...

Read more »

ggtree with funny fonts

June 12, 2015
By
ggtree with funny fonts

showtext is a neat solution to use various types of fonts in R graphs and make it easy to use funny fonts. With showtext, we can draw phylogenetic tree with different types of fonts even with symbolic/icon fonts. Read More: 638 Words Totally

Read more »

Wanted: A Perfect Scatterplot (with Marginals)

June 11, 2015
By
Wanted: A Perfect Scatterplot (with Marginals)

We saw this scatterplot with marginal densities the other day, in a blog post by Thomas Wiecki: The graph was produced in Python, using the seaborn package. Seaborn calls it a “jointplot;” it’s called a “scatterhist” in Matlab, apparently. The seaborn version also shows the strength of the linear relationship between the x and y … Continue reading...

Read more »

Intraday time series analysis of the #rstats hashtag on Twitter

June 11, 2015
By
Intraday time series analysis of the #rstats hashtag on Twitter

This post is a lecture for IS624 Predictive Analytics, which is part of the CUNY Master’s program in Data Analytics. …Continue reading →

Read more »

R man (chester) in the North…

June 11, 2015
By
R man (chester) in the North…

By Andrew Vodden, Account Manager   The Manchester R user group met last Tuesday in a temporary venue next to the Manchester Art Gallery in the city centre. We had a full house and were delighted to welcome representatives from … Continue reading →

Read more »