# Monthly Archives: August 2015

## subview

August 30, 2015
I implemented a function, subview, in ggtree that make it easy to embed a subplot in ggplot. An example is shown below: library(ggplot2) library(ggtree) dd

## RcppGSL 0.3.0

August 30, 2015
A new version of RcppGSL just arrived on CRAN. The RcppGSL package provides an interface from R to the GNU GSL using our Rcpp package. Following on the heels of an update last month we updated the package (and its vignette) further. One of the key additions concern memory management: Given that our proxy classes around the GSL vector and...

## “A 99% TVaR is generally a 99.6% VaR”

August 29, 2015
$\operatorname{VaR}_\alpha(F)=\inf\{x \in \mathbb{R}:F(x)\ge \alpha\}=F^{-1}(\alpha)$

Almost 6 years ago, I posted a brief comment on a sentence I found surprising, by that time, discovered in a report claiming that the expected shortfall  at the 99 % level corresponds quite closely to the  value-at-risk at a 99.6% level which was inspired by a remark in Swiss Experience report, expected shortfall  on a 99% confidence level […} corresponds to approximately 99.6% to...

## Getting started in applied statistics / datascience

August 29, 2015
Where to start to start? I was recently asked by a colleague manager from another organisation what direction they could give to a staff member interested in building skills in the whole “big data” thing. A search of the web shows hundreds if not thousands of sites and blog posts aimed at budding data scientists, but most of them...

## R plot: Comparison of Fairbanks, Alaska and Beijing, China air quality

August 28, 2015
Here’s an interesting R plot comparing a specific air pollution metric between Fairbanks, Alaska and Beijing, China. Right off the bat, Beijing obviously has far worse air quality, and more significantly, it is a chronic, daily problem. But it is used for comparison because we already know this is the case. In Fairbanks, while air

## Building Wordclouds in R

August 28, 2015
In this article, I will show you how to use text data to build word clouds in R. We will use a dataset containing around 200k Jeopardy questions. The dataset can be downloaded here (thanks to reddit user trexmatt for providing the dataset). We will require three packages for this: tm, SnowballC, and wordcloud. First,

## Bio7 2.3 Released!

August 28, 2015
28.08.2015 As a result of the useR conference 2015 with fantastic workshops and presentations where I also presented my software I released a new version of Bio7 with many improvements and new features inspired by the R conference and important for the next ImageJ conference 2015 where I will give a Bio7 workshop. For this

## Lightning strike trend prediction with GBM in R

August 28, 2015
Lightning activity is projected to increase with climate change. Lightning activity is interesting to model with stochastic gradient boosting (GBM: generalized boosted regression models/gradient boosting machine) in R. One use I have for this at SNAP is in the context of landscape fire modeling with SNAP’s ALFRESCO model. The simulations from the model can be

## A Closer Look at TAT Time Dependence

August 28, 2015
The Problem We want to have a closer look at the time–dependence of turn around times (TATs). In particular, we would like to see if there is a significant trend in TAT over time (improvement or deterioration) and we would like the data to inform us of slowdowns and potentially unexpected problems that occur throughout … Continue reading A...

## Get to know Cortana Analytics: Workshop and webinars

August 28, 2015
Cortana Analytics Suite is Microsoft's cloud-based big data and advanced analytics suite. It includes a complete set of all the services need to build advanced analytics applications: from data ingestion and management, data warehousing, advanced analytics, data visualization and solution frameworks. You can use Cortana Analytics to build applications using R, by incorporating services including Data Factory, HDInsights Hadoop,...