## An R Users’ Group in Davis

September 24, 2012
By

I’m excited to share that we’ve started a new R users’ group at UC Davis! Right now our main purpose is to run weekly 2-hour work/hack sessions where R users can get together to work through problems together. More info here

## Example 10.3: Enhanced scatterplot with marginal histograms

September 24, 2012
By

Back in example 8.41 we showed how to make a graphic combining a scatterplot with histograms of each variable. A commenter suggested we change the R graphic to allow post-hoc plotting of, for example, lowess lines. In addition, there are further refinements to be made. In this R-only entry, we'll make the figure...

## Use GBIF and googleVis to Make Maps with Species Occurrence Data

September 24, 2012
By

This is a short follow up on THIS posting.. I will briefly show how to use the dismo- and the googeVis package to plot species occurrences on an interactive Google map, like the one below (HERE is the R-script)MapID2ce4348e653

## Computing kook density in R

September 24, 2012
By

Do you ever see strange lights in the sky? Do you wonder what really goes on in Area 51? Would you like to use your R hacking skills to get to the bottom of the whole UFO conspiracy? Of course, you would! UFO data from infochimps is the focus of a dat...

## qgraph version 1.1.0 and how to simply make a GUI using ‘rpanel’

September 24, 2012
By

Last week I have updated the ‘qgraph‘ package to version 1.1.0, available on CRAN now. Besides some internal changes (especially the self-loops have been substantially improved) the most important change is the addition of a GUI interface, which can be … Continue reading →

## The fear-index: is the VIX efficient to be warned about high volatility? (Finance & Systematic Processus)

September 24, 2012
By

## Simple visually-weighted regression plots

September 24, 2012
By

There has recently been a lot of discussion of so-called “visually-weighted regression” plots. Folk hero Hadley Wickham suggests that such plots would be easy to implement with ggplot2, and so I have attempted to prove him right. The approa...

## New Zealand school performance: beyond the headlines

September 24, 2012
By

I like the idea of having data on school performance, not to directly rank schools—hard, to say the least, at this stage—but because we can start having a look at the factors influencing test results. I imagine the opportunity in … Continue reading →

## Variance targeting in garch estimation

September 24, 2012
By

What is variance targeting in garch estimation?  And what is its effect? Previously Related posts are: A practical introduction to garch modeling Variability of garch estimates garch estimation on impossibly long series The last two of these show the variability of garch estimates on simulated series where we know the right answer.  In response to … Continue reading...

## Popularity indicator, with images (NFL)

September 23, 2012
By

It’s Friday night, there’s nothing good on TV, mmm conditions are perfect for shaggin about in R. So I’m an NFL fan, and (shameless plug) avid fan of this NFL podcast. They run their own pickem league which unless users … Continue reading →

## Universal portfolio, part 11

September 23, 2012
By

First an apology, the links to the Universal Portfolio paper have stopped working.  This is because the personal webpage of Thomas Cover at Stanford has been taken down, but fortunately the content moved elsewhere.  The new link is Universal ...

## Minimum Correlation Algorithm Example

September 23, 2012
By

Today I want to follow up with the Minimum Correlation Algorithm Paper post and show how to incorporate the Minimum Correlation Algorithm into your portfolio construction work flow and also explain why I like the Minimum Correlation Algorithm. First, let’s load the ETF’s data set used in the Minimum Correlation Algorithm Paper using the Systematic

## Video: Analyzing Big Data using Oracle R Enterprise

September 23, 2012
By

Learn how Oracle R Enterprise is used to generate new insight and new value to business, answering not only what happened, but why ...

## Football model; plots and usage

September 23, 2012
By

After reading data, making a predictions display and building a football data model it is time to put this to validate a bit more (regression plots) and put to usage. It appears that the regression plots in the car package were not ...

## Project Euler — problem 20

September 23, 2012
By

It’s been quite a while since my last post on Euler problems. Today a visitor post his solution to the second problem nicely, which encouraged me to keep solving these problems. Just for fun! 10! = 10 * 9 * … * 3 * 2 * 1 … Continue reading →

## The infamous apply function

September 23, 2012
By

For R beginners, the apply() function seems like a secret doorway into programming bliss. It seems so powerful, and yet, beyond reach. For those just starting out, examples of how to use apply() can really help with the intuition of how to h...

## Text Analysis Tutorial on Spam Email in R

September 23, 2012
By

Hi everyone – I just wrote a tutorial on text analysis in R using the tm and wordcloud packages. Thought some of you here might be interested in it: text-analysis-75-925

## Maximum likelihood estimates for multivariate distributions

September 22, 2012
By

Consider our loss-ALAE dataset, and - as in Frees & Valdez (1998) - let us fit a parametric model, in order to price a reinsurance treaty. The dataset is the following, > library(evd) > data(lossalae) > Z=lossalae > X=Z;Y=Z ...

## Spacing measures: heterogeneity in numerical distributions

Numerically-coded data sequences can exhibit a very wide range of distributional characteristics, including near-Gaussian (historically, the most popular working assumption), strongly asymmetric, light- or heavy-tailed, multi-modal, or discrete (e.g., count data).  In addition, numerically coded values can be effectively categorical, either ordered, or unordered.  A specific example that illustrates the range of distributional behavior often seen in a collection...

## Maximum likelihood estimates for multivariate distributions

September 22, 2012
By

Consider our loss-ALAE dataset, and – as in Frees & Valdez (1998) - let us fit a parametric model, in order to price a reinsurance treaty. The dataset is the following, > library(evd) > data(lossalae) > Z=lossalae > X=Z;Y=Z The first step can be to estimate marginal distributions, independently. Here, we consider lognormal distributions for both components, > Fempx=function(x) mean(X<=x) >...

## Good programming practices in R

September 22, 2012
By

I write sloppy R scripts. It is a byproduct of working with a high-level language that allows you to quickly write functional code on the fly (see this post for a nice description of the problem in Python code) and the result of my limited formal training in computer programming. The lack of formal training

## KLEMS (1)

September 22, 2012
By

This post is actually a homework I did. The data file contains input use, output, quantities, costs, and prices for total U.S. nondurable manufacturing for 1949-2001. The data are deﬁned as follows: , , , , = Inputs corresponding to capital, labor, energy, materials, and purchased services, = represents total output, = respective quantity indexes, ...

## Core [still] minus one…

September 22, 2012
By

Another full day spent working with Jean-Michel Marin on the new edition of Bayesian Core (soon to be Bayesian Essentials with R!) and the remaining hierarchical Bayes chapter… I have reread and completed the regression and GLM chapters, sent to very friendly colleagues for a last round of comments. Now, I am essentially idle, waiting

September 22, 2012
By

This week,  I got my hands on some agricultural trade data. Trade data are typically extremely dirty so treat with care when you get your hands on them. Lab standard equipments are required.So I decided to look how countries trade by plotting the ...

## PLS2 with "R"

September 22, 2012
By

I´ve been working these days with PLS2 calibrations with a chemometric software called “Unscrambler” with a data set called “jam”. I said “can I develop PLS2 models with R?”.I look in the book “Introduction to Multivariate Statistical An...

## Power Analysis and the Probability of Errors

September 22, 2012
By
$Power Analysis and the Probability of Errors$

Power analysis is a very useful tool to estimate the statistical power from a study. It effectively allows a researcher to determine the needed sample size in order to obtained the required statistical power. Clients often ask (and rightfully so) what the sample size should be for a proposed project. Sample sizes end up being

## Federal Register API/R Package Ideas?

September 21, 2012
By

The other day Critical Juncture put up an API for the Federal Register. I thought it would be great if there was a package that could use this API to download data directly into R (much like the excellent WDI package). This would make it easier to anal...

## Minimum Correlation Algorithm Paper

September 21, 2012
By

Over summer I was busy collaborating with David Varadi on the Minimum Correlation Algorithm paper. Today I want to share the results of our collaboration: Minimum Correlation Algorithm Paper Back Test reports Supporting R code The Minimum Correlation Algorithm is fast, robust, and easy to implement. Please add it to you portfolio construction toolbox and