Ordinal football

October 1, 2012
By

I've had a quick look at this article on R-bloggers $-$ I don't think I've followed the whole exchange, but I believe they have discussed what models should/could be applied to estimate football scores (specifically, in this case they are using the Dut...

Designing real-world 3-D objects with R

October 1, 2012
By

The Maker Movement has led to the production of open-source 3-D printers and other manufacturing machines that allow hobbyists to design, create and produce real-world objects affordably. Now R user Ian Walker, in a post at the Psychological Statistics blog, shows how to use the R language to transform 3-D surfaces into real-world physical objects with a 3-D printer....

A Brief Tip on Generating Fractional Factorial Designs in R

October 1, 2012
By

A number of marketing researchers use the orthoplan procedure in SPSS to generate fractional factorial designs.  It is not surprising, then, that I received a number of questions concerning the recent article in the Journal of Statistical Software by Hideo Aizaki on “Basic Functions for Supporting an Implementation of Choice Experiments in R.”  To summarize their issues,...

Example 10.4: Multiple comparisons and confidence limits

October 1, 2012
By

A colleague is a devotee of confidence intervals. To him, the CI have the magical property that they are immune to the multiple comparison problem-- in other words, he feels its OK to look at a bunch of 95% CI and focus on the ones that appear to exclude the null. This though...

When Russell 2000 is Low Vol

October 1, 2012
By

Continuing in my exploration of the Russell 2000 (Russell 2000 Softail Fat Boy), I thought I would try to approach the topic with a low volatility paradox mindset.  Since 2005, beta of the Russell 2000 compared to the S&P 500 has exceeded 1.2 ...

Level fit summaries can be tricky in R

October 1, 2012
By

Model level fit summaries can be tricky in R. A quick read of model fit summary data for factor levels can be misleading. We describe the issue and demonstrate techniques for dealing with them.When modeling you often encounter what are commonly called categorical variables, which are called factors in R. Possible values of categorical variables Related posts:

October 1, 2012
By

Quick and Easy Subsetting

October 1, 2012
By

Public health datasets can be enormous and difficult to look at.  Often it is great to be able to only look at specific parts of the dataset, or to only run analysis on a specific part of a dataset.  There are two ways that you can subset a d...

Rcpp 0.9.14

October 1, 2012
By

Another release of Rcpp has just appeared on CRAN and was just uploaded to Debian. It addresses yet another issue we had on OS X and should hopefully put the build issues to rest. Three new (vectorized) sugar functions were added, along with some ne...

Making random, equally-sized partitions

October 1, 2012
By

Sometimes, as with cross-validation, one needs to generate k partitions, each with an equal number of observations. There are probably an infinite number of ways this could be done in R, but the Gist below illustrates one way to do it in four lines, w...

How to add a benchmark to a variance matrix

October 1, 2012
By

There is a good way and a bad way to add a benchmark to a variance matrix that will be used for optimization and similar operations.  Our examination sheds a little light on the process of variance matrix estimation in this realm. Role of benchmarks Investing Benchmarks are common in investment management.  It’s my opinion … Continue reading...

Fitting an ellipse to point data

September 30, 2012
By

Some time ago I wrote an R function to fit an ellipse to point data, using an algorithm developed by Radim Halíř and Jan Flusser1 in Matlab, and posted it to the r-help list. The implementation was a bit hacky, returning odd results for some data. A couple of days ago, an...

An R-based Research Notebook – Test

September 30, 2012
By

The following are from Tom Torsney-Weir’s blog (here’s the repo). Simple MathJax and R example A more complex example with caching <!--begin.rcode test2,cache=TRUE x.vals <- runif(10) y.vals Another plotting example using ggplot <!--begin.rcode test3,cache=TRUE, message=FALSE library(ggplot2) c Example using googleVis <!--begin.rcode test4,cache=TRUE, message=FALSE, results='asis' suppressPackageStartupMessages(library(googleVis)) ## Table with embedded links PopTable

Working with Bipartite/Affiliation Network Data in R

September 30, 2012
By

Data can often be usefully conceptualized in terms affiliations between people (or other key data entities). It might be useful analyze common group membership, common purchasing decisions, or common patterns of behavior. This post introduces bipartite/affiliation network data and provides … Continue reading →

Football, an ordinal model

September 30, 2012
By

On September 19th, flo2speak remarked under a post that his/her experience is that ordinal models had better performance. That seems reason enough to try, so there we are. In examining this type of model it is found that more complex models can be...

September 30, 2012
By

Here is a little code snippet that shows how to do two things Use the Google Maps API to resolve place names into lat-long coordinate pairs. Plot R dataframes that contain lat-long data (for example from #1) onto Google Maps for quick visualization using the googleVis package.  The embedded map looks a little wonky here but it looks...

Quantifying student feedback using Org mode and R

September 30, 2012
By

As the term has progressed, my LSM2241 lectures are getting more consistent. I’m aiming to use 45 slides for what is officially a two hour lecture, although in reality it lasts about 90 minutes. We take a break at about 50 … Continue reading →

September 29, 2012
By

I got very excited on making a network diagram of my Facebook network using Ghefi (https://gephi.org/) and submitted my first assignment for the Social Network Analysis course on https://www.coursera.org/. It's middle of the night, so I will ...

Padding integers for use in filenames

September 29, 2012
By

If you’ve ever written code that generates a whole whack of files, you may have came across the following problem when processing them. Using a naming convention wherein files are numbered will  gum up any ordering which is based on string sorting (ls, for example). What you end up with is something like this: Which

Merging Dataframes by Partly Matching String

September 29, 2012
By

The latest posting by Tony Hirst sparked my attention because I was thinking about a very similar issue recently.I was also fiddling around with agrep and adist until I realised that for this very issue matching of substrings is not as important as matching multiple words.. With this different approach I quite easily matched all but 3...

Weekend Reading – Gold in October

September 28, 2012
By

I recently came across the “An early Halloween for gold traders” article by Mark Hulbert. I have discussed this type of seasonality analysis in my presentation at R/Finance this year. It is very easy to run the seasonality analysis using the Systematic Investor Toolbox. This confirms that October have been historically bad for Gold, but

Browse the in-development R sources at GitHub

September 28, 2012
By

As an open-source project, the R source code has always been available to download from the R-project website. You can find source code for the latest released version here, and for the changing-daily new version in progress (R-devel) here. But if you don't have the R sources handy, and just want to check on the contents of a file...

Second Milano R net meeting

September 28, 2012
By

Second Milano R net meeting took place on September, 27. More than thirty R users joining both the presentations session and the open bar. If you attended the meeting, please leave a comment in the page of the meeting. You … Continue reading →

Photos of the second Milano R net meeting

September 28, 2012
By

Photos of the second Milano R net meeting Milano; September 27, 2012

September 28, 2012
By

In our previous post, we used a quick-and-dirty method for ordering the axes on our heatmap. It has been pointed out to me that There is a Package for That (which is my nominee for a new slogan for R — not that it needs a slogan). seriation offe...

Presentations of the second Milano R net meeting

September 28, 2012
By

Welcome presentation Andrea Spanò, Partner at Quantide (download PDF, 3.0 MB) Introduction to the next Italian BioR event at PTP Andrea Pedretti, Parco Tecnologico Padano (download PDF, 0.2 MB) Applications of technical risk assessment in Food Industry by R Carlo … Continue reading →

Reading and Text Mining a PDF-File in R

September 27, 2012
By

I just added this R-script that reads a PDF-file to R and does some text mining with it to my Github repo..

3-D animation of the changing Antarctic ice sheet

September 27, 2012
By

Last month we shared an visualization showing the changing extent of Arctic sea-ice. This visualization by the multinational Commission for the Conservation of Antarctic Marine Living Resources (CCAMLR) switches the view to the Southern pole and takes the visualization to a whole new level, by animating it in 3-D: The amount of sea ice in the Southern Ocean surrounding...