analyze the area resource file (arf) with r

October 1, 2012
By

the arf is fun to say out loud.  it's also a single county-level data table with about 6,000 variables, produced by the united states health services and resources administration (hrsa).  the file contains health information and statistics fo...

Read more »

Where in the world is R and RStudio

October 1, 2012
By
Where in the world is R and RStudio

Using the web logs collected when users download RStudio, we’ve prepared the following two maps showing where RStudio is being used, over the whole globe and just within the continental USA. Obviously this data is somewhat biased, as it reflects the number of downloads of RStudio, rather than the number of users of R (which

Read more »

Ordinal football

October 1, 2012
By
Ordinal football

I've had a quick look at this article on R-bloggers $-$ I don't think I've followed the whole exchange, but I believe they have discussed what models should/could be applied to estimate football scores (specifically, in this case they are using the Dut...

Read more »

Designing real-world 3-D objects with R

October 1, 2012
By
Designing real-world 3-D objects with R

The Maker Movement has led to the production of open-source 3-D printers and other manufacturing machines that allow hobbyists to design, create and produce real-world objects affordably. Now R user Ian Walker, in a post at the Psychological Statistics blog, shows how to use the R language to transform 3-D surfaces into real-world physical objects with a 3-D printer....

Read more »

A Brief Tip on Generating Fractional Factorial Designs in R

October 1, 2012
By

A number of marketing researchers use the orthoplan procedure in SPSS to generate fractional factorial designs.  It is not surprising, then, that I received a number of questions concerning the recent article in the Journal of Statistical Software by Hideo Aizaki on “Basic Functions for Supporting an Implementation of Choice Experiments in R.”  To summarize their issues,...

Read more »

Example 10.4: Multiple comparisons and confidence limits

October 1, 2012
By
Example 10.4: Multiple comparisons and confidence limits

A colleague is a devotee of confidence intervals. To him, the CI have the magical property that they are immune to the multiple comparison problem-- in other words, he feels its OK to look at a bunch of 95% CI and focus on the ones that appear to exclude the null. This though...

Read more »

When Russell 2000 is Low Vol

October 1, 2012
By
When Russell 2000 is Low Vol

Continuing in my exploration of the Russell 2000 (Russell 2000 Softail Fat Boy), I thought I would try to approach the topic with a low volatility paradox mindset.  Since 2005, beta of the Russell 2000 compared to the S&P 500 has exceeded 1.2 ...

Read more »

Level fit summaries can be tricky in R

October 1, 2012
By
Level fit summaries can be tricky in R

Model level fit summaries can be tricky in R. A quick read of model fit summary data for factor levels can be misleading. We describe the issue and demonstrate techniques for dealing with them.When modeling you often encounter what are commonly called categorical variables, which are called factors in R. Possible values of categorical variables Related posts:

Read more »

How does my computer know what language I am using? – An approach of statistical learning (Language, Computer Science)

October 1, 2012
By
How does my computer know what language I am using? – An approach of statistical learning (Language, Computer Science)

Quick and Easy Subsetting

October 1, 2012
By
Quick and Easy Subsetting

Public health datasets can be enormous and difficult to look at.  Often it is great to be able to only look at specific parts of the dataset, or to only run analysis on a specific part of a dataset.  There are two ways that you can subset a d...

Read more »

Rcpp 0.9.14

October 1, 2012
By

Another release of Rcpp has just appeared on CRAN and was just uploaded to Debian. It addresses yet another issue we had on OS X and should hopefully put the build issues to rest. Three new (vectorized) sugar functions were added, along with some ne...

Read more »

Making random, equally-sized partitions

October 1, 2012
By
Making random, equally-sized partitions

Sometimes, as with cross-validation, one needs to generate k partitions, each with an equal number of observations. There are probably an infinite number of ways this could be done in R, but the Gist below illustrates one way to do it in four lines, w...

Read more »

How to add a benchmark to a variance matrix

October 1, 2012
By
How to add a benchmark to a variance matrix

There is a good way and a bad way to add a benchmark to a variance matrix that will be used for optimization and similar operations.  Our examination sheds a little light on the process of variance matrix estimation in this realm. Role of benchmarks Investing Benchmarks are common in investment management.  It’s my opinion … Continue reading...

Read more »

Fitting an ellipse to point data

September 30, 2012
By
Fitting an ellipse to point data

Some time ago I wrote an R function to fit an ellipse to point data, using an algorithm developed by Radim Halíř and Jan Flusser1 in Matlab, and posted it to the r-help list. The implementation was a bit hacky, returning odd results for some data. A couple of days ago, an...

Read more »

An R-based Research Notebook – Test

September 30, 2012
By

The following are from Tom Torsney-Weir’s blog (here’s the repo). Simple MathJax and R example A more complex example with caching <!--begin.rcode test2,cache=TRUE x.vals <- runif(10) y.vals Another plotting example using ggplot <!--begin.rcode test3,cache=TRUE, message=FALSE library(ggplot2) c Example using googleVis <!--begin.rcode test4,cache=TRUE, message=FALSE, results='asis' suppressPackageStartupMessages(library(googleVis)) ## Table with embedded links PopTable

Read more »

Working with Bipartite/Affiliation Network Data in R

September 30, 2012
By
Working with Bipartite/Affiliation Network Data in R

Data can often be usefully conceptualized in terms affiliations between people (or other key data entities). It might be useful analyze common group membership, common purchasing decisions, or common patterns of behavior. This post introduces bipartite/affiliation network data and provides … Continue reading →

Read more »

Football, an ordinal model

September 30, 2012
By
Football, an ordinal model

On September 19th, flo2speak remarked under a post that his/her experience is that ordinal models had better performance. That seems reason enough to try, so there we are. In examining this type of model it is found that more complex models can be...

Read more »

Plot R Data With googleVis

September 30, 2012
By

Here is a little code snippet that shows how to do two things Use the Google Maps API to resolve place names into lat-long coordinate pairs. Plot R dataframes that contain lat-long data (for example from #1) onto Google Maps for quick visualization using the googleVis package.  The embedded map looks a little wonky here but it looks...

Read more »

Quantifying student feedback using Org mode and R

September 30, 2012
By
Quantifying student feedback using Org mode and R

As the term has progressed, my LSM2241 lectures are getting more consistent. I’m aiming to use 45 slides for what is officially a two hour lecture, although in reality it lasts about 90 minutes. We take a break at about 50 … Continue reading →

Read more »

my Facebook social network

September 29, 2012
By
my Facebook social network

I got very excited on making a network diagram of my Facebook network using Ghefi (https://gephi.org/) and submitted my first assignment for the Social Network Analysis course on https://www.coursera.org/. It's middle of the night, so I will ...

Read more »

Padding integers for use in filenames

September 29, 2012
By
Padding integers for use in filenames

If you’ve ever written code that generates a whole whack of files, you may have came across the following problem when processing them. Using a naming convention wherein files are numbered will  gum up any ordering which is based on string sorting (ls, for example). What you end up with is something like this: Which

Read more »

Merging Dataframes by Partly Matching String

September 29, 2012
By

The latest posting by Tony Hirst sparked my attention because I was thinking about a very similar issue recently.I was also fiddling around with agrep and adist until I realised that for this very issue matching of substrings is not as important as matching multiple words.. With this different approach I quite easily matched all but 3...

Read more »

Weekend Reading – Gold in October

September 28, 2012
By
Weekend Reading – Gold in October

I recently came across the “An early Halloween for gold traders” article by Mark Hulbert. I have discussed this type of seasonality analysis in my presentation at R/Finance this year. It is very easy to run the seasonality analysis using the Systematic Investor Toolbox. This confirms that October have been historically bad for Gold, but

Read more »

Browse the in-development R sources at GitHub

September 28, 2012
By

As an open-source project, the R source code has always been available to download from the R-project website. You can find source code for the latest released version here, and for the changing-daily new version in progress (R-devel) here. But if you don't have the R sources handy, and just want to check on the contents of a file...

Read more »

Second Milano R net meeting

September 28, 2012
By

Second Milano R net meeting took place on September, 27. More than thirty R users joining both the presentations session and the open bar. If you attended the meeting, please leave a comment in the page of the meeting. You … Continue reading →

Read more »

Photos of the second Milano R net meeting

September 28, 2012
By
Photos of the second Milano R net meeting

Photos of the second Milano R net meeting Milano; September 27, 2012

Read more »

Optimal seriation for your matrices

September 28, 2012
By
Optimal seriation for your matrices

In our previous post, we used a quick-and-dirty method for ordering the axes on our heatmap. It has been pointed out to me that There is a Package for That (which is my nominee for a new slogan for R — not that it needs a slogan). seriation offe...

Read more »

Presentations of the second Milano R net meeting

September 28, 2012
By

Welcome presentation Andrea Spanò, Partner at Quantide (download PDF, 3.0 MB) Introduction to the next Italian BioR event at PTP Andrea Pedretti, Parco Tecnologico Padano (download PDF, 0.2 MB) Applications of technical risk assessment in Food Industry by R Carlo … Continue reading →

Read more »

Reading and Text Mining a PDF-File in R

September 27, 2012
By
Reading and Text Mining a PDF-File in R

I just added this R-script that reads a PDF-file to R and does some text mining with it to my Github repo..

Read more »