Dynamical systems: Mapping chaos with R

July 13, 2012
By
$Dynamical systems: Mapping chaos with R$

Chaos. Hectic, seemingly unpredictable, complex dynamics. In a word: fun. I usually stick to the warm and fuzzy world of stochasticity and probability distributions, but this post will be (almost) entirely devoid of randomness. While chaotic dynamics are entirely deterministic, their sensitivity to initial conditions can trick the observer into seeing iid. In ecology, chaotic

Examples and resources on association rule mining with R

July 13, 2012
By

by Yanchang Zhao, RDataMining.com The technique of association rules is widely used for retail basket analysis, as well as in other applications to find assocations between itemsets and between sets of attribute-value pairs. It can also be used for classification … Continue reading →

R for Ecologists: Making MATLAB-like Graphs in R

July 13, 2012
By

I’ve decided that my blog should become a brain dump for my experience/troubles/solutions to programming in R. I’ve met many people who want to learn but don’t have the time or patience to sit down and figure it out from … Continue reading →

1-Month Reversal Strategy

July 12, 2012
By

Today I want to show a simple example of the 1-Month Reversal Strategy. Each month we will buy 20% of loosers and short sell 20% of winners from the S&P 500 index. The loosers and winners are measured by prior 1-Month returns. I will use this post to set the stage for my next post

July 12, 2012
By

Conrad released version 3.2.4 of Armadillo yesterday. It contains a workaround for g++ 4.7.0 and 4.7.1 which have a regression triggered by the Armadillo codebase for small fixed-sized matrices. The corresponding RcppArmadillo package 0.3.2.4 arrived ...

Napa Valley wine tasting map: interactive version

July 12, 2012
By

Got some great reactions to the Napa Valley wine tasting map made with the ggmap package I posted on Monday. A couple of people asked if similar maps could be made for other wine regions (like Australia's Hunter Valley, or the Walla Walla region in Washington): provided you have a list of winery addresses, tweaks to the same R...

Using discrete-event simulation to simulate hospital processes

July 12, 2012
By

Discrete-event simulation is a very useful tool when it comes to simulating alternative scenario’s for current of future business operations. Let’s take the following case; Patients of an outpatient diabetes clinic are complaining about long waiting times, this seems to have an adverse effect on patient satisfaction and patient retention.  Read more »

GenABEL: an annoying error after the import of PLINK data format

July 12, 2012
By

In the previous post we saw how much convenient could be GenABEL in the management of genotypic/phenotypic data. We introduced the import of genotypic data from an Illumina format file: > convert.snp.illumina(inf = "gen.illu", out = "gen.raw", strand = "file") … Continue reading →

Creating Williams designs with even number of products

July 12, 2012
By

A Williams design is a special Latin square with the additional property of first order carry over (each product is followed equally often by each other product). In R the package crossdes can be used to create them. > williams(4)    &nbsp...

July 11, 2012
By

Last time, I posted some R code to help quickly launch many iButton Thermochron temperature dataloggers with the same mission parameters. The R code makes use of a publicly-available command line utility released by the iButton’s manufacturer, Maxim.  Of course, Maxim also has a command line utility for downloading the data from those iButtons that

A primer on R2OpenBUGS using the simple linear regression example.

July 11, 2012
By

I make using OpenBUGS fun (and easier)! I've been a BUGS, WinBUGS and OpenBUGS user for some time now (20 years and counting!). The combination of R and OpenBUGS using the R2OpenBUGS package allows the user to bring together data preparation...

Rcpp is smoking fast for agent-based models in data frames

July 11, 2012
By

In a previous post, I discussed different approaches to speeding up some loops in data frames. In particular, R data frames provide a simple framework for representing large cohorts of agents in stochastic epidemiological models, such as those representing disease … Continue reading →

Bridget Riley exhibition in London

July 11, 2012
By

The other day I saw a fantastic exhibition of work by Bridget Riley. Karsten Schubert, who is Riley's main agent, has a some of her most famous and influential artwork from 1960 - 1966 on display, including the seminal Moving Squares from 1961.Photo of...

Health Care Costs – Part 2, "Unhealthy Things Not Related to the Problem"

July 11, 2012
By

Lighting Up Way back in the day, folks believed that smoking was not only cool but also completely safe. As Marcel Danesi states in his book Of Cigarettes, High Heels, and Other Interesting Things, Second Edition: An Introduction to Semiotics ...

In case you missed it: June 2012 Roundup

July 11, 2012
By

In case you missed them, here are some articles from June of particular interest to R users. The FDA goes on the record that it's OK to use R for drug trials. A review of talks at the useR! 2012 conference. Using the negative binomial distribution to convert monthly fecundity into the chances of having a baby in a...

Getting numpy data into R — Take Two

July 10, 2012
By

A couple of days ago, I had posted a short Python script to convert numpy files into a simple binary format which R can read quickly. Nice, but still needing an extra file. Shortly thereafter, I found Carl Rogers cnpy library which makes reading and writing numpy files from C++ a breeze, and I quickly wrapped this up into a new package...

This is *huge*: SAScii package

July 10, 2012
By

http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html Q: How do you make a hairless primate? Answer 1: Take a hairy primate, wait a few million years and see if Darwin was right. Answer 2: Make them work i...

Importing public data with SAS instructions into R

July 10, 2012
By

Many public agencies release data in a fixed-format ASCII (FWF) format. But with the data all packed together without separators, you need a "data dictionary" defining the column widths (and metadata about the variables) to make sense of them. Unfortunately, many agencies make such information available only as a SAS script, with the column information embedded in a PROC...

introduction to R: learning by doing (part 2: plots)

July 10, 2012
By

Lets go one with the second part of learning R by doing R (you will find the first part here. As we have used vectors, matrices and loops in the first part, we will concentrate on graphics in this one. but first we will need data to plot: Sometimes you will need several plots in

simulation, an ubiquitous tool

July 10, 2012
By

(This article was first published on Xi'an's Og » R, and kindly contributed to R-bloggers) After struggling for quite a walk on that AMSI public lecture talk, and dreading its loss with the problematic Macbook, I managed to complete a first draft last night in Adelaide, downloading a final set of images from the Web...

Visualizing Graphical Models

July 10, 2012
By

I'm anticipating presenting research of mine based on Bayesian graphical models to an audience that might not be familiar with them. When presenting ordinary regression results, there's already the sort of statistical sniper questions along the lines o...

SAS Beats R on July 2012 TIOBE Rankings

July 10, 2012
By

The TIOBE Community Programming Index ranks the popularity of programming languages, but from a programming language perspective rather than as analytical software (http://www.tiobe.com). It extracts measurements from blogs, entries in Wikipedia, books on Amazon, search engine results, etc. and combines them into a single index. … Continue reading →

2nd CFP: the 10th Australasian Data Mining Conference (AusDM 2012)

July 10, 2012
By

The Tenth Australasian Data Mining Conference (AusDM 2012) Sydney, Australia, 5-7 December 2012 http://ausdm12.togaware.com/ The Australasian Data Mining Conference has established itself as the premier Australasian meeting for both practitioners and researchers in data mining. This year’s conference, AusDM’12, co-hosted … Continue reading →

Data Mining In Excel: Lecture Notes and Cases

July 10, 2012
By

by Yanchang Zhao, RDataMining.com It is a 270-page book on data mining with Excel. It can be downloaded as a PDF file at http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.1393&rep=rep1&type=pdf. Below is its table of contents. - Overview of the Data Mining Process - Data Exploration … Continue reading →

Sourcing Code from GitHub

July 10, 2012
By

In previous posts I described how to input data stored on GitHub directly into R. You can do the same thing with source code stored on GitHub. Hadley Wickham has actually made the whole process easier by combining the getURL, textConnection, and source commands into one function: source_url. This is in his devtools...

Package JM — version 1.0-0

July 10, 2012
By

(by Dimitris Rizopoulos) Dear R-users, I’d like to announce the release of version 1.0-0 of package JM (already available from CRAN) for the joint modeling of longitudinal and time-to-event data using shared parameter models. These models are applicable in mainly two settings. First, when focus is in the survival outcome and we wish to account for the effect of an...

Review: Kölner R Meeting 6 July 2012

July 10, 2012
By

The second Cologne R user meeting took place last Friday, 6 July 2012, at the Institute of Sociology. Thanks to Bernd Weiß, who provided the meeting room, we didn't have to worry about the infrastructure, like we did at our first gathering. Again, we ...

Project Euler — problem 13

July 9, 2012
By

The 13th in Project Euler is one big number problem: Work out the first ten digits of the sum of the following one-hundred 50-digit numbers. Obviously, there are some limits in machine representation of numbers. In R, 2^(-1074) is the smallest … Continue reading →