R video tutorial number 1
LEARN R BY JUST WATCHING
For this week, Decision Science News has created a video tutorial on how to get started using the R Language for Statistical Computing. (The tutorial is best viewed in your browser’s full-screen mode, try pressing F11 in Windows). R is free and open source, and constantly being improved upon by countless contributors worldwide. DSN highly recommends using R.
Topics covered include:
- Downloading and installing R in Windows
- The R graphical user interface
- Viewing the graphics demo
- Vectors and basic stats
- Simple plotting
The commands in the tutorial are:
demo(graphics)
x=c(1,2,3,4,5,6,7)
y=c(10,14,20,18,16,15,10)
x+y
z=c(x,y)
sum(y)
mean(y)
sd(y)
plot(x,y)
barplot(y,col=”lightgreen”)
Got that? Now try R video tutorial 2
See also The R Graph Gallery
Can’t view flash? Download movie. If you see no image under Windows, download the TSSC Codec.

GillespieSSA on CRAN
So it’s official – my R package GillespieSSA has been posted to the official list of packages on CRAN (The Comprehensive R Archive Network). Check it out.
Now back to my manuscript. Over the last few days I have been polishing my stochastic simulation algorithm manuscript. I am aiming to send it off to some eminent (and hopefully interested) academics today – just as a peek preview. I’ll probably let it ripen over the weekend and then (barring any unexpected events) I’ll submit it on monday.
Recent changes in the R package "arm" (applied regression and multilevel modeing)
GillespieSSA 0.2-0 is released
Just finished the new version of the GillespieSSA package. The tar ball of the new version is posted on its web page (here). I also submitted it to CRAN so in (due time) it should appear on the official R package list.
What’s new in this release? Lots of bug fixes. The Binomial tau-leap method has been completely rewritten (resulting in a more stable and faster SSA method), I have implemented a novel nu-tiling algorithm for the state change matrix which substantially simplifies the definition of certain types of systems having a large number of states and/or reaction channels. The nu-tiling is a new and still experimental algorithm and is still awaiting proper documentation (i.e. being described in a manuscript) down the road. If any one wishes to take advantage of it and gets stuck feel free to contact me. I also improved the ssa.plot() function making it more efficient and flexible when displaying very large time series (i.e. large number of states and/or many time points).
What is not new? Well, I am still only providing a unix tar ball. Packaging it for Windows seems to be a time consuming and painful process so am postponing the inevitable until I have a large block of free time available (like that is ever going to happen). Otherwise volunteers taking on this task are more than welcome to pitch in.
Implementing the stochastic simulation algorithm in R
Dear blog,
no I have not forgot you or the rest of the world. Sequestered in my humble cubicle I have been hammering away day and night for the last several weeks – hammering out a manuscript. Finally after what feels like an eternity I am almost there, i.e. getting close to submitting it. I just send a draft out for an in-lab review, so hopefully it is just a matter of some final polishing before I can get it out of my hair. If curiosity got the best of you, here’s the (draft) abstract.
Abstract
The deterministic dynamics of populations in continuous time is traditionally described using coupled, first-order ordinary differential equations. While this approach is accurate for large systems it is often inadequate for small systems where key species may be present in small numbers or where key reactions occur at a low rate. The stochastic simulation algorithm (SSA) is a procedure for generating time-evolution trajectories of finite populations in continuous time and has become the standard algorithm for these type of models. This article presents a simple to use and flexible framework for implementing the SSA using the high-level statistical computing language R. Using three biological models as examples, logistic growth, Rosenzweig-MacArthur predator-prey model, and Kermack-McKendrick SIRS metapopulation model, I show how the deterministic model can be formulated as a finite-population stochastic model within the framework of SSA theory and I give examples of it’s implementation in R. The example models are run using different SSA Monte Carlo methods, one exact method (Gillespie’s Direct method), and three approximate methods (Explicit, Binomial, and Optimized tau-leap methods). Comparison of simulation results from the different methods confirm that the time-evolution trajectories are indistinguishable while the approximate methods outperform the exact method by up to four orders of magnitude.
… yes, four orders of magnitude. I could hardly believe it myself when I saw it. For those of you not familiar with theoretical mumbojumbo, it means it is on the order of 10,000 faster. Actually, even that is an under estimate. The record I have is one of the accelerated methods running 63,000 times faster than the exact method (yes, that’s for exactly the same stochastic model)…, while still giving results that are indistinguishable from the results of the exact method.
Organization and R
Many R users seem to get themselves in a bit of a mess with R files and workspaces scattered across different directories. The R files themselves also get messy and hard to follow. So here is some advice on keeping organized with R:
- Try to keep code strictly indented based on the code structure such as loops, if statements, etc. Every left brace { should be followed by an extra level of indentation which continues until the matching right brace }. You should be able to quickly identify what lines are part of a loop, or are conditioned by an if statement, simply by the levels of indentation.
- Comment copiously. You need to be able to figure out what your code does in a year’s time.
- Have a single directory for each project. Within that, keep an R workspace, an R file containing the functions you’ve written, and one or more R files containing the code to read in the data, apply the functions to the data, plot some graphs, etc.
- Don’t have multiple versions of essentially the same code. If you are doing similar things to what you’ve done before, write a function to do it and call it when required.
- Have a main.R file which does all the analysis for the paper, chapter or report. It may simply consist of source lines such as
- Every graph and table to go into your written document should be created via code. Use savepdf() or saveeps() in the monash package for graphs, and xtable() in the xtable package for LaTeX tables. (For more complicated LaTeX tables, latex() in the Hmisc package is also useful.)
source(“readdata.R”)
source(“fitmodel.R”)
Then the whole project can be run by sourcing the main file. If you find an error in your data, or you get updated or revised data, it is then a simple matter of running main.R and all the graphs and analysis will be re-created.
Organization and R
Many R users seem to get themselves in a bit of a mess with R files and workspaces scattered across different directories. The R files themselves also get messy and hard to follow. So here is some advice on keeping organized with R:
- Try to keep code strictly indented based on the code structure such as loops, if statements, etc. Every left brace { should be followed by an extra level of indentation which continues until the matching right brace }. You should be able to quickly identify what lines are part of a loop, or are conditioned by an if statement, simply by the levels of indentation.
- Comment copiously. You need to be able to figure out what your code does in a year’s time.
- Have a single directory for each project. Within that, keep an R workspace, an R file containing the functions you’ve written, and one or more R files containing the code to read in the data, apply the functions to the data, plot some graphs, etc.
- Don’t have multiple versions of essentially the same code. If you are doing similar things to what you’ve done before, write a function to do it and call it when required.
- Have a main.R file which does all the analysis for the paper, chapter or report. It may simply consist of source lines such as
- Every graph and table to go into your written document should be created via code. Use savepdf() or saveeps() in the monash package for graphs, and xtable() in the xtable package for LaTeX tables. (For more complicated LaTeX tables, latex() in the Hmisc package is also useful.)
source(”readdata.R”)
source(”fitmodel.R”)
Then the whole project can be run by sourcing the main file. If you find an error in your data, or you get updated or revised data, it is then a simple matter of running main.R and all the graphs and analysis will be re-created.
Extract objects from a list
mylist=(list(a=1,b=2,c="string1",d=list("r"=2,"z"="string2")))
for(i in 1:length(mylist)){
##first extract the object value
tempobj=mylist[[i]]
##now create a new variable with the original name of the list item
eval(parse(text=paste(names(mylist)[[i]],"= tempobj")))
}
> print(a)
[1] 1
> print(b)
[1] 2
> print(c)
[1] "string1"
> print(d)
$r
[1] 2
$z
[1] "string2"
If there is a more elegant way to do this, please let me know in the comments.

