In my last post I have plotted randu dataset to show that all its points lie on 15 parallel planes. But I was not fully satified with the solution and decided to show this numerically.It can be done in four steps:identifying four points lying...

In yesterday's webinar, "New Features in Revolution R Enterprise 5.0 to Support Scalable Data Analysis", Sue Ranney demonstrated the features of the RevoScaleR big data analysis package included with Revolution R Enterprise. In the webinar, she showed how to use the rxImport function to import big data sets from SAS, SPSS or ODBC, how to use the rxDataStep function...

I gave a talk today on doing very basic phylogenetics in R, including getting sequence data, aligning sequence data, plotting trees, doing trait evolution stuff, etc.Please comment if you have code for doing bayesian phylogenetic inference in R. ...

The below is taken from a work in progress: The Polya urn is a heuristic associated with Dirichlet process mixtures. We present the scheme in a modified format, using balloons instead of balls, where the probability of drawing a balloon from the urn is proportional to its volume. Balloons are preferred because their volume may

Converting HTML to plain text usually involves stripping out the HTML tags whilst preserving the most basic of formatting. I wrote a function to do this which works as follows (code can be found on github): The above uses an XPath approach to achieve it’s goal. Another approach would be to use a regular expression. These

Of course, a few days before I leave for a much needed vacation, USA Today released their updated NCAA coaching salary database. For sports junkies, there’s an unlimited number of analysis and visualizations that can be done on the data. I took a quick break from packing to condense the data to a csv and

During the final stage of asset allocation process we have to decide how to implement our desired allocation. In many cases we will allocate capital to the mutual fund managers who will invest money according to their fund’s mandate. Usually there is no perfect relationship between asset classes and fund managers. To determine the true

The most recent edition of the Revolution Newsletter is out. The news section is below, and you read the full November edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. R Training from Hadley Wickham: The R guru (and author of ggplot2, plyr and several...

I don't know, of course, because the evidence at hand is based on my experience. But, I'll leave the reader to consider whether these observations generalize. Proponents of Bayesian statistical inference argue that Bayesian credible intervals are more intuitive than the frequentist confidence intervals, because the Bayesian inference is a probability statement about a parameter.

When looking for functions whose exact name is unknown # Functions related to “shrinkage” methods help.search(“shrinkage”) Package sos does a great job in finding functions install.packages(“sos”) library(sos) shrinkageResults <- findFn("shrinkage", maxPages = 1) shrinkageResults # This opens a webpage in your browser with the results The table in the webpage created above have sortable columns.

Reading data into R when dealing with column types and values that need to be considered as NA Below are code snippets to introduce a few arguments of the read.csv function in R # Create sample data strVals <- do.call("c",lapply(1:1000,function(x)paste(sample(letters,sample(5:20,1)),collapse=""))) miscVals <- sample(c("","999","—-","MISS"),100,replace=T) numVals <- rnorm(1000) # Scenario 1 : Pure numeric and strings dataTemp<-data.frame(numericVals

A reminder that Sue Ranney will be presenting the webinar New Features in Revolution R Enterprise 5.0 (Including RevoScaleR) to support Scalable Data Analysis tomorrow (Thursday) at 11AM Pacific time. To whet your appetite, here's another video demonstation of more of the new big data analysis features, including the rxDataStep function to preprocess a data set using R functions...

This is a follow-up of the post Power of running world records As suggested by Andrew, plotting running world records could benefit from a change of variables. More exactly the use of different variables sheds light on a well-known sports result provided in a 2000 Nature paper by Sandra Savaglio and Vincenzo

you can write it aswrite.table(r.data.frame, "excel.file.xls", sep="\t", na="", row.names=F)which I can usually open in Excel just by clicking on it.Credit: http://tolstoy.newcastle.edu.au/R/help/05/04/3388.html

Here’s something I came across by accident, an R package called fgui which has the ability to automatically create a widget just by passing it a function with parameters, e.g.: The GUI produced from the code above looks like this: I love how easy that was to do, very cool, and useful too! The package

Inspired by this tutorial, I thought that it would be nice to have the possibility to have access to weather forecast directly from the R command line, for example for a personalized start-up message such as the one below: Weather summary for Trieste, Friuli-Venezia Giulia: The weather in Trieste is clear. The temperature is currently 14°C (57°F). Humidity: 63%. Fortunately,...

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics - Tag - R-english. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave,...

The Black-Litterman Model was created by Fisher Black and Robert Litterman in 1992 to resolve shortcomings of traditional Markovitz mean-variance asset allocation model. It addresses following two items: Lack of diversification of portfolios on the mean-variance efficient frontier. Instability of portfolios on the mean-variance efficient frontier: small changes in the input assumptions often lead to

Once you become addicted to chess game analysis, it becomes very easy to swamp yourselves with questions regarding different aspects of the game. Testing out different hypothesis like preference of mobility versus positional advantage requires a bit of manual chess game mining, which could potentially be analyzed using R. With the help of websites like