My talk on doing phylogenetics in R

November 18, 2011
By

I gave a talk today on doing very basic phylogenetics in R, including getting sequence data, aligning sequence data, plotting trees, doing trait evolution stuff, etc.Please comment if you have code for doing bayesian phylogenetic inference in R.  ...

Read more »

My talk on doing phylogenetics in R

November 18, 2011
By
My talk on doing phylogenetics in R

I gave a talk today on doing very basic phylogenetics in R, including getting sequence data, aligning sequence data, plotting trees, doing trait evolution stuff, etc.Please comment if you have code for doing bayesian phylogenetic inference in R.  ...

Read more »

Why balloons are better than balls (in urn schemes)

November 18, 2011
By

The below is taken from a work in progress: The Polya urn is a heuristic associated with Dirichlet process mixtures. We present the scheme in a modified format, using balloons instead of balls, where the probability of drawing a balloon from the urn is proportional to its volume. Balloons are preferred because their volume may

Read more »

htmlToText(): Extracting Text from HTML via XPath

November 18, 2011
By
htmlToText(): Extracting Text from HTML via XPath

Converting HTML to plain text usually involves stripping out the HTML tags whilst preserving the most basic of formatting. I wrote a function to do this which works as follows (code can be found on github): The above uses an XPath approach to achieve it’s goal. Another approach would be to use a regular expression. These

Read more »

FBS Coaches Avg. Salary

November 18, 2011
By
FBS Coaches Avg. Salary

Of course, a few days before I leave for a much needed vacation, USA Today released their updated NCAA coaching salary database. For sports junkies, there’s an unlimited number of analysis and visualizations that can be done on the data. I took a quick break from packing to condense the data to a csv and

Read more »

Style Analysis

November 17, 2011
By
Style Analysis

During the final stage of asset allocation process we have to decide how to implement our desired allocation. In many cases we will allocate capital to the mutual fund managers who will invest money according to their fund’s mandate. Usually there is no perfect relationship between asset classes and fund managers. To determine the true

Read more »

Spinner Doctor

November 17, 2011
By
Spinner Doctor

The setup Dan Meyer, a (former?) math teacher with some extraordinary ideas, has a nifty concept for teaching expected values: “So one month before our formal discussion of expected value, I’d print out this image, tack a spinner to it, … Continue reading →

Read more »

Revolution Newsletter: November 2011

November 17, 2011
By

The most recent edition of the Revolution Newsletter is out. The news section is below, and you read the full November edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. R Training from Hadley Wickham: The R guru (and author of ggplot2, plyr and several...

Read more »

GEO2R: Web App to Analyze Gene Expression in GEO Datasets Using R

November 17, 2011
By
GEO2R: Web App to Analyze Gene Expression in GEO Datasets Using R

Gene Expression Omnibus is NCBI's repository for publicly available gene expression data with thousands of datasets having over 600,000 samples with array or sequencing data. You can download data from GEO using FTP, or download and load the data direc...

Read more »

Using neural network for regression

November 17, 2011
By
Using neural network for regression

Artificial neural networks are commonly thought to be used just for classification because of the relationship to logistic regression: neural networks typically use a logistic activation function and output values from 0 to 1 like logistic regression. However, the worth … Continue reading →

Read more »

Bayesian vs. Frequentist Intervals: Which are more natural to scientists?

November 17, 2011
By

I don't know, of course, because the evidence at hand is based on my experience. But, I'll leave the reader to consider whether these observations generalize. Proponents of Bayesian statistical inference argue that Bayesian credible intervals are more intuitive than the frequentist confidence intervals, because the Bayesian inference is a probability statement about a parameter.

Read more »

Finding functions in R

November 17, 2011
By
Finding functions in R

When looking for functions whose exact name is unknown # Functions related to “shrinkage” methods help.search(“shrinkage”) Package sos does a great job in finding functions install.packages(“sos”) library(sos) shrinkageResults <- findFn("shrinkage", maxPages = 1) shrinkageResults # This opens a webpage in your browser with the results The table in the webpage created above have sortable columns.

Read more »

Missing values and column types when reading data into R

November 17, 2011
By
Missing values and column types when reading data into R

Reading data into R when dealing with column types and values that need to be considered as NA Below are code snippets to introduce a few arguments of the read.csv function in R # Create sample data strVals <- do.call("c",lapply(1:1000,function(x)paste(sample(letters,sample(5:20,1)),collapse=""))) miscVals <- sample(c("","999","—-","MISS"),100,replace=T) numVals <- rnorm(1000) # Scenario 1 : Pure numeric and strings dataTemp<-data.frame(numericVals

Read more »

Webinar Tomorrow: What’s new in Revolution R Enterprise 5.0

November 16, 2011
By

A reminder that Sue Ranney will be presenting the webinar New Features in Revolution R Enterprise 5.0 (Including RevoScaleR) to support Scalable Data Analysis tomorrow (Thursday) at 11AM Pacific time. To whet your appetite, here's another video demonstation of more of the new big data analysis features, including the rxDataStep function to preprocess a data set using R functions...

Read more »

Power-laws: choose your x and y variables carefully

November 16, 2011
By
Power-laws: choose your x and y variables carefully

This is a follow-up of the post Power of running world records As suggested by Andrew, plotting running world records could benefit from a change of variables. More exactly the use of different variables sheds light on a well-known sports result provided in a 2000 Nature paper by Sandra Savaglio and Vincenzo

Read more »

Update on Scary Derivatives

November 16, 2011
By
Update on Scary Derivatives

After reading Bloomberg’s article, JPMorgan Chase & Co. and Goldman Sachs Group Inc., among the world’s biggest traders of credit derivatives, disclosed to shareholders that they have sold protection on more than $5 trillion of debt globally. ...

Read more »

an easy way to writing data.frame to Excel

November 16, 2011
By

you can write it aswrite.table(r.data.frame, "excel.file.xls", sep="\t", na="", row.names=F)which I can usually open in Excel just by clicking on it.Credit: http://tolstoy.newcastle.edu.au/R/help/05/04/3388.html

Read more »

Using SyntaxHighlighter and R Brush in Blogger

November 16, 2011
By
Using SyntaxHighlighter and R Brush in Blogger

If you're thinking it is time to give the code examples in your blog a more readable look, you may follow this path and use the SyntaxHighlighterFirst thing: check the SyntaxHighlighter Website for the basics.Read more »

Read more »

Performance measurement is about decisions

November 16, 2011
By
Performance measurement is about decisions

The return of a hypothetical fund was 17.9% in 2010.  We want to know if that is good or bad. The benchmark method The assets in the portfolio are constituents of the S&P 500, so we can compare our fund return to the return of the index. Figure 1: 2010 returns of: the fund and … Continue reading...

Read more »

fgui: Automatically Creating Widgets for Arguments of a Function – A Quick Example

November 16, 2011
By
fgui: Automatically Creating Widgets for Arguments of a Function – A Quick Example

Here’s something I came across by accident, an R package called fgui which has the ability to automatically create a widget just by passing it a function with parameters, e.g.: The GUI produced from the code above looks like this: I love how easy that was to do, very cool, and useful too! The package

Read more »

Lambert’s W function and the generalised logarithm

November 16, 2011
By
Lambert’s W function and the generalised logarithm

Yesterday I ran into an equation that was a sum of an exponential and a linear term: It doesn’t take long to figure out that there is no analytical solution, and so I set out to write some crappy numerical code. After wasting some time with a fixed point iteration that did not really work,

Read more »

Weather forecast and good development practices

November 16, 2011
By
Weather forecast and good development practices

Inspired by this tutorial, I thought that it would be nice to have the possibility to have access to weather forecast directly from the R command line, for example for a personalized start-up message such as the one below: Weather summary for Trieste, Friuli-Venezia Giulia: The weather in Trieste is clear. The temperature is currently 14°C (57°F). Humidity: 63%. Fortunately,...

Read more »

PhD defense on copulas

November 15, 2011
By

(This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: Freakonometrics - Tag - R-english. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave,...

Read more »

Black-Litterman Model

November 15, 2011
By
Black-Litterman Model

The Black-Litterman Model was created by Fisher Black and Robert Litterman in 1992 to resolve shortcomings of traditional Markovitz mean-variance asset allocation model. It addresses following two items: Lack of diversification of portfolios on the mean-variance efficient frontier. Instability of portfolios on the mean-variance efficient frontier: small changes in the input assumptions often lead to

Read more »

First attempt at Chess Data Mining

November 15, 2011
By
First attempt at Chess Data Mining

Once you become addicted to chess game analysis, it becomes very easy to swamp yourselves with questions regarding different aspects of the game. Testing out different hypothesis like preference of mobility versus positional advantage requires a bit of manual chess game mining, which could potentially be analyzed using R. With the help of websites like

Read more »

Landscape figures in Sweave

November 15, 2011
By

This post is a quick follow up from my initial article on Sweave to add a note on how to get a plot in landscape orientation to fill the whole page, plus a little example of using BibTex.Just to clarify  my … Continue reading →

Read more »

This One’s Personal: Sanford Koufax vs. Randy Johnson…pffft

November 15, 2011
By
This One’s Personal: Sanford Koufax vs. Randy Johnson…pffft

I couldn’t let this one go. The conclusion draw here by this author that Randy Johnson was “the best pitcher of all time” was not something I could allow to slip through the cracks. Johnson was awesome. Incredible to watch. … Continue reading →

Read more »

Announcing Revolution R Enterprise 5.0

November 15, 2011
By

We're proud to announce the latest update to the enhanced, commercial-grade distribution of R, Revolution R Enterprise 5.0. With each new release, Revolution R Enterprise adds more capabilities to open-source R, to make R users more productive, to improve performance of R programs, to support Big Data analytics, and to provide servers and APIs for enterprise deployment. New features...

Read more »

Example 9.14: confidence intervals for logistic regression models

November 15, 2011
By
Example 9.14: confidence intervals for logistic regression models

Recently a student asked about the difference between confint() and confint.default() functions, both available in the MASS library to calculate confidence intervals from logistic regression models. The following example demonstrates that they yield d...

Read more »