Blue Jay and Scrub Jay : Using rvertnet to check the distributions in R

July 30, 2012
By
Blue Jay and Scrub Jay : Using rvertnet to check the distributions in R

As part of my Google Summer of Code, I am also working on another package for R called rvertnet. This package is a wrapper in R for VertNet websites. Vertnet is a vertebrate distributed database network consisting of FishNet2, MaNIS, HerpNET, and ORNIS. Out of that currently Fishnet, HerpNET and ORNIS have their v2 portals serving data. rvertnet has functions now to access

Read more »

Returns with negative net asset values

July 30, 2012
By
Returns with negative net asset values

How are returns calculated when net asset value goes negative? Previously In “A tale of two returns” we highlighted the similarities and differences of log returns versus simple returns. Positive valuation We create — in R — an example of net asset value at four times: > nav1 <- c(1000, 900, 950, 1010) > nav1 … Continue reading...

Read more »

unsupervised classification of a raster in R: the layer-stack or part one.

July 29, 2012
By
unsupervised classification of a raster in R: the layer-stack or part one.

In my last post I was explaining the usage of QGis to do a layerstack of a Landsat-scene. Due to the fact that further research and trying out resulted in frustration I decided to stick with a software I know well: R. So download the needed layers here and open up your flavoured version of

Read more »

Extracting upstream regions of a RefSeq human gene list in R using Bioconductor

July 29, 2012
By

Suppose that you want to do local mapping of upstream regions of a given RefSeq IDs in a particular genome in R using Bioconductor. Download the script here.In this case, you may take a look at the Bioconductor AnnotationData Packages here: http:/...

Read more »

Community Detection in Networks with R

Community Detection in Networks with R

I mainly post this visualization because I think it’s pretty. It reminds a little of the work by the famous Dutch painter Mondrian. The complete matrix can be found here. The plot is a heatmap of an adjacency matrix generated by a weighted dir...

Read more »

ScraperWiki in R

July 29, 2012
By

ScraperWiki describes itself as an online tool for gathering, cleaning and analysing data from the web. It is a programming oriented approach, users can implement ETL processes in Python, PHP or Ruby, share these processes among the community (or pay for privacy) and schedule automated runs. The software behind the service is open source, and there is...

Read more »

Hangman in R: A learning experience

July 28, 2012
By
Hangman in R: A learning experience

I love when people take a sophisticated tool and use it to play video games. Take R for example. I first saw someone create a game for R at talk.stats.com. My friend Dason inspired me to more efficiently waste time … Continue reading →

Read more »

My New Book: Developing, Deploying and Debugging Multi-Armed Bandit Algorithms

July 28, 2012
By

I’m happy to announce that I’ve started writing a new book for O’Reilly, which will focus on teaching readers how to use Multi-Armed Bandit Algorithms to build better websites. My hope is that the book can help web developers build up an intuition for the core conundrum facing anyone who wants to build a successful

Read more »

Petrol prices adjusted for inflation

July 28, 2012
By
Petrol prices adjusted for inflation

Petrol prices adjusted for inflation (Perth, Western Australia) The thought for this sprung to mind when I saw petrol drop below $1.20 per litre the other day, and it made me think, I remember paying that when I got to … Continue reading →

Read more »

Hi R and Axys, I’m d3.js “Nice to Meet You” (On the Iphone)

July 27, 2012
By
Hi R and Axys, I’m d3.js “Nice to Meet You” (On the Iphone)

I am still definitely in the proof of concept stage, but as I progress I get more excited about the prospects of combining d3.js with R and Axys through Bryan Lewis’ really nice R websockets package (even nicer now that he has added the daemonize fun...

Read more »

R is reported as being used by about half of all data miners in the 2011 Data Miners Survey

July 27, 2012
By
R is reported as being used by about half of all data miners in the 2011 Data Miners Survey

by Yanchang Zhao, RDataMining.com R is reported as now being used by close to half of all data miners (47%) in the 2011 Data Miners Survey by Rexer Analytics. Below is picked up from the survey highlights regarding data mining … Continue reading →

Read more »

My no loops in R hair shirt

July 27, 2012
By

Being professional involved with analyzing source code I get to work with a much larger number of programming languages than most people. There is a huge difference between knowing the intricate details of the semantics of a language and being able to fluently program in a language like a native developer. There are languages whose

Read more »

Revolution Analytics at JSM 2012

July 27, 2012
By

Revolution Analytics is proud to once again be a gold sponsor and Wi-Fi sponsor of the JSM 2012 conference in San Diego, the largest gathering of statisticians, biostatisticians, analysts, data miners and data scientists in the world. The conference begins on Sunday, and you'll find the Revolution Analytics team in the exhibit hall. Drop by to take a look...

Read more »

rApache 1.2.0 Released

July 27, 2012
By
rApache 1.2.0 Released

With this release comes a minor change in behavior: for requests that have been configured with RFileEval, RFileHandler, or using the r-script handler, rApache will set the working directory to the file’s directory. For instance with a Rook deployment like this: <Location /hmisc> SetHandler r-handler ...

Read more »

ggplot2: A little twist on back-to-back bar charts

July 27, 2012
By
ggplot2: A little twist on back-to-back bar charts

Sangyoon Lee BackgroundWhile thinking about ways to represent incoming and outgoing flows in a business process, I thought about using export-import charts like the one shown here in the Learning R blog. However, as the author acknowledges, it is difficult to compare individual values using these charts. Regardless, I still wanted to have...

Read more »

Evolution average number of beds per hospital

July 27, 2012
By
Evolution average number of beds per hospital

The OECD collects (among a lot of other statistics) information on the number of hospitals and hospital beds per country. These two parameters combined and its evolution over the years could give an indication on whether or not the country’s hospital landscape is evolving towards large medical centers, small scale  Read more »

Evolution average number of beds per hospital

July 27, 2012
By
Evolution average number of beds per hospital

The OECD collects (among a lot of other statistics) information on the number of hospitals and hospital beds per country. These two parameters combined and its evolution over the years could give an indication on whether or not the country's hospital landscape is evolving towards large medical centers, small scale hospital settings or whether there is no trend to detect. In the graph below...

Read more »

More on Factor Attribution to improve performance of the 1-Month Reversal Strategy

July 26, 2012
By
More on Factor Attribution to improve performance of the 1-Month Reversal Strategy

In my last post, Factor Attribution to improve performance of the 1-Month Reversal Strategy, I discussed how Factor Attribution can be used to boost performance of the 1-Month Reversal Strategy. Today I want to dig a little dipper and examine this strategy for each sector and also run a sector-neutral back-test. The initial steps to

Read more »

Linear regression by gradient descent

July 26, 2012
By
Linear regression by gradient descent

In Andrew Ng's Machine Learning class, the first section demonstrates gradient descent by using it on a familiar problem, that of fitting a linear function to data. Let's start off, by generating some bogus data with known characteristics. Let's make y just a noisy version of x. Let's also add 3 to give the intercept term something to...

Read more »

Big vectors coming to R

July 26, 2012
By

R has been available as a 64-bit application since it's earliest days. But the internal representation of R's fundamental data type — the vector — has long been subject to a 32-bit limitation: the maximum number of elements is capped at 2^31 (or just over 2.1 billion) elements. Now, at 8 bytes per element that's 16Gb of data, so...

Read more »

Changing function scope in GNU R example

July 26, 2012
By

In my last post I have discussed how to work around GNU R scoping rules using environment function. This time let us look at a practical example using recode function from car package.First let us look at how&nbs...

Read more »

Monitor: Using category labels

July 26, 2012
By
Monitor: Using category labels

I´ve been checking recently the performance of a calibration of compound feed with  a set of samples (15): 3 samples of hen feed, 3 of pig feed, 3 of chicken feed, 3 of ovine feed and 3 of cattle feed.The idea is to check if the calibration predi...

Read more »

Plotting 95% Confidence Bands in R

July 26, 2012
By
Plotting 95% Confidence Bands in R

I am comparing estimates from subject-specific GLMMs and population-average GEE models as part of a publication I am working on. As part of this, I want to visualize predictions of each type of model including 95% confidence bands. First I had to ma...

Read more »

R Inferno-ism: order is not rank

July 26, 2012
By
R Inferno-ism: order is not rank

Do not use order when you want rank. Background The update of “A comparison of some heuristic optimization methods” is due to the bug that Luca Scrucca spotted. Actually, it is two bugs: I used order when I meant rank This somehow escaped being in The R Inferno   Problem What I said in my … Continue reading...

Read more »

Universal portfolio, part 9

July 25, 2012
By
Universal portfolio, part 9

Part 8 was discussing the distribution of the absolute wealth of the Universal Portfolio across all possible tuples of length 2, 3 and 4.However, comparing the absolute wealth against some reference, especially against simple portfolio selection algor...

Read more »

Getting rasters into shape from R

July 25, 2012
By
Getting rasters into shape from R

Today I needed to convert a raster to a polygon shapefile for further processing and plotting in R. I like to keep my code together so I can easily keep track of what I’ve done, so it made sense to … Continue reading →

Read more »

Another R mention in the NYT

July 25, 2012
By

The R language gets a brief mention in an article in yesterday's New York Times on automated bond trading: The traders here are mostly educated in math or physics, often outside the United States, and their desks are piled high with textbooks like the “R Graphs Cookbook,” for working with obscure computer programming languages. R an obscure programming language?...

Read more »

Hierarchical Cluster Analysis (ChemoSpec) – 03

July 25, 2012
By
Hierarchical Cluster Analysis (ChemoSpec) – 03

It is clear that we can discriminate between olive oil and sunflower oil, but let´s see the reason for the sub-clusters in the sunflower oil.Samples sflw6da, sflw7da, sflw8da, sflw9da, sflw10da are refined sunflower, so it is filtered and processed, t...

Read more »

Heaviside Signal Detection Part 1: Informed non-parametric testing

July 25, 2012
By
Heaviside Signal Detection Part 1: Informed non-parametric testing

Steps may be frequently found in geophysical datasets, specifically timeseries (e.g. GPS).  A common approach to estimating the size of the offset is to assume (or estimate) what the statistical structure of the noise is and estimate the size and … Continue reading →

Read more »