R meets HANA

February 2, 2012
By
R meets HANA

If you read my last blog called HANA meets R you will remember that we read data from HANA into R directly, without having to download an .csv file, but using ODBC. This time, we're going to read data from HANA as well, but after do some nice tricks on R, we're going to post back the...

Read more »

Great Maps with ggplot2

February 2, 2012
By
Great Maps with ggplot2

The above map (and this one) was produced using R and g

Read more »

HANA meets R

February 2, 2012
By
HANA meets R

In my previous HANA and R blogs, I have been forced to create .csv files from HANA and read them on R...an easy but also boring procedure...specially if your R report is supposed to be run on a regular basis...having to create an .csv file every time you need to run your report it's not a nice thing...After...

Read more »

tenured research position with ABC skills!

February 2, 2012
By
tenured research position with ABC skills!

I just received this announcement for the opening of a (tenured/civil servant) position in the national research institute in biostatistics, genetics, and agronomy, INRA: Position opening with profile Approximate inference techniques in complex systems Key activities and required skills: You will develop methodological research in the field of statistical inference for models used in environmental

Read more »

Landscape Metrics with R, SDMTools, ImageJ and Bio7

February 2, 2012
By

01.02.2012 Landscape metrics were developed to analyze spatial patterns of landscapes (e.g. composition and spatial arrangement). In R it is possible to calculate these metrics with the “SDMTools” package. Bio7 offers an easy to use interface to R and ImageJ and can use these tools to simplify a workflow to analyze image data (e.g. vegetation

Read more »

Two courses in R programming by Ken Rice and Thomas Lumley

February 2, 2012
By

Ken Rice and Thomas Lumley will give a course on advanced R programming in two locations this summer. 1. In Edinburgh, June 13-15 (the week before the International Conference in Quantitative Genetics). See http://www.eisg2012.org.uk/ 2. In Seattle, July 23-25, as part of the Summer Institute in Statistical Genetics. See http://www.biostat.washington.edu/suminst/sisg/general The course is about 60% lecture and 40% lab session (BYO R),...

Read more »

Analytic applications are built by data scientists

February 1, 2012
By

Ventana Research analyst David Menninger was on the judging panel for the Applications of R in Business contest. In a post on the Ventana research blog, he offers his perspectives on the contest, noting that R, as a statistical package, includes many algorithms for predictive analytics, including regression, clustering, classification, text mining and other techniques. The contest submissions supported...

Read more »

Cochran Q Test for k related samples in R

February 1, 2012
By
Cochran Q Test for k related samples in R

To run the Cochran Q Test in R, we need to download the package of it first, since it is not built-in in R. The name of the package is RVAideMemoire authored by Maxime Hervé. Here's how to do it.Codes:Here we are installing the package named RVAideMem...

Read more »

Transformation of Several Variables in a Dataframe

February 1, 2012
By
Transformation of Several Variables in a Dataframe

This is how I transform several columns of a dataframe, i.e., with count-data into binary coded data (this would apply also for any other conversion..).count1

Read more »

Vectorized R vs Rcpp

February 1, 2012
By
Vectorized R vs Rcpp

In my previous post, I tried to show, that Rcpp is 1000 faster than pure R and that generated the fuss in the comments. Being lazy, I didn’t vectorize R code and at the end I was comparing apples vs oranges. To fix that problem, I built a new script, where I’m trying to compare

Read more »

Are Recessions Environmentally Beneficial?

February 1, 2012
By
Are Recessions Environmentally Beneficial?

Description:Total energy consumption in the United States by sector.  Vertical gray lines represent periodsof recession.Data:http://www.eia.gov/totalenergy/data/annual/index.cfm#consumptionhttp://en.wikipedia.org/wiki/List_of_recessions_in_the_Uni...

Read more »

MAT886 mean excess function (and reinsurance)

February 1, 2012
By
MAT886 mean excess function (and reinsurance)

Tomorrow, in the course on extreme value, we will focus on applications. We will discuss reinsurance pricing. Consider a random variable , a threshold and define the mean excess function. This function is known in life insurance as the average ...

Read more »

"R": Looking at the Data (Gasoline) – 001

February 1, 2012
By
"R": Looking at the Data (Gasoline) – 001

As other softwares "R" has nice tools to look to the data before to develop the calibration.Statistics for the "Y" variable (in this case octane number) like Maximun, Minimun,..,standard deviation,...are important:> library(ChemometricsWithR)> data(gasoline)> summary(gasoline$octane)   Min.  1st Qu.  Median    Mean   3rd Qu.    Max.   83.40   85.88    87.75    87.18   88.45    89.60> sd(gasoline$octane) 1.530078And of course the Histogram:> hist(gasoline$octane)

Read more »

Confirming SSR, SSE, and SST using matrix in R

February 1, 2012
By
Confirming SSR, SSE, and SST using matrix in R

The codes below was done in our regression laboratory class. Here, we run first the data in SPSS, and take the ANOVA output where we can find the computed values of SSR, SSE, and SST.ANOVAb Model Sum of Squares df Mean Square F Sig. 1 Regress...

Read more »

R Training Course in the Bay Area

February 1, 2012
By
R Training Course in the Bay Area

An introduction to R for sofware developers and data analysts Saturday March 10th, 2012 8:30-5:00pm EBay 2161 North 1st Street San Jose, California I will be presenting a one day professional development workshop on R programming for software developers and … Continue reading →

Read more »

the birthday problem [X'idated]

February 1, 2012
By
the birthday problem [X'idated]

The birthday problem (i.e. looking at the distribution of the birthdates in a group of n persons, assuming a uniform distribution of the calendar dates of those birthdates) is always a source of puzzlement ! For instance, here is a recent post on Cross Validated: I have 360 friends on facebook, and, as

Read more »

RStudio Server: accessing the RStudio R IDE through your browser

February 1, 2012
By

I like having all my important documents and scripts in one single place. This saves me from having to synchronize them between the different workplaces I have, and makes backupping much less of a pain. One way of achieving this… See more ›

Read more »

R is the easiest language to speak badly

February 1, 2012
By
R is the easiest language to speak badly

I am amazed by the number of comments I received on my recent blog entry about "by", "apply" and friends. I had started my post by pointing out that R is a language. Well indeed, I have come to the conclusion, that it is a language with lots of irregul...

Read more »

MINE: Maximal Information-based NonParametric Exploration

February 1, 2012
By
MINE: Maximal Information-based NonParametric Exploration

There was a lot of buzz in the blogosphere as well as the science community about a new family of algorithms that are able to find non-linear relationships over extremely large fields of data. What makes it particularly useful is that the measure(s) it...

Read more »

New R User Group in Cambridge, UK

January 31, 2012
By

Yet another new local R user group has launched this month, this time in Cambridge, UK. Cambridge RUG was created by data analyst Andrew Caines to promote the use of R in the Cambridge area. The group aims to encourage people try the R language, act as an advice centre to help people get where they want to with...

Read more »

Weak Law of Large Numbers

January 31, 2012
By
Weak Law of Large Numbers

1 Description The weak law of large numbers is a result in probability theory also known as Bernoulli’s

Read more »

quick tips: within function assignment and specific object removal

January 31, 2012
By
quick tips: within function assignment and specific object removal

If you’re familiar with the faster iterations on objects such as lapply, sapply, or apply for matrices, you might get surprised that the function call saves new assignments only locally. One of my favorite lines in R comes from the … Continue reading →

Read more »

Example: Two Sample t-Test

January 31, 2012
By
Example: Two Sample t-Test

The recovery time (in days) is measured for 10 patients taking a new drug and for 10 different patients taking a placebo. We wish to test the hypothesis that the mean recovery time for patients taking the drug is less than for those taking placebo. The...

Read more »

Given a room with n people in it, what is the probability any two will have the same birthday?

January 31, 2012
By
Given a room with n people in it, what is the probability any two will have the same birthday?

Revisiting a fun puzzle I remember first encountering as an undergraduate. Nice example of creating a plot in R using ggplot2. I also plot the probability of someone in the room having the same birthday as you.

Read more »

Example: One Sample t-Test

January 31, 2012
By
Example: One Sample t-Test

Using the stack loss dataset, test the hypothesis that the mean of the stackloss is equal to 20 versus a two-sided alternative. Solution:Codes:Output:Interpretation: With the p-value greater than the level of significance alpha at 0.05, then we la...

Read more »

Surfaces in ternary plots

January 31, 2012
By
Surfaces in ternary plots

In mixture experiments there is a constraint that the variables are the proportions of components that are mixed together with the consequence that these proportions sum to one. When fitting regression models to data from mixture experiments we may be interested in reprenting the fitted model with a surface plot. The constraint on proportions means

Read more »

ultimate R recursion

January 31, 2012
By
ultimate R recursion

One of my students wrote the following code for his R exam, trying to do accept-reject simulation (of a Rayleigh distribution) and constant approximation at the same time: which I find remarkable if alas doomed to fail! I wonder if there exists a (real as opposed to fantasy) computer language where you could introduce constants

Read more »

This graph makes me think Kobe is not that good, he just shoots a lot

January 31, 2012
By
This graph makes me think Kobe is not that good, he just shoots a lot

I find it surprising that NBA commentators rarely talk about field goal percentage. Everybody knows that the more you shoot the more you score. But players that score a lot are admired without consideration of their FG%. Of course having a high FG% is ...

Read more »

Given a room with n people in it, what is the probability any two will have the same birthday?

January 31, 2012
By
Given a room with n people in it, what is the probability any two will have the same birthday?

Revisiting a fun puzzle I remember first encountering as an undergraduate. Nice example of creating a plot in R using ggplot2. I also plot the probability of someone in the room having the same birthday as you. ## See http://en.wikipedia.org/wiki/Bi...

Read more »