R talk on regular expressions (regex)

October 6, 2011
By
R talk on regular expressions (regex)

Regular expressions are a powerful in any language to manipulate, search, etc. data. For example:> fruit <- c("apple", "banana", "pear", "pineapple")> fruit "apple" "banana" "pear" "pineapple"> grep("a", fruit) # there is an ...

Read more »

R: Preparing balanced stimuli lists for a psychological experiment

October 6, 2011
By

Dividing a list of stimuli described by several statistics into subsets which are balanced according to these statistics is a common task in psychological research. For the purpose of preparing materials for an experiment which I am going to conduct &#...

Read more »

R Workshop

October 6, 2011
By

I am going to start a continuing “R Workshop” series of posts with R tips and tricks. If you have questions you’d like answered or were wondering about certain aspects, please leave them in the comments.

Read more »

Do cents follow Benford’s Law?

October 5, 2011
By
Do cents follow Benford’s Law?

Benford's law is an amazing thing. If you know the probability distribution that classes of "natural" numbers should have, you can detect where people might be faking data: phony tax returns, bogus scientific studies, etc.

Read more »

New R-generated Video: Has StackOverflow Posting Behavior Changed Over Time?

October 5, 2011
By
New R-generated Video:  Has StackOverflow Posting Behavior Changed Over Time?

Sparks have been flying between my favorite data analysis language and my favorite programmer's Q & A site since long ago: R flirted with StackOverflow on September 10, 2008, 5 days before StackOverflow was even open to the public. R still hesitates to leave its original suitor, the loud and lively R-help mailing list, where

Read more »

Linear regression with correlated data

October 5, 2011
By

I started following the debate on differential minimum wage for youth (15-19 year old) and adults in New Zealand. Eric Crampton has written a nice series of blog posts, making the data from Statistics New Zealand available. I will use … Continue reading →

Read more »

Slides and replay for "Backtesting FINRA’s Limit Up/Down Rules" available

October 5, 2011
By

If you missed last week's webinar on using Revolution R and IBM Netezza to analyze the effectiveness of new rules intended to prevent another financial "Flash Crash", you can watch a replay by filling in this form. Once the replay begins, you can download the slides by clicking the "Download" button that appears below the media player. Revolution Analytics...

Read more »

Hot Spot Mapping in R: Illustrating Relative Seasonal Risk

October 5, 2011
By
Hot Spot Mapping in R: Illustrating Relative Seasonal Risk

In recent months, IDV has taken steps to incorporate the powerful statistical engine, R, as a viable connection to Visual Fusion.  R has a robust and growing set of libraries and a community that is constantly thumping away on improvements.  ...

Read more »

Calling Google Maps API from R

October 5, 2011
By
Calling Google Maps API from R

Hi, Related to Julyan’s previous post, I want to share an easy way to access Google Maps API through R. And then we’ll stop about Google, otherwise it’ll look like we’re just looking for jobs. My problem was the following: … Continue reading →

Read more »

New release with Batch processing

October 5, 2011
By

This week we rolled out a new release at cloudnumbers.com which implements two new main features: cloudnumbers.com now supports Batch processing. Due to some changes in the architecture we were able to reduce our system requirements. In detail, we do not need that much open ports in your firewall. Please check our updated System Requirements

Read more »

Modelling with R: part 3

October 5, 2011
By

The previous posts, part 1 and part 2, detailed the procedure to successfully import the data and transform the data so that we can extract some useful information from them. Now it's time to get our hands dirty with some predictive modelling. The dependent variable here is a binary variable taking values "0" and "1", indicating whether the customer...

Read more »

Drawing maps using shapefiles and R

October 4, 2011
By
Drawing maps using shapefiles and R

Sometimes a student may use a self explained chart, instead of a boring table for showing outcomes in a research paper. Yet, graphs are efficient in showing the broad picture of an issue and also for present results. In political science, you can getting into this topic reading Kastellec and Leoni (2007), for instance. I

Read more »

Interactive charts with googleVis package and R

October 4, 2011
By
Interactive charts with googleVis package and R

Examples at the link below illustrate interactive charts created with the googleVis package and R. http://code.google.com/p/google-motion-charts-with-r/wiki/GadgetExamples Some amazing features are: a motion chart shows the changes over time, an AnnotatedTimeLine shows zoom-in/zoom-out view of time series, a TreeMap supports drill-down … Continue reading →

Read more »

GEE using Stata vs. R

October 4, 2011
By

I am running GEE logistic regression model for my fetal loss paper. As usual, I compare results between Stata and R and make sure they are consistent. To my surprise, the models assuming independent correlation structure give similar results but the mo...

Read more »

Introduction to PloTA library in the Systematic Investor Toolbox

October 4, 2011
By
Introduction to PloTA library in the Systematic Investor Toolbox

PloTA ( plot + ta ) library in the Systematic Investor Toolbox is a simple plot interface to charting Time Series and Technical Analysis plots. I created it as an alternative to charting functionality in quantmod package. It is designed to mimic default plot interface and works with xts objects. PloTA implements following methods: plota

Read more »

Bayesian Computation with R – Albert (2009)

October 4, 2011
By
Bayesian Computation with R – Albert (2009)

Title: Bayesian Computation with RAuthor(s): Jim AlbertPublisher/Date: Springer/2009Statistics level: High Programming level: Low Overall recommendation: Recommended Bayesian Computation with R focuses primarily on providing the reader with a basic understanding of Bayesian thinking and the relevant analytic tools included in R. It does not explore either of those areas in detail, though it does hit The post Bayesian...

Read more »

Bayesian Computation with R – Albert (2009)

October 4, 2011
By

Title: Bayesian Computation with RAuthor(s): Jim AlbertPublisher/Date: Springer/2009Statistics level: High Programming level: Low Overall recommendation: Recommended Bayesian Computation with R focuses primarily on providing the reader with a basic un...

Read more »

Combining Base+Grid Graphics

October 4, 2011
By
Combining Base+Grid Graphics

R provides several frameworks for composing figures. Base graphics is the simplest, grid is more advanced, and the lattice/ggplot packages provide convenient abstractions of the grid graphics system. Multi-element figures can be readily created in base...

Read more »

Simple time series plot using R : Part 2

October 4, 2011
By
Simple time series plot using R : Part 2

I would like to share my experience of plotting different time series in the same plot for comparison. As an assignment I had to plot the time series of Infant mortality rate(IMR) along with the SOX emission(sulphur emission) for the past 5 decades in ...

Read more »

permute: a package for generating restricted permutations

October 4, 2011
By
permute: a package for generating restricted permutations

Multivariate ordination methods are commonly used in ecology to investigate patterns in species composition in space or time. Constrained ordination methods such as redundancy analysis (RDA) and canonical correspondence analysis (CCA) are effectively just multiple regressions, but we lack the … Continue reading →

Read more »

Calculating and graphing within-subject confidence intervals for ANOVA

October 4, 2011
By
Calculating and graphing within-subject confidence intervals for ANOVA

Psychologists are gradually coming round to the view that it is a good idea to present interval estimates alongside point estimates of statistics. The most common statistic reported in psychology research is almost certainly the mean (strictly...

Read more »

Example 9.8: New stuff in SAS 9.3– Bayesian random effects models in Proc MCMC

October 4, 2011
By
Example 9.8: New stuff in SAS 9.3– Bayesian random effects models in Proc MCMC

Rounding off our reports on major new developments in SAS 9.3, today we'll talk about proc mcmc and the random statement.Stand-alone packages for fitting very general Bayesian models using Markov chain Monte Carlo (MCMC) methods have been available for...

Read more »

Tutorial on using the rworldmap package

October 4, 2011
By
Tutorial on using the  rworldmap package

This blog following up my previous oneattempts to explain how the geo-pie map was created. I do not know how to attach a .rflow file in this blog. What you can do is to copy the following code into Notepad … Continue reading →

Read more »

permute: a package for generating restricted permutations

October 4, 2011
By
permute: a package for generating restricted permutations

Multivariate ordination methods are commonly used in ecology to investigate patterns in species composition in space or time. Constrained ordination methods such as redundancy analysis (RDA) and canonical correspondence analysis (CCA) are effectively just multiple regressions, but we lack the parametric theory to adequately test the statistical significance of terms in the model. Other techniques likewise lack the appropriate...

Read more »

How to show explained variance in a multilevel model

October 3, 2011
By
How to show explained variance in a multilevel model

In this post I will show one way to display explained variance using a line chart. There is no default plot for displaying the effect of each factor on deviance of the model, so this is a tentative proposal for my dissertation. The following values were obtained using multilevel models performed in R (thanks for

Read more »

Oracle’s Big Data Appliance to include R

October 3, 2011
By

At the Oracle OpenWorld conference in San Francisco today, Oracle announced the new Oracle Big Data Appliance, "a new engineered system that includes an open source distribution of Apache™ Hadoop™, Oracle NoSQL Database, Oracle Data Integrator Application Adapter for Hadoop, Oracle Loader for Hadoop, and an open source distribution of R." Oracle's foray into the Hadoop and NoSQL spaces...

Read more »

Visualizing Climbing Ropes

October 3, 2011
By
Visualizing Climbing Ropes

Today’s market of climbing ropes offers a huge variety of models to choose from. If you don’t have a good idea of what you want, the task of selecting a rope can become a true challenge, or worst, almost a … Continue reading →

Read more »

The four steps to publication-grade graphics in R

October 3, 2011
By

This Article is also available in EN, ES.

Read more »

OLS beta VS. Robust beta

October 3, 2011
By
OLS beta VS. Robust beta

In financial context, is suppose to reflect the relation between a stock and the general market. A broad based index such as the S&P 500 is often taken as proxy for the general market. The , without getting into too … Continue reading →

Read more »