Integrate data and reporting on the Web with knitr

September 11, 2012
By

Today's guest post comes from Yihui Xie, author of the knitr package — ed. Hi, this is Yihui Xie, and I'm guest posting on the Revolutions blog to talk about one aspect of the knitr package: how we can integrate data analysis and reporting in R with the Web. This post includes both the work that has been done...

Read more »

Big data analysis, for free, in R (or “How I learned to load, manipulate, and save data using the ff package”)

September 11, 2012
By
Big data analysis, for free, in R (or “How I learned to load, manipulate, and save data using the ff package”)

Before choosing to support the purchase of Statistica at my workplace, I came across the ff package as an option for working with really big datasets (with special attention paid to ff dataframes, or ffdf). It looked like a good … Continue reading →

Read more »

Second Milano R net meeting

September 11, 2012
By
Second Milano R net meeting

September 27, 2012 - 18:00 - 21:00 Fiori Oscuri Bistrot & Bar (www.fiorioscuri.it) Via Fiori Oscuri, 3 - Milano (Zona Brera) Continue reading →

Read more »

Connecting data to the real world – The next sexy job?

September 11, 2012
By
Connecting data to the real world – The next sexy job?

At last week's Royal Statistical Society (RSS) conference Hal Varian, Chief Economist at Google, gave a panel talk about 'Statistics at Google'. Could he get a better audience than the RSS? Hal talked about his career in academia and at Google. He remi...

Read more »

Unit root, or not ? is it a big deal ?

September 10, 2012
By
Unit root, or not ? is it a big deal ?

Consider a time series, generated using set.seed(1) E=rnorm(240) X=rep(NA,240) rho=0.8 X=0 for(t in 2:240){X=rho*X+E} The idea is to assume that an autoregressive model can be considered, but we don't know the value of the parameter. ...

Read more »

Extending Gold time series

September 10, 2012
By
Extending Gold time series

While back-testing trading strategies I want all assets to have long history. Unfortunately, sometimes there is no tradeable stock or ETF with sufficient history. For example, I might use GLD as a proxy for Gold allocation, but GLD is only began trading in November of 2004. We can extend the GLD’s historical returns with its

Read more »

PDF and CDF for normal distributions with R

September 10, 2012
By
PDF and CDF for normal distributions with R

Below, we give the R code to plot the PDF and the CDF for normal distributions. We wish to get charts quite similar to the ones read on Wikipedia (Normal Distribution). The resulting charts are shown at the bottom. Notice that … Continue reading →

Read more »

igraph 0.6 issues: Changed numbering of Vertices

September 10, 2012
By

I have tried one of my previous scripts with an updated igraph version and I got an interesting (pretty much unexpected) error:At type_indexededgelist.c:269 : invalid (odd) length of edges vector, Invalid edge vectorThe problem is that it was a well-te...

Read more »

R-bloggers submission

September 10, 2012
By

I might be the 393rd blogger on the R-bloggers.com :-)  http://www.r-bloggers.com/Wohoo!

Read more »

Item Response Theory: Developing Your Intuition

September 10, 2012
By
Item Response Theory: Developing Your Intuition

Suppose that you accepted my argument from the last two posts on halo effects and bifactor models.  As you might recall, I argued that when respondents complete rating scales, they predominating rely on their generalized impression with a more minor role played by the specific features that the ratings were written to measure.  Consequently, we...

Read more »

Example 10.1: Read a file byte by byte

September 10, 2012
By
Example 10.1: Read a file byte by byte

More and more makers of electronic devices use standard storage media to record data. Sometimes this is central to the device's function, as in a camera, so that the data must be easy to recover. Other times, it's effectively incidental, and the device maker may not provide easy access to the stored data....

Read more »

Population health management with RevoScaleR

September 10, 2012
By

This guest post is by Douglas McNair MD PhD, Engineering Fellow & President, Cerner Math Inc. -- ed. RevoScaleR scaling big-data modeling performance for real-time health data analysis at Cerner The size of data sets is increasing much more rapidly than the speed of cores, of RAM, and of disk drives. This is particularly true of electronic health records...

Read more »

INLA functions (continued)

September 10, 2012
By
INLA functions (continued)

I have polished up one of the two functions I've thought of implementing for INLA and it's now available in the development version of the R package. So if you've got INLA installed in your R version (see how you can do it here, if you don't), you...

Read more »

Progress bars in R using winProgressBar

September 10, 2012
By
Progress bars in R using winProgressBar

Using progress bars in R scripts can provide valuable timing feedback during development and additional polish to final products. winProgressBar and setWinProgressBar are the primary functions for creating progress bars in R. Progress bars, and progress indicators in general, are relatively uncommon in R programming. This makes sense, as they can add bloat and, being design elements, they generally...

Read more »

Progress bars in R using winProgressBar

September 10, 2012
By
Progress bars in R using winProgressBar

Using progress bars in R scripts can provide valuable timing feedback during development and additional polish to final products. winProgressBar and setWinProgressBar are the primary functions for creating progress bars in R. Progress bars, and progress indicators in general, are relatively uncommon in R programming. This makes sense, as they can add bloat and, being The post Progress...

Read more »

Great Circles, Black Holes, and Community Events Part 1 of 3

September 10, 2012
By

About 8 years ago, I was sitting in class listening to a guest lecturer talk about how community events can be described like celestial bodies with their own gravity, where the size and importance of the event would attract more people, from farther away. Much like a black hole, where the bigger the mass of the black hole the...

Read more »

Not fooled by randomness

September 10, 2012
By
Not fooled by randomness

The paper is “Not Fooled by Randomness: Using Random Portfolios to Analyze Investment Funds” by Roberto Stein.  Here is an explanation of the idea of random portfolios. Favorite sentence The real question here is whether we’re actually measuring skill, or these are still measures of performance, so influenced by extraneous factors that the existence of … Continue reading...

Read more »

R Package Vignettes with Markdown

September 10, 2012
By

What is the best resource to learn an R package? Many R users know the almighty question mark ? in R. For example, type ?lm and you will see the documentation of the function lm. If you know nothing about a package, you can take a look at the HTML help...

Read more »

Converting a Markdown File to PDF Using Pandoc

September 9, 2012
By

Working with knitr and markdown is a great way to share quick reports with colleagues, but in cases where IE8 is still the dominant browser, shipping an HTML file with embedded graphics is a non-starter. IE8 does not support the Data URI format used to...

Read more »

Core minus one!

September 9, 2012
By
Core minus one!

Jean-Michel Marin visited me in Paris last week and, besides taking part in Pierre’s PhD defence, we made enough progress to close two more chapters of the new edition of Bayesian Core (soon to be Bayesian Essentials with R!) This follows the good work session we had in Carnon where we also completed two chapters

Read more »

How to embed a Gist in Tumblr

September 9, 2012
By

Here, both for the sake of posterity, as well as example, is an embedded Gist that describes how to embed a Gist in Tumblr: https://gist.github.com/1395926

Read more »

Football predictions display

September 9, 2012
By
Football predictions display

Having looked at the football data earlier, I wanted to look at predictions for new games. This consists of two parts, getting a predictive model, predicting and displaying the predictions. I decided to do this backwards, first to make the displays. Th...

Read more »

Implementing the CountSummary Procedure

Implementing the CountSummary Procedure

In my last post, I described and demonstrated the CountSummary procedure to be included in the ExploringData package that I am in the process of developing.  This procedure generates a collection of graphical data summaries for a count data sequence, based on the distplot, Ord_plot, and Ord_estimate functions from the vcd package.  The distplot function generates both the Poissonness...

Read more »

RInside 0.2.8

September 8, 2012
By

This morning version 0.2.8 of RInside arrived on the CRAN sites. RInside provides a set of convenience classes which facilitate embedding of R inside of C++ applications and programs, using the classes and functions provided by the Rcpp R and C++ in...

Read more »

Using R to connect to a SQL Server and MySQL Database using MS Windows

September 8, 2012
By

Connecting to MySQL and Microsoft SQL Server Connecting to a MySQL database or MS SQL Server from the R environment can be extremely useful.  It allows a researcher direct access to the data without have to first export it from a database and then import it from a csv file or entering it directly into

Read more »

Violence along Mexico’s Southern Border and Central America

September 7, 2012
By
Violence along Mexico’s Southern Border and Central America

Rates for Panama and Nicaragua are from 2009, all other countries 2010. Municipalities which are part of a metro area in Mexico are shown with the metro area homicide rate.Visit the interactive map of homicides Having just posted on violence along Mexico's northern border, I figured it's time to...

Read more »

Big Issue with System Backtests

September 7, 2012
By
Big Issue with System Backtests

Almost always, when I see a system backtested, the backtest assumes a static portfolio with no contributions or withdrawals.  This assumption only covers an extremely limited subset of my clients.  Cash flows in and out of a portfolio or syst...

Read more »

In praise of ProjectTemplate for reproducible research

September 7, 2012
By
In praise of ProjectTemplate for reproducible research

As you might know from some of my previous posts, I’m a big fan of making my scientific work reproducible. My main reasons for being so keen on this are: 1. Reproducibility is key to science – if it can’t be reproduced then it can not be verified (that is, the experiment can’t be tried again

Read more »

Simulation metamodeling with GNU R

September 7, 2012
By
Simulation metamodeling with GNU R

I am one of the organizers of ESSA2013 conference that will take place in September 2013 in Warsaw, Poland. The conference scope is social simulation and in particular methods of statistical analysis of simulation output (metamodeling). As we have just issued Call for Papers for the conference so I decided to post a simple example of a metamodel.Recently I had...

Read more »