The police records for 2009 are out.

June 20, 2010
By
The police records for 2009 are out.

The 2009 homicide numbers collected by the SNSP (National System of Public Security) are finally out, you can download the data from the ICESI, which is a civic institution not affiliated with the government.If you remember, one of the conclusio...

Read more »

Here’s the distribution of the first million digits of the…

June 20, 2010
By
Here’s the distribution of the first million digits of the…

Here’s the distribution of the first million digits of the square root of two’s decimal expansion. Number of digits | is:   0's |  99 818  1's |  98 926  2's | 100 442  3's | 100 191  4's | 100 031  5's | 100 059  6's |  99 885  ...

Read more »

QSPR modeling with signatures

June 20, 2010
By
QSPR modeling with signatures

I had to dig deep to find posts on QSAR modeling. There are quite a few on QSAR in Bioclipse, but that focuses on the descriptor calculation. In a quick scan, I could only spot two modeling posts:The CDK/Metabolomics/Chemometrics Unconference resultsWh...

Read more »

R-INLA package

June 19, 2010
By
R-INLA package

Another R package for mixed effect modeling. Looks promising.

Read more »

Estimating Probability of Drawdown

June 19, 2010
By
Estimating Probability of Drawdown

I've shown several examples of how to use LSPM's probDrawdown function as a constraint when optimizing a leverage space portfolio.  Those posts implicitly assume the probDrawdown function produces an accurate estimate of actual drawdo...

Read more »

More powerful iconv in R

June 19, 2010
By

The R function iconv converts between character string encodings, for example, from the locale dependent encoding to UTF-8: > iconv("foo", to="UTF-8") [1] "foo" However, R has long-running trouble with embedded null characters ('') in strings. Hence, if we try to convert to an encoding that permits embedded null characters, iconv will fail: > iconv("foo", to="UTF-16")

Read more »

What I need to know…

June 19, 2010
By

is maps and geographical data representation in R. In case you’re curious too this is a good study material from R-Bloggers : maps ; geographical ; spatial Ok. This could be a tweet rather than a post…

Read more »

ggplot2 GUI progress

June 19, 2010
By
ggplot2 GUI progress

(Written by Ian Fellows) Below is a link to the first of a weekly (or bi-weekly) screen-cast vlog of my progress building a GUI for the ggplot2 package. http://neolab.stat.ucla.edu/cranstats/gsoc_vlog1.mov comments and suggestions are more than welcome, and can e-mailed to me at: [email protected]

Read more »

The perfect fake

June 19, 2010
By
The perfect fake

Usually when you are doing Monte Carlo testing, you want fake data that’s good, but not too good. You may want a sample taken from the Uniform distribution, but you don’t want your values to be uniformly distributed. In other words, if you were to order your sample values from lowest to highest, you don’t

Read more »

Why R doesn’t suck

June 19, 2010
By

I first encountered the R programming language a few years ago when I needed to make some plots. Although I’ve used it occasionally since, I always considered it a sort of “Perl for statisticians” — a useful swiss-army knife with … Continue reading →

Read more »

Those dice aren’t loaded, they’re just strange

June 18, 2010
By
Those dice aren’t loaded, they’re just strange

I must confess to feeling an almost obsessive fascination with intransitive games, dice, and other artifacts. The most famous intransitive game is rock, scissors, paper. Rock beats scissors.  Scissors beats paper. Paper beats rock. Everyone older than 7 seems to know this, but very few people are aware that dice can exhibit this same behavior,

Read more »

Revolution Analytics: Startup to watch

June 18, 2010
By

Jack Germain of LinuxInsider interviewed Revolution CEO Norman Nie for his "Startup to Watch" column. Amongst the topics covered: the R language (Norman: "There are no statistical expressions that can not be written in R"), Revolution's recent name-change and announcement of our development roadmap, and the challenges of competing with SAS and Norman's former company, SPSS. Read the full...

Read more »

The impact of the drug war in Mexico

June 18, 2010
By
The impact of the drug war in Mexico

For the last couple of years, Mexico has been in the midst of an escalating drug war, with violent crime on the upswing in many areas. But tracking the impact quantitatively is difficult: in Mexico, about 85% of crimes go unreported, and corruption leads to inaccurate reporting in some districts. Diego Valle has taken on the task of visualizing...

Read more »

R: Command Line Calculator using Rscript

June 18, 2010
By

I currently use an awesome little bash trick to get a command line calculator that was posted on lifehacker, and that I blogged about previously.calc(){ awk "BEGIN{ print $* }" ;}You just add this to your .bashrc file and then you can use it ...

Read more »

R: Command Line Calculator using Rscript

June 18, 2010
By

I currently use an awesome little bash trick to get a command line calculator that was posted on lifehacker, and that I blogged about previously.calc(){ awk "BEGIN{ print $* }" ;}You just add this to your .bashrc file and then you can use it ...

Read more »

R Commander – linear regression

June 18, 2010
By
R Commander – linear regression

We can fit various linear regression models using the R Commander GUI which also provides various ways to consider the model diagnostics to determine whether we need to consider a different model. Fast Tube by Casper The “Statistics” menu provides access to various statistical models via the “Fit models” sub-menu including:Linear regression – the simplest scenario with

Read more »

Occupational Wage Comparison Plotted in R

June 17, 2010
By
Occupational Wage Comparison Plotted in R

Ever have conversations with your kids about what they are going to do with their life? Still trying to figure out what you are going to do with yours?Best to not starve...The chart above represents the percentage of each occupation that earn a given h...

Read more »

Do Not Log-Transform Count Data, Bitches!

June 17, 2010
By
Do Not Log-Transform Count Data, Bitches!

OK, so, the title of this article is actually Do not log-transform count data, but, as @ascidacea mentioned, you just can’t resist adding the “bitches” to the end. Onwards. If you’re like me, when you learned experimental stats, you were taught to worship at the throne of the Normal Distribution. Always check your data and

Read more »

Stack exchange for statistical analysis needs you!

June 17, 2010
By
Stack exchange for statistical analysis needs you!

The proposal to create a StackExchange site for statistical analysis is steadily moving forward. We have now completed the scoping stage which involved finding enough people willing to express an interest in the idea, and voting on some example questions to define what is allowed and what is not allowed on the site. The on-topic

Read more »

Chart the U.S. Gross National Product with the Federal Reserve API

June 17, 2010
By
Chart the U.S. Gross National Product with the Federal Reserve API

The Federal Reserve of St. Louis has an amazing amount of economic data available through their API. You need to apply for an API key, and once you have been approved you include your API key as URL parameter to access your data. api_key='YOUR API KE...

Read more »

Installing Ruby on Linux as a User other than root

June 17, 2010
By
Installing Ruby on Linux as a User other than root

Ruby is best known as the language behind the rails web application framework. However, it is a very flexible general purpose language that can be used for tasks of direct interest to R Developers (parsing files, interacting with databases, processing...

Read more »

Playing with Primes in R (Part II)

June 17, 2010
By
Playing with Primes in R (Part II)

Popping Part III off the stack—where I ended up unexpectedly discovering that the primes and primlist functions are broken in the schoolmath package on CRAN—let's see what prime numbers look like when computed correctly in R. To do this, I've had to roll my own prime number generating function.Personalizing primes in RFor what I want...

Read more »

Playing with Primes in R (Part II)

June 17, 2010
By
Playing with Primes in R (Part II)

Popping Part III off the stack—where I ended up unexpectedly discovering that the primes and primlist functions are broken in the schoolmath package on CRAN—let's see what prime numbers look like when computed correctly in R. To do this, I've had to roll my own prime number generating function.Personalizing primes in RFor what I want...

Read more »

Messing with R packages

June 17, 2010
By

This was really frustrating. I’m trying to modify a package from Matt Johnson and although I could get the package he sent me to install flawlessly, I couldn’t un-tar it, make a change, re-tar it, and then R CMD INSTALL … Continue reading →

Read more »

Shrinking R’s PDF output

June 17, 2010
By

R is great for graphics, but I've found that the PDF's R produces when drawing large plots can be extremely large. This is especially common when using spplot() to plot a large raster. I've made a 15 page PDF full of rasters that was hundreds of MB in size.  Obviously I don't need all the detail (every pixel of...

Read more »

Shrinking R’s PDF output

June 17, 2010
By
Shrinking R’s PDF output

R is great for graphics, but I've found that the PDF's R produces when drawing large plots can be extremely large. This is especially common when using spplot() to plot a large raster. I've made a 15 page PDF full of rasters that was hundreds of MB in ...

Read more »

A new Q&A website for Data-Analysis (based on StackOverFlow engine) – is waiting for you

June 17, 2010
By
A new Q&A website for Data-Analysis (based on StackOverFlow engine) – is waiting for you

What is the StackOverFlow Q&A website about? StackOverFlow.com (“SO” for short) is a programming Q & A site that’s free. Free to ask questions, free to answer questions, free to read. Free, And fast. For the R community, SO offers a growing database of R related questions and answer (click the link to check them out). You might be asking yourself what’s...

Read more »

Learning R

June 17, 2010
By

When R is brought up as a possibility for doing statistics or data mining or any sort of predictive analytics among non R users, someone will invariably point out that R has a “steep learning curve”, and the response among those gathered usually includes a significant amount of head nodding. Even those who have put in heroic efforts to...

Read more »

Comparing standard R with Revoutions for performance

June 17, 2010
By
Comparing standard R with Revoutions for performance

Following on from my previous post about improving performance of R by linking with optimized linear algebra libraries, I thought it would be useful to try out the five benchmarks Revolutions Analytics have on their Revolutionary Performance pages.

Read more »