2250 search results for "Map"

Faster R in Hadoop: rmr 1.3 now available

July 23, 2012
By

The RHadoop project continues the Big Data integration of R and Hadoop, with a new update to its rmr package. Version 1.3 of rmr improves the performance of map-reduce jobs for Hadoop written in R. New features include: An optional vectorized API for efficient R programming when dealing with small records. Fast C implementations for serialization and deserialization from...

Read more »

Modeling Trick: Impact Coding of Categorical Variables with Many Levels

July 23, 2012
By
Modeling Trick: Impact Coding of Categorical Variables with Many Levels

One of the shortcomings of regression (both linear and logistic) is that it doesn’t handle categorical variables with a very large number of possible values (for example, postal codes). You can get around this, of course, by going to another modeling technique, such as Naive Bayes; however, you lose some of the advantages of regression Related posts:

Read more »

R for Ecologists: Phylogenies in R

July 22, 2012
By
R for Ecologists: Phylogenies in R

I’ve only recently begun working from an evolutionary perspective, and I can’t imagine why I haven’t thought about it much before. After all, it comes up in just about everything that we do in ecology. For example, I’m currently feeding … Continue reading →

Read more »

googleVis — where did SYTYCD dancers come from?

July 21, 2012
By

After watching 20 wonderful dancers of the 9th season of So You Think You Can Dance, I have presented a geomap of  the states where they are coming from (click here). Now, I am interested to this show’s history.  I’d like to re-draw the … Continue reading →

Read more »

Community Detection in Networks with R

Community Detection in Networks with R

I mainly post this visualization because I think it’s pretty. It reminds a little of the work by the famous Dutch painter Mondrian. The complete matrix can be found here. The plot is a heatmap of an adjacency matrix generated by a weighted dir...

Read more »

Coke vs Soda vs Pop : Linguistic trends analyzed with Twitter and R

July 19, 2012
By
Coke vs Soda vs Pop : Linguistic trends analyzed with Twitter and R

Growing up in Australia, for me a carbonated drink like Pepsi or Fanta or lemonade was always just a "soft drink". (Also, 'lemonade' in Australia was something different to 'lemonade' in the US; it's something close to 7-Up.) So when I moved to Seattle, it was surprising to me that all such things were called "pop". And then I...

Read more »

Health Care Costs – Part 3, "Why You Are Paying More"

July 19, 2012
By
Health Care Costs – Part 3, "Why You Are Paying More"

Malpractice - A Booming Industry? Perhaps authors Frank Sloan, Randall Bovbjerg and Penny Githens capture it best from their book Insuring Medical Malpractice: "If aging Doctor Kildare were to return to medical practice today, having been...

Read more »

Factor Attribution to improve performance of the 1-Month Reversal Strategy

July 16, 2012
By
Factor Attribution to improve performance of the 1-Month Reversal Strategy

Today I want to show how to use Factor Attribution to boost performance of the 1-Month Reversal Strategy. The Short-Term Residual Reversal by D. Blitz, J. Huij, S. Lansdorp, M. Verbeek (2011) paper presents the idea and discusses the results as applied to US stock market since 1929. To improve 1-Month Reversal Strategy performance authors

Read more »

Example 9.38: dynamite plots, revisited

July 16, 2012
By
Example 9.38: dynamite plots, revisited

Dynamite plots are a somewhat pejorative term for a graphical display where the height of a bar indicates the mean, and the vertical line on top of it represents the standard deviation (or standard error). These displays are commonly found in many scientific disciplines, as a way of communicating group differences in means. Many...

Read more »

Project Euler — problem 14

July 16, 2012
By
Project Euler — problem 14

It’s Monday today! It’s work day! And I’ve already worked on computer for two hours. Time for a break, which is the 14th problem of Project Euler. The following iterative sequence is defined for the set of positive integers: n n/2 (n … Continue reading →

Read more »