Twitter Math Puzzle and Solution

July 7, 2011
By

Yesterday I posted a very simple math puzzle to Twitter that I found in Jonathan Baron’s book, Thinking and Deciding. The puzzle is the following: Show that every number of the form ABC,ABC is divisible by 13. The puzzle comes up in Baron’s book as an example of an “insight problem” in which one goes

Read more »

R: calculations involving months

July 7, 2011
By
R: calculations involving months

Ask anyone how much time has elapsed since September last year and they’ll probably start counting on their fingers: “October, November…” and tell you “just over 9 months.” So, when faced as I was today with a data frame (named dates) like this: How to add a 7th column, with the number of months between

Read more »

Things I would tell a budding bioinformatician to learn.

July 7, 2011
By

I recently read Ewan Birney's blog post, which I found echoed a lot of my own thoughts about the use of statistical in computational biology. I thought I would compile my own similar list but for bioinformatics  / computational biology in general. I have not been and in the field as long as Ewan and I certainly still...

Read more »

Descriptive statistics, causal inference, and story time

July 7, 2011
By

Dave Backus points me to this review by anthropologist Mike McGovern of two books by economist Paul Collier on the politics of economic development in Africa. My first reaction was that this was interesting but non-statistical so I’d have to either post it on the sister blog or wait until the 30 days of statistics

Read more »

Necessity to Explain CDS with A Regime Switching Model

Necessity to Explain CDS with A Regime Switching Model

Examining the determinants of credit default swap (CDS) spreads is a hot topic, CDS spread has displayed siginificant regime switching behaviour since the break of credit crisis, which can be seen from the old graph in the post Credit Default Spread a...

Read more »

Call for a Special Topic on Grid and Cloud Computing Methods in Biomedical Research

Today, the AG Statistical Computing released the “Call for a Special Topic on Grid and Cloud Computing” in the Journal “Methods of Information in Medicine”. We are inviting submissions for a special topic of Methods of Information in Medicine on “Grid and Cloud Computing Methods in Biomedical Research“. This special topic call originates from a

Read more »

Use R!

July 7, 2011
By
Use R!

In short: R is a free intuitive programming language that is used by practitioners in a plethora of academic disciplines. Therefore, it is on the cutting edge, and expanding rapidly. It creates stunning visuals, works seamlessly together with LaTeX, has really good online documentation and the community is unparalleled. A week...

Read more »

Rcpp 0.9.5

A maintenance release version 0.9.5 of Rcpp is now on CRAN and in Debian. This release comprises a number of minor fixes, extensions as well as small additions to the documentation and examples which have accumulated since the last release in Apr...

Read more »

Men with Hats

July 6, 2011
By
Men with Hats

Suppose N people (and their hats) attend a party (in the 1950s). For fun, the guests mix their hats in a pile at the center of the room, and each person picks a hat uniformly at random. What is the probability that nobody ends up with their own hat?E...

Read more »

rasterVis

rasterVis

The raster package defines classes and methods for spatial raster data access and manipulation. The new rasterVis package complements raster providing a set of methods for enhanced visualization and interaction. It is now at CRAN. Several examples can ...

Read more »

How Marketo uses Revolution R Enterprise

July 6, 2011
By

Marketo, a leading marketing automation company, relies on data analysis to implement the features in its hosted application that help companies get the most out of their marketing dollar. We've just published a case study about how Marketo uses Revolution R Enterprise and the R language to analyze the massive data sets generated by their customers: “I use it...

Read more »

Importing google news data to R

July 6, 2011
By
Importing google news data to R

I've been playing around lately with the stock market data available from google finance, through quantmod in R. Here's a function I've written (which depends on the R Data Science Toolkit), to pull news stories related to a stock from google, parse t...

Read more »

Early stopping and penalized likelihood

July 6, 2011
By
Early stopping and penalized likelihood

Maximum likelihood gives the beat fit to the training data but in general overfits, yielding overly-noisy parameter estimates that don't perform so well when predicting new data. A popular solution to this overfitting problem takes advantage of the iterative nature of most maximum likelihood algorithms by stopping early. In general, an iterative optimization algorithm goes from a...

Read more »

Yet another way to use R in Excel for .NET programmer

July 6, 2011
By
Yet another way to use R in Excel for .NET programmer

I wrote the article whose title is "Another way to use R in Excel for .NET programmer" last night.In that article, We need to use IDE to write C# program.On ther other hand, Excel-DNA give us easier way to create XLL.Let me show you one ...

Read more »

Artificial intelligence in trading: k-means clustering

July 6, 2011
By
Artificial intelligence in trading: k-means clustering

There is many flavors of artificial intelligence (AI), however I want to show practical example of the cluster analysis. It is very applicable in finance. For example, one of stylized facts of volatility is, that it moves in clusters, meaning that today’s volatility will be more likely as yesterday’s volatility. To gauge these moves you

Read more »

Google Correlate Certainly Does Not Imply Causation

July 6, 2011
By
Google Correlate Certainly Does Not Imply Causation

I recently heard about a new tool called Google Correlate that helps one finds Google search patterns that correspond to (i.e. correlate with) real-world trends.For those that don't get it yet, the tool allows one to type in a search term and the tool ...

Read more »

Color reduction of an image – and Warholize?

July 5, 2011
By
Color reduction of an image – and Warholize?

There seems to be several methods out there for reducing the colors in an image. I became interested in this after pondering how this is done in the excellent freeware program IrfanView. Unfortunately, their method is not described anywhere that I coul...

Read more »

ARMA Models for Trading, Part VI

July 5, 2011
By
ARMA Models for Trading, Part VI

All posts in this series were combined into a single, extended tutorial and posted on my new blog. In the fourth posting in this series, we saw the performance comparison between the ARMA strategy and buy-and-hold over the last approximately 10 years. Over the last few weeks (it does take time, believe me) I back-tested

Read more »

Even faster linear model fits with R using RcppEigen

Linear regression models are a major component of every applied researcher's toolbox. Obtaining results more quickly is therefore of central importance, particularly when many such models have to be fit. Common examples in this context are Monte Carl...

Read more »

A Quantstrat to Build On Part 6

July 5, 2011
By
A Quantstrat to Build On Part 6

THIS IS NOT INVESTMENT ADVICE.  ACTING ON THIS MAY LOSE LOTS OF MONEY. In A Quantstrat to Build on Part 5, I promised some performance reporting on quantstrat portfolios, but then in REIT Momentum in Quantstrat, I discovered it is not nearly as ea...

Read more »

New R User Group in Argentina

July 5, 2011
By

A new local R user group has formed in Buenos Aires, Argentina, under the succinct name ".aR". They're currently putting together the agenda for their first meeting, and are looking for speakers with expertise in the BioConductor project and Finance. If you'd like to join the group, check out the Spanish-language website for .aR, or follow @ar_usergroup on Twitter....

Read more »

Sentiment Analysis for Airlines via Twitter

July 5, 2011
By
Sentiment Analysis for Airlines via Twitter

Last weekend here in the states was the 4th of July long weekend, one of the busier air travel days of the year. As anyone who flies in the States knows, with air travel often comes frustration, and in this social media age many express their frustration on Twitter: The image above comes from a tutorial on text mining...

Read more »

Example 9.1: Scatterplots with binning for large datasets

July 5, 2011
By
Example 9.1: Scatterplots with binning for large datasets

Scatterplots can get very hard to interpret when displaying large datasets, as points inevitably overplot and can't be individually discerned. A number of approaches have been crafted to help with this problem. One approach uses binning. This approa...

Read more »

Another way to use R in Excel for .NET programmer

July 5, 2011
By
Another way to use R in Excel for .NET programmer

As you know, RExcel give us a way to combine R with Excel.But, It just bothering to install some COMs and maybe not be programming but excel manipulation!If you are a .NET programmer, there is another way to call R from Excel.I would like to ...

Read more »

Different goals, different looks: Infovis and the Chris Rock effect

July 5, 2011
By
Different goals, different looks:  Infovis and the Chris Rock effect

Seth writes: Here’s my candidate for bad graphic of the year: I studied it and learned nothing. I have no idea how they assigned colors to locations. I already knew that there were more within-city calls than calls to individual distant locations — for example that there are more SF-SF calls than SF-LA calls.

Read more »

In 4 Steps your Application (including R) is running on a Cloud Computing Cluster

In 4 Steps your Application (including R) is running on a Cloud Computing Cluster

Today, cloud computing is used in many application areas from academic research to industry. Commercial cloud providers as Amazon Web Services (AWS) advertise the simple and fast access to cloud computing resources. Posts in different blogs proof that you can get your application running in the cloud, but it will cost you more than 15

Read more »

Bounded target support

July 4, 2011
By
Bounded target support

Here is an interesting question from Tomàs that echoes a lot of related emails: I’m turning to you for advice. I’m facing problem  where parameter space is bounded, e.g. all parameters have to be positive. If in MCMC as proposal distribution I use normal distribution, then at some iterations I get negative proposals. So my

Read more »

slides from my R tutorial on Twitter text mining #rstats

July 4, 2011
By
slides from my R tutorial on Twitter text mining #rstats

Update: An expanded version of this tutorial will appear in the new Elsevier book Practical Text Mining and Statistical Analysis for Non-structured Text Data Applications by Gary Miner et. al which is now available for pre-order from Amazon. In conjunction with the book, I have cleaned up the tutorial code and published it on github.

Read more »

R.NET

July 4, 2011
By

The R.NET project provides a mechanism for communicating with R from a .NET application. This appears to be a promising way to create simple interfaces to some of the functionality of R. Some examples of using R.NET can be found here and here.

Read more »