## Software Signals

January 7, 2013
This blog post by Sean Taylor generated quite a stir. He discussed the signals one sends by using certain software packages and seems to think that R users are more competent. The reactions ranged from amusement to bashing. In defense of hard to learn statistical tools, i.e. #rstats prsm.tc/gyTBRK <- pretty funny 'who uses what I encourage you...

## Market predictions for year 2013

January 7, 2013
Calibrations of 2013 predictions for 18 equity indices — plus some publicly available predictions. Orientation The distributions are an attempt to see the variability if there were no market-driving news for the whole year. Another way of thinking: mentally moving the distribution to center on a prediction gives a sense of the variability of results … Continue reading...

## Using the Rcpp sugar function clamp

January 7, 2013
Since the 0.10.* release series, Rcpp contains a new sugar function clamp which can be used to limit vectors to both a minimum and maximim value. This recent StackOverflow question permitted clamp to shine. We retake some of the answers, including the ...

## Internal Consistency Reliability in R with Lambda4

January 6, 2013
For the last year I have been developing a package “Lambda4” to improve internal consistency reliability estimation.  In the package’s conception my primary concern centered on H.G. Osburn’s maximized lambda4 estimator.  Despite a very thorough search I could not find a stats package that could utilized Osburn’s method.  I wanted to learn R and so I jumped in and...

## Demonstrating Confidence Intervals with Shiny

January 6, 2013
For the introductory statistic student confidence intervals can seem a daunting concept to grasp.  Quite simply put it is an interval that we have a certain measure of confidence that the population parameter falls into.  The 95% confidence is the most common value chosen in my academic circle.  Nevertheless, many others may be viable as well as long as...

## http://cran.r-project.org/web/packages/Lambda4/index.html

January 6, 2013
http://cran.r-project.org/web/packages/Lambda4/index.html: Our own JackStat (Tyler) published his first package in R.

## Batch forecasting in R

January 6, 2013
I sometimes get asked about forecasting many time series automatically. Here is a recent email, for example: I have looked but cannot find any info on generating forecasts on multiple data sets in sequence. I have been using analysis services for sql server to generate fitted time series but it is too much of a black box (or I...

## Search and replace: Are you tired of nested `ifelse`?

January 6, 2013
It happens all the time: you have a vector of fruits and you want to replace all bananas with apples, all oranges with pineapples, and leave all the other fruits as-is, or maybe change them all to figs. The usual solution? A big old nested `ifelse`: ...

## Demonstrate your R code with an interactive, embeddable Javascript widget

January 6, 2013
Let visitors execute and play with simple R examples right on your web page, thanks to a web service and an embeddable widget provided by the Sage project.

## 2012 Summary and 2013 Plans

January 6, 2013
2012 was a very important year for me. It was my first full year of trading only pure quantitative strategies. It was a very successful year as well, despite the fact that the S&P 500 returned 16% (including dividends) – a tough to beat benchmark. The strategy I use on the SPY, for which I

## Bayesian Classification with Gaussian Process

January 6, 2013
Despite prowess of the support vector machine, it is not specifically designed to extract features relevant to the prediction. For example, in network intrusion detection, we need to learn relevant network statistics for the network defense. In consu...

## More Principal Components Fun

January 6, 2013
Today, I want to continue with the Principal Components theme and show how the Principal Component Analysis can be used to build portfolios that are not correlated to the market. Most of the content for this post is based on the excellent article, “Using PCA for spread trading” by Jev Kuznetsov. Let’s start by loading

## PLS Path Modeling with R: A Comprehensive Tutorial by Gaston Sanchez

January 6, 2013
Gaston Sanchez has just published an online pdf of his new book PLS Path Modeling with R.I have been using Gaston's plspm r package for a couple of years to analyze marketing data.  I started when I needed to test a path model in wh...

## Querying an SQLite database from R

January 6, 2013
You have an SQLite database, perhaps as part of some replication materials, and you want to query it from R. You might want to be able to say: results <- runsql("select * from mytable order by date") and get the results back as an R object. Here's a function to do it. In the following,

## What Are Your Favorite Methodology and Statistics Blogs?

January 6, 2013
I recently searched for a list of the "top statistics blogs" or the "top methodology blogs" and I couldn't find a recent compilation. This contrasts with visualization blogs, which are relatively easily to find (e.g. top visualization blogs). I've decided to initiate the provision of this public good, but would like to draw on others'

January 6, 2013
Update 31 January: I've folded source_GitHubData into the repmis packaged. See this post. Update 7 January 2012: I updated the internal workings of source_GitHubData so that it now relies on httr rather than RCurl. Also it is more directly descended ...

## Sequential testing in a triangle test setting

January 6, 2013
It is well known the binomial test never has an error of exactly 5%. You aim for at most 5%, calculate the number correct to get there and end up with an error of e.g 2%. This is a shame but there is no solution. However, it is also an opportunity; the...

## tolower() – error catching unmappable characters

January 6, 2013
The tolower() function returns an error where it can’t map to the Unicode character set of the input data – a common occurrence when analysing social media data with emoticons. Emoticons are those symbols that are commonly used on mobile phones but aren’t always recognised on all platforms. For example, when converting tweets to @delta

## Performance Benchmark of Running Sum Functions

January 6, 2013
First, let us consider a running sum function in pure R. To get started, I looked at the source code of the TTR package to see the algorithm used in runSum. The runSum function uses a Fortran routine to compute the running/rolling sum of a vector. The ...

## Using the Rcpp Timer

January 6, 2013
Sine the 0.10.2 release, Rcpp contains an internal class Timer which can be used for fine-grained benchmarking. Romain motivated Timer in a post to the mailing * list where Timer is used to measure the different components of the costs of random number...

## The statistics software signal

January 5, 2013
Tyler Cowen links to a post by Sean Taylor, who writes the following about users of R: You are willing to invest in learning something difficult. You do not care about aesthetics, only availability of packages and getting results quickly. To me, R is easy and Sas is difficult. I once worked with some students The post The...

## R/Finance 2013 Call for Papers

January 5, 2013
It’s that time of year again – we’ve just posted our Call for Papers for the R/Finance 2013 conference, which focuses on applied finance using R. This is our fifth annual conference, again organized by a group of R package authors and community contributors and hosted by the International Center for Futures and Derivatives (ICFD)

## Monotonic deshrinking in weighted averaging models

January 5, 2013
$Monotonic deshrinking in weighted averaging models$

Weighted averaging regression and calibration is the most widely used method for developing a palaeolimnological transfer function. Such models are used to reconstruct properties of the past lake environment such as pH, total phosphorus, and water temperature with, it has … Continue reading →

## Infinite generators in R

January 5, 2013
This is first in a series of posts about creating simulations in R. As a foundational discussion, I first look …Continue reading »

## What’s that “pre- and post-multiply” stuff?

January 5, 2013
Often in SEM scripts you will see matrices being pre- and post-multiplied by some other matrix. For instance, this figures in scripts computing the genetic correlation between variables. How does pre- and post-multiplying a variance...

## National identification number: Finland part 3

January 5, 2013
Last part of our short series about the Finnish social security number (Fssn). You can check part 1 here, and part 2 here. This last post we are interested in generating random Fssn's. This has no real world applications. It is just a coding excercis...