Trading Strategy 1: What goes up, goes up…

June 26, 2013
By
Trading Strategy 1: What goes up, goes up…

As I said earlier, my main task at my internship is to hunt for profitable strategies. As you can imagine, strategies can range from the exceedingly simple and easy to implement, to the crazily complex. Let’s start out with one … Continue reading →

Read more »

Looking out for volatility

June 26, 2013
By
Looking out for volatility

Let’s do an easy experiment. Lets caluclate the 25-day rolling volatility of the S&P 500 from 2007 onwards. 1-Get the data: getSymbols(‘SPY’,from=’2007/01/01′) 2-Run the volatility function from the package TTR (comes along with quantmod): vol=volatility(SPY,n=25,N=252,calc=’close’) #n=25 means we want 25 … Continue reading →

Read more »

Using R: Two plots of principal component analysis

June 26, 2013
By
Using R: Two plots of principal component analysis

PCA is a very common method for exploration and reduction of high-dimensional data. It works by making linear combinations of the variables that are orthogonal, and is thus a way to change basis to better see patterns in data. You either do spectral decomposition of the correlation matrix or singular value decomposition of the data

Read more »

Technical(and not technical) strategy testing

June 25, 2013
By
Technical(and not technical) strategy testing

I got "hooked" on OOP approach of R in particular reference classes. And after my last little project on option scenario analysis I reconstructed my messy technical strategy testing code.Now to begin I would like to reason why I have done this while there exists a nice "blotter" and "quantstrat" packages.First of all "quantstrat" is faster than blotter, which...

Read more »

Natural Language Processing Tutorial

June 25, 2013
By
Natural Language Processing Tutorial

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. The goal is to provide a reasonable baseline on top of which more complex natural language processing can be done, and...

Read more »

Natural language processing tutorial

June 25, 2013
By
Natural language processing tutorial

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. The goal is to provide a reasonable baseline on top of which more complex natural language processing can be...

Read more »

My Talk at Boston Python

June 25, 2013
By

I just gave a talk at Boston Python about natural language processing in general, and edX ease and discern in specific. You can find the presentation source here, and the web version of it here. There is a video of it here. Nelle Varoquaux and Micha...

Read more »

My talk at Boston Python

June 25, 2013
By

I just gave a talk at Boston Python about natural language processing in general, and edX ease and discern in specific. You can find the presentation source here, and the web version of it here. There is a video of it here. Nelle Varoquaux and Micha...

Read more »

rClr: low level access to .NET from R

June 25, 2013
By
rClr: low level access to .NET from R

rClr is a package to access arbitrary .NET code seamlessly. The "CLR" acronym part of the package name stands for Common Language Runtime. C# and R being languages I regularly use, I have felt the need for better interoperability between these for a fe...

Read more »

Split violin plots

June 25, 2013
By
Split violin plots

(This article was first published on Ecology in silico, and kindly contributed to R-bloggers) Violin plots are useful for comparing distributions. When data are grouped by a factor with two levels (e.g. males and females), you can split the violins in half to see the difference between groups. Consider a 2 x 2 factorial experiment: treatments A and B...

Read more »

Sample size calculations equivalent to Stata functions

June 25, 2013
By

<p>Loading ...</p>

Read more »

A comprehensive guide to time series plotting in R

June 25, 2013
By
A comprehensive guide to time series plotting in R

As R has evolved over the past 20 years its capabilities have improved in every area. The visual display of time series is no exception: as the folks from Timely Portfolio note that: Through both quiet iteration and significant revolutions, the volunteers of R have made analyzing and charting time series pleasant. R began with the basics, a simple...

Read more »

Natural Language Processing Tutorial

June 25, 2013
By
Natural Language Processing Tutorial

Introduction This will serve as an introduction to natural language processing. I adapted it from slides for a recent talk at Boston Python. We will go from tokenization to feature extraction to creating a model using a machine learning algorithm. The goal is to provide a reasonable baseline on top of which more complex natural language processing can be done, and...

Read more »

My Talk at Boston Python

June 25, 2013
By

I just gave a talk at Boston Python about natural language processing in general, and edX ease and discern in specific. You can find the presentation source here, and the web version of it here. There is a video of it here. Nelle Varoquaux and Michael ...

Read more »

Getting started with R

June 25, 2013
By
Getting started with R

I wanted to avoid advanced topics in this post and focus on some “blocking and tackling” with R in an effort to get novices started.  This is some of the basic code I found useful when I began using R just over 6 weeks ago. Reading in data from a .csv file is a breeze with this command. > data =...

Read more »

The Dream 8 Challenges

June 25, 2013
By
The Dream 8 Challenges

The 8th iteration of the DREAM Challenges are underway. DREAM is something like the Kaggle of computational biology with an open science bent. Participating teams apply machine learning and statistical modeling methods to biological problems, competing to achieve the best predictive accuracy. This year's three challenges focus on reverse engineering cancer, toxicology and the kinetics of...

Read more »

Three Ways to Run Bayesian Models in R

June 25, 2013
By
Three Ways to Run Bayesian Models in R

There are different ways of specifying and running Bayesian models from within R. Here I will compare three different methods, two that relies on an external program and one that only relies on R. I won’t go into much detail about the differences in syntax, the idea is more to give a gist about how the different modeling languages...

Read more »

Exploratory Data Analysis: 2 Ways of Plotting Empirical Cumulative Distribution Functions in R

Exploratory Data Analysis: 2 Ways of Plotting Empirical Cumulative Distribution Functions in R

Introduction Continuing my recent series on exploratory data analysis (EDA), and following up on the last post on the conceptual foundations of empirical cumulative distribution functions (CDFs), this post shows how to plot them in R.  (Previous posts in this series on EDA include descriptive statistics, box plots, kernel density estimation, and violin plots.) I

Read more »

Predicting spatial locations using point processes

June 25, 2013
By
Predicting spatial locations using point processes

I’ve uploaded a draft tutorial on some aspects of prediction using point processes. I wrote it using R-Markdown, so there’s bits of R code for readers to play with. It’s hosted on Rpubs, which turns out to be a great deal more convenient than WordPress for that sort of thing.

Read more »

-omics in 2013

June 24, 2013
By
-omics in 2013

Just how many (bad) -omics are there anyway? Let’s find out. 1. Get the raw data It would be nice if we could search PubMed for titles containing all -omics: However, we cannot since leading wildcards don’t work in PubMed search. So let’s just grab all articles from 2013: and save them in a format

Read more »

Visualising Crime Hotspots in England and Wales using {ggmap}

June 24, 2013
By
Visualising Crime Hotspots in England and Wales using {ggmap}

Two weeks ago, I was looking for ways to make pretty maps for my own research project. A quick search led me to some very informative blog posts by Kim Gilbert, David Smith and Max Marchi. Eventually, I Google'd the excellent crime weather map exa...

Read more »

Comparing the speed of pqR with R-2.15.0 and R-3.0.1

June 24, 2013
By
Comparing the speed of pqR with R-2.15.0 and R-3.0.1

As part of developing pqR, I wrote a suite of speed tests for R. Some of these tests were used to show how pqR speeds up simple real programs in my post announcing pqR, and to show the speed-up obtained with helper threads in pqR on systems with multiple processor cores. However, most tests in

Read more »

Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

Exploratory Data Analysis: Conceptual Foundations of Empirical Cumulative Distribution Functions

Introduction Continuing my recent series on exploratory data analysis (EDA), this post focuses on the conceptual foundations of empirical cumulative distribution functions (CDFs); in a separate post, I will show how to plot them in R.  (Previous posts in this series include descriptive statistics, box plots, kernel density estimation, and violin plots.) To give you

Read more »

Merging Data — SAS, R, and Python

June 24, 2013
By
Merging Data — SAS, R, and Python

On analyticbridge, the question was posed about moving an inner join from Excel (which was taking many minutes via VLOOKUP()) to some other package.  The question asked what types of performance can be expected in other systems.  Of the list ...

Read more »

Rcpp 0.10.4

June 24, 2013
By

A new version of Rcpp is now on the CRAN network for GNU R; binaries for Debian have been uploaded as well. This release brings a fairly large number of fixes and improvements across a number of Rcpp features, see below for the detailed list. We a...

Read more »

A beer recommendation system made with R

June 24, 2013
By
A beer recommendation system made with R

If you know a beer you like and want some recommendations for a style of beer to try, check out the yhat Beer Recommender: This neat little app is the product of a recommendation system built using the R language by the folks behind the yhat blog. It's based on about 1.5 million beer reviews from the Beer Advocate....

Read more »

My Stat Bytes talk, with slides and code

June 24, 2013
By

On Thursday of last week I gave a short informal talk to Stat Bytes, the CMU Statistics department's twice a month computing seminar. Quick tricks for faster R code: Profiling to Parallelism Abstract: I will present a grab bag of tricks to speed up your R code. Topics will include: installing an optimized BLAS, how

Read more »

My Stat Bytes talk, with slides and code

June 24, 2013
By

On Thursday of last week I gave a short informal talk to Stat Bytes, the CMU Statistics department‘s twice a month computing seminar. Quick tricks for faster R code: Profiling to Parallelism Abstract: I will present a grab bag of … Continue reading →

Read more »

Opel Corsa Diesel Usage

June 24, 2013
By
Opel Corsa Diesel Usage

I wanted to extend my car weight distribution calculation of June 16 from only 2000 to years 2000 to 2013. Unfortunately, come Sunday afternoon the code seemed too slow and not even the beginning of a post. So, I went on to another calculation I w...

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



http://www.eoda.de





ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.