October 2013

The joy of data analysis

October 24, 2013 | Patrick Burns

Music and snow. Poke my eyes out Perhaps your immediate response is: “I’d rather poke my eyes out with a burning stick than do data analysis.” There’s a completely different reaction from a lot of people who have experienced data analysis. Music It’s not entirely clear why ... [Read more...]

ISO Popularity on Flickr Explore

October 24, 2013 | klr

Not finance, but I figured there might be some out there interested in the pictures from Flickr’s Explore.  In addition to amazing photography, there is an abundance of information.  In the short post below, I use R with rCharts, slidify, and Rflickr to take a look at the distribution ... [Read more...]

Fetch Twitter data using R

October 24, 2013 | suresh kumar Gorakala

This short post will explain how you can fetch twitter data using  twitteR & StreamR packages available in R. In order to connect to twitter API, we need to undergo an authentication process known as OAuth explained in my previous post.Twitter data can be fetched from twitter in two ways: ... [Read more...]

The Basics of Encoding Categorical Data for Predictive Models

October 23, 2013 | Max Kuhn

Thomas Yokota asked a very straight-forward question about encodings for categorical predictors: "Is it bad to feed it non-numerical data such as factors?" As usual, I will try to make my answer as complex as possible. (I've heard the old wives tale that eskimos have 180 different words in their language ... [Read more...]

Customize your R session with .Rprofile

October 23, 2013 | David Smith

The .Rprofile file is a great way to customize your R session every time you start it up. You can use it to change R's defaults, define handy command-line functions, automatically load your favourite packages — anything you like! The Getting Genetics Blog has a nice example .Rprofile file to give ... [Read more...]

Update for Backtesting Asset Allocation Portfolios post

October 23, 2013 | systematicinvestor

It was over a year since my original post, Backtesting Asset Allocation portfolios. I have expanded the functionality of the Systematic Investor Toolbox both in terms of optimization functions and helper back-test functions during this period. Today, I want to update the Backtesting Asset Allocation portfolios post and showcase new ... [Read more...]

Numerical computation of quantiles

October 23, 2013 | thiagogm

Recently I had to define a R function that would compute the -th quantile of a continuous random variable based on an user-defined density function. Since the main objective is to have a general function that computes the quantiles for any user-defined density function it needs be done numerically. Problem ... [Read more...]

Overfitted Backtests

October 23, 2013 | klr

It has been a while since I discussed testing for overfitting in backtests.  Since then, Marcos López de Prado and coauthors have done some very thoughtful work (see the bottom), and they even started a blog.  Their newest paper builds on discoveries they made in their earlier work, and ... [Read more...]

Global Migration Flow Table Estimates

October 23, 2013 | gjabel

A few months Demographic Research published my paper on estimating global migration flow tables. In the paper I developed a method to estimate international migrant flows, for which there is limited comparable data, to matches changes in migrant stock data, … Continue reading → [Read more...]

New R package: scholar

October 23, 2013 | James Keirstead

My new R package, scholar, has just been posted on CRAN. The scholar package provides functions to extract citation data from Google Scholar. In addition to retrieving basic information about a single scholar, the package also allows you to compare multiple scholars and predict future h-index values. There’s a ... [Read more...]

Time series plots in R

October 23, 2013 | Gavin L. Simpson

I recently coauthored a couple of papers on trends in environmental data (Curtis & Simpson in press; Monteith et al. in press), which we estimated using GAMs. Both papers included plots like the one shown below wherein we show the estimated trend and associated point-wise 95% confidence interval, plus some other markings. ... [Read more...]

GLM, non-linearity and heteroscedasticity

October 22, 2013 | arthur charpentier

Last week in non-life insurance course, we’ve seen the theory of the Generalized Linear Models, emphasizing the two important components the link function (which is actually the key component in predictive modeling) the distribution, or the variance function Just to illustrate, consider my favorite dataset ­lin.mod = lm(dist~... [Read more...]
1 2 3 4 5 6 12

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)