An Image Crossfader Function

November 10, 2011
By
An Image Crossfader Function

Some project offspin, the jpgfader-function (the jpgfader-function in funny use can be viewed HERE):Read more »

Read more »

In case you missed it: October Roundup

November 10, 2011
By

In case you missed them, here are some articles from October of particular interest to R users. The creator of the ggplot2 package, Hadley Wickham, shares details on some forthcoming big-data graphics functions (based on research sponsored by Revolution Analytics). A list of several dozen free data sources that can easily be imported into R. Bob Muenchen gave a...

Read more »

Code optimization, an Rcpp solution

November 10, 2011
By

Tony Breyal woke up an old code optimization problem in this blog post, so I figured it was time for an Rcpp based solution This solutions moves down Henrik Bengtsson's idea (which was at the basis of attempt 10) down to C++. The idea was to call sprintf less than the other solutions to generate the strings...

Read more »

Expected Salary by Major

November 10, 2011
By
Expected Salary by Major

In this recent editorial about the Occupy Wall Street movement, Richard Kim profiles a protestor that despite having a master’s degree can’t find a job. This particular protestor quit his job as a school teacher three years ago and took out a $...

Read more »

Facebook Graph API Explorer with R

November 10, 2011
By
Facebook Graph API Explorer with R

I wanted to play around with the Facebook Graph API  using the Graph API Explorer page as a coding exercise. This facility allows one to use the API with a temporary authorisation token. Now, I don’t know how to make an R package for the proper API where you have to register for an API key and

Read more »

Bridge and Torch problem in R

November 10, 2011
By
Bridge and Torch problem in R

A couple months ago I came across the bridge and torch problem at a careers fair in Oxford. A young tech company called QuBit used it as a brain teaser challenge for would be software engineers to solve before submitting … Continue reading →

Read more »

Diagram for a Bernoulli process (using R)

November 10, 2011
By
Diagram for a Bernoulli process (using R)

A Bernoulli process is a sequence of Bernoulli trials (the realization of n binary random variables), taking two values (0/1, Heads/Tails, Boy/Girl, etc…). It is often used in teaching introductory probability/statistics classes about the binomial distribution. When visualizing a Bernoulli process, it is common to use a binary tree diagram in order to show the Read more...

Read more »

Managing a Local R Repository

November 10, 2011
By

I will be teaching a workshop on R and LaTeX at NEAIR in just under a month. One of the issues I will encounter is a lack of Internet access. I also work with restricted data from NCES which requires the computer to be secured including no network a...

Read more »

R 101: The Subset Function

November 9, 2011
By

The subset function is available in base R and can be used to return subsets of a vector, martix, or data frame which meet a particular condition. In my three years of using R, I have repeatedly used the subset() function and believe that it is the most useful tool for selecting elements of a

Read more »

Geometric Efficient Frontier

November 9, 2011
By
Geometric Efficient Frontier

What is important for an investor? The rate of return is at the top of the list. Does the expected rate of return shown on the mean-variance efficient frontier paints the full picture? If investor’s investment horizon is longer than one period, for example 5 years, than the true measure of portfolio performance is Geometric

Read more »

Suggest some R tasks for high-schoolers

November 9, 2011
By

Many high-schoolers are now using R in class, and to help even more students get exposure to R (while improving R itself), Virgilio Gómez-Rubio is seeking suggestions for projects for the next Google Code-In: An application has been put forward for R to participate in Google Code-in. This is a Google's contest to introduce pre-university students (age 13-18) to...

Read more »

Add Transparency to JPEG – Yes, We Can!

November 9, 2011
By
Add Transparency to JPEG – Yes, We Can!

...Just read you JPEG and add an alpha channel manually, then assign values for transparency. Of course for printing you need to use a device that accepts alpha.See how it's done HERE.

Read more »

Getting Started With Twitter Analysis in R

November 9, 2011
By
Getting Started With Twitter Analysis in R

Earlier today, I saw a post vis the aggregating R-Bloggers service a post on Using Text Mining to Find Out What @RDataMining Tweets are About. The post provides a walktrhough of how to grab tweets into an R session using the twitteR library, and then do some text mining on it. I’ve been meaning to

Read more »

R-Function GScholarScraper to Webscrape Google Scholar Search Result

November 9, 2011
By
R-Function GScholarScraper to Webscrape Google Scholar Search Result

Based on my previous post on Web Scraping I coded and uploaded the Function "GScholarScraper" HERE for testing!The function will pull all (!) results, processing pages in chunks of 100 results/titles, and return a file with all titles, links, etc. It w...

Read more »

CloudStat: Learn & Do R on the Cloud CloudStat is a platform…

November 8, 2011
By

CloudStat: Learn & Do R on the Cloud CloudStat is a platform to learn and do R on the Cloud. With CloudStat, there is no more download, installation, update and maintenance. CloudStat decrease the R language learning curve besides collaboration. And it...

Read more »

project euler – problem 49

November 8, 2011
By

The arithmetic sequence, 1487, 4817, 8147, in which each of the terms increases by 3330, is unusual in two ways: (i) each of the three terms are prime, and, (ii) each of the 4-digit numbers are permutations of one another. There are no arithmetic sequences made up of three 1-, 2-, or 3-digit primes,...

Read more »

Lending Club – naive data analysis

November 8, 2011
By
Lending Club – naive data analysis

Dataspora recently analyzed Lending Club‘s data in a geographical way using the data distributed by the site. Lending Club is an online financial community that brings together creditworthy borrowers and savvy investors so that both can benefit financially. We replace the high cost and complexity of bank lending with a faster, smarter way to borrow

Read more »

Web Scraping Google Scholar: Part 2 (Complete Success)

November 8, 2011
By
Web Scraping Google Scholar: Part 2 (Complete Success)

This is a followup to a post I uploaded earlier today about web scraping data off Google Scholar. In that post I was frustrated because I’m not smart enough to use xpathSApply to get the kind of results I wanted. However fast-forward to the evening whilst having dinner with a friend, as a passing remark,

Read more »

What the frack? Does hydraulic fracturing lead to increased earthquakes?

November 8, 2011
By
What the frack?  Does hydraulic fracturing lead to increased earthquakes?

Earthquakes are normal occurrences along the boundaries of major plate margins, such as along the San Andreas fault system of California,  and are less common within plate interiors.  Try telling that, however, to the citizens of Oklahoma who...

Read more »

Three free books on R for Statistics

November 8, 2011
By

Avril Coghlan, a lecturer at University College Cork in Ireland, has written and made available for free three books ideal for students or practitioners new to R who want to use it for multivariate analysis, time series analysis or biomedical statistics. Each book begins with practical advice for installing and using R in general, before diving into their specialized...

Read more »

Error Handling in Lyx & Sweave: using Quantmod (and R, of course)

November 8, 2011
By

I do reports for clients with LyX and Sweave. It took me an extremely long time to get them working, but now that they’re working I can do more in an hour and thus charge more per hour. (Which is, like, the point.) If you’re not familiar, here’s ...

Read more »

Error Handling in Lyx & Sweave: using Quantmod (and R, of course)

November 8, 2011
By

I do reports for clients with LyX and Sweave. It took me an extremely long time to get them working, but now that they’re working I can do more in an hour and thus charge more per hour. If you’re not familiar, here’s a rundown: LaTeX is the stand...

Read more »

Using Text Mining to Find Out What @RDataMining Tweets are About

November 8, 2011
By
Using Text Mining to Find Out What @RDataMining Tweets are About

This post shows an example on text mining of Twitter data with R packages twitteR, tm and wordcloud. Package twitteR provides access to Twitter data, tm provides functions for text mining, and wordcloud visualizes the result with a word cloud. … Continue reading →

Read more »

readGrads – An R package to read and manipulate grads data

November 8, 2011
By

I created an R package to read grads data. As far as I know, there is no dedicated package to read grads data. The package is still quite new, any remarks on the documentation or code are more than welcome.… See more ›

Read more »

Setting up AWS Cluster to use snow in R

November 8, 2011
By
Setting up AWS Cluster to use snow in R

Setting up AWS Cluster I wanted to setup an AWS cluster to take a shot at a Kaggle contest – DunnHumby Challenge http://www.kaggle.com/c/dunnhumbychallenge For this, I found StarCluster to be of great help. It allows you to set-up AWS nodes in a few lines of code and does much more (choosing AMIs and cluster configurations)

Read more »

Web Scraping Google Scholar (Partial Success)

November 8, 2011
By

I wanted to scrape the information returned by a Google Scholar web search into an R data frame as a quick XPath exercise. The following will successfully extract  the ‘title’, ‘url’ , ‘publication’ and ‘description’.  If any of these fields are not available, as in the case of a citation, the corresponding cell in the data

Read more »

Bridge and Torch problem in R

November 8, 2011
By
Bridge and Torch problem in R

A couple months ago I came across the bridge and torch problem at a careers fair in Oxford. A young tech company called QuBit used it as a brain teaser challenge for would be software engineers to solve before submitting … Continue reading →

Read more »

Example 9.13: Negative binomial regression with proc mcmc

November 8, 2011
By
Example 9.13: Negative binomial regression with proc mcmc

In practice, data that derive from counts rarely seem to be fit well by a Poisson model; one more flexible alternative is a negative binomial model. In this SAS-only entry, we discuss how proc mcmc can be used for estimation. An overview of support f...

Read more »

Blankety Blank

November 8, 2011
By
Blankety Blank

The erstwhile big 4 all blanked their opponents last Saturday and a poster on the Guardian wondered when was the previous occasion of such an occurrence. It’s a pretty simple procedure in SQL using a subquery, but in the spirit of learning R, I thought I would tackle the problem in that language, with the

Read more »