Thoughts on SPSS and R Integration

March 10, 2012
By
Thoughts on SPSS and R Integration

As part of considering SPSS as a platform for modeling I wanted to test SPSS’ integration with R. What I found out is getting SPSS to work with R isn’t embarssingly obvious. What’s worse I found it quite difficult to...

Read more »

Slides from today’s Big Data Step-by-Step Tutorials: Infrastructure series and Intro to R+Hadoop with RHadoop’s rmr

March 10, 2012
By
Slides from today’s Big Data Step-by-Step Tutorials: Infrastructure series and Intro to R+Hadoop with RHadoop’s rmr

Slides from the Boston Predictive Analytics Big Data Workshop tutorials: Big Data Step-by-Step: Infrastructure 1/3: Local VM Big Data Step-by-Step: Infrastructure 2/3: Running R and RStudio on EC2 Big Data Step-by-Step: Infrastructure 3/3: Taking it to the cloud... easily... with Whirr Big Data Step-by-Step: Using R & Hadoop (with RHadoop's rmr package)

Read more »

"Fear of floating exchange rate" or "fear of losing international reserves".

March 10, 2012
By
"Fear of floating exchange rate" or "fear of losing international reserves".

We were recently required to do an assignment for the International Finance course where we had to investigate the policy that the emerging economies adopt towards holding international reserves. A recent research paper at the NBER by Joshua Aizen...

Read more »

Detour in taste wordclouds

March 10, 2012
By

I read Mining Twitter for consumer attitudes towards hotels in my feed of R-bloggers. That reminded me that I intended to look at generating wordclouds for salt and MSG at some point. Salt, or sodium is linked to hypertension, which is linked...

Read more »

German train monitor provides access to train delay data

March 10, 2012
By
German train monitor provides access to train delay data

The German newspaper Süddeutsche Zeitung (SZ) worked together with OpenDataCity to create an online train monitor of the German network: Zugmonitor. This is another great example of the new form of data journalism.The project provides access to data o...

Read more »

Recovering Marginal Effects and Standard Errors of Interactions Terms Pt. II: Implement and Visualize

March 9, 2012
By
Recovering Marginal Effects and Standard Errors of Interactions Terms Pt. II: Implement and Visualize

In the last post I presented a function for recovering marginal effects of interaction terms. Here we implement the function with simulated data and plot the results using ggplot2.       #---Simulate Data and Fit a linear model with an...

Read more »

Recovering Marginal Effects and Standard Errors of Interactions Terms Pt. II: Implement and Visualize

March 9, 2012
By
Recovering Marginal Effects and Standard Errors of Interactions Terms Pt. II: Implement and Visualize

In the last post I presented a function for recovering marginal effects of interaction terms. Here we implement the function with simulated data and plot the results using ggplot2.       #---Simulate Data and Fit a linear model with an...

Read more »

Recovering Marginal Effects and Standard Errors of Interactions Terms Pt. II: Implement and Visualize

March 9, 2012
By
Recovering Marginal Effects and Standard Errors of Interactions Terms Pt. II: Implement and Visualize

In the last post I presented a function for recovering marginal effects of interaction terms. Here we implement the function with simulated data and plot the results using ggplot2.       #---Simulate Data and Fit a linear model with an...

Read more »

find | xargs … Like a Boss

March 9, 2012
By

*Edit March 12* Be sure to look at the comments, especially the commentary on Hacker News - you can supercharge the find|xargs idea by using find|parallel instead.---Do you ever discover a trick to do something better, faster, or easier, and wish you c...

Read more »

Two-minute tutorials for R beginners

March 9, 2012
By

R user Anthony Damico has created "Twotorials": a series of two-minute tutorials for newcomers to R. Topics include how to download and install R, how to do simple arithmetic in r, how to work with data tables in r and many others. The tutorials are especially useful for users of R on Windows, with video demonstrations using the Windows...

Read more »

NIT: Fatty acids study in R – Part 005

March 9, 2012
By
NIT: Fatty acids study in R – Part 005

There are several algorithms to run a PLS regression (I recommend to consult the books: “Introduction to Multivariate Analysis in Chemometrics - Kurt Varmuza & Peter Filzmozer” and “Chemometrics with R – Ron Wehrens”).We are going to use ...

Read more »

Mining Twitter for consumer attitudes towards hotels

March 9, 2012
By
Mining Twitter for consumer attitudes towards hotels

Couple of months back I read Jeffrey Breen’s presentation on mining Twitter for consumer attitudes towards airlines, so I was just curious how it would look if I estimate the sentiment toward major hotels. So here it is: # load twitter library > library(twitteR) # search for all the hilton tweets > hilton.tweets=searchTwitter('@hilton',n=1500) > length(hilton.tweets)

Read more »

Stats 101 resources

March 9, 2012
By
Stats 101 resources

A few friends have asked for self-study resources on learning (or brushing up on) basic statistics. I plan to keep updating this post as I find more good suggestions. Of course the ideal case is to have a good teacher … Continue reading →

Read more »

Big-data Naive Bayes and Classification Trees with R and Netezza

March 8, 2012
By

The IBM Netezza analytics appliances combine high-capacity storage for Big Data with a massively-parallel processing platform for high-performance computing. With the addition of Revolution R Enterprise for IBM Netezza, you can use the power of the R language to build predictive models on Big Data. In the demonstration below, Revolution Analytics' Derek Norton analyzes loan approval data stored on...

Read more »

Montreal R workshop: Plyr, reshape and other data manipulation goodies

March 8, 2012
By
Montreal R workshop: Plyr, reshape and other data manipulation goodies

March 12, 2012 14h-16h N4/17 Stewart Biology Building, McGill University Étienne Low-Decarie, McGill University This workshop is organized by the BGSA and is free of charge (!), but space is limited. Register early to ensure your spot! From Étienne: Ever want to split your data according to factors, apply a function on each part and

Read more »

Experience on using R to build prediction models in business applications

March 8, 2012
By
Experience on using R to build prediction models in business applications

By Yanchang zhao, RDataMining.com Building prediction/classification models is one of the most widely-seen data mining tasks in business applications. To share experience on building prediction models with R, I have started a discussion at RDataMining group on LinkedIn with the … Continue reading →

Read more »

Benford’s Law after converting count data to be in base 5

March 8, 2012
By
Benford’s Law after converting count data to be in base 5

Firstly, I know nothing about election fraud – this isn’t a serious post. But, I do like to do some simple coding. Ben Goldacre posted on using Benford’s Law to look for evidence of Russian election fraud. Then Richie Cotton did the same, but using R. Commenters on both sites suggested that as the data

Read more »

A plot of my citations in Google Scholar vs. Web of Science

March 8, 2012
By
A plot of my citations in Google Scholar vs. Web of Science

There has been some discussion about whether Google Scholar or one of the proprietary software companies numbers are better for citation counts. I personally think Google Scholar is better for a number of reasons: Higher numbers, but consistently/a...

Read more »

Early-March flotsam

March 8, 2012
By
Early-March flotsam

It has been a strange last ten days since we unexpectedly entered grant writing mode. I was looking forward to work on this issue near the end of the year but a likely change on funding agency priorities requires applying … Continue reading →

Read more »

How to create a data frame from text submitted in a textarea with FastRWeb

March 8, 2012
By
How to create a data frame from text submitted in a textarea with FastRWeb

In this article, I show you how to create a data.frame from a text submitted in a textarea field with FastRWeb. Requirements FastRWeb installed Knowledge of webforms ?read.table Experience in HTML5 Submit This example needs two scripts. The first one contains the webform. I wrote a FastRWeb script in order to work in /var/FastRWeb/web.R/. This

Read more »

Labelling panels in R graphics

March 8, 2012
By
Labelling panels in R graphics

Labelling a graphics panel in R is easy right? Sure it is, just use text and define the coordinates. text(x=5, y=10, "a") But is there an easy way to get in the same place all the time, even if you have different axis lengths (e.g. 0 to 5 on the x-axis but 0 to 100

Read more »

NIT: Fatty acids study in R – Part 004

March 7, 2012
By
NIT: Fatty acids study in R – Part 004

It is clear that MSC does not remove the entire scatter in the raw spectra, so some of the information is hidden by the scatter. Improvement of the sample presentation will help to remove the scatter.We know that the first loading is much related to th...

Read more »

Japanese Trade and the Yen

March 7, 2012
By
Japanese Trade and the Yen

I have had the pleasure over the last couple of weeks to help plan the CFA Society of Alabama 2012 Dinner featuring Jim Rogers and Barron’s Senior Editor Jack Willoughby.  The event was fantastic, and I would like to publicly thank Jim Rogers an...

Read more »

How to Import SPSS Data into R

March 7, 2012
By

This video tutorial demonstrates how to import data into R that is currently in SPSS format. The video also shows how to do use a few basic commands on datasets, once they are imported into R. The steps in this video apply whether you are using a Mac o...

Read more »

How to Import SPSS Data into R

March 7, 2012
By

This video tutorial demonstrates how to import data into R that is currently in SPSS format. The video also shows how to do use a few basic commands on datasets, once they are imported into R. The steps in this video apply whether you are using a Mac or a PC/Windows machine. See more videos on www.statsmakemecry.com.

How Not To Draw a Probability Distribution

March 7, 2012
By
How Not To Draw a Probability Distribution

If I google for “probability distribution” I find the following extremely bad picture: It’s bad because it conflates ideas and oversimplifies how variable probability distributions can generally be. Most distributions are not unimodal. Most dist...

Read more »

Philadelphia Schools

March 7, 2012
By
Philadelphia Schools

I'm on spring break, and yesterday I took some time to check off some items on my to-do list, namely:Start getting acquainted with all the new features of ggplot2 .Get a handle on dealing with geographic data in R.I've done some furtive geographic...

Read more »

Setting Up and Customizing R

March 7, 2012
By

For the longest time I resisted customizing R for my particular environment. My philosophy has been that each R script for each separate analysis I do should be self contained such that I can rerun the script from top to bottom on any machine and get the same results. This being said, I have now

Read more »

Strike Zone Changes?

March 7, 2012
By
Strike Zone Changes?

It's been a while since I have posted here. I have been swamped with some papers I am trying to get out, finishing up the dissertation, and interviews (faculty ones in addition to others). I should have some big news in the next couple of weeks regar...

Read more »