find | xargs … Like a Boss

March 9, 2012
By

*Edit March 12* Be sure to look at the comments, especially the commentary on Hacker News - you can supercharge the find|xargs idea by using find|parallel instead.---Do you ever discover a trick to do something better, faster, or easier, and wish you c...

Read more »

Two-minute tutorials for R beginners

March 9, 2012
By

R user Anthony Damico has created "Twotorials": a series of two-minute tutorials for newcomers to R. Topics include how to download and install R, how to do simple arithmetic in r, how to work with data tables in r and many others. The tutorials are especially useful for users of R on Windows, with video demonstrations using the Windows...

Read more »

NIT: Fatty acids study in R – Part 005

March 9, 2012
By
NIT: Fatty acids study in R – Part 005

There are several algorithms to run a PLS regression (I recommend to consult the books: “Introduction to Multivariate Analysis in Chemometrics - Kurt Varmuza & Peter Filzmozer” and “Chemometrics with R – Ron Wehrens”).We are going to use ...

Read more »

Mining Twitter for consumer attitudes towards hotels

March 9, 2012
By
Mining Twitter for consumer attitudes towards hotels

Couple of months back I read Jeffrey Breen’s presentation on mining Twitter for consumer attitudes towards airlines, so I was just curious how it would look if I estimate the sentiment toward major hotels. So here it is: # load twitter library > library(twitteR) # search for all the hilton tweets > hilton.tweets=searchTwitter('@hilton',n=1500) > length(hilton.tweets)

Read more »

Stats 101 resources

March 9, 2012
By
Stats 101 resources

A few friends have asked for self-study resources on learning (or brushing up on) basic statistics. I plan to keep updating this post as I find more good suggestions. Of course the ideal case is to have a good teacher … Continue reading →

Read more »

Big-data Naive Bayes and Classification Trees with R and Netezza

March 8, 2012
By

The IBM Netezza analytics appliances combine high-capacity storage for Big Data with a massively-parallel processing platform for high-performance computing. With the addition of Revolution R Enterprise for IBM Netezza, you can use the power of the R language to build predictive models on Big Data. In the demonstration below, Revolution Analytics' Derek Norton analyzes loan approval data stored on...

Read more »

Montreal R workshop: Plyr, reshape and other data manipulation goodies

March 8, 2012
By
Montreal R workshop: Plyr, reshape and other data manipulation goodies

March 12, 2012 14h-16h N4/17 Stewart Biology Building, McGill University Étienne Low-Decarie, McGill University This workshop is organized by the BGSA and is free of charge (!), but space is limited. Register early to ensure your spot! From Étienne: Ever want to split your data according to factors, apply a function on each part and

Read more »

Experience on using R to build prediction models in business applications

March 8, 2012
By
Experience on using R to build prediction models in business applications

By Yanchang zhao, RDataMining.com Building prediction/classification models is one of the most widely-seen data mining tasks in business applications. To share experience on building prediction models with R, I have started a discussion at RDataMining group on LinkedIn with the … Continue reading →

Read more »

Benford’s Law after converting count data to be in base 5

March 8, 2012
By
Benford’s Law after converting count data to be in base 5

Firstly, I know nothing about election fraud – this isn’t a serious post. But, I do like to do some simple coding. Ben Goldacre posted on using Benford’s Law to look for evidence of Russian election fraud. Then Richie Cotton did the same, but using R. Commenters on both sites suggested that as the data

Read more »

A plot of my citations in Google Scholar vs. Web of Science

March 8, 2012
By
A plot of my citations in Google Scholar vs. Web of Science

There has been some discussion about whether Google Scholar or one of the proprietary software companies numbers are better for citation counts. I personally think Google Scholar is better for a number of reasons: Higher numbers, but consistently/a...

Read more »

Early-March flotsam

March 8, 2012
By
Early-March flotsam

It has been a strange last ten days since we unexpectedly entered grant writing mode. I was looking forward to work on this issue near the end of the year but a likely change on funding agency priorities requires applying … Continue reading →

Read more »

How to create a data frame from text submitted in a textarea with FastRWeb

March 8, 2012
By
How to create a data frame from text submitted in a textarea with FastRWeb

In this article, I show you how to create a data.frame from a text submitted in a textarea field with FastRWeb. Requirements FastRWeb installed Knowledge of webforms ?read.table Experience in HTML5 Submit This example needs two scripts. The first one contains the webform. I wrote a FastRWeb script in order to work in /var/FastRWeb/web.R/. This

Read more »

Labelling panels in R graphics

March 8, 2012
By
Labelling panels in R graphics

Labelling a graphics panel in R is easy right? Sure it is, just use text and define the coordinates. text(x=5, y=10, "a") But is there an easy way to get in the same place all the time, even if you have different axis lengths (e.g. 0 to 5 on the x-axis but 0 to 100

Read more »

NIT: Fatty acids study in R – Part 004

March 7, 2012
By
NIT: Fatty acids study in R – Part 004

It is clear that MSC does not remove the entire scatter in the raw spectra, so some of the information is hidden by the scatter. Improvement of the sample presentation will help to remove the scatter.We know that the first loading is much related to th...

Read more »

Japanese Trade and the Yen

March 7, 2012
By
Japanese Trade and the Yen

I have had the pleasure over the last couple of weeks to help plan the CFA Society of Alabama 2012 Dinner featuring Jim Rogers and Barron’s Senior Editor Jack Willoughby.  The event was fantastic, and I would like to publicly thank Jim Rogers an...

Read more »

How to Import SPSS Data into R

March 7, 2012
By

This video tutorial demonstrates how to import data into R that is currently in SPSS format. The video also shows how to do use a few basic commands on datasets, once they are imported into R. The steps in this video apply whether you are using a Mac o...

Read more »

How to Import SPSS Data into R

March 7, 2012
By

This video tutorial demonstrates how to import data into R that is currently in SPSS format. The video also shows how to do use a few basic commands on datasets, once they are imported into R. The steps in this video apply whether you are using a Mac or a PC/Windows machine. See more videos on www.statsmakemecry.com.

How Not To Draw a Probability Distribution

March 7, 2012
By
How Not To Draw a Probability Distribution

If I google for “probability distribution” I find the following extremely bad picture: It’s bad because it conflates ideas and oversimplifies how variable probability distributions can generally be. Most distributions are not unimodal. Most dist...

Read more »

Philadelphia Schools

March 7, 2012
By
Philadelphia Schools

I'm on spring break, and yesterday I took some time to check off some items on my to-do list, namely:Start getting acquainted with all the new features of ggplot2 .Get a handle on dealing with geographic data in R.I've done some furtive geographic...

Read more »

Setting Up and Customizing R

March 7, 2012
By

For the longest time I resisted customizing R for my particular environment. My philosophy has been that each R script for each separate analysis I do should be self contained such that I can rerun the script from top to bottom on any machine and get the same results. This being said, I have now

Read more »

Strike Zone Changes?

March 7, 2012
By
Strike Zone Changes?

It's been a while since I have posted here. I have been swamped with some papers I am trying to get out, finishing up the dissertation, and interviews (faculty ones in addition to others). I should have some big news in the next couple of weeks regar...

Read more »

Why an inverse-Wishart prior may not be such a good idea

March 7, 2012
By
Why an inverse-Wishart prior may not be such a good idea

While playing around with Bayesian methods for random effects models, it occured to me that inverse-Wishart priors can really bite you in the bum. Inverse Wishart-priors are popular priors over covariance functions. People like them priors because they are conjugate to a Gaussian likelihood, i.e, if you have data with each : so that the

Read more »

ThinkStats … in R :: Example 1.3

March 7, 2012
By

With 1.2 under our belts, we go now to the example in section 1.3 which was designed to show us how to partition a larger set of data into subsets for analysis. In this case, we’re going to jump to example 1.3.2 to determine the number of live births. While the Python loop is easy

Read more »

RcppArmadillo 0.2.36

March 6, 2012
By

RcppArmadillo release 0..2.36 is now on CRAN. It contains just the changes from the new Armadillo release 2.4.4. The NEWS entry below summarises the changes. 0.2.36 2012-03-05 o Upgraded to Armadillo release 2.4.4 * fixes for q...

Read more »

lembarrasduchoix asked: thank you for the introduction to…

March 6, 2012
By
lembarrasduchoix asked:
thank you for the introduction to…

lembarrasduchoix asked: thank you for the introduction to Newcomb’s paradox! Could you do a post on your favorite paradoxes?    The decision theory paradoxes I’m familiar with are: Ellsberg Paradox— Theorists encode bothsituations with unknown...

Read more »

Frustration

March 6, 2012
By

Google has failed me.  Cannot get RMySQL to install on my laptop.  Looks like I am going to need a different method to get data from MySQL into R.If anyone has pointers, I'm all ears.Windows 7 x64, R 2.13.1

Read more »

Using R to Visualizing Information Flows on Wikipedia Talk Pages

March 6, 2012
By
Using R to Visualizing Information Flows on Wikipedia Talk Pages

Wikipedia talk pages allow editors to discuss the evolving content on related Wikipedia articles. Sometimes the topic of a page is controversial and the talk page threads can become heated with different posts invoking a wide range of values in the kinds of appeals they use in their arguments. For example, in one thread you

Read more »

Russian elections

March 6, 2012
By
Russian elections

Just a few words about the Russian election. I read this entry http://www.badscience.net/2012/03/is-there-statistical-evidence-of-fraud-in-the-russian-election-data/ and thought to look for myself. For me it seems the data is not good enough ...

Read more »

Java based GUI for R

March 6, 2012
By

JGR is a pretty nice Java based GUI for R.  The primary reason I like this is that it is truly cross platform, and will work the same for any operating systemAdded benefits are that some packages like rJava and others tend to break on Mac OSX, but...

Read more »