Experience on using R to build prediction models in business applications

March 8, 2012
By
Experience on using R to build prediction models in business applications

By Yanchang zhao, RDataMining.com Building prediction/classification models is one of the most widely-seen data mining tasks in business applications. To share experience on building prediction models with R, I have started a discussion at RDataMining group on LinkedIn with the … Continue reading →

Read more »

Benford’s Law after converting count data to be in base 5

March 8, 2012
By
Benford’s Law after converting count data to be in base 5

Firstly, I know nothing about election fraud – this isn’t a serious post. But, I do like to do some simple coding. Ben Goldacre posted on using Benford’s Law to look for evidence of Russian election fraud. Then Richie Cotton did the same, but using R. Commenters on both sites suggested that as the data

Read more »

A plot of my citations in Google Scholar vs. Web of Science

March 8, 2012
By
A plot of my citations in Google Scholar vs. Web of Science

There has been some discussion about whether Google Scholar or one of the proprietary software companies numbers are better for citation counts. I personally think Google Scholar is better for a number of reasons: Higher numbers, but consistently/a...

Read more »

Early-March flotsam

March 8, 2012
By
Early-March flotsam

It has been a strange last ten days since we unexpectedly entered grant writing mode. I was looking forward to work on this issue near the end of the year but a likely change on funding agency priorities requires applying … Continue reading →

Read more »

How to create a data frame from text submitted in a textarea with FastRWeb

March 8, 2012
By
How to create a data frame from text submitted in a textarea with FastRWeb

In this article, I show you how to create a data.frame from a text submitted in a textarea field with FastRWeb. Requirements FastRWeb installed Knowledge of webforms ?read.table Experience in HTML5 Submit This example needs two scripts. The first one contains the webform. I wrote a FastRWeb script in order to work in /var/FastRWeb/web.R/. This

Read more »

Labelling panels in R graphics

March 8, 2012
By
Labelling panels in R graphics

Labelling a graphics panel in R is easy right? Sure it is, just use text and define the coordinates. text(x=5, y=10, "a") But is there an easy way to get in the same place all the time, even if you have different axis lengths (e.g. 0 to 5 on the x-axis but 0 to 100

Read more »

NIT: Fatty acids study in R – Part 004

March 7, 2012
By
NIT: Fatty acids study in R – Part 004

It is clear that MSC does not remove the entire scatter in the raw spectra, so some of the information is hidden by the scatter. Improvement of the sample presentation will help to remove the scatter.We know that the first loading is much related to th...

Read more »

Japanese Trade and the Yen

March 7, 2012
By
Japanese Trade and the Yen

I have had the pleasure over the last couple of weeks to help plan the CFA Society of Alabama 2012 Dinner featuring Jim Rogers and Barron’s Senior Editor Jack Willoughby.  The event was fantastic, and I would like to publicly thank Jim Rogers an...

Read more »

How to Import SPSS Data into R

March 7, 2012
By

This video tutorial demonstrates how to import data into R that is currently in SPSS format. The video also shows how to do use a few basic commands on datasets, once they are imported into R. The steps in this video apply whether you are using a Mac o...

Read more »

How to Import SPSS Data into R

March 7, 2012
By

This video tutorial demonstrates how to import data into R that is currently in SPSS format. The video also shows how to do use a few basic commands on datasets, once they are imported into R. The steps in this video apply whether you are using a Mac or a PC/Windows machine. See more videos on www.statsmakemecry.com.

How Not To Draw a Probability Distribution

March 7, 2012
By
How Not To Draw a Probability Distribution

If I google for “probability distribution” I find the following extremely bad picture: It’s bad because it conflates ideas and oversimplifies how variable probability distributions can generally be. Most distributions are not unimodal. Most dist...

Read more »

Philadelphia Schools

March 7, 2012
By
Philadelphia Schools

I'm on spring break, and yesterday I took some time to check off some items on my to-do list, namely:Start getting acquainted with all the new features of ggplot2 .Get a handle on dealing with geographic data in R.I've done some furtive geographic...

Read more »

Setting Up and Customizing R

March 7, 2012
By

For the longest time I resisted customizing R for my particular environment. My philosophy has been that each R script for each separate analysis I do should be self contained such that I can rerun the script from top to bottom on any machine and get the same results. This being said, I have now

Read more »

Strike Zone Changes?

March 7, 2012
By
Strike Zone Changes?

It's been a while since I have posted here. I have been swamped with some papers I am trying to get out, finishing up the dissertation, and interviews (faculty ones in addition to others). I should have some big news in the next couple of weeks regar...

Read more »

Why an inverse-Wishart prior may not be such a good idea

March 7, 2012
By
Why an inverse-Wishart prior may not be such a good idea

While playing around with Bayesian methods for random effects models, it occured to me that inverse-Wishart priors can really bite you in the bum. Inverse Wishart-priors are popular priors over covariance functions. People like them priors because they are conjugate to a Gaussian likelihood, i.e, if you have data with each : so that the

Read more »

ThinkStats … in R :: Example 1.3

March 7, 2012
By

With 1.2 under our belts, we go now to the example in section 1.3 which was designed to show us how to partition a larger set of data into subsets for analysis. In this case, we’re going to jump to example 1.3.2 to determine the number of live births. While the Python loop is easy

Read more »

RcppArmadillo 0.2.36

March 6, 2012
By

RcppArmadillo release 0..2.36 is now on CRAN. It contains just the changes from the new Armadillo release 2.4.4. The NEWS entry below summarises the changes. 0.2.36 2012-03-05 o Upgraded to Armadillo release 2.4.4 * fixes for q...

Read more »

lembarrasduchoix asked: thank you for the introduction to…

March 6, 2012
By
lembarrasduchoix asked:
thank you for the introduction to…

lembarrasduchoix asked: thank you for the introduction to Newcomb’s paradox! Could you do a post on your favorite paradoxes?    The decision theory paradoxes I’m familiar with are: Ellsberg Paradox— Theorists encode bothsituations with unknown...

Read more »

Frustration

March 6, 2012
By

Google has failed me.  Cannot get RMySQL to install on my laptop.  Looks like I am going to need a different method to get data from MySQL into R.If anyone has pointers, I'm all ears.Windows 7 x64, R 2.13.1

Read more »

Using R to Visualizing Information Flows on Wikipedia Talk Pages

March 6, 2012
By
Using R to Visualizing Information Flows on Wikipedia Talk Pages

Wikipedia talk pages allow editors to discuss the evolving content on related Wikipedia articles. Sometimes the topic of a page is controversial and the talk page threads can become heated with different posts invoking a wide range of values in the kinds of appeals they use in their arguments. For example, in one thread you

Read more »

Russian elections

March 6, 2012
By
Russian elections

Just a few words about the Russian election. I read this entry http://www.badscience.net/2012/03/is-there-statistical-evidence-of-fraud-in-the-russian-election-data/ and thought to look for myself. For me it seems the data is not good enough ...

Read more »

Java based GUI for R

March 6, 2012
By

JGR is a pretty nice Java based GUI for R.  The primary reason I like this is that it is truly cross platform, and will work the same for any operating systemAdded benefits are that some packages like rJava and others tend to break on Mac OSX, but...

Read more »

Big Data Analytics to Revolutionize Services

March 6, 2012
By

Revolution Analytics' CEO Dave Rich was interviewed by Wikibon's David Vellante and SiliconAngle's John Furrier at the Strata 2012 conference last week. Given Dave's background at Accenture Analytics, the conversation naturally turned to impact of predictive analytics and R on business services. (See the video after the jump, below.) Bret Latmore of SiliconANGLE provides highlights of the interview, including...

Read more »

Ggplot2 Notes

March 6, 2012
By

During the time I have used R the base graphics package has met my needs, although I have been aware of ggplot2 but found learning it a bit of a struggle so have pretty much ignored it until now. Most … Continue reading →

Read more »

Missing my Statsy Goodness? Check out #SciFund!

March 6, 2012
By
Missing my Statsy Goodness? Check out #SciFund!

I know, I know, I have been kinda lame about posting here lately. But that’s because my posting muscle has been focused on the new analyses for what makes a succesful #SciFund proposal. I’ve been posting them at the #SciFund blog under the Analysis tag – so check it out. There’s some fun stats, and

Read more »

Pathway Analysis for High-Throughput Genomics Studies

March 6, 2012
By
Pathway Analysis for High-Throughput Genomics Studies

I get a lot of requests in the core about running a "pathway analysis." Someone ran a handful of gene expression arrays, or better yet, ran an RNA-seq experiment (with replicates!). These, and many other kinds of high-throughput assays (GWAS, ChIP-seq,...

Read more »

R101

March 6, 2012
By
R101

I’m preparing “R101,” an introductory workshop on the statistical software R. Perhaps other beginners might find some use in the following summary and resources. (See also the post on resources for teaching yourself introductory statistics.) Do you have obligatory screenshots … Continue reading →

Read more »

Screencast: The Making of 17018d5488

March 6, 2012
By
Screencast: The Making of 17018d5488

The following screencast demonstrates the use of R, the quantmod R package and bash to process SPX data from 1950. An explanation on how to access a git repository that includes the plots and the R console history is also provided. This screencast prod...

Read more »

Multiple Factor Model – Building 130/30 Index

March 5, 2012
By
Multiple Factor Model – Building 130/30 Index

Nico brought to my attention the 130/30: The New Long-Only (2008) by A. Lo, P. Patel paper in his comment to the Multiple Factor Model – Building CSFB Factors post. This paper presents a very detailed step by step guide to building 130/30 Index using average CSFB Factors as the alpha model and MSCI Barra

Read more »