Gaussian Processes with RStan

August 19, 2013
By
Gaussian Processes with RStan

Email Previously I looked at how to simulate Gaussian processes in R, following the methods in Rasmussen and Williams. But now that Andrew Gelman et al. (of

Read more »

Question and Answer: Generating Binary and Discrete Response Data

August 19, 2013
By

I was recently contacted by a reader with two very specific questions and I thought that this would be a good topic to publicity respond to. He would like to simulate his data:I have firm level data and the model is discrete choice with the main expla...

Read more »

Text Mining with R – Comparing Word Counts in two Text Documents

August 19, 2013
By

Here's what I came up with to compare word counts in two pieces of text. If you got any idea, I'd love to learn about alternatives!## a function that compares word counts in two textswordcount ...

Read more »

Revolution Newsletter: August 2013

August 19, 2013
By

The most recent edition of the Revolution Newsletter is now available. In case you missed it, the news section is below, and you can read the full August edition (with highlights from this blog and community events) online. You can subscribe to the Revolution Newsletter to get it monthly via email. What is R? Has anyone ever asked you,...

Read more »

R vs Python Speed Comparison for Bootstrapping

August 19, 2013
By
R vs Python Speed Comparison for Bootstrapping

I’m interested in Python a lot, mostly because it appears to be wickedly fast. The downside is that I don’t know it nearly as well as R, so any speed gain in computation time is more than offset by Google … Continue reading →

Read more »

The Bayesian Counterpart of Pearson’s Correlation Test

August 19, 2013
By
The Bayesian Counterpart of Pearson’s Correlation Test

Except for maybe the t test, a contender for the title “most used and abused statistical test” is Pearson’s correlation test. Whenever someone wants to check if two variables relate somehow it is a safe bet (at least in psychology) that the first thing to be tested is the strength of a Pearson’s correlation. Only if that doesn’t...

Read more »

Is the Tax Code the longest Title?

August 19, 2013
By
Is the Tax Code the longest Title?

  Last week, I shared that Dan Katz and I had finally published a draft of our paper, Measuring the Complexity of the Law: The U.S. Code.  We’d previewed this research on Computational Legal Studies years ago.  Since then, we’ve received great… Read more ›

Read more »

Slides from Rcpp talk in Chicago

August 19, 2013
By

A couple of days ago, I gave a talk to the Chicago R Users Group which is run ever-so-smoothly by Paul Teetor and Chase Carpenter. The talk provided a brief introduction to Rcpp for R and C++ integration. Slides are now up on my talks / presentation...

Read more »

#21 – Find significant relationships in data with a CoCo Matrix

August 19, 2013
By
#21 – Find significant relationships in data with a CoCo Matrix

The CoCo Matrix (correlation coefficient matrix) is a script for R that takes a table headed with multiple variables and calculates the correlation coefficients between each of the variables, determines which are statistically significant, and represents them visually in a grid-plot. I created the CoCo Matrix to cross correlate a table with a large number of

Read more »

Fitting psychometric functions using STAN

August 19, 2013
By
Fitting psychometric functions using STAN

STAN is a new system for Bayesian inference, similar to BUGS and JAGS. I’ve played with it a bit and it’s quite promising, it really has the potential to make MCMC less of a pain (on simple models). I’ve written a short introduction to fitting psychometric functions using STAN and R, in case that’s useful

Read more »

Slides from Rcpp talk in Chicagp

August 18, 2013
By

A couple of days ago, I gave a talk to the Chicago R Users Group which is run ever-so-smoothly by Paul Teetor and Chase Carpenter. The talk provided a brief introduction to Rcpp for R and C++ integration. Slides are now up on my talks / presentation...

Read more »

Endogenous Spatial Lags for the Linear Regression Model

August 18, 2013
By
Endogenous Spatial Lags for the Linear Regression Model

Over the past number of years, I have noted that spatial econometric methods have been gaining popularity. This is a welcome trend in my opinion, as the spatial structure of data is something that should be explicitly included in the empirical modelling procedure. Omitting spatial effects assumes that the location co-ordinates for observations are unrelated

Read more »

Fitting a Model by Maximum Likelihood

August 18, 2013
By
Fitting a Model by Maximum Likelihood

Maximum-Likelihood Estimation (MLE) is a statistical technique for estimating model parameters. It basically sets out to answer the question: what model parameters are most likely to characterise a given set of data? First you need to select a model for the data. And the model must have one or more (unknown) parameters. As the name

Read more »

Exercise in REML/Mixed model

August 18, 2013
By

I want to build a bit more experience in REML, so I decided to redo some of the SAS examples in R. This post describes the results of example 59.1 (page 5001, SAS(R)/STAT User guide 12.3 link). Following the list from freshbiostats I will analyze ...

Read more »

Clarifying vague interactions

August 18, 2013
By
Clarifying vague interactions

For some reason, authors occasionally present linear model results with vague or unintelligible interaction effects. One way to be vague when presenting interaction effects is to provide only a table of model coefficients, including no information on the range of covariate values observed, and no plots to aid in interpretation. Here’s an example: Suppose you have discovered a statistically significant...

Read more »

Mapping Australian electoral divisions with ggplot2

August 18, 2013
By
Mapping Australian electoral divisions with ggplot2

I’ve seen some creative visualisations of issues surrounding the Australian election recently though not as many maps as I expected. ‘ggplot2′ is the go-to package for plotting in R so I thought I’d see if I could plot the Australian electoral divisions with ggplot2. By using the Australian Electoral Commission’s GIS mapping coordinates and mutilating

Read more »

Negative Payments in Local Spending Data

August 17, 2013
By
Negative Payments in Local Spending Data

In anticipation of a new R library from School of Data data diva @mihi_tr that will wrap the OpenSpending API and providing access to OpenSpending.org data directly from within R, I thought I’d start doodling around some ideas raised in Identifying Pieces in the Spending Data Jigsaw. In particular, common payment values, repayments/refunds and “balanced

Read more »

Update to Fantasy Football Draft Optimizer shiny app

August 17, 2013
By

By popular demand, I updated the Fantasy Football Draft Optimizer shiny app with two changes: The app now takes into account how many teams are in your league when estimating The post Update to Fantasy Football Draft Optimizer shiny app appeared first on Fantasy Football Analytics.

Read more »

Working with climate data from the web in R

August 17, 2013
By
Working with climate data from the web in R

I recently attended ScienceOnline Climate, a conference in Washington, D.C. at AAAS. You may have heard of the ScienceOnline annual meeting in North Carolina - this was one of their topical meetings focused on Climate Change. I moderated a session on working with data from the web in R, focusing on climate data. Search Twitter for...

Read more »

Working with climate data from the web in R

August 17, 2013
By
Working with climate data from the web in R

I recently attended ScienceOnline Climate, a conference in Washington, D.C. at AAAS. You may have heard of the ScienceOnline annual meeting in North Carolina - this was one of their topical meetings focused on Climate Change. I moderated a session on working with data from the web in R, focusing on climate data. Search Twitter for...

Read more »

Accuracy versus F score: Machine Learning for the RNA Polymerases

August 16, 2013
By

Hello, today I'm going to show you the difference of using two different common performance measures (useful not only for Machine Learning purposes, is useful in every scientific field). Until now, I have found more the accuracy values than F scores in...

Read more »

Using Heatmaps to Uncover the Individual-Level Structure of Brand Perceptions

August 16, 2013
By
Using Heatmaps to Uncover the Individual-Level Structure of Brand Perceptions

Heatmaps, when the rows and columns are appropriately ordered, provide insight into the data structure at the individual level.  In an earlier post I showed a cluster heatmap with dendrograms for both the rows and the columns.  In addition, I...

Read more »

Foodborne Chicago finds dodgy restaurants with tweets, and R

August 16, 2013
By
Foodborne Chicago finds dodgy restaurants with tweets, and R

If, like me, you've ever had a sandwich from a dubious deli and then been laid up for days afterwards, you know that food poisoning is no trifling matter. In the past, local authorities would only ever learn of such public health issues if they get reported to the authorities by the victim (or the victim's doctor). But that...

Read more »

Equivocal Zones

August 16, 2013
By
Equivocal Zones

In Chapter 11, equivocal zones were briefly discussed. The idea is that some classification errors are close to the probability boundary (i.e. 50% for two class outcomes). If this is the case, we can create a zone where we the samples are predicted as "equivocal" or "indeterminate" instead of one of the class levels. This only works if the...

Read more »

Programming style guidelines: R and MATLAB

August 16, 2013
By
Programming style guidelines: R and MATLAB

summary of programming style conventions in R and MATLAB

Read more »

RcppArmadillo 0.3.910.0

August 15, 2013
By

A new minor release 3.910.0 of Armadillo came out a few days ago. A new RcppArmadillo release 0.3.910.0 was provided rightaway, and after a brief back-and-forth with CRAN (mostly having to do with the non-standard vignette corresponding to our CSD...

Read more »

Creating a Quick Report with knitr, xtable, R Markdown, Pandoc (and some OpenBLAS Benchmark Results)

August 15, 2013
By
Creating a Quick Report with knitr, xtable, R Markdown, Pandoc (and some OpenBLAS Benchmark Results)

To cut a long story short, I always wanted to write professional-looking documents (technical reports and potentially my thesis) with R codes. No more copy and paste. No more Microsoft Word. At the same time, I don't feel comfortable with LaTeX. Somehow I found a workaround with knitr, xtable, R Markdown...

Read more »

sapply is my new friend!

August 15, 2013
By
sapply is my new friend!

I’ve written previously about how the apply function is a major workhorse in many of my work projects. What I didn’t know is how handy the sapply function can be! There are a couple of cases so far where I’ve … Continue reading →

Read more »

Census Atlas Japan

August 15, 2013
By
Census Atlas Japan

The 2011 Census Open Atlas project has been put on hold recently as various other research projects have intervened – more on these soon. However, over the summer  Chris Brunsdon and I have taken a research trip to Ritsumeikan University (Japan) where we visited Keiji Yano and Tomoki Nakaya. As part of this trip I began developing a census atlas for

Read more »

Sponsors

Mango solutions



RStudio homepage



Zero Inflated Models and Generalized Linear Mixed Models with R

Dommino data lab

Quantide: statistical consulting and training



http://www.eoda.de







ODSC

ODSC

CRC R books series





Six Sigma Online Training





Contact us if you wish to help support R-bloggers, and place your banner here.