The Rcpp Gallery and my Seinfeld Streak

February 3, 2013
By

A good three weeks ago, we introduced the Rcpp Gallery. While this is a joint effort by several of us on the Rcpp team, the backend was conceived and implemented entirely by JJ who also bootstrapped it with same first content, drawing on posts by Ha...

Read more »

Checking validation statistics (Monitor function 030220139)

February 3, 2013
By

(This article was first published on NIR-Quimiometria, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on their blog: NIR-Quimiometria. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL,...

Read more »

Clustering using dynamic tree cut

February 3, 2013
By
Clustering using dynamic tree cut

Summary: Two methods for hierarchical clustering are introduced: (i) dynamic tree cut; and (ii) dynamic hybrid cut. Dynamic tree cut is a top-down algorithm that relies solely on the dendrogram. The algorithm implements an adaptive, iterative process of cluster decomposition … Continue reading →

Read more »

Arc Diagrams in R: Les Miserables

February 3, 2013
By
Arc Diagrams in R: Les Miserables

In this post we will talk about the R package “arcdiagram” for plotting pretty arc diagrams like the one below: Arc Diagrams An arc diagram is a graphical display to visualize graphs or networks in a one-dimensional layout. The main idea is to display nodes along a single axis, while representing the edges or connections … Continue reading...

Read more »

XLConnect 0.2-4

February 3, 2013
By
XLConnect 0.2-4

Mirai Solutions GmbH (http://www.mirai-solutions.com) is very pleased to announce the release of XLConnect 0.2-4, which is available from CRAN. This newest release comes along with a number of new features: Ability to read cached cell values. There is a new … Continue reading →

Read more »

For descriptive statistics, values below LLOQ set to …

February 3, 2013
By
For descriptive statistics, values below LLOQ set to …

That is what I read the other day. For calculation of descriptive statistics, values below the LLOQ (lower limit of quantification)  were set to.... Then I wondered, wasn't there a trick in JAGS to incorporate the presence of missing data while es...

Read more »

A package for agricultural statistic: FAOSTAT

February 3, 2013
By

After 8 years of using R, today I finally become a contributor to the community and released my first package, FAOSTAT.The package is designed to provide user with direct access to the FAOSTAT data base via R and to support the...

Read more »

Scatterplot with marginal boxplots

February 3, 2013
By
Scatterplot with marginal boxplots

Using R and ggplot2 to draw a scatterplot with the two marginal boxplotsDrawing a scatterplot with the marginal boxplots (or marginal histograms or marginal density plots) has always been a bit tricky (well for me anyway). The approach I take here is, first, to draw the three separate plots using ggplot2:the scatterplot;the horizontal boxplot to appear in the...

Read more »

Maize trade Part II: Comparison and analysis

February 3, 2013
By
Maize trade Part II: Comparison and analysis

Following my last post about the maize network, although interesting but is not very informative. What we are going to do today is to contrast the maize network with the wine trade network.The choice why we have chose wine will become clear after the...

Read more »

Visualising 2012 NFL Quarterback performance with R heat maps

February 3, 2013
By
Visualising 2012 NFL Quarterback performance with R heat maps

With only 24 hours remaining in the 2012 NFL season, this is a good time to review how the league's QBs performed during the regular season using performance data from KFFL and the heat mapping capabilities of R. #scale data to mean=0, sd=1 and convert to matrix QBscaled <- as.matrix(scale(QB2012)) #create heatmap and don't reorder

Read more »

InstallOldPackages: a repmis command for installing old R package versions

February 3, 2013
By

A big problem in reproducible research is that software changes. The code you used to do a piece of research may depend on a specific version of software that has since been changed. This is an annoying problem in R because install.packages only installs the most recent version of a package. It can be tedious to collect the old...

Read more »

Comparing individual team run production

February 3, 2013
By

Or, The 2010 Mariners: How Bad Were They?In earlier posts, I used the statistical software R to plot the trends in league average run scoring since 1901. This was the first step to answering other questions I had on my mind:How poor was the offensive performance of the 2010 Seattle Mariners?Are they showing any signs...

Read more »

data.table or data.frame?

February 2, 2013
By

I spent a portion of today trying to convince a colleague that there are times when the data.table package is faster than traditional methods in R. It took a few of the tests below to prove the point. Generate a data.frame of characters and numbers for easy plotting. df <- data.frame(letters = as.character(sample(letters, 1e+08, replace = TRUE)), ...

Read more »

A random walk ? What else ?

February 2, 2013
By
A random walk ? What else ?

Consider the following time series, What does it look like ? I know, this is a stupid game, but I keep using it in my time series courses. It does look like a random walk, doesn’t it ? If we use Philipps-Perron test, yes, it does, > PP.test(x) Phillips-Perron Unit Root Test data: x Dickey-Fuller = -2.2421, Truncation lag parameter = 6,...

Read more »

R scripts for analyzing survey data

February 2, 2013
By

Another site pops up with open code for analyzing public survey data: http://www.asdfree.com/ It will be interesting to see whether this gets used by the general public--given the growing trend of data journalism and so forth--versus academics. It is...

Read more »

A slightly different introduction to R, part III

February 2, 2013
By
A slightly different introduction to R, part III

I think you’ve noticed by now that a normal interactive R session is quite messy. If you don’t believe me, try playing around for a while and then give the history() command, which will show you the commands you’ve typed. If you’re anything like me, a lot of them are malformed attempts that generated some

Read more »

qdap 0.2.0 released

February 2, 2013
By
qdap 0.2.0 released

This is the first CRAN release of qdap (qdap 0.2.0) found here.  qdap (Quantitative Discourse Analysis Package) is an R package designed to assist in quantitative discourse analysis. The package stands as a bridge between qualitative transcripts of dialogue and … Continue reading →

Read more »

RcppExamples 0.1.6

February 1, 2013
By

A pure maintenance release 0.1.6 of RcppExamples was made two weeks ago, and never announced. We merely moved the NEWS.Rd file into the proper location in the inst/ directory, and, while were at it, mentioned the new Rcpp Gallery in the DESCRIPTION fi...

Read more »

digest 0.6.2

February 1, 2013
By

digest version 0.6.2 came out a few days ago as an almost immediate follow-up to release 0.6.1. We used paste0() in a few places, and this is only available with newer versions of R. To not introduce as somewhat unnessecary dependency, we reverted thi...

Read more »

Bootstrap Confidence Intervals

February 1, 2013
By
Bootstrap Confidence Intervals

Here is an example of nonparametric bootstrapping.  It’s a powerful technique that is similar to the Jackknife. With the bootstrap, however, the approach uses re-sampling. It’s clearly not as good as parametric approaches but it gets the job done. This can be used in a variety of situations ranging from variance estimation to model selection. John

Read more »

Visualizing MLB Hall of Fame votes with R

February 1, 2013
By
Visualizing MLB Hall of Fame votes with R

Carlos Scheidegger and Kenny Shirley created this visualization of votes for the Major League Baseball hall of fame: They describe the chart as follows: The main figure above is a plot of BBWAA Hall of Fame voting by year for all 1,070 players who have appeared on the ballot since Hall of Fame voting began in 1936. The circular...

Read more »

General Navigation in R

February 1, 2013
By
General Navigation in R

So you’ve finally managed to install the pesky environment but have no idea what you are looking at when you open the program. This tutorial is for you. (Again, here is a version with screenshot pictures). When you open R, it might look different than the screenshots in the picture version of the tutorial. This

Read more »

Yen and JGBs Short-Term vs Long Term

February 1, 2013
By
Yen and JGBs Short-Term vs Long Term

I have read some articles arguing that the recent move in the Japanese Yen is overdone.  However, considering the short-term without regard to the long-term context is naïve and potentially dangerous.  Although I do not have significant proo...

Read more »

Overdispersion with different exposures

February 1, 2013
By
Overdispersion with different exposures

In actuarial science, and insurance ratemaking, taking into account the exposure can be a nightmare (in datasets, some clients have been here for a few years – we call that exposure – while others have been here for a few months, or weeks). Somehow, simple results because more complicated to compute just because we have to take into account...

Read more »

Bayesian model choice for the Poisson model

February 1, 2013
By
Bayesian model choice for the Poisson model

Following Arthur Charpentier‘s example, I am going to try to post occasionally on material covered during my courses, in the hope that it might be useful to my students, but also to others. In the second practical of the Bayesian Case Studies course, we looked at Bayesian model choice and basic Monte Carlo methods, looking

Read more »

#13 Mapping in R: Representing geospatial data together with ggplot

February 1, 2013
By
#13 Mapping in R: Representing geospatial data together with ggplot

I have been trawling around for a while now trying to find a simple and understandable way of representing geospatial data in R, whilst retaining the ability to manipulate the visualisation in ggplot. After much searching I came across some articles which got me to a working product only after a lot of ball ache.

Read more »

"I don’t wanna grow up": Age / value relationships for football players

February 1, 2013
By
"I don’t wanna grow up": Age / value relationships for football players

Let's get back to the age-value relationship from my last post. I did some more plotting to see on which position this inversed U-shaped relationship is strongest. Please note, that I use a dataframe called eu.players throughout this post, which holds downloaded football player information from transfermarkt.de.But first, let us get back to the original graph.

Read more »

Converting a dataset from wide to long

February 1, 2013
By
Converting a dataset from wide to long

I recently had to convert a dataset that I was working with from a wide format to a long format for my analysis.  I struggled with this a bit, but finally found the right sources and the right package to do it, so I thought I'd share my practical ...

Read more »

Show me the pdf already

February 1, 2013
By

You’ve got a pdf file and you’d like to view it with whatever the system viewer is. As usual, that requires something special for Windows and something general for the rest of us. Here goes… openPDF <- function(f) { os <- .Platform$OS.type if (os=="windows") shell.exec(normalizePath(f)) else { pdf <- getOption("pdfviewer", default='') if (nchar(pdf)==0) stop("The 'pdfviewer'

Read more »

Sponsors