Random sudokus [test]

May 17, 2010
By
Random sudokus [test]

Robin Ryder pointed out to me that 3 is indeed the absolute minimum one could observe because of the block constraint (bon sang, mais c’est bien sûr !). The distribution of the series of 3 digits being independent over blocks, the theoretical distribution under uniformity can easily be simulated: #uniform distribution on the block diagonal

Read more »

Rcpp 0.8.0

May 17, 2010
By

Romain and I are happy to announce the release of Rcpp version 0.8.0. It has been uploaded to CRAN. A Debian upload is delayed until the now-required inline package is accepted into Debian. The source package is also available from here. This release ...

Read more »

Rcpp 0.8.0

Romain and I are happy to announce the release of Rcpp version 0.8.0. It has been uploaded to CRAN. A Debian upload is delayed until the now-required inline package is accepted into Debian. The source package is also available from here. This rel...

Read more »

Lambda Distribution

May 17, 2010
By

(This article was first published on Rmetrics blogs, and kindly contributed to R-bloggers) To leave a comment for the author, please follow the link and comment on his blog: Rmetrics blogs. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web...

Read more »

Winning the first game in a baseball series: a harbinger, or not?

May 17, 2010
By
Winning the first game in a baseball series: a harbinger, or not?

For those not familiar with the major-league baseball in the US (and despite living here for more than 10 years, I still include myself in that category), the games usually played in series: team A visits the home of team B, and the two teams play two or more games against each other on successive days. It's common wisdom...

Read more »

Example 7.37: calculation of Hotelling’s T^2

May 17, 2010
By
Example 7.37: calculation of Hotelling’s T^2

Hotelling's T^2 is a multivariate statistic used to compare two groups, where multiple outcomes are observed for each subject. Here we demonstrate how to calculate Hotelling's T^2 using R and SAS, and test the code using a simulation study then apply ...

Read more »

Index of the R-Sessions

May 17, 2010
By

The R-Sessions are a series of blog entries on using R. A large part consists of an R-manual I once wrote. Other posts include some tricks I found out, as well as entries detailing functions and packages I wrote for ...

Read more »

Hitting the Big Data Ceiling in R

May 16, 2010
By
Hitting the Big Data Ceiling in R

As a true R fan, I like to believe that R can do anything, no matter how big, how small or how complicated: there is some way to do it in R. I decided to approach my large, sparse matrix problem with this attitude. But here I sit a broken man. There is no “native” big data support built into...

Read more »

Graphing using R

May 16, 2010
By
Graphing using R

Long-time readers of the Stubborn Mule will know that charts are a regular feature here. Almost all of these charts were produced using the R statistical software package which, in my view, produces far superior results to the most commonly used graphing tool: Excel. As a community service to help rid the world of horrible

Read more »

Random sudokus

May 16, 2010
By
Random sudokus

After thinking about random sudokus for a few more weeks, I eventually came to read the paper by Newton and DeSalvo about the entropy of sudoku matrices. As written earlier, if we consider (as Newton and DeSakvo) a uniform distribution where the sudokus are drawn uniformly over the set of all sudokus, the entropy of

Read more »

A 34 Minute Video on Using R to Analyse Winter Olympic Medal Data

May 16, 2010
By

In this post I present a 34-minute video on using R. The video is based on an analysis of 1924 to 2006 Winter Olympic Medals that I presented previously in text form. The video aims to to show what an interactive session in R might look like using ...

Read more »

A 34 Minute Video on Using R to Analyse Winter Olympic Medal Data

May 16, 2010
By

In this post I present a 34-minute video on using R. The video is based on an analysis of 1924 to 2006 Winter Olympic Medals that I presented previously in text form. The video aims to to show what an interactive session in R might look like using ...

Read more »

Emulating Internet Traffic in Load Tests

May 15, 2010
By
Emulating Internet Traffic in Load Tests

One of the recurring questions in the GCaP class last week was: How can we make web-application load tests more representative of real Internet traffic? The sticking point is that conventional load-test simulators like LoadRunner, JMeter, and httperf, ...

Read more »

Typo in Bayesian Core [again]

May 15, 2010
By
Typo in Bayesian Core [again]

Reza Seirafi from Virginia Tech sent me the following email about Bayesian Core, which alas is pointing out a real typo in the reversible jump acceptance probability for the mixture model: With respect to the expression provided on page 178 for the acceptance probability of the split move, I was wondering if the omission of

Read more »

Linear regression models with robust parameter estimation

May 15, 2010
By

There are situations in regression modelling where robust methods could be considered to handle unusual observations that do not follow the general trend of the data set. There are various packages in R that provide robust statistical methods which are summarised on the CRAN Robust Task View. As an example of using robust statistical estimation in

Read more »

A small customization of ESS

May 14, 2010
By
A small customization of ESS

JD Long (at Cerebral Mastication) posted a question on Twitter about an artifact in ESS, where typing “_” gets you “<-”. This is because in the early days of S+, “_” was an allowed assignment operator, and ESS was developed in that era. Later, it was disallowed in favor of “<-” and “=”, so ESS

Read more »

Because it’s Friday: Optical Illusion

May 14, 2010
By

See more of the best illusions of 2010 at the link below. Best Illusion of the Year Contest: Top finalists in the 2010 contest

Read more »

New R User Group in Boston

May 14, 2010
By

There's another new R User Group, this time in Boston: the New England R User Group. Their first meeting will be on Tuesday, May 25. Get all the info by joining the Google Group at the link below. Google Groups: New England R User Group

Read more »

Introducing IBrokers (and Jeff Ryan)

May 13, 2010
By
Introducing IBrokers (and Jeff Ryan)

Josh had kindly invited me to post on FOSS Trading around the time when he first came up with the idea for the blog. Fast forward a year and I am finally taking him up on his offer.I'll start by highlighting that while all the software in this post is indeed free (true to FOSS), an account with...

Read more »

Introducing IBrokers (and Jeff Ryan)

May 13, 2010
By
Introducing IBrokers (and Jeff Ryan)

Josh had kindly invited me to post on FOSS Trading around the time when he first came up with the idea for the blog. Fast forward a year and I am finally taking him up on his offer.I'll start by highlighting that while all the software in this post is indeed free (true to FOSS), an account with...

Read more »

In case you missed it: April Roundup

May 13, 2010
By

In case you missed them, here are some articles from last month of particular interest to R users. We announced the availability of Revolution R Community 3.2 (based on R 2.10.1), now 100% open source, and including a new doMC package for parallel computing on Windows. We announced that Revolution R Enterprise is now available free of charge to...

Read more »

Introduction to using R in research

May 13, 2010
By

I was recently asked to give a talk to our graduate school annual conference. I offered several titles and the one they picked was Using R in research. I'm not sure if this was a good idea or not. The graduate school covers PhD students across three ar...

Read more »

Using R, LaTeX, and Sweave for Reproducible Research: Handouts, Templates, & Other Resources

May 13, 2010
By

Several readers emailed me or left a comment on my previous announcement of Frank Harrell's workshop on using Sweave for reproducible research asking if we could record the seminar. Unfortunately we couldn't record audio or video, but take a look a...

Read more »

Is it possible to get a causal smoothed filter ?

May 12, 2010
By
Is it possible to get a causal smoothed filter ?

Although I haven't been all that much of a fan of moving average based methods, I've observed some discussions and made some attempts to determine if it's possible to get an actual smoothed filter with a causal model. Anyone who's worked on financial ...

Read more »

pimax(mcsm)

May 12, 2010
By
pimax(mcsm)

The function pimax from our package mcsm is used in to reproduce Figure 5.11 of our book Introducing Monte Carlo Methods with R. (The name comes from using the Pima Indian R benchmark as the reference dataset.) I got this email from Josué I ran the ‘pimax’ example from the mcsm manual, and it gave

Read more »

Manual variable selection using the dropterm function

May 12, 2010
By
Manual variable selection using the dropterm function

When fitting a multiple linear regression model to data a natural question is whether a model can be simplified by excluding variables from the model. There are automatic procedures for undertaking these tests but some people prefer to follow a more manual approach to variable selection rather than pressing a button and taking what comes

Read more »

Revolution Analytics and R in the news

May 12, 2010
By

It was quite the media frenzy for Revolution and R last week. In conjunction with our relaunch as Revolution Analytics, we spoke to more than a dozen journalists and analysts to explain why we think R is at the center of a perfect storm for predictive analytics: with routine collection of large data sets, data analysis is now a...

Read more »

Reflections on consulting part 5 – what languages and tools to learn?

May 12, 2010
By
Reflections on consulting part 5 – what languages and tools to learn?

What languages and tools should you learn as a math/stat consultant?  To jump to the answer: Excel/VBA, SQL, R, Java, and Python. Spreadsheets have many problems with verifiability and scalability, so why Excel? Excel is: Useful for prototyping ideas quickly, either for your own use or to show to other team members Well-known and understood

Read more »

What Social Network Analysis software do you use?

May 12, 2010
By
What Social Network Analysis software do you use?

See a the poll here by Gabriel Rossman at Code and Culture. I voted for R and ‘igraph’. If you use R you are getting access to all the other wonderful things that come with R. Using specialized package, like Pajek, UCINET etc requires constant going back and forth between network software and some other

Read more »