...say you have a multivariate dataset and a two-way factorial design - you do a PERMANOVA and the aov-table (adonis is using ANOVA or "sum"-contrasts) tells you there is an interaction - how to proceed when you want to go deeper into the ana...

The Open Governing Index is a new measure developed by VisionMobile, that rates open-source projects regarding their governance process. The index has four facets, described thoroughly in the "Open Governance Index" publication, and briefly below. access - These criteria assess the availability of source code, a permissive license, developer support mechanisms, a roadmap, and openness

We have been consistently impressed by and enjoyed the wealth of R wisdom available on the R-bloggers aggregation site. Therefore Win-Vector LLC is granting the right to reformat and redistribute (with attribution and link) our blog‘s R content in the R-bloggers site and feeds. We hope to see our R content shared through this network. Related posts:

When dealing with transaction cost analysis, a stock’s volume is assumed to be stable or foreseeable. However, there is different picture, then we are dealing with an illiquid stock. It is relatively easy to forecast the volume of a liquid stock, because trading volume has high autocorrelation – the volumes at t and t+1 are correlated. For

Wikimania 2011 came to a close yesterday. For those of you unfamiliar with Wikimania it may be described as UseR for Wikipedia, Wikimedia and MediaWiki all rolled into one. The conference brings together staff, volunteer editors, volunteer developers and users of MediaWiki projects. Of specific interest to R Bloggers readers may be the sessions on…

Introduction Effect estimation is an important task in modern research. An example is the identification of risk factors for disease and the qualification of medical treatments. Usually, researchers are interested in estimating the global, common effect. Since actual effects tend to differ across populations, estimates based on sample of a particular population seldomly generalize well.

Usability. I am not an expert in Human-Computer Interaction (HCI) at all. Worse, I make the crappiest looking interfaces, typically. So, that's said. Usability. Wikipedia writes that "sability is the ease of use and learnability of a ...

My last two posts have been about mixture models, with examples to illustrate what they are and how they can be useful. Further discussion and more examples can be found in Chapter 10 of Exploring Data in Engineering, the Sciences, and Medicine. One important topic I haven’t covered is how to fit mixture models to datasets like the Old Faithful geyser...

Programmers should definitely know how to use R. I don’t mean they should switch from their current language to R, but they should think of R as a handy tool during development.Again and again I find myself working with Java code like the following. td.linenos { background-color: #f0f0f0; padding-right: 10px; } span.lineno { background-color: #f0f0f0; Related posts:

I got a paper (unavailable online) to referee about testing for the order (i.e. the number of components) of a normal mixture. Although this is an easily spelled problem, namely estimate k in I came to the conclusion that it is a kind of ill-posed problem. Without a clear definition of what a component is,

Revolution Analytics is hosting several hands-on R training classes over the next few months, with in-person instruction from two leading package authors and experts from the R community. Diethelm Würtz from ETH Zurich will give a two-day master class on Portfolio Selection and Optimization in Practice. Prof Würtz leads the Rmetrics project, and will provide in-depth instruction on using...

Several years ago Gerd Gigerenzer wrote: “Statistical rituals largely eliminate statistical thinking in the social sciences. Rituals are indispensable for identification with social groups, but they should be the subject rather than the procedure of science. Statistical rituals largely eliminate … Continue reading →

Ever have a regression model where the coefficients don't make sense? I've been trying to predict electricity and gas consumption from daily activity schedules but a simple linear regression kept saying that demands should go down the more an activity is performed. Fortunately I found the nnls package and show here how you can use it to...

While my time at the 2011 Joint Statistical Meetings was short--I unfortunately missed some presentations I would have like to have attended--it was a great experience. The collection of academics and professionals is very different from the other con...

In recent years many R packages have been developed to enable image analysis in R. As an alternative the combination of R with a powerful image analysis software like ImageJ offers many advanced image analysis interfaces and algorithms not yet available in R. Bio7 integrates both applications in a Rich Client Plattform based on Eclipse

Here are the 14 slides I used during my talk at the Joint Statistical Meetings 2011: shotwell-jsm-2011.pdf. I'm trying hard to minimize the text in my presentation slides. But, this usually requires that I practice more. Hence, you will know which talks I have practiced thoroughly by the amount of text in the slides .

Together with Revolution Analytics, I will be offering two more one-day classes on the Rcpp package for seamless integration of R and C++. The format will follow the workshop Romain and I gave during the tutorial day preceding this year's R/Financ...

A reader asked a question about data from environment canada. He wanted to know if that data could somehow be integrated into the RGhcnV3 package. That turned out to be a bit more challenging that I expected. In short order I’d found a couple other people who had done something similar. DrJ of course was

At the JSM 2011 conference in Miami earlier this week, we conducted an informal poll of attendees on their attitudes to respect to Big Data, statistical software, and data science. JSM is the largest gathering of statisticians in North America, and attendees were invited to complete a survey after logging into the Wi-Fi network. Of the 190 respondents to...

Of course since we all know Jon Skeet does have various powers, I will move onto unanswered questions, whether a users reputation makes them receive more upvotes for answers. I’ve seen this theory mentioned in multiple places (see any of the comments to Jon Skeet’s answer that are along the lines of “If this was

We have an internal image that floated around work several years ago that details network utilization of TCP over a wide variety of configurations. It is a heatmap created in matlab that is just sweet, sweet eye candy. We actually hung it on the outside of a cube for a short while and people couldn't help but stop and...