Following the course, in order to define assocation measures (from Kruskal (1958)) or concordance measures (from Scarsini (1984)), define a concordance function as follows: let be a random pair with copula , and with copula . Then define the so-...

This post will examine the Heber Valley Railroad, a small town tourist attraction using event gravitational pull. Using the information from part 1 the two factors associated with the events gravity, the number of participants, and the distance they traveled. The number of participants can be shown using bar charts, histograms, and summary tables. The distance traveled...

Let's get that started ...Fun stuff with R coming soon!Disclaimer beforehand:The analyses I'll present are not meant to be taken too seriously in a scientific way. I just wanna show what you can do with R as a programming language, basic statistics and...

Previously This book and the associated R package were introduced before. Executive Summary A very nice — and enlightening — discussion of a wide range of topics. Principles The Introduction to the book sets out 5 principles. This is probably the most important part of the book. The principles are: We don’t know much in … Continue reading...

In response to an update to ggplot2 (now verson 0.9.2) I had to make some minor changes to our package TripleR. The CRAN maintainers also asked to … Please also fix other issues that may be apparent in checks with a current R-devel. Now, how can this be done? Here’s my workflow on Mac OS

One-Way ANOVA Analysis of variance is a tool used for a variety of purposes. Applications range from a common one-way ANOVA, to experimental blocking, to more complex nested designs. This first ANOVA example provides the necessary tools to analyze data using this technique. This example will show a basic one-way ANOVA. I will save the

An update to the wordcloud package (2.2) has been released to CRAN. It includes a number of improvements to the basic wordcloud. Notably that you may now pass it text and Corpus objects directly. as in: #install.packages(c("wordcloud","tm"),repos="http://cran.r-project.org") library(wordcloud) library(tm) wordcloud("May our children and our children's children to a thousand generations, continue to enjoy the

Hi Internet! I’m Preeya, and I will be your guide in this blog’s quantitative quest for knowledge. To get started, let’s talk about pricing. Part of the Kickstarter process is figuring out how much a hypothetical product will cost once it’s on the market. But how accurately can that be calculated without actually going through … Continue reading...

Today's guest post comes from Yihui Xie, author of the knitr package — ed. Hi, this is Yihui Xie, and I'm guest posting on the Revolutions blog to talk about one aspect of the knitr package: how we can integrate data analysis and reporting in R with the Web. This post includes both the work that has been done...

While back-testing trading strategies I want all assets to have long history. Unfortunately, sometimes there is no tradeable stock or ETF with sufficient history. For example, I might use GLD as a proxy for Gold allocation, but GLD is only began trading in November of 2004. We can extend the GLD’s historical returns with its

I have tried one of my previous scripts with an updated igraph version and I got an interesting (pretty much unexpected) error:At type_indexededgelist.c:269 : invalid (odd) length of edges vector, Invalid edge vectorThe problem is that it was a well-te...

Suppose that you accepted my argument from the last two posts on halo effects and bifactor models. As you might recall, I argued that when respondents complete rating scales, they predominating rely on their generalized impression with a more minor role played by the specific features that the ratings were written to measure. Consequently, we...

More and more makers of electronic devices use standard storage media to record data. Sometimes this is central to the device's function, as in a camera, so that the data must be easy to recover. Other times, it's effectively incidental, and the device maker may not provide easy access to the stored data....

This guest post is by Douglas McNair MD PhD, Engineering Fellow & President, Cerner Math Inc. -- ed. RevoScaleR scaling big-data modeling performance for real-time health data analysis at Cerner The size of data sets is increasing much more rapidly than the speed of cores, of RAM, and of disk drives. This is particularly true of electronic health records...

Using progress bars in R scripts can provide valuable timing feedback during development and additional polish to final products. winProgressBar and setWinProgressBar are the primary functions for creating progress bars in R. Progress bars, and progress indicators in general, are relatively uncommon in R programming. This makes sense, as they can add bloat and, being The post Progress...

About 8 years ago, I was sitting in class listening to a guest lecturer talk about how community events can be described like celestial bodies with their own gravity, where the size and importance of the event would attract more people, from farther away. Much like a black hole, where the bigger the mass of the black hole the...

The paper is “Not Fooled by Randomness: Using Random Portfolios to Analyze Investment Funds” by Roberto Stein. Here is an explanation of the idea of random portfolios. Favorite sentence The real question here is whether we’re actually measuring skill, or these are still measures of performance, so influenced by extraneous factors that the existence of … Continue reading...

What is the best resource to learn an R package? Many R users know the almighty question mark ? in R. For example, type ?lm and you will see the documentation of the function lm. If you know nothing about a package, you can take a look at the HTML help...

Working with knitr and markdown is a great way to share quick reports with colleagues, but in cases where IE8 is still the dominant browser, shipping an HTML file with embedded graphics is a non-starter. IE8 does not support the Data URI format used to...

Jean-Michel Marin visited me in Paris last week and, besides taking part in Pierre’s PhD defence, we made enough progress to close two more chapters of the new edition of Bayesian Core (soon to be Bayesian Essentials with R!) This follows the good work session we had in Carnon where we also completed two chapters

In my last post, I described and demonstrated the CountSummary procedure to be included in the ExploringData package that I am in the process of developing. This procedure generates a collection of graphical data summaries for a count data sequence, based on the distplot, Ord_plot, and Ord_estimate functions from the vcd package. The distplot function generates both the Poissonness...