335 search results for "boxplot"

Basketball Data Part III – BMI: Does it Matter?

June 11, 2014
By
Basketball Data Part III – BMI: Does it Matter?

For those of you who are just joining us, please refer back to the previous two posts referencing scraping XML data and length of NBA career by position. The next idea I wanted to explore was whether BMI had any … Continue reading →

Read more »

Box plot, Fisher’s style

June 4, 2014
By
Box plot, Fisher’s style

In a recent issue of Significance, I discovered an interesting – and amuzing – figure, about some box & beard plot, in Dr Fisher’s casebook: Beard the statistician in his den. In French, the box plot (introduced by John Tukey, not George Box, as discussed in a previous post) is popular under the name boîte à moustaches (box with a...

Read more »

Basketball Data Part II – Length of Career by Position

June 2, 2014
By
Basketball Data Part II – Length of Career by Position

In the previous post, I showed how easy it is to use R to scrape XML tables from websites (I used the XML package to scrape some basic basketball data).  In this post, I’ll explore the idea that NBA career … Continue reading →

Read more »

A quick look at FX realized vol

May 31, 2014
By
A quick look at FX realized vol

Much has been said about the decline in volatility. At the moment I am very active in FX spot trading and as a generalization do better the more vol there is. I wanted to see how things stood on the crosses I am most active in, namely EUR/USD, GBP/USD and USD/JPY. I took hourly data from FxPro (not my broker, nor...

Read more »

Automated determination of distribution groupings – A StackOverflow collaboration

May 18, 2014
By
Automated determination of distribution groupings – A StackOverflow collaboration

For those of you not familiar with StackOverflow (SO), it's a coder's help forum on the StackExchange website. It's one of the best resources for R-coding tips that I know of, due entirely to the community of users that routinely give expert advise (as...

Read more »

Vectorizing IPv4 Address Conversions – Part 2

May 17, 2014
By
Vectorizing IPv4 Address Conversions – Part 2

The previous post looked at using the Vectorize() function to, well, vectorize, our Rcpp IPv4 functions. While this is a completely acceptable practice, we can perform the vectorization 100% in Rcpp/C++. We’ve included both the original Rcpp IPv4 functions and the new Rcpp-vectorized functions together to show the minimal differences between them: #include <Rcpp.h> #include <boost/asio/ip/address_v4.hpp> using namespace Rcpp; using namespace boost::asio::ip; // Rcpp/C++ vectorized routines // ] NumericVector rcpp_rinet_pton (CharacterVector...

Read more »

Dining in San Francisco – Let R Guide You

May 6, 2014
By
Dining in San Francisco – Let R Guide You

I’m frequently asked by newcomers to R to provide an easy to follow generic set of instructions on how to download data, transform it, aggregate it, make graphs, and write it all up for publication in a high impact journal – all by the end of the day ! While such a request is somewhat

Read more »

Dining in San Francisco – Let R Guide You

May 6, 2014
By
Dining in San Francisco – Let R Guide You

I’m frequently asked by newcomers to R to provide an easy to follow generic set of instructions on how to download data, transform it, aggregate it, make graphs, and write it all up for publication in a high impact journal – all by the end of the day ! While such a request is somewhat

Read more »

Comrades Marathon: Negative Splits and Cheating

May 6, 2014
By
Comrades Marathon: Negative Splits and Cheating

With this year’s Comrades Marathon just less than a month away, I was reminded of a story from earlier in the year. Mark Dowdeswell, a statistician at Wits University, found evidence of cheating by some middle and back of the pack Comrades runners. He identified a group of 20 athletes who had suspicious negative splits:

Read more »

There is no “Too Big” Data, is there?

April 23, 2014
By
There is no “Too Big” Data, is there?

A few years ago, a former classmate came back to me with a simple problem. He was working for some insurance company (and still is, don’t worry, chatting with me is not yet a reason for dismissal), and his problem was that their dataset was too large to run (standard) codes to get a regression, and some predictions. My...

Read more »