Some doctors receive more malpractice reports than others. Just how unequal is the distribution of malpractice reports?
The post 1.5 percent of doctors, a quarter of malpratice reports appeared first on Decision Science News.
Some doctors receive more malpractice reports than others. Just how unequal is the distribution of malpractice reports?
The post 1.5 percent of doctors, a quarter of malpratice reports appeared first on Decision Science News.
# The SIR Model (susceptible, infected, and recovered) model is a common and useful tool in epidemiological modelling.
# In this post and in future posts I hope to explore how this basic model can be enriched by including different population groups or disease vectors.
# Simulation Population Parameters:
# Proportion Susceptible
Sp = .9
# Proportion...
Recently I wrote a blogpost showing the implementation of a simple bubble sort algorithm in pure R code. The downside of that implementation was that is was awfully slow. And by slow, I mean really slow, as in “a 100…
This post is minimalistic. Consider this: Now let's have look at what's inside x: But is it really true? Here you go. A colleague of mine was once ruined by this for an entire day before we realized what was…
I recently updated my plots of the data analysis tools used in academia in my ongoing article, The Popularity of Data Analysis Software. I repeat those here and update my previous forecast of data analysis software usage. Learning to use … Continue reading →
I've been looking at this article for a new tree-based method. It uses other classification methods (e.g. LDA) to find a single variable use in the split and builds a tree in that manner. The subtleties of the model are: The model does not prune but ...
Summary: Two new measures of tree topology are introduced: temporal clustering (TC), and staircase-ness. Several other existing statistics are also implemented for the purpose of comparison: Aldous's graphical test and likelihood test to decide if a tree fits the Yule … Continue reading →
In this blogpost, I want to dive deeper into the explanation of the relationship between Frequency and Recency of Visits with the Conversion Rate and Average Order Value. I have used the RGA package for data extraction and Dr. Hadley Wickham’s ggplot2 package to achieve the visualizations. Here’s the data aggregation script : #transactions dataframe
Over the last year I worked with two colleagues of mine on the subject of inflation and claims inflation in particular. I didn't expect it to be such a challenging topic, but we ended up with more questions than answers. The key question and biggest ch...
Conrad rolled up a new Armadillo release 3.820 (following two minor fix release in the 0.3.810 series of which we packaged the one that was relevant for us). This new version is now out in a release 0.3.820 of RcppArmadillo which is already on CRAN a...
Guy Freeman writes: I thought you’d all like to know that Stan was used and referenced in a peer-reviewed Rapid Communications paper on influenza. Thank you for this excellent modelling language and sampler, which made it possible to carry out this work quickly! I haven’t actually read the paper, but I’m happy to see Stan
The post Stan!...
Here is the cover of the Japanese translation of our Introducing Monte Carlo methods with R book. A few year after the French translation. It actually appeared last year in August but I was not informed of this till a few weeks ago. The publisher is Maruzen, with an associated webpage if you want to 
My last post discussed a technique for integrating functions in R using a Monte Carlo or randomization approach. The mc.int function (available here) estimated the area underneath a curve by multiplying the proportion of random points below the curve by the total area covered by points within the interval: The estimated integration (bottom plot) is 
In case you missed them, here are some articles from April of particular interest to R users: A critique of a SAS whitepaper comparing the performance of SAS, R and Mahout. A video presentation from statistician Tess Nesbitt at UpStream, who uses GAM survival models in R for marketing attribution analysis. The April edition of the Revolution Analytics newsletter....
Over the past few days, I have been introduced to a few new-to-me R packages, via some comments from the Shiny guys and the R-bloggers site. This seems a rather haphazard way of acquiring knowledge and I cannot be alone in thinking that this is not the most productive way to become aware of new/better
Stack Exchange is a series of question-and-answer sites, including Stack Overflow for programming and Cross Validated for statistics. I was introduced to these sites at a short talk by Barry Rowlingson at the 2011 UseR! meeting, “Why R-help must die!“ These sites have a lot of advantages over R-help: The format is easier to read, 
I was recently asked by a client to create a large number of “proof of concept” visualizations that illustrated the power of R for compiling and analyzing disparate datasets. The client was specifically interested in automated analyses of global data. A little research led me to the WDI package. The WDI package is a tool
The post Global...
How to have a better chance of a good outcome. Making mistakes There’s been a lot of talk recently about data analysis problems with spreadsheets. If you’ve not stuck your head out of your cave lately, then you can catch some of the discussion by doing an internet search for: Reinhart Rogoff There are several
The post Living...
Update: I am aware the table of contents are being displayed in bullet form as I intended. The web template I'm using seems to be buggy. It also seems to think this page is in Indonesian...Working on it!
Table of Contents:
A couple of days ago, I had an opportunity to give a guest lecture on our Rcpp package for R and C++ integration. This was in CMSC 12300 Computer Science with Applications-3 in the Department of Computer Science at University of Chicago. The course i...
Preparing and reshaping data is the ever continuing task of a data analyst. Luckily we have many tools for it. The default tool in R would be reshape(), although this is so user friendly that a reshape package has been added too. I try to use reshape()...
In my courses on R, I usually show how to insert a picture as a background for a graph. But it is also to see the picture as an object, and to insert it in a graph everywhere we like to see it, as explained on the awesome blog http://rsnippets.blogspot.ca/…. (in a post published in January 2012). I wanted...
When I first saw a graphic made from Yihui’s animation package (Xie, 2012) I was amazed at the magic and thought “I could never do that”. Passage of time… One night I found myself bored and as usual avoiding work. … Continue reading →
With Stéphane Tufféry, we were working this week on a chapter of a book, entitled Statistical Learning in Actuarial Science. The chapter should be based on R functions, and we wanted to reproduce some outputs he previously obtained with SAS. The good thing is that even complex functions (logistic regression, regression trees, etc) produce the same kind of outputs....
As noted on paragraph 18.4.1 of the book Veterinary Epidemiologic Research, logistic regression is widely used for binary data, with the estimates reported as odds ratios (OR). If it’s appropriate for case-control studies, risk ratios (RR) are preferred for cohort studies as RR provides estimates of probabilities directly. Moreover, it is often forgotten the assumption 
(This article was first published on Econometrics by Simulation, and kindly contributed to R-bloggers) # I am interested in how small bits of individualized instructions can create collective action.# In this simulation I will give a single instruction to each individual in the swarm.# Choose another individual who is not too close, then accelerate towards that individual.# I also...