**Decision Science News » R**, and kindly contributed to R-bloggers)

WITH DATA LIKE THESE, WHO CAN SAY?

Decision Science News is no stranger to misleading infographics in free New York newspapers. We could upgrade to real newspapers, but we find that playing “spot the infographic flaw” really makes the time fly on the subway.

We saw the above graphic in a paper called Metro. In New York City, restaurants are graded by health inspectors and receive an “A”, “B”, or “C” rating (any lower than C and they are shut down). This graphic was supposed to inform us about the percentage of restaurants with As, by borough and citywide. Can you spot the goof?

You might be curious how the weighted average of 73.3%, 62.8%, 63.6%, 61.4%, and 62.2% could be 69% (shown in the red box) given that 73.3% gets the smallest weight in the average.

Ignoring the top row in the table, a simple calculation from the remaining numbers gives 63.6% as the percentage of restaurants with As. But which stat is correct? Perhaps the top row is correct and some other numbers in the table are wrong.

Amazingly, the same day, AM New York, yet another free paper, ran more or less the same story, but with different numbers. Based on those, 68% of restaurants had As. Disappointingly, all their by-borough percentages failed to line up with hose from Metro (see R code at end of this article).

Decision Science News then tried to cut out the middleman and hit the New York City Department of Health and Mental Hygiene Website. Pie charts are much maligned, but when it comes to the topic of food safety, why not? If it were up to us, we would have drawn in a crust and whipped cream, but then our taste in charts is controversial.

Taken from http://www.nyc.gov/html/doh/downloads/pdf/rii/restaurant-grading-1-year-report.pdf

So, we now have four candidate figures, 69%, 63.6%, 68% and 69%, which are in no way independent, but do suggest the answer is “just shy of 70%”.

Another interesting tidbit in the health department’s report is that the restaurant grades may be effective at changing restaurants’ behavior. At first inspection, 39% of restaurants got As, 34% got Bs, and 27% got Cs. From page 3:

Among those scoring in the B range on initial inspection, nearly 40% improved to earn an A on reinspection. Of restaurants that scored in the C range on their initial inspection, 72% improved enough to earn an A or B on re-inspection.

There you have it!

R CODE FOR R NERDS

################

#Metro data

graded=c(22454,2204,5235,9086,5030,899)

asMetrostated=c(.69,.662,.628,.636,.614,.733)

asMetrocount=round(asMetrostated*graded)

metro=data.frame(graded,asMetrostated,asMetrocount)

row.names(metro)=c("citywide","bronx","brooklyn","manhattan","queens","statenIsland")

metro

sprintf("Total graded: %d", sum(metro[2:6,1]))

sprintf("Total As: %d", sum(metro[2:6,3]))

sprintf("Percent As: %.2f", (sum(metro[2:6,3]) / sum(metro[2:6,1]) * 100))

################

#AM New York data

statenIsland=c(644,73,20,82)

queens=c(3009,601,152,806)

brooklyn=c(3197,619,152,774)

bronx=c(1394,260,54,332)

manhattan=c(5792,1006,256,1314)

am=data.frame(bronx,brooklyn,manhattan,queens,statenIsland)

am[5,]=apply(am[1:4,],2,sum)

am[6,] = am[1,]/am[5,]

row.names(am)=c("A","B","C","GradePending","total","As")

round(am,2)

sprintf("Percent As: %.2f", sum(am[1,]) / sum(am[5,]))

**leave a comment**for the author, please follow the link and comment on his blog:

**Decision Science News » R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...