Articles by mike

repoRter.nih: a convenient R interface to the NIH RePORTER Project API

March 13, 2022 | mike

Introduction The US National Institute of Health (NIH) received funding of approximately $42 billion in fiscal year 2022; $31 billion (72%) of this was awarded by the NIH in the form of research grant funding to hospitals, medical colleges, non-profits, businesses, and other organizations based in the U.S. and abroad.[https://nexus.od....
[Read more...]

Bayesian Model Based Optimization in R

January 30, 2021 | mike

Model-based optimization (MBO) is a smart approach to tuning the hyperparameters of machine learning algorithms with less CPU time and manual effort than standard grid search approaches. The core idea behind MBO is to directly evaluate fewer points within a hyperparameter space, and to instead use a “surrogate model” which ... [Read more...]

Bayesian Model Based Optimization in R

January 30, 2021 | mike

Model-based optimization (MBO) is a smart approach to tuning the hyperparameters of machine learning algorithms with less CPU time and manual effort than standard grid search approaches. The core idea behind MBO is to directly evaluate fewer points within a hyperparameter space, and to instead use a “surrogate model” which ... [Read more...]

GAMs and scams: Part 1

January 11, 2021 | mike

People who do statistical modeling for insurance applications usually know their way around a GLM pretty well. In pricing applications, GLMs can produce a reasonable model to serve as the basis of a rating plan, but in my experience they are usually followed by a round of “selections” – a process ...
[Read more...]

GAMs and scams: Part 1

January 11, 2021 | mike

People who do statistical modeling for insurance applications usually know their way around a GLM pretty well. In pricing applications, GLMs can produce a reasonable model to serve as the basis of a rating plan, but in my experience they are usually followed by a round of “selections” – a process ...
[Read more...]

Color: The Cinderella of dataviz

March 13, 2009 | mike

“Avoiding catastrophe becomes the first principle in bringing color to information: Above all, do no harm.”  — Envisioning Information, Edward Tufte, Graphics Press, 1990    Color is one of the most abused and neglected tools in data visualization. It is abused when we make poor color choices; it is neglected when we rely ... [Read more...]

People who love scatter plots & connecting dots

February 20, 2009 | mike

We hosted the first Dataviz Salon SF on Tuesday night, with lightning talks by boredom cop Shane Booth, dataviz wiz Lee Byron , computational journalist Brad Stenger, data wrangler Pete Skomoroch , and any/all data enthusiast Brendan O’Connor . I was going to blog all about it — but Tom Carden of ... [Read more...]

What I’ll be presenting at O’Reilly Money Tech 2009

October 21, 2008 | mike

(April 2009 Update:  Unfortunately, The Money Tech Conference was indefinitely postponed, but fortunately I will be presenting a version of this talk in July at OSCON 2009). I’ve been invited to speak at O’Reilly’s Money Tech conference this coming February 4-6th in New York City and thought I’... [Read more...]

How do you measure a major league slugger?

September 1, 2008 | mike

I gave a talk last month at SAP Labs in Palo Alto, along with Jim Porzak of ResponSys, introducing the R Statistical Language to a Business Intelligence interest group.  The goal was to highlight how open source tools, like R, can be used to build predictive models.  The example I ... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)