**R – JAGS News**, and kindly contributed to R-bloggers)

There is an opportunity at IARC for a postdoctoral statistician to work on a challenging problem with a big impact.

One of the important tasks of the International Agency for Research on Cancer (IARC) is to collate and publish global cancer statistics. GLOBOCAN (part of the Global Cancer Observatory) is a periodic publication of IARC that gives estimate of incidence and mortality in 184 countries worldwide. Currently published estimates are for the year 2012 and estimates for 2016 are in preparation.

Up until now, GLOBOCAN has reported only point estimates of cancer incidence and mortality . But standards for global health statistics are evolving. With the publication of the GATHER statement (Guidelines for Accurate and Transparent Health Estimates Reporting) which requires “a quantitative measure of the uncertainty of the estimates”, we now face the challenge of supplementing the point estimates with uncertainty intervals.

GLOBOCAN estimates are widely used by cancer epidemiologists and can be the starting point for further epidemiological research. I use them in my own analyses (see for example articles on the global burden of cancers attributable to infections and the worldwide burden of cancer attributable to HPV ) which is why I have been working with the Cancer Surveillance section at IARC to help push this forward. We now have funding for a postdoc to join the team working on this issue.

### The challenge

From the point of view of statistical methodology, this is not a trivial problem. GLOBOCAN does not use a single overarching statistical model, but a variety of different methods based on the availability and quality of data on cancer incidence and mortality in each country. For example, the decision tree below shows how GLOBOCAN chooses the estimation method for cancer incidence.

So how do we add uncertainty intervals? We have some ideas but are open to original input. Our first approach is based on the observation that every branching point in the decision tree represents a choice between a “better” or “worse” method. We will start by reanalyzing the GLOBOCAN data over all possible paths in the decision tree (based on available data) to estimate the errors accumulated at each branching point. Then we will use these estimates to quantify the uncertainty for countries where only the worse path is possible.

The research will be guided by two principles:

- The methods used will be computationally intensive, but they have to be feasible and reproducible (We will not be using MCMC for this one).
- The statistical theory can be as sophisticated as you like, but the methods have to be sold to a general audience of cancer epidemiologists. One of the strengths of GLOBOCAN is the transparency of its methods. The uncertainty intervals have to play to this strength.

Funding is currently available up to the end of 2018. I expect the scope of the project to go beyond this date and will look to secure further funding but cannot give guarantees at this time. If you are interested contact me for further inquiries (plummerm AT iarc DOT fr).

**leave a comment**for the author, please follow the link and comment on their blog:

**R – JAGS News**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...