The question is: can we automate scientific discovery, and what might an interface to such a tool look like.
I’ve been experimenting with automating simple and complex data analysis and report generation tasks for biological data and mostly using R and LATEX. You can see some of my progress and challenges encountered in the presentation below. My ultimate goal is to push forward my current state of fill in the generic template to some kind of interactive and dynamic document generator.
While thinking of a fun way of to present the idea of human-guided data analysis and report generation, I thought of the idea for creating a simple choose your adventure story. I decided to adapt the visualization below into an interactive adventure in R which culminates in the writing of your life story using the magic of knitr.
You can download the story generator, AdventureR, and try it out for yourself. Or take a quick look at some of the possible adventures. Be forewarned some of the story endings are not for the squeamish.
On a practical level, I am in the early prototyping phase of what might a similar application look like for metabolomic data. Currently I am struggling to get beyond the linear workflow and to something more interactive, dynamic and adaptive. Nonetheless its is a beauty to behold when at the click of a button an analysis and report can be generated which would otherwise take many hours to days to manually implement. The challenge is to turn the fill in the template style into an adaptive and robust interface which can quickly guide a human domain expert through the modes of information encoded in the data.
Currently my prototype tools require well defined input templates and flow the data hierarchically down a tree of tasks (e.g. statistics, visualization, functional analysis, clustering), each coming together to generate a mapped network. Right now things are very linear and mostly fill in the template, but still very usefully mimics a set of tasks a bioinformatician might perform. My goal is to adapt the LATEX based text generator into a GUI driven markdown or html based reporting application. With the ultimate goal of increasing the speed and interactivity of the data analysis and interpretation process.