John Kay muses on interpreting statistical data:

Always ask of such data “**what is the question to which this number is the answer?**”. “*Earnings before interest, tax, depreciation and amortisation on a like-for-like basis before allowance for exceptional restructuring costs*” is the answer to the question “*what is the highest profit number we can present without attracting flat disbelief?*”.

And on the pitfalls of powerful data analysis tools:

When the data seem to point to an unexpected finding, always consider the possibility that the problem is a feature of the data, rather than a feature of the world. […] It is now easy to import data into a computer program without thought. The unwarranted precision of the projected growth in rail traffic – a 96 per cent increase, rather than a doubling – is a clue that the number was generated by a computer, not a skilled interpreter of evidence.

Statistics are only as valid as the sources from which they are drawn and the abilities of those who use them. When I discover something surprising in data, the most common explanation is that I made a mistake.

*Related*

To

**leave a comment** for the author, please follow the link and comment on their blog:

** CYBAEA Data**.

R-bloggers.com offers

**daily e-mail updates** about

R news and

tutorials on topics such as:

Data science,

Big Data, R jobs, visualization (

ggplot2,

Boxplots,

maps,

animation), programming (

RStudio,

Sweave,

LaTeX,

SQL,

Eclipse,

git,

hadoop,

Web Scraping) statistics (

regression,

PCA,

time series,

trading) and more...

If you got this far, why not

__subscribe for updates__ from the site? Choose your flavor:

e-mail,

twitter,

RSS, or

facebook...