Statistical Breakdown.

The preface to Matt Brigg’s book is worth the price of the book alone: for these two quotes.

Bayes is not what you think. Hypothesis testing should immediately and forever be tossed into the scrapheap of intellectual history and certainly never tauht the vulnerable.

The love of theory is the root of all evil

An example is in this week’s JAMA psychiatry. It conflates two problems: for a test to work it has to clearly identify a difference between normal and pathological, and this difference, which can be measured many ways and abstracted as an effect size, has to have significant post test predictive value to be of clinical use [1]. But neuroimaging does not work this way.

So the paper considers how to fix this, summarized in this diagram,

, Brain regions with true signal in one patient. Voxels in blue were assigned a true, moderate effect size of Cohen d?=?0.5. Below, signal plus simulated noise; independent noise was added for each patient. Right panel: Results from a group t test (N?=?30). Left, the true effect size. Right, post hoc effect sizes from significant voxels (P?< ?.001 uncorrected). As expected, the estimated effect size of every significant voxel was higher than the true effect size. B, Expected effect size inflation for the maximal effect size across a family of tests. This bias, shown here using Monte Carlo simulation (Gaussian noise, 10?000 samples), increases as a function of the log number of tests performed and is approximated by the extreme value distribution (EV1). Effect size inflation increases as both the number of tests increases and the sample size decreases. C, Machine learning can maximize valid effect size estimates. The effect size for the difference between viewing negative and neutral images for an amygdala region of interest from the SPM Anatomy Toolbox version 2.2c (d?=?0.66) and for a whole-brain multivariate signature optimized to predict negative emotion (d?=?3.31). The test participants (N?=?61) were an independent sample not used to train the signature, so the effect size estimates are unbiased. [/caption] There are a number of problems.

  1. The brain is mathematical model. It would be more honest, but probably distressing to the reader, to use a cube
  2. I hate effect sizes because I don’t know what they are measuring. They are extracted from measures that are trying to ascertain a clinical state. I know the benefits and risks of those measures.
  3. Trying to model how to make the uncertain certain is a fallacy
  4. You cannot have hypotheses without a theory, an in psychiatry the theories that work relate to glia, synapses and networks more than anatomy
  5. But there has been millions of dollars poured into neuroimaging, despite a prediction that it would not work (that should have been published).

    This is a foolishness. Briggs is correct: you do your empirical testing once you have induced the problem and worked our a way to predict effects, not gather the data and let the statistical equivalent of a monkey grinder produce a precious significant result, which is probably random noise.

    _______
    1. I have just submitted an abstract that uses Bayes’ Theorem. Mea culpa. Mea maxima culpa