The mythical Myers-Briggs scale.

Over at TC, derailment is the norm. The most recent thread moved onto the Myers-Briggs test. Now this is a test with good face validity. The questions seem to make sense, and the published descriptions also fit.

Now, wikipedia notes that the test is trademarked and copyrighted,,, but it is out there, and geeks take it. But is is reliable?

Now, since I can access a library… I fired up Ovid, and used the string “Myers-Briggs” AND (reliability OR validity OR psychometrics using EMBASE, PsychInfo and Pubmed along with the Cochrane DARE… and got 79 hits. When I added systematic review I got two hits, both off topic.

The best paper I can find is old. Really Old. It is by Carlson: the citation is Journal of Personality Assessment Aug1985, Vol. 49 Issue 4, p356.

Carlson notes that the orignial reliabiliy studies, reported in the manual, found split-half reliability coefficient (Peason r) of aroun 0.80, which is acceptable, but alternate data (prior to 1975) gave a range of 0.66 — 0.92. The more recent and shorter forms of the test had test retest reliabilite sof 0.48 — 0.84 for continuaous scores. The more recent studies find that all scales change except for extroversion / introversion.  The more recent data (ie. early 1980s) data was poor. To quote Carlson:

… the percentage of subjects who retained their specific dichotamous type preferences across all four scales was only 47%. In other words, a subject who had, say an ESTJ preference on first testing had only a 50:50 chance of maintaining the identical preference on every one of the four subscales upon retesting.

So. although each scale may be reliable but this is challenged by test/retest data. However, the tests are not reliable enough to remain stable… Well the paper continues, still using r and notes that the correlation with the eysenck extraversion factor is 0.74

At this point I am going to snark. The trouble with correlation factors is that there is not accounting for random agreement. I prefer to see Cohen’s Kappa here — and those numbers are lower. The current papers overstate the agreement.

The data on validity may not be there –this is an old paper and there is no more recent review I can find easliy nor a relevant systematic review.  But we may be chasing a chimera.

And here I am turning to a literature that I know. Personality disorders. The test-retest reliability, and the inter rater reliability here is bad. In part because people change. Our personalities — beyond some tempremental hard wiring which is probably about internal or external control, risk tolerence, and novelty seeking (which is a paraphrase of Cloninger). A large part of what we do in therapy is buy time to allow people to grow out of their temperaments.  There may be no such thing as a fixed personality. And, if that is the case, the Jungian typology is merely a myth, pleasant but of no earthly use.