A new study published in JAMA Network Open reveals that diagnostic interviews, the standard method for diagnosing mental health conditions, vary significantly in reliability depending on the disorder. Researchers found that these interviews, often considered the gold standard, fall short of providing a definitive benchmark for excellent validity and reliability, according to lead author Laura Duncan, a psychiatry professor at McMaster University.
The study reviewed test-retest reliability data from February 2024 to September 2025, using Cohen's kappa coefficient to measure consistency. Substance use disorders, particularly opioid use disorder, showed the highest reliability, largely because their criteria are behavior-based. For example, patients can more accurately report how many drinks they had in a week than the number of days they felt sad or anxious.
Dr. Michael First, a psychiatrist at Columbia University and author of the Structured Clinical Interview for DSM-5, criticized the study for not differentiating between fully structured and semi-structured interviews. Fully structured interviews yield more consistent results but lack flexibility, while semi-structured interviews allow clinicians to ask follow-up questions for greater accuracy but may produce more variable results. Duncan acknowledged the limitation, noting that the necessary data to compare interview formats is often unavailable.
Despite decades of hope for objective laboratory tests for mental conditions, none have materialized. Duncan suggested a future shift toward viewing symptoms on a spectrum rather than strict diagnostic categories.