Psychiatry Outcomes Benchmarks: PHQ-9 Response and Remission Rates

Published benchmarks for PHQ-9 response and remission across therapy, medication management, and TMS — with real-world data from the OutcomesAI platform.

Why These Numbers Matter

Behavioral health is moving toward value-based care. Payers, health systems, and accreditation bodies increasingly want practices to demonstrate outcomes — not just document visits. But most practices don’t know what “good” looks like.

This page publishes the benchmarks that matter most: PHQ-9 response and remission rates across the three primary treatment modalities in outpatient psychiatry — therapy, medication management, and TMS. Where we have data from our own platform, we include it. Where we don’t, we cite the best available published evidence.

Use these numbers to set expectations with patients, benchmark your own program, and prepare for value-based contract conversations.

How to Read These Numbers

Response is defined as a ≥50% reduction in PHQ-9 score from baseline. It means a patient experienced meaningful clinical improvement.

Remission is defined as a PHQ-9 score below 5. It means a patient’s symptoms have returned to minimal or none — the highest threshold of treatment success.

These are the same definitions used by NCQA’s HEDIS Depression Remission or Response (DRR-E) measure, CMS MIPS Measure #370, and the majority of published clinical literature. They are the standard your payers use.

Benchmark Summary

Treatment Modality	Response Rate	Remission Rate	Population	Source
Psychotherapy (CBT / outpatient)	~46%	~22%	Real-world outpatient, 5,554 episodes	Coley et al., 2021 (Kaiser Permanente)
Medication Management (first-line)	~47%	~28–33%	Outpatient MDD, first adequate trial	STAR*D trial, Rush et al., 2006
TMS — published benchmark	50–60%	30–40%	Treatment-resistant depression	Published RCT meta-analyses
TMS — real-world consensus	Up to 83% improvement	>50%	Broad real-world settings	Trapp et al., Clinical Neurophysiology, 2025
TMS — OutcomesAI platform	71%	32%	4,000+ patients, 5 years	OutcomesAI, 2026

By Modality

Psychotherapy

Published benchmark: ~46% response / ~22% remission

Real-world psychotherapy outcomes are lower than clinical trial figures, largely because trial populations are more carefully selected. A large study of 5,554 outpatient psychotherapy episodes across two integrated health systems (Kaiser Permanente Colorado and Washington) found a response rate of 46% and a remission rate of 22% using PHQ-9 criteria, with follow-up measured at 14–180 days after treatment initiation.

Source: Coley RY et al. Predicting outcomes of psychotherapy for depression with electronic health record data. Journal of Affective Disorders Reports. 2021.

CBT-specific trials report higher remission rates under controlled conditions — one well-designed study reported 48% remission at 18 weeks for in-person CBT. The gap between controlled trial figures and the real-world 22% remission rate reflects the difference between protocol-adherent research and routine clinical practice.

Source: Lewis CC et al. Applying machine learning to identify predictors of CBT response. Cognitive Therapy and Research. 2015.

What this means for your practice: A response rate below 40% or remission rate below 18% in your therapy program warrants a closer look at session frequency, treatment fidelity, and patient engagement. A response rate above 46% exceeds the published real-world benchmark.

Medication Management

Published benchmark: ~47% response / ~28–33% remission (first adequate trial)

The STAR*D trial — a large-scale, real-world antidepressant effectiveness study of 4,041 outpatients with major depressive disorder — found a Level 1 (first antidepressant trial with citalopram) response rate of approximately 47% and a remission rate of 28–33%, depending on the symptom scale used (28% by HAM-D, 33% by QIDS-SR). Both measures declined meaningfully with each subsequent treatment trial, underscoring that outcomes worsen as treatment resistance deepens.

Source: Rush AJ et al. Acute and Longer-Term Outcomes in Depressed Outpatients: STAR*D. American Journal of Psychiatry. 2006. Note: A 2023 reanalysis (Pigott et al., BMJ Open) identified methodological concerns with the original STAR*D reporting; the figures cited here reflect the original Level 1 published data, which remain the most widely referenced benchmark in clinical practice.

For patients who have already failed one or more antidepressant trials, response and remission rates in subsequent trials are substantially lower — making the identification of treatment-resistant patients a key clinical and commercial priority.

What this means for your practice: Systematic PHQ-9 tracking makes it possible to identify patients who are not responding to medication early — before they disengage from care — and initiate a conversation about alternative interventions such as TMS.

TMS (Transcranial Magnetic Stimulation)

Published benchmark: 50–60% response / 30–40% remission (standard protocols, treatment-resistant population)

Real-world consensus: Up to 83% improvement, >50% remission (broader population)

TMS benchmarks vary significantly based on patient selection. Clinical trials, which enroll specifically treatment-resistant patients, report more conservative figures. Real-world studies — which include a broader range of severity and treatment history — consistently report higher rates.

A 2025 consensus review of 2,396 studies, endorsed by the National Network of Depression Centers, the Clinical TMS Society, and the International Federation of Clinical Neurophysiology, confirmed that in real-world settings, up to 83% of patients show meaningful improvement and more than half achieve full remission.

Source: Trapp NT et al. Consensus review and considerations on TMS to treat depression. Clinical Neurophysiology. 2025;170:206–233.

OutcomesAI platform data: 71% response / 32% remission

Across 4,000+ patients treated at a large multi-site TMS practice over five years, using PHQ-9 scores collected through the OutcomesAI platform as part of routine clinical workflow:

71% achieved response (≥50% PHQ-9 reduction) by end of treatment course
32% achieved remission (PHQ-9 < 5) by end of treatment course
Median response occurred at session 30 of a standard 30–36 session course

These results exceed published RCT benchmarks for standard rTMS protocols in treatment-resistant populations, and are consistent with the higher end of real-world consensus data. They reflect outcomes under consistent clinical protocols and systematic measurement — not controlled trial conditions.

Why Most Practices Can’t Report These Numbers

The benchmarks above require one thing most practices don’t have: consistent, longitudinal PHQ-9 data collected at the right intervals across their entire patient population.

The most commonly reported failure modes:

PHQ-9 collected at intake only — no endpoint score means no response or remission calculation
Inconsistent collection — some providers use it, others don’t, making population-level aggregation impossible
No connection between appointment records and scores — PHQ-9 data exists but can’t be linked to treatment episodes
Manual processes — scores entered in notes or on paper, never structured in the EHR

PHQ-9 response — defined as a ≥50% reduction from baseline — is considered the preferred metric for comparing depression treatment outcomes because it does not favor higher or lower baseline symptom severity, indicates clinically meaningful improvement, and is straightforward to calculate and audit.

The measure is simple. The infrastructure to collect it consistently is not — unless it’s built into the workflow from the start.

How OutcomesAI Makes This Possible

OutcomesAI integrates directly with your EHR to automate PHQ-9 (and GAD-7, PCL-5, C-SSRS) collection as part of your existing clinical workflow. Scores are structured, timestamped, and linked to treatment episodes — so response and remission calculations happen automatically.

The result is the same kind of longitudinal outcomes data published on this page — generated by your practice, about your patients, available for every payer conversation, quality review, and contract negotiation you have.

Sources

Coley RY et al. Predicting outcomes of psychotherapy for depression with electronic health record data. Journal of Affective Disorders Reports. 2021.
Lewis CC et al. Applying machine learning to identify predictors of CBT response. Cognitive Therapy and Research. 2015.
Rush AJ, Trivedi MH, Wisniewski SR, et al. Acute and Longer-Term Outcomes in Depressed Outpatients: STAR*D. American Journal of Psychiatry. 2006.
Pigott HE et al. What are the treatment remission, response and extent of improvement rates after up to four trials of antidepressant therapies? BMJ Open. 2023.
Trapp NT et al. Consensus review and considerations on TMS to treat depression. Clinical Neurophysiology. 2025;170:206–233.
NCQA. Depression Remission or Response for Adolescents and Adults (DRR-E). 2025.
OutcomesAI. TMS Outcomes in Real-World Clinical Practice. 2026.

Related reading:

Building a Data-Driven Behavioral Health Practice — connecting outcomes data to operations
Every EHR Shows You a No-Show Rate. None Show You What Happened Next. — schedule performance analytics
What PE Firms Miss in Behavioral Health Due Diligence — how these benchmarks inform acquisition decisions

This page is updated annually. Benchmarks reflect published literature and OutcomesAI platform data as of 2026. All patient data is de-identified in accordance with HIPAA Safe Harbor standards.