Surgery still outperforms GLP‑1 drugs over longer follow-up
A meta-analysis comparing bariatric surgery with GLP‑1 drugs found the gap widened as follow-up got longer. The signal is strong, but still confounded by non-random treatment selection and real-world access differences.
People talk about obesity treatment like it’s a clean contest: shots vs surgery. But in real life, the “winner” is often decided upstream by access, coverage, and who can tolerate what for long enough.
A new systematic review and meta-analysis tried to do the obvious comparison anyway: across published studies, how do glucagon-like peptide‑1 (GLP‑1) receptor agonists stack up against bariatric surgery at different time horizons?
The access story behind the headline
In 2026, it’s hard to read anything about GLP‑1s without running into the machinery around them: insurers, PBMs (pharmacy benefit managers), shortages, pilot programs, prior authorization, step therapy. That same machinery shapes whether a patient ever gets to “choose” meds versus surgery.
A good reminder of how much intermediaries matter is that regulators are still litigating the basics of access and pricing in older categories like insulin, with PBM behavior in the crosshairs (for example, STAT’s reporting on PBM insulin access allegations). Policy is also experimenting with new models for complex care (see STAT on the CMS Aspire model). That matters here because durable weight loss is not just biology, it’s a long-term relationship with a system.
What the study actually did
This paper is a systematic review and meta-analysis of 15 studies (20,594 participants) comparing bariatric surgery with GLP‑1 receptor agonists (PubMed | DOI).
Important: this is not a randomized, head-to-head trial. These comparisons are vulnerable to a boring but powerful problem: the people who get surgery are often different from the people who get medication (baseline BMI, comorbidities, readiness for intensive follow-up, ability to get surgery approved, etc.). Pooling results doesn’t magically erase that.
What the authors did that’s useful is split outcomes by time window:
- around 6 months
- up to 1 year
- beyond 1 year
That framing matches the lived question: What holds up after the first honeymoon period?
What they found (directionally)
The headline result is not subtle: the longer the follow-up, the more surgery pulls ahead.
On average across the included studies:
-
Weight change
- ~6 months: no statistically clear difference (MD -12.19 kg)
- ≤1 year: favored surgery (MD -16.97 kg)
-
1 year: favored surgery (MD -19.78 kg)
-
BMI change (favored surgery across windows)
- ~6 months: MD -6.77 kg/m²
- ≤1 year: MD -5.10 kg/m²
-
1 year: MD -6.61 kg/m²
-
Blood sugar measures
- beyond 1 year: larger HbA1c reduction with surgery (HbA1c is a 2–3 month average blood sugar; MD -1.69%)
They also report substantial heterogeneity, which is a polite way of saying: these are not uniform “apples to apples” comparisons. Different drugs and doses. Different surgery types. Different baseline patient profiles. Different follow-up intensity.
The more interesting question: what are we comparing, exactly?
If you treat this paper as a literal scoreboard, you’ll miss the point.
Bariatric surgery is a one-time intervention with an ecosystem of pre-op selection, peri-op risk, and post-op follow-up. GLP‑1 therapy is a chronic intervention with discontinuation, side effects, coverage churn, and supply issues. The most meaningful outcomes depend on:
- who stays on therapy and for how long
- who can access surgery and maintain follow-up
- what happens when people stop and restart
In other words, the comparison is partly about treatment durability, but also about implementation durability.
What this adds (and what it doesn’t)
What it adds: a clear picture that when you look across the literature, the gap tends to widen with longer follow-up.
What it doesn’t add: clean causal evidence. We still can’t say how much of the difference is biology vs selection and system effects.
What to watch next
If you want a version of this question that actually settles arguments, you’d want:
- better head-to-head evidence (randomized, or at least carefully matched cohorts)
- longer follow-up on newer agents and combination regimens
- endpoints people care about: cardiovascular events, kidney outcomes, quality of life, adverse events