All news
NewssemaglutideGLP-1side effects

Penn AI scans 400K Reddit posts for hidden Ozempic side effects

A Penn team used LLMs on 400,000+ Reddit posts about semaglutide, surfacing menstrual, temperature, and fatigue patterns not in clinical reports.

May 29, 2026 · 3 min read


A University of Pennsylvania team published research on May 24, 2026 in Nature Health using large language models to analyze over 400,000 Reddit posts from approximately 70,000 users discussing semaglutide and related GLP-1 drugs across more than five years. The AI-mediated analysis surfaced several symptom patterns that don't appear prominently in formal clinical trial reports — notably menstrual irregularities (nearly 4% of users reporting side effects), temperature-related symptoms (chills, hot flashes), and unexplained fatigue as the second most common patient-reported symptom.

The study doesn't establish causation. It does establish that patient-reported side-effect patterns from real-world use materially differ from the side-effect profile captured in pharmaceutical trials — a methodological story as much as a drug-safety story.

What happened

The Penn team, led by senior author Sharath Chandra Guntuku with co-authors Neil Sehgal, Lyle Ungar, and Jena Shaw Tronieri, used GPT and Gemini large language models to systematically extract and standardize symptom descriptions from Reddit discussions. The methodological backbone: matching freeform patient language to MedDRA (Medical Dictionary for Regulatory Activities) terms — the standardized symptom vocabulary used in clinical pharmacovigilance.

Headline findings:

  • 44% of users in the dataset mentioned at least one side effect
  • Gastrointestinal symptoms were the most common (consistent with trials)
  • Unexplained fatigue was the second most common symptom — less prominent in trial reporting
  • Menstrual irregularities appeared in roughly 4% of side-effect-reporting users
  • Temperature dysregulation (chills, hot flashes, fever-like sensations) appeared as a coherent symptom cluster

The authors emphasize the study doesn't prove the medications caused these symptoms — Reddit users self-select, self-attribute, and have no control group. The contribution is identifying patterns that warrant formal clinical investigation rather than dismissing as unrelated.

Why it matters

For the strength-peptide and GLP-1 audience, two implications:

The clinical trial side-effect profile is incomplete. The standard sources of side-effect information (FDA labels, Phase 3 trials) are populated by adverse events that meet specific reporting thresholds in controlled populations. Real-world use produces a broader signal that takes years to surface through standard post-marketing surveillance. AI-mediated social-media mining is a much faster path.

Women on GLP-1s deserve more attention. The menstrual-irregularity signal is one of the clearer female-specific patterns and aligns with anecdotal community reports. As GLP-1s expand into broader populations, the female reproductive-health interactions are an underexplored area.

For the broader frame on GLP-1s in the strength-peptide community see semaglutide + peptide stack: protecting lean mass on GLP-1 and tirzepatide vs semaglutide for body composition.

The temperature-dysregulation signal is also worth noting. Some users have reported being cold on GLP-1s anecdotally; the Reddit-mining work elevates this from anecdote to identified pattern worth investigating mechanistically.

What to watch

A few downstream questions this kind of study sets up:

  • Replication. Whether independent teams can replicate the symptom-pattern findings with different LLM approaches or different datasets
  • Mechanistic investigation. Whether the menstrual and temperature findings have plausible GLP-1-mediated mechanisms that warrant prospective study
  • FDA response. Whether the agency updates label warnings or pharmacovigilance practices in response to AI-mediated signal detection
  • Broader peptide application. This methodology could be applied to BPC-157, TB-500, and other strength peptides where formal trial data is thin. The Reddit and forum data on these compounds is substantial; LLM-mediated analysis could surface side-effect patterns that haven't been formally documented

For the related discussion on peptide evidence quality see the Croatian BPC-157 problem and Frontiers Aging peptide review.

Sources

Sources