Watchful Waiting in Pediatric Sleep Apnea: What the Data Really Shows About Behavioral and Cognitive Outcomes

jrotenberg3
Feb 14
6 min read

A critical look at the evidence that challenges conventional interpretations

If you've read about the landmark CHAT study on pediatric obstructive sleep apnea, you might have come away thinking: "Surgery didn't improve cognitive outcomes, so maybe we're doing too many tonsillectomies." That interpretation has gained traction in some circles, but it misses a crucial part of the story.

As a pediatric neurologist and sleep specialist who regularly counsels families facing this decision, I've noticed a troubling gap between how these studies are summarized and what they actually found. Let me walk you through what the data really shows—and why the "watchful waiting is fine" narrative deserves a closer look.

The Study That Changed the Conversation

The Childhood Adenotonsillectomy Trial (CHAT) was a game-changer: 464 children with sleep apnea, randomly assigned to either early surgery or watchful waiting, followed for seven months with rigorous outcome measures. Published in the New England Journal of Medicine in 2013, it remains the most cited study in this field.

Here's what everyone remembers: The primary outcome—formal neuropsychological testing of attention and executive function—showed no significant difference between surgery and watchful waiting.

And here's what too many people forget: Every single behavioral measure told a completely different story.

The Behavioral Endpoints Nobody Talks About

Yes, behavioral endpoints were evaluated—extensively. In fact, they were pre-specified secondary outcomes, measured with validated instruments and analyzed rigorously. Here's what they found:

Parent Reports (Conners' Rating Scale)

Children in the watchful waiting group showed significantly worse:

Restlessness and impulsivity
Emotional lability
Executive function in daily life (BRIEF scale)

Effect sizes were moderate to large—meaning clinically meaningful differences.

Teacher Reports

Classroom behavior was significantly worse in the watchful waiting group. These weren't just parents noticing differences at home; teachers who didn't know which group a child was in saw the same problems at school.

Quality of Life

Multiple validated measures (PedsQL, OSA-18) showed significantly worse quality of life in children who didn't have surgery. Think about that: kids were sleeping worse, functioning worse, and feeling worse about their daily lives.

Symptoms and Sleepiness

The Pediatric Sleep Questionnaire and Epworth Sleepiness Scale both showed more symptoms and greater daytime sleepiness in the watchful waiting group.

Treatment Failures

All nine treatment failures occurred in the watchful waiting group. These weren't just statistical blips—these were children who deteriorated to the point where intervention became necessary, experiencing school behavioral problems, worsening sleep quality, morning headaches, and even hypertension.

So Why Didn't Formal Testing Show Differences?

This is where it gets interesting. The disconnect between formal neuropsychological testing and real-world behavioral measures tells us something important about what we're measuring—and what we're missing.

The Ceiling Effect Problem

The children in CHAT had baseline neuropsychological scores near the population mean (around 100). They weren't cognitively impaired to begin with, so there wasn't much room for improvement. You can't gain 20 points when you're already at average.

The Ecological Validity Gap

Formal neuropsychological testing happens in a quiet room with a psychometrist, one-on-one attention, and structured tasks. Real life happens in noisy classrooms, at 3 PM when kids are tired, with multiple competing demands and limited external structure.

Parent and teacher rating scales capture this real-world executive functioning in a way that formal testing doesn't. When a teacher says a child is restless, impulsive, and struggling to pay attention in class, that matters—even if the child can perform well on a structured attention task in a testing room.

The Follow-Up Duration Issue

Seven months may not be long enough to detect subtle cognitive changes that accumulate over time. Some studies have shown cognitive decline over longer follow-up periods after adenotonsillectomy, suggesting ongoing monitoring is important. The developing brain is particularly vulnerable to sleep disruption, and effects may be cumulative.

What About the Pulmonary Safety Argument?

Here's where I'll agree with the watchful waiting advocates: From a purely respiratory standpoint, observation can be safe in selected cases.

The CHAT data:

79% of surgical patients normalized their sleep study
46% of watchful waiting patients also normalized their sleep study

That 46% spontaneous improvement rate is real and meaningful. It suggests that for some kids with mild to moderate sleep apnea, the condition resolves on its own as they grow.

But "pulmonary safe" doesn't mean "consequence-free." While the polysomnography normalized in nearly half of observed children, their behavior, quality of life, and daily functioning remained impaired throughout that observation period.

The Severe OSA Story Is Different—and Alarming

For children with severe sleep apnea, the evidence is more concerning. A 2006 study by Halbower and colleagues used magnetic resonance spectroscopy to look at brain chemistry in children with severe OSA. What they found should give us pause:

Decreased neuronal metabolite ratios in the hippocampus and frontal cortex (suggesting neuronal injury)
Significant deficits in IQ compared to matched controls
Impaired executive function

These weren't children with mild snoring. But it raises the question: At what point does OSA cross from "safe to observe" to "causing brain injury"? And can we reliably identify that threshold?

The mechanism likely involves intermittent hypoxia and sleep fragmentation affecting brain development, particularly in regions critical for learning and executive control.

What This Means for Clinical Practice

I don't think the answer is "operate on everyone who snores." But I also don't think it's "watchful waiting is fine because the neuropsych testing was negative."

Watchful Waiting May Be Reasonable When:

OSA is truly mild (AHI <5) without significant desaturation
The child has no behavioral or school problems
Quality of life isn't impaired
Close follow-up is feasible
The family understands they're accepting ongoing functional impairment during the observation period

Early Intervention Is Favored When:

Moderate to severe OSA
Significant behavioral problems or school difficulties
Quality of life impairment
Obesity (lower spontaneous resolution rates)
Risk factors for progression

The Honest Conversation

When I counsel families, I tell them what the data actually shows: "Your child's sleep study might improve on its own—about half do. But during that waiting period, the research suggests their behavior, sleep quality, and daily functioning will likely be worse than if we treated now. Formal testing might not show a difference, but the real-world impact is measurable."

The 2023 Update: PATS Study

A more recent trial (the Pediatric Adenotonsillectomy Trial for Snoring, published in JAMA in 2023) confirmed the CHAT findings and added new concerns:

Again, no difference in executive function or attention
Again, significantly worse behavioral problems and quality of life with watchful waiting
New finding: Greater decline in blood pressure with surgery
Worrisome finding: 13.2% of watchful waiting children had worsening sleep apnea over 12 months (vs. only 1.3% who had surgery)

That last point deserves emphasis. We're not just asking "Will they get better on their own?" We should also ask "Might they get worse?"

The Bottom Line

The narrative that "surgery doesn't improve cognitive outcomes" is technically true but functionally misleading. It conflates one specific type of measurement (formal neuropsychological testing) with the totality of cognitive and behavioral functioning.

What the data actually shows is more nuanced:

Watchful waiting is relatively safe from a pulmonary standpoint in mild-moderate OSA (46% spontaneous improvement)
Watchful waiting carries ongoing behavioral and quality-of-life costs that are measurable, clinically significant, and consistent across multiple studies
Formal neuropsychological testing may be insensitive to the real-world functional deficits that parents and teachers observe daily
For severe OSA, there's evidence of actual neuronal injury, making watchful waiting inappropriate

A Better Framework

Instead of "to operate or not to operate," maybe we need a more sophisticated framework:

Mild OSA + no functional impairment + reliable follow-up = Watchful waiting is reasonable
Mild OSA + behavioral problems or quality of life impairment = Benefits of surgery likely outweigh risks
Moderate-severe OSA = Surgery is first-line treatment
Severe OSA with significant desaturation = Urgent intervention needed

And critically: Watchful waiting doesn't mean "ignore it." It means active surveillance with regular reassessment of symptoms, behavior, school performance, and quality of life—not just waiting for the next sleep study.

The Research We Still Need

The studies we have are good, but they leave important questions unanswered:

What happens over longer follow-up periods? (Most studies are 6-12 months)
Can we identify which children will improve spontaneously vs. deteriorate?
Are there subtler cognitive effects that emerge over years rather than months?
How do we better measure the real-world functional impact of OSA in ways that correlate with what parents and teachers observe?

Until we have those answers, I'll keep having nuanced conversations with families—acknowledging both the safety of observation in select cases and the real behavioral costs during that observation period.

Because "the neuropsych testing was negative" shouldn't be the end of the discussion. It should be the beginning of a more thoughtful one.

Additional Neurology & Sleep Medicine Resources

For Healthcare Professionals:

For Families:

Neuropsychological Assessment:

References:

Marcus CL, et al. A randomized trial of adenotonsillectomy for childhood sleep apnea. N Engl J Med. 2013;368(25):2366-2376.
Redline S, et al. Adenotonsillectomy for Snoring and Mild Sleep Apnea in Children: A Randomized Clinical Trial. JAMA. 2023;330(21):2084-2095.
Halbower AC, et al. Childhood obstructive sleep apnea associates with neuropsychological deficits and neuronal brain injury. PLoS Med. 2006;3(8):e301.

Dr. Joshua Rotenberg is a board-certified pediatric neurologist and sleep specialist practicing in Houston, Texas. The views expressed today are his own and do not constitute medical advice. And medical opinionsbased on literature may change. Individual treatment decisions should be made in consultation with a qualified healthcare professional.

Houston Specialty Clinic

Pediatric Neurology, Pediatric Pulmonology, and Sleep Medicine

Houston, Sugar Land, and Telehealth