Context rot

Context rot is the degradation in a language model’s answer quality as its context window fills with long, noisy, or irrelevant content, causing it to lose track of the information that actually matters.

Even models with very large context windows do not use every token equally well. As context grows, relevant evidence gets buried among distractors, attention spreads thin, and the model can miss, ignore, or confuse the spans that hold the answer. The practical effect is that simply stuffing more text into the window does not reliably improve (and can hurt) results.

Related phenomena include the "lost in the middle" effect, where content placed in the center of a long context is recalled less reliably than content at the edges. Together these effects mean longer is not always better; signal density inside the window matters.

Context compression is a direct countermeasure. By removing redundant and off-topic tokens before the model reads them, compression raises the signal-to-noise ratio of the context, which can keep answer quality stable or improve it even though fewer tokens are sent.

This is why Compresr’s light, query-specific compression can match or beat full-context accuracy: trimming distractors mitigates context rot rather than merely saving tokens.

Related terms