---
title: Prompt compression
url: https://compresr.ai/glossary/prompt-compression
description: Prompt compression is a form of context compression that reduces the number of tokens in the prompt sent to a language model while keeping the instructions and content the model needs to respond correctly.
---

# Prompt compression is a form of context compression that reduces the number of tokens in the prompt sent to a language model while keeping the instructions and content the model needs to respond correctly.

> Prompt compression is a form of context compression that reduces the number of tokens in the prompt sent to a language model while keeping the instructions and content the model needs to respond correctly.

> Human-readable page: https://compresr.ai/glossary/prompt-compression

## Definition
Prompt compression targets the assembled prompt: system instructions, few-shot examples, retrieved passages, and the user message. Methods range from token-level pruning (dropping low-information words) to abstractive rewriting (restating the same content more concisely) to learned models that score and select spans.

The term is often used interchangeably with context compression, but it emphasizes the prompt as the unit of work. In practice the same engine handles both: a long retrieved context and the surrounding prompt are compressed together so the final request is shorter end to end.

Quality depends heavily on how compression decides what to keep. Generic, query-blind pruning can remove tokens that turn out to be answer-bearing. Query-specific approaches condition the decision on what is being asked, which protects the spans that matter for the current task.

Compresr performs query-specific prompt and context compression: the query guides which tokens survive, so the shortened prompt still carries the evidence the model needs to answer.

## Related machine surfaces
- [Context compression (/glossary/context-compression.md)](/glossary/context-compression.md)
- [Query-specific compression (/glossary/query-specific-compression.md)](/glossary/query-specific-compression.md)
- [Token (LLM) (/glossary/token.md)](/glossary/token.md)
- [Compression ratio (/glossary/compression-ratio.md)](/glossary/compression-ratio.md)
- [/glossary.md: all terms](/glossary.md)

## Provenance
Compresr Inc. is a Y Combinator W26 company built by four EPFL-trained founders in San Francisco, California and Europe (Switzerland).
Contact: [compresr.ai/contact](https://compresr.ai/contact).
