---
title: Compresr: LLM context-compression API
url: https://compresr.ai/
description: Compresr is an LLM context-compression API: send long context plus the query you want answered, get back a shorter context that keeps the answer-bearing spans and drops the rest.
---

# Compresr is an LLM context-compression API that shortens long context to the spans that answer your query.

> Compresr is an LLM context-compression API: send long context plus the query you want answered, get back a shorter context that keeps the answer-bearing spans and drops the rest.

> Human-readable page: https://compresr.ai/

## What it is
- Public model: `latte_v1` (public, query-specific; `query` is required).
- Interfaces: Python SDK, TypeScript SDK, or hosted HTTP API (api.compresr.ai).
- Input: long context + the query you want answered. Output: a shorter context that keeps the answer-bearing spans.
- Pricing: $0.10 / 1M tokens (public model latte_v1). $10 free credits at signup, no card. On-prem: custom volume pricing, runs in your VPC.

## What it is not
- It is not prompt caching, KV-cache compression, a long-context model, a vector database, or a reranker.
- It composes with all of them: it sits one layer before the model call and shrinks what you send.

## Headline benchmarks
On public long-document QA, light (~2x) compression matches or beats full-context accuracy: FinanceBench 73%→77%, QMSum 55.9%→59.6%. Full tables and methodology: [/benchmarks.md](/benchmarks.md).

## Accuracy vs. ratio: two separate claims
- The accuracy win lives at LIGHT (~2x) compression: FinanceBench 73%→77%, QMSum 55.9%→59.6% (single-shot long-document QA, not RAG, 2026-04).
- High ratios (~10x / ~90% reduction) are a COST + LATENCY claim only. Past ~2x, accuracy falls ~2pp per doubling and can drop below the full-context baseline (e.g. FinanceBench ~8.9x = 65%).
- These are never combined into a single "90% reduction at full accuracy" claim.

## Related machine surfaces
- [/benchmarks.md: accuracy & ratio results](/benchmarks.md)
- [/pricing.md: pricing](/pricing.md)
- [/docs/quick-start.md: first call](/docs/quick-start.md)
- [/machine: entity overview](/machine)
- [/llms.txt: machine site map](/llms.txt)

## Provenance
Compresr Inc. is a Y Combinator W26 company built by four EPFL-trained founders in San Francisco, California and Europe (Switzerland).
Contact: [compresr.ai/contact](https://compresr.ai/contact).
