Backed by Y Combinator

Compress Context.Cut Cost. Improve performance.

compresr proxy for agentsGitHub Stars

Sign up Try demo

🎯 Boost LLM pipelines with context compression

Reduce context to what matters. At 2 levels.

Coarse-grained. Pass a query + a list of chunks -> get relevant ones.
Fine-grained. Query + your context -> token-level compression.

🤖 Make your agents context-efficient

Use any agents with our gateway for cost/latency/accuracy benefits.

Compress conversation history, tool outputs, lists of tools.
Works with: Claude Code, OpenClaw, Codex, and more!

Up to 200x compression without quality loss

A SEC filing analysis

	Baseline (GPT-5.2)	latte_v1 API + GPT-5.2
Compression	—	10x
Average Context	~106Ktokens	~10.5Ktokens
Accuracy	72.3%	74.5%
Savings	—	76%cheaper

FinanceBench · 141 questions over 79 SEC filings · Full filings up to 230K tokens long

Ready to compress?

Start compressing today. Cut token costs and build smarter AI applications.

Get started free Book a demo

Stay in the loop

No spam. Unsubscribe anytime.