compresr.ai machine view / /index.md

title: Compresr: context compression for LLMs
url: https://compresr.ai/
description: Your prompts carry far more tokens than the model actually reads. Compresr drops the rest: up to 90% fewer tokens, so you cut cost and latency.

#Stop paying for context your model doesn’t need.

Your prompts carry far more tokens than the model actually reads. Compresr drops the rest: up to 90% fewer tokens, so you cut cost and latency. At light compression it matches or beats full-context accuracy on public benchmarks.

Human-readable page: https://compresr.ai/

##See it on your own file in 60 seconds

-Works in Claude Code, Cursor, or any agent harness.

-Open full demo

##Stop overpaying.

-If you’re paying full price for your tokens, you’re leaving real money on the table.

-Feed us the query and the context. We return only the tokens that actually move the answer. You pay less, the LLM responds faster, and answers get sharper.

-Question-aware: we compress for the task.

-At light ~2× compression, accuracy matches or beats full context.

-SDK or on-prem. Your call.

##Get started in minutes

-Drop-in addition to your current context management workflow.

-Get Your API Key. Create an API key from your console.

-Install the SDK. Install the official SDK. Works with Python 3.8+ or Node.js 18+.

-Compress with a Query (latte_v2). Question-specific compression: keeps tokens relevant to the user query. Required when passing a query.

##Two ways to deploy

-Pick the one that fits your stack.

-Drop-in SDK. One API key. Install, grab a key, compress any prompt or document before it hits your LLM. Pay per million tokens, no surprise bills.

-$10 in free credits on sign-up, no credit card required

-TypeScript & Python clients

-Question-aware compression

-Transparent per-million-token pricing

-Runs inside your VPC. Your data never leaves your network. We deploy Compresr to your infrastructure, tune it for your workload, and support you directly.

-Private deployment in your cloud or data center

-Custom throughput & latency SLAs

-Tailored to your business needs

-Dedicated support

##Related machine surfaces

-/docs/quick-start.md: first call

-/contact.md: talk to us

-/llms.txt: machine site map