# Compresr > Compresr is an LLM context-compression API: send long context plus the query you want answered, get back a shorter context that keeps the answer-bearing spans and drops the rest. Public model: latte_v1 (query-specific). Python SDK, TypeScript SDK, or hosted HTTP API. Every page below links to its Markdown twin (`.md`), the plain-text surface an LLM or answer engine should read. The human-readable pages live at the same path without the `.md` suffix. ## Docs - [Introduction](https://compresr.ai/docs/introduction.md): What Compresr is and where it fits in an LLM stack. - [Quick Start](https://compresr.ai/docs/quick-start.md): Make your first compression call in minutes. - [Authentication](https://compresr.ai/docs/authentication.md): API keys and how to authenticate requests. - [Python SDK](https://compresr.ai/docs/sdks/python.md): Install and compress with the Python client. - [TypeScript SDK](https://compresr.ai/docs/sdks/typescript.md): Install and compress with the TypeScript client. - [cURL](https://compresr.ai/docs/sdks/curl.md): Call the hosted HTTP API directly with cURL. - [Models](https://compresr.ai/docs/api-reference/models.md): The public model latte_v1 and its parameters. - [Conventions](https://compresr.ai/docs/api-reference/conventions.md): Request/response shapes and shared conventions. - [Rate Limits](https://compresr.ai/docs/api-reference/rate-limits.md): Per-tier request and token limits. - [Errors](https://compresr.ai/docs/api-reference/errors.md): Error codes and how to handle them. ## Product - [Pricing](https://compresr.ai/pricing.md): $0.10 / 1M tokens; $10 free credits at signup, no card. On-prem is custom volume pricing in your VPC. - [Security](https://compresr.ai/security.md): Data handling, retention, deletion, and on-prem/VPC deployment. - [Changelog](https://compresr.ai/changelog.md): Product and API updates over time. - [Benchmarks](https://compresr.ai/benchmarks.md): Accuracy and compression-ratio results on long-document QA. - [About](https://compresr.ai/about.md): Compresr Inc., Y Combinator W26, four EPFL-trained founders in San Francisco, California and Europe (Switzerland). ## Compare - [vs. LLMLingua](https://compresr.ai/compare/llmlingua.md): Hosted, query-aware, on-prem-ready alternative to LLMLingua research code. - [vs. Prompt Caching](https://compresr.ai/compare/prompt-caching.md): Why compression and caching are complementary, not rivals. - [Prompt compression tools compared](https://compresr.ai/compare/prompt-compression-tools.md): LLMLingua-2, LongLLMLingua, scaledown, Token Company, and Compresr. ## Glossary - [Glossary](https://compresr.ai/glossary.md): LLM context-compression vocabulary, defined for humans and machines. ## Benchmarks - FinanceBench (n=128, 2026-04): light ~2x compression 73%->77% accuracy; QMSum (n=272, 2026-04): 55.9%->59.6%. Past ~2x is a cost/latency play. Single-shot long-document QA, not RAG. See [Benchmarks](https://compresr.ai/benchmarks.md). ## Agents - [agents.md](https://compresr.ai/agents.md): Agent skill file — curl it to install and use the Compresr SDK end to end. ## Optional - [Machine overview](https://compresr.ai/machine): The structured entity card an answer engine can cite for "what is Compresr". - [Full docs corpus](https://compresr.ai/llms-full.txt): Every doc concatenated as Markdown in one fetch.