# https://compresr.ai/docs/quick-start

> Human-readable page: https://compresr.ai/docs/quick-start

By the end of this page you will have made your first compressed request and seen the token savings.

1. **Install the SDK**
   Install the client for your language. The cURL path needs nothing -
   it ships with your OS.

2. **Get an API key**
   Create a key in the [dashboard](/dashboard/api-keys), copy
   its `cmp_...` value, and export it as
   `COMPRESR_API_KEY`. See
   [Authentication](/docs/authentication) for the full key
   model and security guidance.

3. **Send a compression request**
   Pass the long `context` you would otherwise send to your
   LLM, plus the `query` you want it to answer. Always set
   `compression_model_name: "latte_v1"`.

   ```python
   import os
   from compresr import CompressionClient

   client = CompressionClient(api_key=os.environ["COMPRESR_API_KEY"])

   result = client.compress(
       context=(
           "The James Webb Space Telescope (JWST) is a space telescope designed "
           "primarily to conduct infrared astronomy. As the most powerful telescope "
           "ever launched, its 6.5-metre primary mirror is composed of 18 gold-coated "
           "hexagonal beryllium segments.\n\n"
           "It orbits the Sun near the Sun-Earth L2 Lagrange point, about "
           "1.5 million kilometres from Earth.\n\n"
           "A tennis-court-sized sunshield keeps the instruments below 50 K so the "
           "infrared sensors can detect faint heat signatures from distant galaxies.\n\n"
           "The observatory launched on December 25, 2021 aboard an Ariane 5 rocket "
           "from the Guiana Space Centre.\n\n"
           "It is operated jointly by NASA, ESA, and the Canadian Space Agency, with primary scientific operations conducted from the Space Telescope Science Institute in Baltimore."
       ),
       query="What is the diameter of JWST's primary mirror?",
       compression_model_name="latte_v1",
       target_compression_ratio=4,
   )

   print(result.data.compressed_context)
   print(
       f"{result.data.original_tokens} -> {result.data.compressed_tokens} tokens "
       f"({result.data.original_tokens / result.data.compressed_tokens:.2f}x)"
   )
   ```

   **TypeScript**

   ```typescript
   import { CompressionClient } from '@compresr/sdk';

   const client = new CompressionClient({
     apiKey: process.env.COMPRESR_API_KEY!,
   });

   const result = await client.compress({
     context:
       "The James Webb Space Telescope (JWST) is a space telescope designed " +
       "primarily to conduct infrared astronomy. As the most powerful telescope " +
       "ever launched, its 6.5-metre primary mirror is composed of 18 gold-coated " +
       "hexagonal beryllium segments.\n\n" +
       "It orbits the Sun near the Sun-Earth L2 Lagrange point, about " +
       "1.5 million kilometres from Earth.\n\n" +
       "A tennis-court-sized sunshield keeps the instruments below 50 K so the " +
       "infrared sensors can detect faint heat signatures from distant galaxies.\n\n" +
       "The observatory launched on December 25, 2021 aboard an Ariane 5 rocket " +
       "from the Guiana Space Centre.\n\n" +
       "It is operated jointly by NASA, ESA, and the Canadian Space Agency, with primary scientific operations conducted from the Space Telescope Science Institute in Baltimore.",
     query: "What is the diameter of JWST's primary mirror?",
     compressionModelName: 'latte_v1',
     targetCompressionRatio: 4,
   });

   const { original_tokens, compressed_tokens, compressed_context } = result.data;
   console.log(compressed_context);
   console.log(
     `${original_tokens} -> ${compressed_tokens} tokens ` +
       `(${(original_tokens / compressed_tokens).toFixed(2)}x)`,
   );
   ```

   **cURL**

   ```bash
   curl -X POST https://api.compresr.ai/api/compress/question-specific/ \
     -H "X-API-Key: $COMPRESR_API_KEY" \
     -H "Content-Type: application/json" \
     -d '{
       "context": "The James Webb Space Telescope (JWST) is a space telescope designed primarily to conduct infrared astronomy. As the most powerful telescope ever launched, its 6.5-metre primary mirror is composed of 18 gold-coated hexagonal beryllium segments.\n\nIt orbits the Sun near the Sun-Earth L2 Lagrange point, about 1.5 million kilometres from Earth.\n\nA tennis-court-sized sunshield keeps the instruments below 50 K so the infrared sensors can detect faint heat signatures from distant galaxies.\n\nThe observatory launched on December 25, 2021 aboard an Ariane 5 rocket from the Guiana Space Centre.\n\nIt is operated jointly by NASA, ESA, and the Canadian Space Agency, with primary scientific operations conducted from the Space Telescope Science Institute in Baltimore.",
       "query": "What is the diameter of JWST'"'"'s primary mirror?",
       "compression_model_name": "latte_v1",
       "target_compression_ratio": 4
     }'
   ```

4. **Inspect the result**
   The response contains the compressed text plus token-savings stats.
   This is the actual response from the live API for the call above
   (numbers will vary slightly run-to-run as `duration_ms`
   depends on load):

   ```text
   The James Webb Space Telescope (JWST) is a space telescope designed primarily to conduct infrared astronomy. As the most powerful telescope ever launched, its 6.5-metre primary mirror is composed of 18 gold-coated hexagonal beryllium segments.[106 tokens dropped]
   160 -> 58 tokens (2.76x)
   ```

   **TypeScript**

   ```text
   The James Webb Space Telescope (JWST) is a space telescope designed primarily to conduct infrared astronomy. As the most powerful telescope ever launched, its 6.5-metre primary mirror is composed of 18 gold-coated hexagonal beryllium segments.[106 tokens dropped]
   160 -> 58 tokens (2.76x)
   ```

   **cURL**

   ```json
   {
     "success": true,
     "message": null,
     "data": {
       "original_tokens": 160,
       "compressed_tokens": 58,
       "target_compression_ratio": 4.0,
       "actual_compression_ratio": 0.6375,
       "tokens_saved": 102,
       "duration_ms": 68,
       "compressed_context": "The James Webb Space Telescope (JWST) is a space telescope designed primarily to conduct infrared astronomy. As the most powerful telescope ever launched, its 6.5-metre primary mirror is composed of 18 gold-coated hexagonal beryllium segments.[106 tokens dropped]"
     }
   }
   ```

   - `compressed_context`: the shortened text. Forward
   this to your LLM exactly as you would the original input.
   `[N tokens dropped]` markers show where spans were
   cut; pass
   `disable_placeholders=true` if you want a clean
   concatenation without them.

   - `actual_compression_ratio`: fraction of input
   tokens _removed_ (here `0.6375` = ~64%
   removed). It is **not** an Nx factor.

   - `target_compression_ratio`: the value you asked
   for, echoed back. `0–1` = removal strength;
   `>1` = Nx factor (max `200`).

   - `tokens_saved`: `original_tokens` −
   `compressed_tokens`.

   - `duration_ms`: server-side compression time. Network
   round-trip is on top of this.

## What just happened?

`latte_v1` scored every paragraph in your context against the `query`, kept the ones that answer it, and dropped the rest. The paragraph with the mirror-diameter sentence stayed; the L2 orbit, sunshield, launch, and operator paragraphs did not. You sent ~64% fewer input tokens to your downstream model without losing the answer.

Want a cleaner output without the `[N tokens dropped]` placeholders? Pass `disable_placeholders=True`. For finer-grained, sentence-level cuts inside a paragraph, pass `coarse=False`. See the [models reference](/docs/api-reference/models) for the full parameter list.

## Next steps

- [Python SDK](/docs/sdks/python) - full method reference, async variants, streaming, batching.
- [TypeScript SDK](/docs/sdks/typescript) - same surface, camelCase params.
- [cURL / HTTP](/docs/sdks/curl) - raw REST reference.
- [Models](/docs/api-reference/models) - tune `target_compression_ratio` and other latte-only options.
- [Agent client](/docs/sdks/python#6-agent-client) - drop-in for `anthropic.Anthropic()` / `openai.OpenAI()` with automatic tool-output compression.
- [Web search](/docs/guides/web-search) - add Tavily or Brave to your agent loop in one line.
- [LangChain integration](/docs/framework-integration/langchain): first-party middleware for tool outputs, history, and outbound prompts, plus a `BaseDocumentCompressor` for RAG.
- [LangGraph integration](/docs/framework-integration/langgraph): state-graph node, lossy checkpoint serializer, store wrapper, and multi-agent handoff tool.
- [LlamaIndex integration](/docs/framework-integration/llamaindex): query-engine postprocessor, tool wrapper, and Memory API block.
- [LiteLLM integration](/docs/framework-integration/litellm): drop the `compresr` guardrail into the proxy and compress tool messages across every provider transparently.