API reference
POST /compress/question-specific
Compress context while preserving tokens relevant to a query. The primary Compresr endpoint.
POST
/api/compress/question-specific/API keyCompress context while preserving tokens relevant to a query.
Supply a long context and a query; the model keeps the parts of the context that matter for that query and drops the rest. This is the headline Compresr endpoint.
Request body
contextstringRequiredThe text to compress. Pass null or an empty string to get an empty result back with no billing.
querystringRequiredThe question or topic to preserve relevance for. Cannot be empty.
compression_model_name"latte_v1"RequiredCompression model name. Currently only
latte_v1 is exposed publicly.target_compression_rationumberOptionalDefault:
see ModelsCompression strength. See /docs/api-reference/models for the canonical value semantics.
coarsebooleanOptionalDefault:
trueParagraph-level compression. Faster and cheaper, less precise than token-level.
heuristic_chunkingbooleanOptionalDefault:
falsePre-chunk the context with structure-aware heuristics before compression.
disable_placeholdersbooleanOptionalDefault:
falseStrip placeholder tokens from the compressed output.
Response
Response
successbooleantrue on success.
dataobjectoriginal_contextstringThe input context, echoed back.
compressed_contextstringThe compressed output you forward to your LLM.
original_tokensintegerToken count of the input context.
compressed_tokensintegerToken count after compression.
tokens_savedintegeroriginal_tokens − compressed_tokens.
target_compression_rationumber | nullThe ratio you requested, if any.
actual_compression_rationumberThe ratio actually achieved (0 to 1).
duration_msintegerServer-side processing time in milliseconds.
Status codes
Status codes
200Compression succeeded.OK400Malformed JSON body.Bad Request401Missing or invalidUnauthorizedX-API-Key.422A field failed validation (e.g. emptyUnprocessable Entityquery,target_compression_ratio > 200).429Rate limit hit for your tier.Too Many Requests500Upstream compression service error.Internal Server Error503Upstream compression service error.Service Unavailable
Request
python
Response
json