API reference

POST /pricing/estimate-cost

Estimate the processing cost of compressing a given input-token volume at a target ratio.

POST/api/pricing/estimate-costPublic

Estimate processing cost for a given input-token volume and target compression ratio.

Public, unauthenticated endpoint used by the pricing calculator. Given a compression model, a number of input tokens, and a target ratio, it returns the estimated processing cost and how many tokens would survive compression.

Request body

compression_modelstringOptional

Default: service default

Compression model to estimate against (e.g. "latte_v2").

input_tokensintegerRequired

Number of input tokens to compress. Must be greater than 0.

compression_rationumberOptional

Default: service default

Target compression ratio between 0 and 1 (exclusive). 0.3 means keep ~30% of tokens.

Response

successboolean
dataobject
- input_tokensinteger
  Echoed input volume.
- compressed_tokensinteger
  Estimated token count after compression.
- tokens_savedinteger
  input_tokens − compressed_tokens.
- compression_rationumber
  Echoed target ratio (0 to 1).
- processing_cost_usdnumber
  Estimated cost in USD for the compression step alone.
- modelstring
  Model used for the estimate.
- input_price_per_1mnumber
  Input price per 1M tokens at the time of the estimate.

Status codes

200
Estimate returned.
OK
400
Unknown compression_model.
Bad Request
422
Field validation failure (e.g. input_tokens <= 0, ratio outside 0 to 1).
Unprocessable Entity
500
Internal pricing-service failure.
Internal Server Error

Request

python

Response

json