API reference
POST /pricing/estimate-cost
Estimate the processing cost of compressing a given input-token volume at a target ratio.
POST
/api/pricing/estimate-costPublicEstimate processing cost for a given input-token volume and target compression ratio.
Public, unauthenticated endpoint used by the pricing calculator. Given a compression model, a number of input tokens, and a target ratio, it returns the estimated processing cost and how many tokens would survive compression.
Request body
compression_modelstringOptionalDefault:
service defaultCompression model to estimate against (e.g. "latte_v1").
input_tokensintegerRequiredNumber of input tokens to compress. Must be greater than 0.
compression_rationumberOptionalDefault:
service defaultTarget compression ratio between 0 and 1 (exclusive). 0.3 means keep ~30% of tokens.
Response
Response
successbooleandataobjectinput_tokensintegerEchoed input volume.
compressed_tokensintegerEstimated token count after compression.
tokens_savedintegerinput_tokens − compressed_tokens.
compression_rationumberEchoed target ratio (0 to 1).
processing_cost_usdnumberEstimated cost in USD for the compression step alone.
modelstringModel used for the estimate.
input_price_per_1mnumberInput price per 1M tokens at the time of the estimate.
Status codes
Status codes
200Estimate returned.OK400UnknownBad Requestcompression_model.422Field validation failure (e.g.Unprocessable Entityinput_tokens <= 0, ratio outside 0 to 1).500Internal pricing-service failure.Internal Server Error
Request
python
Response
json