Skip to content
Compresr docs

Guides

Web search

Add Tavily or Brave web search to your agent loop with automatic compression.

WebSearchTool is Compresr's drop-in search tool for agent loops. It returns a real LangChain BaseTool; pass it to any of the agent client facades and the tool output flows through CompresrToolMiddleware automatically — search results get compressed before they re-enter the model context, with no extra wiring.

Two backends ship out of the box:

ProviderStrengthsBring your own key
Tavily (default)LLM-tuned results, strong on recency and citations, allowed_domains / blocked_domains filtering, max_results 1–10TAVILY_API_KEY
BraveGeneralist web index, independent crawler, no domain filtering on the free planBRAVE_API_KEY

Provider-native server search tools (Anthropic's web_search_20250305, OpenAI's web_search_preview, Gemini's google_search) execute server-side and return opaque or encrypted content that Compresr cannot read or compress. The model sees the search content but the SDK doesn't — there's nothing to compress because we can't intercept the payload.

WebSearchTool deliberately calls the search API client-side. The plaintext results reach the SDK first, pass through CompresrToolMiddleware, and arrive at the model in compressed form. This is the only path on which Compresr's compression actually fires for web search results.

Quick start — Tavily

python

That's the whole integration. The model decides when to call tavily_search; the SDK fires Tavily, hands the raw response to CompresrToolMiddleware, gets back a compressed version, and threads that back into the model context.

Construction options

Both backends accept a small, deliberately-narrow option set. Anything provider-specific (e.g. Tavily search_depth) is forwarded through the extra / **extra escape hatch.

Tavily

python

Brave

python

No domain filtering on Brave

Brave's HTTP search API doesn't expose allowed_domains / blocked_domains. If you need domain filtering today, use Tavily — passing those args to .brave(...) raises a clear error rather than silently ignoring them.

Constructor form

The classmethod factories above are the recommended path — they're discoverable in autocomplete and statically typed. The constructor form below is equivalent and useful when the provider is a runtime variable:

python

CompresrToolMiddleware runs on every tool return inside the agent loop:

  1. The model emits a tool call (e.g. tavily_search({ query: "..." })).
  2. The SDK invokes Tavily and gets back JSON with results: [{ url, title, content, raw_content, score }, ...].
  3. The middleware checks the serialized string length. If it's at or below compression.min_tokens (default 200), it's forwarded untouched. Above the threshold, the middleware calls client.compress(...) with the result body as context and the user's last message as query.
  4. The compressed body replaces the original in the agent state. The model never sees the uncompressed search results.

Two practical implications:

  • Tune compression.min_tokens to your search backend. Tavily returns long raw_content per hit — easy to exceed 200 tokens. Brave returns shorter snippets — you may want to lower min_tokens to e.g. 100 to capture them.
  • The query used for compression is the user's intent, not the model's tool-call query string. This keeps the compression query-aware against what the user actually asked, even when the model rewords the search.

Errors & failure modes

ScenarioBehaviourMitigation
Search-provider 401 / 403The provider's error is raised inside the agent loop and surfaces as CompresrError("Agent execution failed: …")Rotate the search-provider key.
Search-provider rate limitSame — provider error wrapped by the SDK.Backoff / smaller max_results.
Compresr backend down while the middleware firesReturns the original (uncompressed) search result by default (on_error="passthrough" on the policy).Set compression={"on_error": "raise"} if you want to fail loudly instead.
No tools fired (model answered without searching)No compression call. The middleware only runs on tool returns.Encourage the model to search via the user prompt or a system instruction.

Next steps

  • Agent client — the full facade surface (Anthropic shape, OpenAI shape, native).
  • LangChain integration — lower-level CompresrToolMiddleware, wrap_tool_with_compression, and CompresrExtractor for custom retrieval pipelines.
  • LangGraph integration — drop-in compression node for StateGraph-style agents.