Guides
Web search
Add Tavily or Brave web search to your agent loop with automatic compression.
WebSearchTool is Compresr's drop-in search tool for agent loops. It returns a real LangChain BaseTool; pass it to any of the agent client facades and the tool output flows through CompresrToolMiddleware automatically — search results get compressed before they re-enter the model context, with no extra wiring.
Two backends ship out of the box:
| Provider | Strengths | Bring your own key |
|---|---|---|
| Tavily (default) | LLM-tuned results, strong on recency and citations, allowed_domains / blocked_domains filtering, max_results 1–10 | TAVILY_API_KEY |
| Brave | Generalist web index, independent crawler, no domain filtering on the free plan | BRAVE_API_KEY |
Why not Anthropic / OpenAI / Gemini server search?
Provider-native server search tools (Anthropic's web_search_20250305, OpenAI's web_search_preview, Gemini's google_search) execute server-side and return opaque or encrypted content that Compresr cannot read or compress. The model sees the search content but the SDK doesn't — there's nothing to compress because we can't intercept the payload.
WebSearchTool deliberately calls the search API client-side. The plaintext results reach the SDK first, pass through CompresrToolMiddleware, and arrive at the model in compressed form. This is the only path on which Compresr's compression actually fires for web search results.
Quick start — Tavily
That's the whole integration. The model decides when to call tavily_search; the SDK fires Tavily, hands the raw response to CompresrToolMiddleware, gets back a compressed version, and threads that back into the model context.
Construction options
Both backends accept a small, deliberately-narrow option set. Anything provider-specific (e.g. Tavily search_depth) is forwarded through the extra / **extra escape hatch.
Tavily
Brave
No domain filtering on Brave
Brave's HTTP search API doesn't expose allowed_domains / blocked_domains. If you need domain filtering today, use Tavily — passing those args to .brave(...) raises a clear error rather than silently ignoring them.
Constructor form
The classmethod factories above are the recommended path — they're discoverable in autocomplete and statically typed. The constructor form below is equivalent and useful when the provider is a runtime variable:
How compression interacts with web search
CompresrToolMiddleware runs on every tool return inside the agent loop:
- The model emits a tool call (e.g.
tavily_search({ query: "..." })). - The SDK invokes Tavily and gets back JSON with
results: [{ url, title, content, raw_content, score }, ...]. - The middleware checks the serialized string length. If it's at or below
compression.min_tokens(default200), it's forwarded untouched. Above the threshold, the middleware callsclient.compress(...)with the result body ascontextand the user's last message asquery. - The compressed body replaces the original in the agent state. The model never sees the uncompressed search results.
Two practical implications:
- Tune
compression.min_tokensto your search backend. Tavily returns longraw_contentper hit — easy to exceed200tokens. Brave returns shorter snippets — you may want to lowermin_tokensto e.g.100to capture them. - The
queryused for compression is the user's intent, not the model's tool-call query string. This keeps the compression query-aware against what the user actually asked, even when the model rewords the search.
Errors & failure modes
| Scenario | Behaviour | Mitigation |
|---|---|---|
| Search-provider 401 / 403 | The provider's error is raised inside the agent loop and surfaces as CompresrError("Agent execution failed: …") | Rotate the search-provider key. |
| Search-provider rate limit | Same — provider error wrapped by the SDK. | Backoff / smaller max_results. |
| Compresr backend down while the middleware fires | Returns the original (uncompressed) search result by default (on_error="passthrough" on the policy). | Set compression={"on_error": "raise"} if you want to fail loudly instead. |
| No tools fired (model answered without searching) | No compression call. The middleware only runs on tool returns. | Encourage the model to search via the user prompt or a system instruction. |
Next steps
- Agent client — the full facade surface (Anthropic shape, OpenAI shape, native).
- LangChain integration — lower-level
CompresrToolMiddleware,wrap_tool_with_compression, andCompresrExtractorfor custom retrieval pipelines. - LangGraph integration — drop-in compression node for
StateGraph-style agents.