Guides

Web search

Add Tavily, Brave, or Amazon Bedrock AgentCore web search to your agent loop with automatic compression.

WebSearchTool is Compresr's drop-in search tool for agent loops. It returns a real LangChain BaseTool; pass it to any of the agent client facades and the tool output flows through CompresrToolMiddleware automatically: search results get compressed before they re-enter the model context, with no extra wiring.

Three backends ship out of the box:

Provider	Strengths	Bring your own key
Tavily (default)	LLM-tuned results, strong on recency and citations, `allowed_domains` / `blocked_domains` filtering, `max_results` forwarded verbatim	`TAVILY_API_KEY`
Brave	Generalist web index, independent crawler, native domain filtering uses Goggles (out of scope for v1)	`BRAVE_SEARCH_API_KEY` (fallback: `BRAVE_API_KEY`)
AgentCore	Amazon Bedrock web search over MCP + Cognito OAuth, `max_results` clamped 1..25	Cognito client credentials (see AgentCore below)

Why not Anthropic / OpenAI / Gemini server search?

Provider-native server search tools (Anthropic's web_search_20250305, OpenAI's web_search_preview, Gemini's google_search) execute server-side and return opaque or encrypted content that Compresr cannot read or compress. The model sees the search content but the SDK doesn't; there's nothing to compress because we can't intercept the payload.

WebSearchTool deliberately calls the search API client-side. The plaintext results reach the SDK first, pass through CompresrToolMiddleware, and arrive at the model in compressed form. This is the only path on which Compresr's compression actually fires for web search results.

Quick start: Tavily

python

That's the whole integration. The model decides when to call tavily_search (or brave_search / agentcore_web_search for the other providers); the SDK fires the underlying API, hands the raw response to CompresrToolMiddleware, gets back a compressed version, and threads that back into the model context.

Construction options

All three backends accept a small, deliberately-narrow option set. Anything provider-specific (e.g. Tavily search_depth) is forwarded through the extra / **extra escape hatch. max_results is forwarded verbatim to Tavily and Brave (their own upper limits apply); AgentCore clamps it to 1..25.

Tavily

python

TypeScript Tavily env-var fallback

The TypeScript WebSearchTool.tavily({...}) factory does not read TAVILY_API_KEY from the environment; pass apiKey explicitly. The Python factory and the client.research(...) facade string form both do the env fallback.

Brave

python

No native domain filtering on Brave

Brave doesn't expose native allowed_domains / blocked_domains — filtering flows through Goggles, which is out of scope for v1. Python emits a UserWarning and proceeds; TypeScript silently ignores the kwargs (they aren't declared on BraveOptions). Route through Tavily if you need real domain filtering.

TypeScript Brave + Claude tool_use

WebSearchTool.brave in TypeScript returns the upstream BraveSearch directly. LangChain.js's BraveSearch ships without an args_schema, so Anthropic (Claude) tool_use calls can fail with missing 1 required positional argument: 'query'. Workaround: wrap it in your own tool() factory that declares a query: z.string() schema, or use Tavily or AgentCore. Python's WebSearchTool.brave already wraps Brave with an explicit schema, so this is a TypeScript-only pitfall.

AgentCore

Amazon Bedrock AgentCore web search over an MCP streamable-HTTP session, authenticated with a Cognito OAuth client-credentials handshake. Runtime deps are optional — install the extra: pip install compresr[agentcore] (Python) or npm install @modelcontextprotocol/sdk (TypeScript).

python

Every config field resolves from the explicit argument first, then a two-step env-var fallback:

Field	Primary env var	Fallback env var
`gateway_url` / `gatewayUrl`	`AGENTCORE_GATEWAY_MCP_URL`	`GATEWAY_MCP_URL`
`cognito_token_url` / `cognitoTokenUrl`	`AGENTCORE_COGNITO_TOKEN_URL`	`COGNITO_TOKEN_URL`
`client_id` / `clientId`	`AGENTCORE_COGNITO_CLIENT_ID`	`COGNITO_CLIENT_ID`
`client_secret` / `clientSecret`	`AGENTCORE_COGNITO_CLIENT_SECRET`	`COGNITO_CLIENT_SECRET`
`scope`	`AGENTCORE_COGNITO_SCOPE`	`COGNITO_SCOPE`

Runtime behaviour:

HTTPS-only URLs (TypeScript). gatewayUrl and cognitoTokenUrl are validated to start with https://; a plaintext URL throws CompresrError('invalid_config') before any credential leaves the process.
max_results clamp. TypeScript enforces 1..25 at build time. Python passes the value through to the underlying client (which applies the same bound).
Bearer-token cache. The Cognito token is minted once and cached on the shared client; a 401 from the gateway triggers exactly one automatic re-mint before the call is retried.
Timeouts and response cap. Cognito requests time out at 30s, tool calls at 30s, and responses are hard-capped at 1 MB.
allowed_domains / blocked_domains. Accepted for signature parity with Tavily/Brave. Python emits a UserWarning and proceeds; TypeScript silently ignores them (fields declared but unread). Use Tavily for real domain filtering.

Constructor form

The classmethod factories above are the recommended path; they're discoverable in autocomplete and statically typed. The constructor form below is equivalent and useful when the provider is a runtime variable:

python

How compression interacts with web search

CompresrToolMiddleware runs on every tool return inside the agent loop:

The model emits a tool call (e.g. tavily_search({ query: "..." })).
The SDK invokes the provider and normalises the response via _flatten_search_results / flattenSearchResults — the raw JSON is reshaped into blank-line-separated plain-text blocks of title\nurl\ncontent. This is the shape latte_v1 can actually compress; JSON input is a no-op for compression.
The middleware checks the serialized string length. If it's at or below compression.min_tokens (default 200), it's forwarded untouched. Above the threshold, the middleware calls client.compress(...) with the result body as context and the user's last message as query.
The compressed body replaces the original in the agent state. The model never sees the uncompressed search results.

Two practical implications:

Tune compression.min_tokens to your search backend. Tavily returns long content per hit, easy to exceed 200 tokens. Brave returns shorter snippets, so you may want to lower min_tokens to e.g. 100 to capture them.
The query used for compression is the user's intent, not the model's tool-call query string. This keeps the compression query-aware against what the user actually asked, even when the model rewords the search.

Errors & failure modes

Scenario	Behaviour	Mitigation
Missing peer dep (`langchain-tavily` / `langchain-community` / `@modelcontextprotocol/sdk`)	Python raises `ImportError` naming the extra. TypeScript raises `CompresrError` code `missing_peer_dependency`.	Install the extra: `pip install compresr[agents-tavily\</td> <td>agents-brave\</td> <td>agentcore]` or `npm install @langchain/tavily @langchain/community @modelcontextprotocol/sdk`.
Missing API key or config	Python raises `ValueError` naming the arg and env-var fallback. TypeScript raises `CompresrError` with code `missing_api_key` (Brave), `missing_config` (AgentCore), or `invalid_config` (AgentCore non-HTTPS URL), plus `invalid_provider` for the constructor form.	Pass the arg or set the env var.
Search-provider 401 / 403 / rate limit at call time	Python: the provider's exception propagates inside the agent loop. TypeScript: raised as `CompresrError` — for AgentCore specifically, codes `agentcore_auth_error`, `agentcore_no_tool`, `agentcore_bad_response`, `agentcore_tool_error`.	Rotate the key or back off / lower `max_results`.
Compresr backend down while the middleware fires	Returns the original (uncompressed) search result by default (`on_error="passthrough"` on the policy).	Set `compression={"on_error": "raise"}` if you want to fail loudly instead.
No tools fired (model answered without searching)	No compression call. The middleware only runs on tool returns.	Encourage the model to search via the user prompt or a `system` instruction.

Research facade

For research-style pipelines that combine web search, snippet compression, and citation extraction in one call, use client.research.run(question, search="tavily" | "brave" | <tool>, ...) instead of wiring WebSearchTool into an agent loop by hand. The string form of search= currently supports tavily and brave only; pass a pre-built WebSearchTool.agentcore(...) if you want AgentCore under the facade. See the SDK reference pages for the full signature.

Next steps

Agent client: the full facade surface (Anthropic shape, OpenAI shape, native).
LangChain integration: lower-level CompresrToolMiddleware, wrap_tool_with_compression, and CompresrExtractor for custom retrieval pipelines.
LangGraph integration: drop-in compression node for StateGraph-style agents.