tavily
Web search and content extraction using Tavily Search/Extract/Research APIs (Bearer auth). Use when you need web results (general/news/finance), date/topic/domain filtering, source citations, or want Tavily instead of built-in web_search/Firecrawl. Requires TAVILY_API_KEY.
Packaged view
This page reorganizes the original catalog entry around fit, installability, and workflow context first. The original raw source lives below.
Install command
npx @skill-hub/cli install openclaw-skills-tavily-web
Repository
Skill path: skills/doahc/tavily-web
Web search and content extraction using Tavily Search/Extract/Research APIs (Bearer auth). Use when you need web results (general/news/finance), date/topic/domain filtering, source citations, or want Tavily instead of built-in web_search/Firecrawl. Requires TAVILY_API_KEY.
Open repositoryBest for
Primary workflow: Research & Ops.
Technical facets: Full Stack, Tech Writer, Security.
Target audience: everyone.
License: Unknown.
Original source
Catalog source: SkillHub Club.
Repository owner: openclaw.
This is still a mirrored public skill entry. Review the repository before installing into production workflows.
What it helps with
- Install tavily into Claude Code, Codex CLI, Gemini CLI, or OpenCode workflows
- Review https://github.com/openclaw/skills before adding tavily to shared team environments
- Use tavily for development workflows
Works across
Favorites: 0.
Sub-skills: 0.
Aggregator: No.
Original source / Raw SKILL.md
---
name: tavily
description: Web search and content extraction using Tavily Search/Extract/Research APIs (Bearer auth). Use when you need web results (general/news/finance), date/topic/domain filtering, source citations, or want Tavily instead of built-in web_search/Firecrawl. Requires TAVILY_API_KEY.
version: 1.0.0
compatibility: Requires env var TAVILY_API_KEY
requires_env: [TAVILY_API_KEY]
primary_credential: TAVILY_API_KEY
outbound_hosts: ["api.tavily.com"]
metadata:
hermes:
tags: [Web, Search, Tavily, Research, API, Citations]
requires_env: [TAVILY_API_KEY]
outbound_hosts: ["api.tavily.com"]
openclaw:
requires:
env: ["TAVILY_API_KEY"]
---
# Tavily (Web Search / Extract / Research)
## Prereqs
- Ensure `TAVILY_API_KEY` is set in the Hermes environment (commonly `~/.hermes/.env`).
- Do not hardcode or paste API keys into chat logs. See `references/bp-api-key-management.md`.
## Security Notes
- The bundled CLI (`scripts/tavily.py`) reads only `TAVILY_API_KEY` from the environment and only sends requests to `https://api.tavily.com`.
- Prefer `search` then `extract` over `include_raw_content` on `search` to keep outputs small and reduce accidental data exposure.
## Quick Reference
Use the terminal tool to run the bundled CLI script (prints JSON).
`SKILL_DIR` is the directory containing this `SKILL.md` file.
```bash
# Search (general)
python3 SKILL_DIR/scripts/tavily.py search --query "latest OpenAI API changes" --max-results 5
# Search (news) with recency filter
python3 SKILL_DIR/scripts/tavily.py search --query "latest OpenAI API changes" --topic news --time-range week --max-results 5
# High-precision search (more cost/latency)
python3 SKILL_DIR/scripts/tavily.py search --query "OpenAI API rate limits March 2026" --search-depth advanced --chunks-per-source 3 --max-results 5
# Search + answer (still cite URLs from results)
python3 SKILL_DIR/scripts/tavily.py search --query "What is X?" --include-answer basic --max-results 5
# Extract (targeted chunks; prefer this over include_raw_content on search)
python3 SKILL_DIR/scripts/tavily.py extract --url "https://example.com" --query "pricing" --chunks-per-source 3 --format markdown
# Research (creates task + polls until complete)
python3 SKILL_DIR/scripts/tavily.py research --input "Summarize the EU AI Act enforcement timeline. Provide numbered citations." --model auto --citation-format numbered --max-wait-seconds 180
```
Use returned `results[].url` fields as citations/sources in your final answer.
## No-Script Option (curl)
Use Tavily directly via curl (same endpoints, no bundled script):
```bash
curl -s "https://api.tavily.com/search" \
-H "Authorization: Bearer <TAVILY_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"query":"latest OpenAI API changes","topic":"news","time_range":"week","max_results":5}'
curl -s "https://api.tavily.com/extract" \
-H "Authorization: Bearer <TAVILY_API_KEY>" \
-H "Content-Type: application/json" \
-d '{"urls":"https://example.com","query":"pricing","chunks_per_source":3,"extract_depth":"basic","format":"markdown"}'
```
## Procedure
1. Turn the user request into a focused search query (keep it short, ideally under ~400 chars). Split multi-part questions into 2-4 sub-queries.
2. Choose `topic`:
- `general` for most searches
- `news` for current events (prefer also setting `time_range` or date range)
- `finance` for market/finance content
3. Choose `search_depth`:
- Start with `basic` (1 credit) unless you need higher precision.
- Use `advanced` (2 credits) for high-precision queries; use `chunks_per_source` to control snippet volume.
4. Keep `max_results` small (default 5) and filter by `score` + domain trust.
5. For primary text, run `extract` on 1-3 top URLs:
- Provide an `extract --query ... --chunks-per-source N` to avoid dumping full pages into context.
6. For synthesis across multiple subtopics with citations, run `research` and poll until `status=completed`.
## Pitfalls
- `include_raw_content` on `search` can explode output size; prefer the two-step flow: `search` then `extract`.
- `auto_parameters` can silently pick `search_depth=advanced` (2 credits). Set `--search-depth` explicitly when you care about cost.
- `exact_match` is restrictive; wrap the phrase in quotes inside `--query` and expect fewer results.
- `country` boosting is only available for `topic=general`.
- On failures, keep the `request_id` from responses for support/debugging.
## Verification
- Check credits/limits: `python3 SKILL_DIR/scripts/tavily.py usage`
- Add `--include-usage` on `search`/`extract` if you want per-request usage info.
## References
- `references/search.md`
- `references/extract.md`
- `references/research.md`
- `references/research-get.md`
- `references/bp-search.md`
- `references/bp-extract.md`
- `references/bp-research.md`
- `references/bp-api-key-management.md`
- `references/usage.md`
---
## Referenced Files
> The following files are referenced in this skill and included for context.
### references/bp-api-key-management.md
```markdown
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
> Use this file to discover all available pages before exploring further.
# API Key Management
> Learn how to handle API key leaks and best practices for key rotation.
## What to do if your API key leaks
If you suspect or know that your API key has been leaked (e.g., committed to a public repository, shared in a screenshot, or exposed in client-side code), **immediate action is required** to protect your account and quota.
Follow these steps immediately:
1. **Log in to your account**: Go to the [Tavily Dashboard](https://app.tavily.com).
2. **Revoke the leaked key**: Navigate to the API Keys section. Identify the compromised key and delete or revoke it immediately. This will stop any unauthorized usage.
3. **Generate a new key**: Create a new API key to replace the compromised one.
4. **Update your applications**: Replace the old key with the new one in your environment variables, secrets management systems, and application code.
If you notice any unusual activity or usage spikes associated with the leaked key before you revoked it, please contact [[email protected]](mailto:[email protected]) for assistance.
## Rotating your API keys
As a general security best practice, we recommend rotating your API keys periodically (e.g., every 90 days). This minimizes the impact if a key is ever compromised without your knowledge.
### How to rotate your keys safely
To rotate your keys without downtime:
1. **Generate a new key**: Create a new API key in the [Tavily Dashboard](https://app.tavily.com) while keeping the old one active.
2. **Update your application**: Deploy your application with the new API key.
3. **Verify functionality**: Ensure your application is working correctly with the new key.
4. **Revoke the old key**: Once you are confirmed that the new key is in use and everything is functioning as expected, delete the old API key from the dashboard.
<Note>
Never hardcode API keys in your source code. Always use environment variables or a secure secrets manager to store your credentials.
</Note>
```
### scripts/tavily.py
```python
#!/usr/bin/env python3
"""
Tavily API CLI helper for Hermes skills.
Dependency-free (Python stdlib only) and prints JSON to stdout.
Designed to be called via the Hermes terminal tool from the tavily skill.
"""
from __future__ import annotations
import argparse
import json
from os import environ as _environ
import sys
import time
import urllib.error
import urllib.request
from typing import Any, Dict, List, NoReturn, Optional
TAVILY_API_BASE_URL = "https://api.tavily.com"
def _die(msg: str, *, code: int = 2) -> NoReturn:
print(json.dumps({"success": False, "error": msg}, ensure_ascii=False, indent=2))
raise SystemExit(code)
def _headers() -> Dict[str, str]:
# Only read the single intended credential from the environment.
api_key = _environ.get("TAVILY_API_KEY")
if not api_key:
_die("TAVILY_API_KEY environment variable not set")
return {
"Authorization": f"Bearer {api_key}",
"Accept": "application/json",
"User-Agent": "hermes-skill-tavily/1.0",
}
def _try_parse_json(raw: bytes) -> Dict[str, Any]:
text = raw.decode("utf-8", errors="replace")
try:
parsed = json.loads(text)
except Exception:
return {"raw": text}
if isinstance(parsed, dict):
return parsed
return {"value": parsed}
def _request(method: str, path: str, *, json_body: Optional[dict] = None, http_timeout: float = 60.0) -> Dict[str, Any]:
# Hardcode the API host to prevent exfil via injected base URL overrides.
url = f"{TAVILY_API_BASE_URL}{path}"
headers = _headers()
data = None
if json_body is not None:
headers = dict(headers)
headers["Content-Type"] = "application/json"
data = json.dumps(json_body).encode("utf-8")
req = urllib.request.Request(url, data=data, headers=headers, method=method)
try:
with urllib.request.urlopen(req, timeout=http_timeout) as resp:
final_url = resp.geturl()
if not final_url.startswith(TAVILY_API_BASE_URL + "/"):
_die(f"Unexpected redirect: {final_url}", code=1)
raw = resp.read()
except urllib.error.HTTPError as e:
raw = e.read()
payload = _try_parse_json(raw)
_die(f"HTTP {e.code}: {payload}", code=1)
except urllib.error.URLError as e:
_die(f"Request failed: {e.reason}", code=1)
except Exception as e:
_die(f"Request failed: {type(e).__name__}: {e}", code=1)
return _try_parse_json(raw)
def _print(data: Any) -> None:
print(json.dumps(data, ensure_ascii=False, indent=2))
def _parse_csv(value: Optional[str]) -> Optional[List[str]]:
if not value:
return None
items = [x.strip() for x in value.split(",") if x.strip()]
return items or None
def _parse_bool_or_enum(value: str, *, true_values: set[str], false_values: set[str]) -> Any:
v = value.strip().lower()
if v in false_values:
return False
if v in true_values:
return True
return value
def cmd_search(args: argparse.Namespace) -> None:
payload: Dict[str, Any] = {
"query": args.query,
"max_results": args.max_results,
"search_depth": args.search_depth,
"topic": args.topic,
}
if args.chunks_per_source is not None:
payload["chunks_per_source"] = args.chunks_per_source
if args.time_range:
payload["time_range"] = args.time_range
if args.start_date:
payload["start_date"] = args.start_date
if args.end_date:
payload["end_date"] = args.end_date
include_answer = _parse_bool_or_enum(
args.include_answer,
true_values={"true", "1", "yes"},
false_values={"false", "0", "no"},
)
if include_answer is not False:
payload["include_answer"] = include_answer
include_raw_content = _parse_bool_or_enum(
args.include_raw_content,
true_values={"true", "1", "yes"},
false_values={"false", "0", "no"},
)
if include_raw_content is not False:
payload["include_raw_content"] = include_raw_content
if args.include_image_descriptions:
payload["include_images"] = True
payload["include_image_descriptions"] = True
elif args.include_images:
payload["include_images"] = True
if args.include_favicon:
payload["include_favicon"] = True
if args.country:
payload["country"] = args.country
if args.auto_parameters:
payload["auto_parameters"] = True
if args.exact_match:
payload["exact_match"] = True
if args.include_usage:
payload["include_usage"] = True
if (include_domains := _parse_csv(args.include_domains)) is not None:
payload["include_domains"] = include_domains
if (exclude_domains := _parse_csv(args.exclude_domains)) is not None:
payload["exclude_domains"] = exclude_domains
data = _request("POST", "/search", json_body=payload, http_timeout=args.http_timeout)
_print(data)
def cmd_extract(args: argparse.Namespace) -> None:
urls: List[str] = []
if args.url:
urls.extend(args.url)
if args.urls:
urls.extend([u.strip() for u in args.urls.split(",") if u.strip()])
if not urls:
_die("extract requires --url or --urls")
payload: Dict[str, Any] = {
"urls": urls if len(urls) > 1 else urls[0],
"format": args.format,
"extract_depth": args.extract_depth,
}
if args.query:
payload["query"] = args.query
if args.chunks_per_source is not None:
payload["chunks_per_source"] = args.chunks_per_source
if args.include_images:
payload["include_images"] = True
if args.include_favicon:
payload["include_favicon"] = True
if args.timeout_seconds is not None:
payload["timeout"] = float(args.timeout_seconds)
if args.include_usage:
payload["include_usage"] = True
data = _request("POST", "/extract", json_body=payload, http_timeout=args.http_timeout)
_print(data)
def cmd_research_get(args: argparse.Namespace) -> None:
data = _request("GET", f"/research/{args.request_id}", http_timeout=args.http_timeout)
_print(data)
def cmd_research(args: argparse.Namespace) -> None:
payload: Dict[str, Any] = {
"input": args.input,
"model": args.model,
"citation_format": args.citation_format,
"stream": False,
}
if args.output_schema_json:
try:
payload["output_schema"] = json.loads(args.output_schema_json)
except Exception as e:
_die(f"Failed to parse --output-schema-json: {e}")
queued = _request("POST", "/research", json_body=payload, http_timeout=args.http_timeout)
request_id = queued.get("request_id")
if args.no_wait or not request_id:
_print(queued)
return
deadline = time.monotonic() + float(args.max_wait_seconds)
last: Dict[str, Any] = queued
while time.monotonic() < deadline:
status = _request("GET", f"/research/{request_id}", http_timeout=args.http_timeout)
last = status
if status.get("status") in ("completed", "failed"):
_print(status)
return
time.sleep(float(args.poll_interval_seconds))
last = dict(last)
last["timed_out"] = True
_print(last)
def cmd_usage(args: argparse.Namespace) -> None:
data = _request("GET", "/usage", http_timeout=args.http_timeout)
_print(data)
def build_parser() -> argparse.ArgumentParser:
p = argparse.ArgumentParser(prog="tavily.py", description="Call Tavily APIs (Search/Extract/Research) and print JSON.")
p.add_argument("--http-timeout", type=float, default=60.0, help="HTTP timeout in seconds (default: 60)")
sub = p.add_subparsers(dest="cmd", required=True)
sp = sub.add_parser("search", help="POST /search")
sp.add_argument("--query", required=True, help="Search query")
sp.add_argument("--max-results", type=int, default=5, help="Max results (default: 5)")
sp.add_argument("--search-depth", choices=["basic", "advanced", "fast", "ultra-fast"], default="basic")
sp.add_argument("--chunks-per-source", type=int, choices=[1, 2, 3], default=None)
sp.add_argument("--topic", choices=["general", "news", "finance"], default="general")
sp.add_argument("--time-range", choices=["day", "week", "month", "year", "d", "w", "m", "y"], default=None)
sp.add_argument("--start-date", default=None, help="YYYY-MM-DD")
sp.add_argument("--end-date", default=None, help="YYYY-MM-DD")
sp.add_argument("--include-answer", default="false", help="false|true|basic|advanced (default: false)")
sp.add_argument("--include-raw-content", default="false", help="false|true|markdown|text (default: false)")
sp.add_argument("--include-images", action="store_true", help="Include image results")
sp.add_argument("--include-image-descriptions", action="store_true", help="Include image descriptions (implies --include-images)")
sp.add_argument("--include-favicon", action="store_true")
sp.add_argument("--include-domains", default=None, help="Comma-separated allowlist domains")
sp.add_argument("--exclude-domains", default=None, help="Comma-separated blocklist domains")
sp.add_argument("--country", default=None, help="Boost results for a country (general topic only)")
sp.add_argument("--auto-parameters", action="store_true")
sp.add_argument("--exact-match", action="store_true")
sp.add_argument("--include-usage", action="store_true")
sp.set_defaults(func=cmd_search)
ep = sub.add_parser("extract", help="POST /extract")
ep.add_argument("--url", action="append", help="URL to extract (repeatable)")
ep.add_argument("--urls", default=None, help="Comma-separated URLs to extract")
ep.add_argument("--query", default=None, help="Intent for reranking extracted chunks")
ep.add_argument("--chunks-per-source", type=int, choices=[1, 2, 3, 4, 5], default=None)
ep.add_argument("--extract-depth", choices=["basic", "advanced"], default="basic")
ep.add_argument("--format", choices=["markdown", "text"], default="markdown")
ep.add_argument("--include-images", action="store_true")
ep.add_argument("--include-favicon", action="store_true")
ep.add_argument("--timeout-seconds", type=float, default=None, help="Tavily extract timeout (1-60)")
ep.add_argument("--include-usage", action="store_true")
ep.set_defaults(func=cmd_extract)
rp = sub.add_parser("research", help="POST /research then poll GET /research/{request_id}")
rp.add_argument("--input", required=True, help="Research task/prompt")
rp.add_argument("--model", choices=["mini", "pro", "auto"], default="auto")
rp.add_argument("--citation-format", choices=["numbered", "mla", "apa", "chicago"], default="numbered")
rp.add_argument("--output-schema-json", default=None, help="JSON schema string for structured output (no file reads)")
rp.add_argument("--no-wait", action="store_true", help="Only create task, do not poll")
rp.add_argument("--max-wait-seconds", type=float, default=180.0, help="Max time to poll (default: 180)")
rp.add_argument("--poll-interval-seconds", type=float, default=2.0, help="Poll interval (default: 2)")
rp.set_defaults(func=cmd_research)
rgp = sub.add_parser("research-get", help="GET /research/{request_id}")
rgp.add_argument("--request-id", required=True)
rgp.set_defaults(func=cmd_research_get)
up = sub.add_parser("usage", help="GET /usage")
up.set_defaults(func=cmd_usage)
return p
def main(argv: List[str]) -> int:
parser = build_parser()
args = parser.parse_args(argv)
args.func(args)
return 0
if __name__ == "__main__":
raise SystemExit(main(sys.argv[1:]))
```
### references/search.md
```markdown
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
> Use this file to discover all available pages before exploring further.
# Tavily Search
> Execute a search query using Tavily Search.
## OpenAPI
````yaml POST /search
openapi: 3.0.3
info:
title: Tavily Search and Extract API
description: >-
Our REST API provides seamless access to Tavily Search, a powerful search
engine for LLM agents, and Tavily Extract, an advanced web scraping solution
optimized for LLMs.
version: 1.0.0
servers:
- url: https://api.tavily.com/
security: []
tags:
- name: Search
- name: Extract
- name: Crawl
- name: Map
- name: Research
- name: Usage
paths:
/search:
post:
summary: Search for data based on a query
description: Execute a search query using Tavily Search.
requestBody:
description: Parameters for the Tavily Search request.
required: true
content:
application/json:
schema:
type: object
properties:
query:
type: string
description: The search query to execute with Tavily.
example: who is Leo Messi?
search_depth:
type: string
description: >-
Controls the latency vs. relevance tradeoff and how
`results[].content` is generated:
- `advanced`: Highest relevance with increased latency. Best
for detailed, high-precision queries. Returns multiple
semantically relevant snippets per URL (configurable via
`chunks_per_source`).
- `basic`: A balanced option for relevance and latency.
Ideal for general-purpose searches. Returns one NLP summary
per URL.
- `fast`: Prioritizes lower latency while maintaining good
relevance. Returns multiple semantically relevant snippets
per URL (configurable via `chunks_per_source`).
- `ultra-fast`: Minimizes latency above all else. Best for
time-critical use cases. Returns one NLP summary per URL.
**Cost**:
- `basic`, `fast`, `ultra-fast`: 1 API Credit
- `advanced`: 2 API Credits
See [Search Best
Practices](/documentation/best-practices/best-practices-search#search-depth)
for guidance on choosing the right search depth.
enum:
- advanced
- basic
- fast
- ultra-fast
default: basic
chunks_per_source:
type: integer
description: >-
Chunks are short content snippets (maximum 500 characters
each) pulled directly from the source. Use
`chunks_per_source` to define the maximum number of relevant
chunks returned per source and to control the `content`
length. Chunks will appear in the `content` field as:
`<chunk 1> [...] <chunk 2> [...] <chunk 3>`. Available only
when `search_depth` is `advanced`.
default: 3
minimum: 1
maximum: 3
max_results:
type: integer
example: 1
description: The maximum number of search results to return.
default: 5
minimum: 0
maximum: 20
topic:
type: string
description: >-
The category of the search.`news` is useful for retrieving
real-time updates, particularly about politics, sports, and
major current events covered by mainstream media sources.
`general` is for broader, more general-purpose searches that
may include a wide range of sources.
default: general
enum:
- general
- news
- finance
time_range:
type: string
description: >-
The time range back from the current date to filter results
based on publish date or last updated date. Useful when
looking for sources that have published or updated data.
enum:
- day
- week
- month
- year
- d
- w
- m
- 'y'
default: null
start_date:
type: string
description: >-
Will return all results after the specified start date based
on publish date or last updated date. Required to be written
in the format YYYY-MM-DD
example: '2025-02-09'
default: null
end_date:
type: string
description: >-
Will return all results before the specified end date based
on publish date or last updated date. Required to be written
in the format YYYY-MM-DD
example: '2025-12-29'
default: null
include_answer:
oneOf:
- type: boolean
- type: string
enum:
- basic
- advanced
description: >-
Include an LLM-generated answer to the provided query.
`basic` or `true` returns a quick answer. `advanced` returns
a more detailed answer.
default: false
include_raw_content:
oneOf:
- type: boolean
- type: string
enum:
- markdown
- text
description: >-
Include the cleaned and parsed HTML content of each search
result. `markdown` or `true` returns search result content
in markdown format. `text` returns the plain text from the
results and may increase latency.
default: false
include_images:
type: boolean
description: >-
Also perform an image search and include the results in the
response.
default: false
include_image_descriptions:
type: boolean
description: >-
When `include_images` is `true`, also add a descriptive text
for each image.
default: false
include_favicon:
type: boolean
description: Whether to include the favicon URL for each result.
default: false
include_domains:
type: array
description: >-
A list of domains to specifically include in the search
results. Maximum 300 domains.
items:
type: string
default: []
exclude_domains:
type: array
description: >-
A list of domains to specifically exclude from the search
results. Maximum 150 domains.
items:
type: string
default: []
country:
type: string
description: >-
Boost search results from a specific country. This will
prioritize content from the selected country in the search
results. Available only if topic is `general`.
enum:
- afghanistan
- albania
- algeria
- andorra
- angola
- argentina
- armenia
- australia
- austria
- azerbaijan
- bahamas
- bahrain
- bangladesh
- barbados
- belarus
- belgium
- belize
- benin
- bhutan
- bolivia
- bosnia and herzegovina
- botswana
- brazil
- brunei
- bulgaria
- burkina faso
- burundi
- cambodia
- cameroon
- canada
- cape verde
- central african republic
- chad
- chile
- china
- colombia
- comoros
- congo
- costa rica
- croatia
- cuba
- cyprus
- czech republic
- denmark
- djibouti
- dominican republic
- ecuador
- egypt
- el salvador
- equatorial guinea
- eritrea
- estonia
- ethiopia
- fiji
- finland
- france
- gabon
- gambia
- georgia
- germany
- ghana
- greece
- guatemala
- guinea
- haiti
- honduras
- hungary
- iceland
- india
- indonesia
- iran
- iraq
- ireland
- israel
- italy
- jamaica
- japan
- jordan
- kazakhstan
- kenya
- kuwait
- kyrgyzstan
- latvia
- lebanon
- lesotho
- liberia
- libya
- liechtenstein
- lithuania
- luxembourg
- madagascar
- malawi
- malaysia
- maldives
- mali
- malta
- mauritania
- mauritius
- mexico
- moldova
- monaco
- mongolia
- montenegro
- morocco
- mozambique
- myanmar
- namibia
- nepal
- netherlands
- new zealand
- nicaragua
- niger
- nigeria
- north korea
- north macedonia
- norway
- oman
- pakistan
- panama
- papua new guinea
- paraguay
- peru
- philippines
- poland
- portugal
- qatar
- romania
- russia
- rwanda
- saudi arabia
- senegal
- serbia
- singapore
- slovakia
- slovenia
- somalia
- south africa
- south korea
- south sudan
- spain
- sri lanka
- sudan
- sweden
- switzerland
- syria
- taiwan
- tajikistan
- tanzania
- thailand
- togo
- trinidad and tobago
- tunisia
- turkey
- turkmenistan
- uganda
- ukraine
- united arab emirates
- united kingdom
- united states
- uruguay
- uzbekistan
- venezuela
- vietnam
- yemen
- zambia
- zimbabwe
default: null
auto_parameters:
type: boolean
description: >-
When `auto_parameters` is enabled, Tavily automatically
configures search parameters based on your query's content
and intent. You can still set other parameters manually, and
your explicit values will override the automatic ones. The
parameters `include_answer`, `include_raw_content`, and
`max_results` must always be set manually, as they directly
affect response size. Note: `search_depth` may be
automatically set to advanced when it's likely to improve
results. This uses 2 API credits per request. To avoid the
extra cost, you can explicitly set `search_depth` to
`basic`.
default: false
exact_match:
type: boolean
description: >-
Ensure that only search results containing the exact quoted
phrase(s) in the query are returned, bypassing synonyms or
semantic variations. Wrap target phrases in quotes within
your query (e.g. `"John Smith" CEO Acme Corp`). Punctuation
is typically ignored inside quotes.
default: false
include_usage:
type: boolean
description: Whether to include credit usage information in the response.
default: false
required:
- query
responses:
'200':
description: Search results returned successfully
content:
application/json:
schema:
type: object
properties:
query:
type: string
description: The search query that was executed.
example: Who is Leo Messi?
answer:
type: string
description: >-
A short answer to the user's query, generated by an LLM.
Included in the response only if `include_answer` is
requested (i.e., set to `true`, `basic`, or `advanced`)
example: >-
Lionel Messi, born in 1987, is an Argentine footballer
widely regarded as one of the greatest players of his
generation. He spent the majority of his career playing
for FC Barcelona, where he won numerous domestic league
titles and UEFA Champions League titles. Messi is known
for his exceptional dribbling skills, vision, and
goal-scoring ability. He has won multiple FIFA Ballon d'Or
awards, numerous La Liga titles with Barcelona, and holds
the record for most goals scored in a calendar year. In
2014, he led Argentina to the World Cup final, and in
2015, he helped Barcelona capture another treble. Despite
turning 36 in June, Messi remains highly influential in
the sport.
images:
type: array
description: >-
List of query-related images. If
`include_image_descriptions` is true, each item will have
`url` and `description`.
example: []
items:
type: object
properties:
url:
type: string
description:
type: string
results:
type: array
description: A list of sorted search results, ranked by relevancy.
items:
type: object
properties:
title:
type: string
description: The title of the search result.
example: Lionel Messi Facts | Britannica
url:
type: string
description: The URL of the search result.
example: https://www.britannica.com/facts/Lionel-Messi
content:
type: string
description: A short description of the search result.
example: >-
Lionel Messi, an Argentine footballer, is widely
regarded as one of the greatest football players of
his generation. Born in 1987, Messi spent the
majority of his career playing for Barcelona, where
he won numerous domestic league titles and UEFA
Champions League titles. Messi is known for his
exceptional dribbling skills, vision, and goal
score:
type: number
format: float
description: The relevance score of the search result.
example: 0.81025416
raw_content:
type: string
description: >-
The cleaned and parsed HTML content of the search
result. Only if `include_raw_content` is true.
example: null
favicon:
type: string
description: The favicon URL for the result.
example: https://britannica.com/favicon.png
auto_parameters:
type: object
description: >-
A dictionary of the selected auto_parameters, only shown
when `auto_parameters` is true.
example:
topic: general
search_depth: basic
response_time:
type: number
format: float
description: Time in seconds it took to complete the request.
example: '1.67'
usage:
type: object
description: Credit usage details for the request.
example:
credits: 1
request_id:
type: string
description: >-
A unique request identifier you can share with customer
support to help resolve issues with specific requests.
example: 123e4567-e89b-12d3-a456-426614174111
required:
- query
- results
- images
- response_time
- answer
'400':
description: Bad Request - Your request is invalid.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
<400 Bad Request, (e.g Invalid topic. Must be 'general' or
'news'.)>
'401':
description: Unauthorized - Your API key is wrong or missing.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: 'Unauthorized: missing or invalid API key.'
'429':
description: Too many requests - Rate limit exceeded
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
Your request has been blocked due to excessive requests.
Please reduce rate of requests.
'432':
description: Key limit or Plan Limit exceeded
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
<432 Custom Forbidden Error (e.g This request exceeds your
plan's set usage limit. Please upgrade your plan or contact
[email protected])>
'433':
description: PayGo limit exceeded
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
This request exceeds the pay-as-you-go limit. You can
increase your limit on the Tavily dashboard.
'500':
description: Internal Server Error - We had a problem with our server.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: Internal Server Error
security:
- bearerAuth: []
x-codeSamples:
- lang: python
label: Python SDK
source: |-
from tavily import TavilyClient
tavily_client = TavilyClient(api_key="tvly-YOUR_API_KEY")
response = tavily_client.search("Who is Leo Messi?")
print(response)
- lang: javascript
label: JavaScript SDK
source: |-
const { tavily } = require("@tavily/core");
const tvly = tavily({ apiKey: "tvly-YOUR_API_KEY" });
const response = await tvly.search("Who is Leo Messi?");
console.log(response);
components:
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
description: >-
Bearer authentication header in the form Bearer <token>, where <token>
is your Tavily API key (e.g., Bearer tvly-YOUR_API_KEY).
````
```
### references/extract.md
```markdown
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
> Use this file to discover all available pages before exploring further.
# Tavily Extract
> Extract web page content from one or more specified URLs using Tavily Extract.
## OpenAPI
````yaml POST /extract
openapi: 3.0.3
info:
title: Tavily Search and Extract API
description: >-
Our REST API provides seamless access to Tavily Search, a powerful search
engine for LLM agents, and Tavily Extract, an advanced web scraping solution
optimized for LLMs.
version: 1.0.0
servers:
- url: https://api.tavily.com/
security: []
tags:
- name: Search
- name: Extract
- name: Crawl
- name: Map
- name: Research
- name: Usage
paths:
/extract:
post:
summary: Retrieve raw web content from specified URLs
description: >-
Extract web page content from one or more specified URLs using Tavily
Extract.
requestBody:
description: Parameters for the Tavily Extract request.
required: true
content:
application/json:
schema:
type: object
properties:
urls:
oneOf:
- type: string
description: The URL to extract content from.
example: https://en.wikipedia.org/wiki/Artificial_intelligence
- type: array
items:
type: string
description: A list of URLs to extract content from.
example:
- https://en.wikipedia.org/wiki/Artificial_intelligence
- https://en.wikipedia.org/wiki/Machine_learning
- https://en.wikipedia.org/wiki/Data_science
query:
type: string
description: >-
User intent for reranking extracted content chunks. When
provided, chunks are reranked based on relevance to this
query.
chunks_per_source:
type: integer
description: >-
Chunks are short content snippets (maximum 500 characters
each) pulled directly from the source. Use
`chunks_per_source` to define the maximum number of relevant
chunks returned per source and to control the `raw_content`
length. Chunks will appear in the `raw_content` field as:
`<chunk 1> [...] <chunk 2> [...] <chunk 3>`. Available only
when `query` is provided. Must be between 1 and 5.
minimum: 1
maximum: 5
default: 3
extract_depth:
type: string
description: >-
The depth of the extraction process. `advanced` extraction
retrieves more data, including tables and embedded content,
with higher success but may increase latency.`basic`
extraction costs 1 credit per 5 successful URL extractions,
while `advanced` extraction costs 2 credits per 5 successful
URL extractions.
enum:
- basic
- advanced
default: basic
include_images:
type: boolean
description: >-
Include a list of images extracted from the URLs in the
response. Default is false.
default: false
include_favicon:
type: boolean
description: Whether to include the favicon URL for each result.
default: false
format:
type: string
description: >-
The format of the extracted web page content. `markdown`
returns content in markdown format. `text` returns plain
text and may increase latency.
enum:
- markdown
- text
default: markdown
timeout:
type: number
format: float
description: >-
Maximum time in seconds to wait for the URL extraction
before timing out. Must be between 1.0 and 60.0 seconds. If
not specified, default timeouts are applied based on
extract_depth: 10 seconds for basic extraction and 30
seconds for advanced extraction.
minimum: 1
maximum: 60
default: None
include_usage:
type: boolean
description: >-
Whether to include credit usage information in the response.
`NOTE:`The value may be 0 if the total successful URL
extractions has not yet reached 5 calls. See our [Credits &
Pricing
documentation](https://docs.tavily.com/documentation/api-credits)
for details.
default: false
required:
- urls
responses:
'200':
description: Extraction results returned successfully
content:
application/json:
schema:
type: object
properties:
results:
type: array
description: A list of extracted content from the provided URLs.
items:
type: object
properties:
url:
type: string
description: The URL from which the content was extracted.
example: >-
https://en.wikipedia.org/wiki/Artificial_intelligence
raw_content:
type: string
description: >-
The full content extracted from the page. When
`query` is provided, contains the top-ranked chunks
joined by `[...]` separator.
example: >-
"Jump to content\nMain
menu\nSearch\nAppearance\nDonate\nCreate
account\nLog in\nPersonal tools\n Photograph
your local culture, help Wikipedia and win!\nToggle
the table of contents\nArtificial intelligence\n161
languages\nArticle\nTalk\nRead\nView source\nView
history\nTools\nFrom Wikipedia, the free
encyclopedia\n\"AI\" redirects here. For other uses,
see AI (disambiguation) and Artificial intelligence
(disambiguation).\nPart of a series on\nArtificial
intelligence (AI)\nshow\nMajor
goals\nshow\nApproaches\nshow\nApplications\nshow\nPhilosophy\nshow\nHistory\nshow\nGlossary\nvte\nArtificial
intelligence (AI), in its broadest sense, is
intelligence exhibited by machines, particularly
computer systems. It is a field of research in
computer science that develops and studies methods
and software that enable machines to perceive their
environment and use learning and intelligence to
take actions that maximize their chances of
achieving defined goals.[1] Such machines may be
called AIs.\nHigh-profile applications of AI include
advanced web search engines (e.g., Google Search);
recommendation systems (used by YouTube, Amazon, and
Netflix); virtual assistants (e.g., Google
Assistant, Siri, and Alexa); autonomous vehicles
(e.g., Waymo); generative and creative tools (e.g.,
ChatGPT and AI art); and superhuman play and
analysis in strategy games (e.g., chess and
Go)...................
images:
type: array
example: []
description: >-
This is only available if `include_images` is set to
`true`. A list of image URLs extracted from the
page.
items:
type: string
favicon:
type: string
description: The favicon URL for the result.
example: >-
https://en.wikipedia.org/static/favicon/wikipedia.ico
failed_results:
type: array
example: []
description: A list of URLs that could not be processed.
items:
type: object
properties:
url:
type: string
description: The URL that failed to be processed.
error:
type: string
description: >-
An error message describing why the URL couldn't be
processed.
response_time:
type: number
format: float
description: Time in seconds it took to complete the request.
example: 0.02
usage:
type: object
description: Credit usage details for the request.
example:
credits: 1
request_id:
type: string
description: >-
A unique request identifier you can share with customer
support to help resolve issues with specific requests.
example: 123e4567-e89b-12d3-a456-426614174111
'400':
description: Bad Request
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: <400 Bad Request, (e.g Max 20 URLs are allowed.)>
'401':
description: Unauthorized - Your API key is wrong or missing.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: 'Unauthorized: missing or invalid API key.'
'429':
description: Too many requests - Rate limit exceeded
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
Your request has been blocked due to excessive requests.
Please reduce rate of requests.
'432':
description: Key limit or Plan Limit exceeded
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
<432 Custom Forbidden Error (e.g This request exceeds your
plan's set usage limit. Please upgrade your plan or contact
[email protected])>
'433':
description: PayGo limit exceeded
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
This request exceeds the pay-as-you-go limit. You can
increase your limit on the Tavily dashboard.
'500':
description: Internal Server Error - We had a problem with our server.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: Internal Server Error
security:
- bearerAuth: []
x-codeSamples:
- lang: python
label: Python SDK
source: >-
from tavily import TavilyClient
tavily_client = TavilyClient(api_key="tvly-YOUR_API_KEY")
response =
tavily_client.extract("https://en.wikipedia.org/wiki/Artificial_intelligence")
print(response)
- lang: javascript
label: JavaScript SDK
source: >-
const { tavily } = require("@tavily/core");
const tvly = tavily({ apiKey: "tvly-YOUR_API_KEY" });
const response = await
tvly.extract("https://en.wikipedia.org/wiki/Artificial_intelligence");
console.log(response);
components:
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
description: >-
Bearer authentication header in the form Bearer <token>, where <token>
is your Tavily API key (e.g., Bearer tvly-YOUR_API_KEY).
````
```
### references/research.md
```markdown
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
> Use this file to discover all available pages before exploring further.
# Create Research Task
> Tavily Research performs comprehensive research on a given topic by conducting multiple searches, analyzing sources, and generating a detailed research report.
## OpenAPI
````yaml POST /research
openapi: 3.0.3
info:
title: Tavily Search and Extract API
description: >-
Our REST API provides seamless access to Tavily Search, a powerful search
engine for LLM agents, and Tavily Extract, an advanced web scraping solution
optimized for LLMs.
version: 1.0.0
servers:
- url: https://api.tavily.com/
security: []
tags:
- name: Search
- name: Extract
- name: Crawl
- name: Map
- name: Research
- name: Usage
paths:
/research:
post:
summary: Initiate a research task
description: >-
Tavily Research performs comprehensive research on a given topic by
conducting multiple searches, analyzing sources, and generating a
detailed research report.
requestBody:
description: Parameters for the Tavily Research request.
required: true
content:
application/json:
schema:
type: object
properties:
input:
type: string
description: The research task or question to investigate.
example: What are the latest developments in AI?
model:
type: string
description: >-
The model used by the research agent. "mini" is optimized
for targeted, efficient research and works best for narrow
or well-scoped questions. "pro" provides comprehensive,
multi-angle research and is suited for complex topics that
span multiple subtopics or domains
enum:
- mini
- pro
- auto
default: auto
stream:
type: boolean
description: >-
Whether to stream the research results as they are
generated. When 'true', returns a Server-Sent Events (SSE)
stream. See [Streaming
documentation](/documentation/api-reference/endpoint/research-streaming)
for details.
default: false
output_schema:
type: object
description: >-
A JSON Schema object that defines the structure of the
research output. When provided, the research response will
be structured to match this schema, ensuring a predictable
and validated output shape. Must include a 'properties'
field, and may optionally include 'required' field.
default: null
properties:
properties:
type: object
description: >-
An object containing property definitions. Each key is a
property name, and each value is a property schema.
additionalProperties:
type: object
properties:
type:
type: string
enum:
- object
- string
- integer
- number
- array
description: >-
The type of the property. Must be one of: object,
string, integer, number, or array.
description:
type: string
description: A description of the property.
properties:
type: object
description: >-
Required when type is 'object'. Recursive
definition of object properties.
items:
type: object
description: >-
Required when type is 'array'. Defines the schema
for array items.
required:
- type
- description
required:
type: array
description: >-
An array of property names that are required. At least
one key from the properties object must be included.
items:
type: string
example:
properties:
company:
type: string
description: The name of the company
key_metrics:
type: array
description: List of key performance metrics
items:
type: string
financial_details:
type: object
description: Detailed financial breakdown
properties:
operating_income:
type: number
description: Operating income for the period
required:
- company
citation_format:
type: string
description: The format for citations in the research report.
enum:
- numbered
- mla
- apa
- chicago
default: numbered
required:
- input
responses:
'201':
description: Research task queued successfully (when not streaming)
content:
application/json:
schema:
type: object
properties:
request_id:
type: string
description: A unique identifier for the research task.
example: 123e4567-e89b-12d3-a456-426614174111
created_at:
type: string
description: Timestamp when the research task was created.
example: '2025-01-15T10:30:00Z'
status:
type: string
description: The current status of the research task.
example: pending
input:
type: string
description: The research task or question investigated.
example: What are the latest developments in AI?
model:
type: string
description: The model used by the research agent.
example: mini
response_time:
type: integer
description: Time in seconds it took to complete the request.
example: 1.23
required:
- request_id
- created_at
- status
- input
- model
- response_time
example:
request_id: 123e4567-e89b-12d3-a456-426614174111
created_at: '2025-01-15T10:30:00Z'
status: pending
input: What are the latest developments in AI?
model: mini
response_time: 1.23
'400':
description: Bad Request - Your request is invalid.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: 'Invalid model. Must be one of: mini, pro, auto'
'401':
description: Unauthorized - Your API key is wrong or missing.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: 'Unauthorized: missing or invalid API key.'
'429':
description: Too many requests - Rate limit exceeded
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
Your request has been blocked due to excessive requests.
Please reduce rate of requests.
'432':
description: Key limit or Plan Limit exceeded
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
This request exceeds your plan's set usage limit. Please
upgrade your plan or contact [email protected]
'433':
description: PayGo limit exceeded
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
This request exceeds the pay-as-you-go limit. You can
increase your limit on the Tavily dashboard.
'500':
description: Internal Server Error - We had a problem with our server.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: Error when executing research task
security:
- bearerAuth: []
x-codeSamples:
- lang: python
label: Python SDK
source: >-
from tavily import TavilyClient
tavily_client = TavilyClient(api_key="tvly-YOUR_API_KEY")
response = tavily_client.research("What are the latest developments
in AI?")
print(response)
- lang: javascript
label: JavaScript SDK
source: >-
const { tavily } = require("@tavily/core");
const tvly = tavily({ apiKey: "tvly-YOUR_API_KEY" });
const response = await tvly.research("What are the latest
developments in AI?");
console.log(response);
components:
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
description: >-
Bearer authentication header in the form Bearer <token>, where <token>
is your Tavily API key (e.g., Bearer tvly-YOUR_API_KEY).
````
```
### references/research-get.md
```markdown
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
> Use this file to discover all available pages before exploring further.
# Get Research Task Status
> Retrieve the status and results of a research task using its request ID.
## OpenAPI
````yaml GET /research/{request_id}
openapi: 3.0.3
info:
title: Tavily Search and Extract API
description: >-
Our REST API provides seamless access to Tavily Search, a powerful search
engine for LLM agents, and Tavily Extract, an advanced web scraping solution
optimized for LLMs.
version: 1.0.0
servers:
- url: https://api.tavily.com/
security: []
tags:
- name: Search
- name: Extract
- name: Crawl
- name: Map
- name: Research
- name: Usage
paths:
/research/{request_id}:
get:
summary: Get research task status and results
description: Retrieve the status and results of a research task using its request ID.
parameters:
- name: request_id
in: path
required: true
description: The unique identifier of the research task.
schema:
type: string
example: 123e4567-e89b-12d3-a456-426614174111
responses:
'200':
description: Research task is completed or failed.
content:
application/json:
schema:
oneOf:
- $ref: '#/components/schemas/ResearchTaskCompleted'
- $ref: '#/components/schemas/ResearchTaskFailed'
discriminator:
propertyName: status
mapping:
completed:
$ref: '#/components/schemas/ResearchTaskCompleted'
failed:
$ref: '#/components/schemas/ResearchTaskFailed'
'202':
description: Research task is not yet completed (pending or in_progress).
content:
application/json:
schema:
type: object
properties:
request_id:
type: string
description: The unique identifier of the research task.
example: 123e4567-e89b-12d3-a456-426614174111
status:
type: string
description: Current status of the research task.
enum:
- pending
- in_progress
response_time:
type: integer
description: Time in seconds it took to complete the request.
example: 1.23
required:
- request_id
- response_time
- status
example:
request_id: 123e4567-e89b-12d3-a456-426614174111
status: in_progress
response_time: 1.23
'401':
description: Unauthorized - Your API key is wrong or missing.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: 'Unauthorized: missing or invalid API key.'
'404':
description: Research task not found
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: Research task not found
'500':
description: Internal Server Error - We had a problem with our server.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: Error getting research status
security:
- bearerAuth: []
x-codeSamples:
- lang: python
label: Python SDK
source: >-
from tavily import TavilyClient
tavily_client = TavilyClient(api_key="tvly-YOUR_API_KEY")
response =
tavily_client.get_research("123e4567-e89b-12d3-a456-426614174111")
print(response)
- lang: javascript
label: JavaScript SDK
source: >-
const { tavily } = require("@tavily/core");
const tvly = tavily({ apiKey: "tvly-YOUR_API_KEY" });
const response = await
tvly.get_research("123e4567-e89b-12d3-a456-426614174111");
console.log(response);
components:
schemas:
ResearchTaskCompleted:
title: Completed
type: object
properties:
request_id:
type: string
description: The unique identifier of the research task.
example: 123e4567-e89b-12d3-a456-426614174111
created_at:
type: string
description: Timestamp when the research task was created.
example: '2025-01-15T10:30:00Z'
status:
type: string
description: The current status of the research task.
enum:
- completed
content:
oneOf:
- type: string
- type: object
description: >-
The research report content. Can be a string or a structured object
if output_schema was provided.
sources:
type: array
description: List of sources used in the research.
items:
type: object
properties:
title:
type: string
description: Title or name of the source.
example: Latest AI Developments
url:
type: string
format: uri
description: URL of the source.
example: https://example.com/ai-news
favicon:
type: string
format: uri
description: URL to the source's favicon.
example: https://example.com/favicon.ico
response_time:
type: integer
description: Time in seconds it took to complete the request.
example: 1.23
required:
- request_id
- created_at
- status
- content
- sources
- response_time
example:
request_id: 123e4567-e89b-12d3-a456-426614174111
created_at: '2025-01-15T10:30:00Z'
status: completed
content: >-
Research Report: Latest Developments in AI
## Executive Summary
Artificial Intelligence has seen significant advancements in recent
months, with major breakthroughs in large language models, multimodal
AI systems, and real-world applications...
sources:
- title: Latest AI Developments
url: https://example.com/ai-news
favicon: https://example.com/favicon.ico
- title: AI Research Breakthroughs
url: https://example.com/ai-research
favicon: https://example.com/favicon.ico
response_time: 1.23
ResearchTaskFailed:
title: Failed
type: object
properties:
request_id:
type: string
description: The unique identifier of the research task.
example: 123e4567-e89b-12d3-a456-426614174111
status:
type: string
description: The current status of the research task.
enum:
- failed
response_time:
type: integer
description: Time in seconds it took to complete the request.
example: 1.23
required:
- request_id
- status
- response_time
example:
request_id: 123e4567-e89b-12d3-a456-426614174111
status: failed
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
description: >-
Bearer authentication header in the form Bearer <token>, where <token>
is your Tavily API key (e.g., Bearer tvly-YOUR_API_KEY).
````
```
### references/bp-search.md
```markdown
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
> Use this file to discover all available pages before exploring further.
# Best Practices for Search
> Learn how to optimize your queries, refine search filters, and leverage advanced parameters for better performance.
## Query Optimization
### Keep your query under 400 characters
Keep queries concise—under **400 characters**. Think of it as a query for an agent performing web search, not long-form prompts.
### Break complex queries into sub-queries
For complex or multi-topic queries, send separate focused requests:
```json theme={null}
// Instead of one massive query, break it down:
{ "query": "Competitors of company ABC." }
{ "query": "Financial performance of company ABC." }
{ "query": "Recent developments of company ABC." }
```
## Search Depth
The `search_depth` parameter controls the tradeoff between latency and relevance:
<Expandable title="Latency vs relevance chart">
<img src="https://mintcdn.com/tavilyai/-85Rr9EfVqo8fXvO/images/search-depth.png?fit=max&auto=format&n=-85Rr9EfVqo8fXvO&q=85&s=c57f2074dda171a1e3e9f96afbec8f10" alt="Latency vs Relevance by Search Depth" data-og-width="874" width="874" data-og-height="874" height="874" data-path="images/search-depth.png" data-optimize="true" data-opv="3" srcset="https://mintcdn.com/tavilyai/-85Rr9EfVqo8fXvO/images/search-depth.png?w=280&fit=max&auto=format&n=-85Rr9EfVqo8fXvO&q=85&s=71a5fd07a8656ee310a89130f37d45f1 280w, https://mintcdn.com/tavilyai/-85Rr9EfVqo8fXvO/images/search-depth.png?w=560&fit=max&auto=format&n=-85Rr9EfVqo8fXvO&q=85&s=4022debcc3a7e036173f27e42e1fc938 560w, https://mintcdn.com/tavilyai/-85Rr9EfVqo8fXvO/images/search-depth.png?w=840&fit=max&auto=format&n=-85Rr9EfVqo8fXvO&q=85&s=138868083b0737aaa5b00f3064e7d5a8 840w, https://mintcdn.com/tavilyai/-85Rr9EfVqo8fXvO/images/search-depth.png?w=1100&fit=max&auto=format&n=-85Rr9EfVqo8fXvO&q=85&s=71a45f7709dc6e075026659bd24d1280 1100w, https://mintcdn.com/tavilyai/-85Rr9EfVqo8fXvO/images/search-depth.png?w=1650&fit=max&auto=format&n=-85Rr9EfVqo8fXvO&q=85&s=809950b3644e9382d935fb2fdf43a1a5 1650w, https://mintcdn.com/tavilyai/-85Rr9EfVqo8fXvO/images/search-depth.png?w=2500&fit=max&auto=format&n=-85Rr9EfVqo8fXvO&q=85&s=53662412a87ba84df069cf180d228418 2500w" />
*This chart is a heuristic and is not to scale.*
</Expandable>
| Depth | Latency | Relevance | Content Type |
| ------------ | ------- | --------- | ------------ |
| `ultra-fast` | Lowest | Lower | Content |
| `fast` | Low | Good | Chunks |
| `basic` | Medium | High | Content |
| `advanced` | Higher | Highest | Chunks |
### Content types
| Type | Description |
| ----------- | --------------------------------------------------------- |
| **Content** | NLP-based summary of the page, providing general context |
| **Chunks** | Short snippets reranked by relevance to your search query |
Use **chunks** when you need highly targeted information aligned with your query. Use **content** when a general page summary is sufficient.
### Fast + Ultra-Fast
| Depth | When to use |
| ------------ | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `ultra-fast` | When latency is absolutely crucial. Delivers near-instant results, prioritizing speed over relevance. Ideal for real-time applications where response time is critical. |
| `fast` | When latency is more important than relevance, but you want results in reranked chunks format. Good for applications that need quick, targeted snippets. |
| `basic` | A solid balance between relevance and latency. Best for general-purpose searches where you need quality results without the overhead of advanced processing. |
| `advanced` | When you need the highest relevance and are willing to trade off latency. Best for queries seeking specific, detailed information. |
### Using `search_depth=advanced`
Best for queries seeking specific information:
```json theme={null}
{
"query": "How many countries use Monday.com?",
"search_depth": "advanced",
"chunks_per_source": 3,
"include_raw_content": true
}
```
## Filtering Results
### By date
| Parameter | Description |
| ------------------------- | ------------------------------------------------------- |
| `time_range` | Filter by relative time: `day`, `week`, `month`, `year` |
| `start_date` / `end_date` | Filter by specific date range (format: `YYYY-MM-DD`) |
```json theme={null}
{ "query": "latest ML trends", "time_range": "month" }
{ "query": "AI news", "start_date": "2025-01-01", "end_date": "2025-02-01" }
```
### By topic
Use `topic` to filter by content type. Set to `news` for news sources (includes `published_date` metadata):
```json theme={null}
{ "query": "What happened today in NY?", "topic": "news" }
```
### By domain
| Parameter | Description |
| ----------------- | ------------------------------------- |
| `include_domains` | Limit to specific domains |
| `exclude_domains` | Filter out specific domains |
| `country` | Boost results from a specific country |
```json theme={null}
// Restrict to LinkedIn profiles
{ "query": "CEO background at Google", "include_domains": ["linkedin.com/in"] }
// Exclude irrelevant domains
{ "query": "US economy trends", "exclude_domains": ["espn.com", "vogue.com"] }
// Boost results from a country
{ "query": "tech startup funding", "country": "united states" }
// Wildcard: limit to .com, exclude specific site
{ "query": "AI news", "include_domains": ["*.com"], "exclude_domains": ["example.com"] }
```
<Note>Keep domain lists short and relevant for best results.</Note>
## Response Content
### `max_results`
Limits results returned (default: `5`). Setting too high may return lower-quality results.
### `include_raw_content`
Returns full extracted page content. For comprehensive extraction, consider a two-step process:
1. Search to retrieve relevant URLs
2. Use [Extract API](/documentation/best-practices/best-practices-extract#2-two-step-process-search-then-extract) to get content
### `auto_parameters`
Tavily automatically configures parameters based on query intent. Your explicit values override automatic ones.
```json theme={null}
{
"query": "impact of AI in education policy",
"auto_parameters": true,
"search_depth": "basic" // Override to control cost
}
```
<Note>
`auto_parameters` may set `search_depth` to `advanced` (2 credits). Set it
manually to control cost.
</Note>
## Exact Match
Use `exact_match` only when searching for a specific name or phrase that must appear verbatim in the source content. Wrap the phrase in quotes within your query:
```json theme={null}
{
"query": "\"John Smith\" CEO Acme Corp",
"exact_match": true
}
```
Because this narrows retrieval, it may return fewer results or empty result fields when no exact matches are found. Best suited for:
* **Due diligence** — finding information on a specific person or entity
* **Data enrichment** — retrieving details about a known company or individual
* **Legal/compliance** — locating exact names or phrases in public records
## Async & Performance
Use async calls for concurrent requests:
```python theme={null}
import asyncio
from tavily import AsyncTavilyClient
tavily_client = AsyncTavilyClient("tvly-YOUR_API_KEY")
async def fetch_and_gather():
queries = ["latest AI trends", "future of quantum computing"]
responses = await asyncio.gather(
*(tavily_client.search(q) for q in queries),
return_exceptions=True
)
for response in responses:
if isinstance(response, Exception):
print(f"Failed: {response}")
else:
print(response)
asyncio.run(fetch_and_gather())
```
## Post-Processing
### Using metadata
Leverage response metadata to refine results:
| Field | Use case |
| ------------- | ---------------------------------- |
| `score` | Filter/rank by relevance score |
| `title` | Keyword filtering on headlines |
| `content` | Quick relevance check |
| `raw_content` | Deep analysis and regex extraction |
### Score-based filtering
The `score` indicates relevance between query and content. Higher is better, but the ideal threshold depends on your use case.
```python theme={null}
# Filter results with score > 0.7
filtered = [r for r in results if r['score'] > 0.7]
```
### Regex extraction
Extract structured data from `raw_content`:
```python theme={null}
import re
# Extract location
text = "Company: Tavily, Location: New York"
match = re.search(r"Location: (\w+)", text)
location = match.group(1) if match else None # "New York"
# Extract all emails
text = "Contact: [email protected], [email protected]"
emails = re.findall(r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}", text)
```
```
### references/bp-extract.md
```markdown
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
> Use this file to discover all available pages before exploring further.
# Best Practices for Extract
> Learn how to optimize content extraction, choose the right approach, and configure parameters for better performance.
## Extract Parameters
### Query
Use query to rerank extracted content chunks based on relevance:
```python theme={null}
await tavily_client.extract(
urls=["https://example.com/article"],
query="machine learning applications in healthcare"
)
```
**When to use query:**
* To extract only relevant portions of long documents
* When you need focused content instead of full page extraction
* For targeted information retrieval from specific URLs
> When `query` is provided, chunks are reranked based on relevance to the query.
### Chunks Per Source
Control the amount of content returned per URL to prevent context window explosion:
```python theme={null}
await tavily_client.extract(
urls=["https://example.com/article"],
query="machine learning applications in healthcare",
chunks_per_source=3
)
```
**Key benefits:**
* Returns only relevant content snippets (max 500 characters each) instead of full page content
* Prevents context window from exploding
* Chunks appear in `raw_content` as: `<chunk 1> [...] <chunk 2> [...] <chunk 3>`
* Must be between 1 and 5 chunks per source
> `chunks_per_source` is only available when `query` is provided.
**Example with multiple URLs:**
```python theme={null}
await tavily_client.extract(
urls=[
"https://example.com/ml-healthcare",
"https://example.com/ai-diagnostics",
"https://example.com/medical-ai"
],
query="AI diagnostic tools accuracy",
chunks_per_source=2
)
```
This returns the 2 most relevant chunks from each URL, giving you focused, relevant content without overwhelming your context window.
## Extraction Approaches
### Search with include\_raw\_content
Enable include\_raw\_content=true in Search API calls to retrieve both search results and extracted content simultaneously.
```python theme={null}
response = await tavily_client.search(
query="AI healthcare applications",
include_raw_content=True,
max_results=5
)
```
**When to use:**
* Quick prototyping
* Simple queries where search results are likely relevant
* Single API call convenience
### Direct Extract API
Use the Extract API when you want control over which specific URLs to extract from.
```python theme={null}
await tavily_client.extract(
urls=["https://example.com/article1", "https://example.com/article2"],
query="machine learning applications",
chunks_per_source=3
)
```
**When to use:**
* You already have specific URLs to extract from
* You want to filter or curate URLs before extraction
* You need targeted extraction with query and chunks\_per\_source
**Key difference:** The main distinction is control, with Extract you choose exactly which URLs to extract from, while Search with `include_raw_content` extracts from all search results.
## Extract Depth
The `extract_depth` parameter controls extraction comprehensiveness:
| Depth | Use case |
| ----------------- | --------------------------------------------- |
| `basic` (default) | Simple text extraction, faster processing |
| `advanced` | Complex pages, tables, structured data, media |
### Using `extract_depth=advanced`
Best for content requiring detailed extraction:
```python theme={null}
await tavily_client.extract(
url="https://example.com/complex-page",
extract_depth="advanced"
)
```
**When to use advanced:**
* Dynamic content or JavaScript-rendered pages
* Tables and structured information
* Embedded media and rich content
* Higher extraction success rates needed
<Note>
`extract_depth=advanced` provides better accuracy but increases latency and
cost. Use `basic` for simple content.
</Note>
## Advanced Filtering Strategies
Beyond query-based filtering, consider these approaches for curating URLs before extraction:
| Strategy | When to use |
| ------------ | ---------------------------------------------- |
| Re-ranking | Use dedicated re-ranking models for precision |
| LLM-based | Let an LLM assess relevance before extraction |
| Clustering | Group similar documents, extract from clusters |
| Domain-based | Filter by trusted domains before extracting |
| Score-based | Filter search results by relevance score |
### Example: Score-based filtering
```python theme={null}
import asyncio
from tavily import AsyncTavilyClient
tavily_client = AsyncTavilyClient(api_key="tvly-YOUR_API_KEY")
async def filtered_extraction():
# Search first
response = await tavily_client.search(
query="AI healthcare applications",
search_depth="advanced",
max_results=20
)
# Filter by relevance score (>0.5)
relevant_urls = [
result['url'] for result in response.get('results', [])
if result.get('score', 0) > 0.5
]
# Extract from filtered URLs with targeted query
extracted_data = await tavily_client.extract(
urls=relevant_urls,
query="machine learning diagnostic tools",
chunks_per_source=3,
extract_depth="advanced"
)
return extracted_data
asyncio.run(filtered_extraction())
```
## Integration with Search
### Optimal workflow
* **Search** to discover relevant URLs
* **Filter** by relevance score, domain, or content snippet
* **Re-rank** if needed using specialized models
* **Extract** from top-ranked sources with query and chunks\_per\_source
* **Validate** extracted content quality
* **Process** for your RAG or AI application
### Example end-to-end pipeline
```python theme={null}
async def content_pipeline(topic):
# 1. Search with sub-queries
queries = generate_subqueries(topic)
responses = await asyncio.gather(
*[tavily_client.search(**q) for q in queries]
)
# 2. Filter and aggregate
urls = []
for response in responses:
urls.extend([
r['url'] for r in response['results']
if r['score'] > 0.5
])
# 3. Deduplicate
urls = list(set(urls))[:20] # Top 20 unique URLs
# 4. Extract with error handling
extracted = await asyncio.gather(
*(tavily_client.extract(url, extract_depth="advanced") for url in urls),
return_exceptions=True
)
# 5. Filter successful extractions
return [e for e in extracted if not isinstance(e, Exception)]
```
## Summary
1. **Use query and chunks\_per\_source** for targeted, focused extraction
2. **Choose Extract API** when you need control over which URLs to extract from
3. **Filter URLs** before extraction using scores, re-ranking, or domain trust
4. **Choose appropriate extract\_depth** based on content complexity
5. **Process URLs concurrently** with async operations for better performance
6. **Implement error handling** to manage failed extractions gracefully
7. **Validate extracted content** before downstream processing
8. **Optimize costs** by extracting only necessary content with chunks\_per\_source
> Start with query and chunks\_per\_source for targeted extraction. Filter URLs strategically, extract with appropriate depth, and handle errors gracefully for production-ready pipelines.
```
### references/bp-research.md
```markdown
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
> Use this file to discover all available pages before exploring further.
# Best Practices for Research
> Learn how to write effective prompts, choose the right model, and configure output formats for better research results.
## Prompting
Define a **clear goal** with all **details** and **direction**.
* **Be specific when you can.** If you already know important details, include them.<br />
(E.g. Target market or industry, key competitors, customer segments, geography, or constraints)
* **Only stay open-ended if you don't know details and want discovery.** If you're exploring broadly, make that explicit (e.g., "tell me about the most impactful AI innovations in healthcare in 2025").
* **Avoid contradictions.** Don't include conflicting information, constraints, or goals in your prompt.
* **Share what's already known.** Include prior assumptions, existing decisions, or baseline knowledge—so the research doesn't repeat what you already have.
* **Keep the prompt clean and directed.** Use a clear task statement + essential context + desired output format. Avoid messy background dumps.
### Example Queries
```text theme={null}
"Research the company ____ and it's 2026 outlook. Provide a brief
overview of the company, its products, services, and market position."
```
```text theme={null}
"Conduct a competitive analysis of ____ in 2026. Identify their main competitors,
compare market positioning, and analyze key differentiators."
```
```text theme={null}
"We're evaluating Notion as a potential partner. We already know they primarily
serve SMB and mid-market teams, expanded their AI features significantly in 2025,
and most often compete with Confluence and ClickUp. Research Notion's 2026 outlook,
including market position, growth risks, and where a partnership could be most
valuable. Include citations."
```
## Model
| Model | Best For |
| ------ | -------------------------------------------------------------------- |
| `pro` | Comprehensive, multi-agent research for complex, multi-domain topics |
| `mini` | Targeted, efficient research for narrow or well-scoped questions |
| `auto` | When you're unsure how complex research will be |
### Pro
Provides comprehensive, multi-agent research suited for complex topics that span multiple subtopics or domains. Use when you want deeper analysis, more thorough reports, or maximum accuracy.
```json theme={null}
{
"input": "Analyze the competitive landscape for ____ in the SMB market, including key competitors, positioning, pricing models, customer segments, recent product moves, and where ____ has defensible advantages or risks over the next 2–3 years.",
"model": "pro"
}
```
### Mini
Optimized for targeted, efficient research. Works best for narrow or well-scoped questions where you still benefit from agentic searching and synthesis, but don't need extensive depth.
```json theme={null}
{
"input": "What are the top 5 competitors to ____ in the SMB market, and how do they differentiate?",
"model": "mini"
}
```
## Structured Output vs. Report
* **Structured Output** - Best for data enrichment, pipelines, or powering UIs with specific fields.
* **Report** — Best for reading, sharing, or displaying verbatim (e.g., chat interfaces, briefs, newsletters).
### Formatting Your Schema
* **Write clear field descriptions.** In 1–3 sentences, say exactly what the field should contain and what to look for. This makes it easier for our models to interpret what you're looking for.
* **Match the structure you actually need.** Use the right types (arrays, objects, enums) instead of packing multiple values into one string (e.g., `competitors: string[]`, not `"A, B, C"`).
* **Avoid duplicate or overlapping fields.** Keep each field unique and specific - contradictions or redundancy can confuse our models.
## Streaming vs. Polling
<CardGroup cols={2}>
<Card title="Streaming" icon="wave-pulse" href="https://github.com/tavily-ai/tavily-cookbook/blob/main/cookbooks/research/streaming.ipynb">
Best for user interfaces where you want real-time updates.
</Card>
<Card title="Polling" icon="rotate" href="https://github.com/tavily-ai/tavily-cookbook/blob/main/cookbooks/research/polling.ipynb">
Best for background processes where you check status periodically.
</Card>
</CardGroup>
<Tip>
See streaming in action with the [live demo](https://chat-research.tavily.com/).
</Tip>
```
### references/usage.md
```markdown
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
> Use this file to discover all available pages before exploring further.
# Usage
> Get API key and account usage details
## OpenAPI
````yaml GET /usage
openapi: 3.0.3
info:
title: Tavily Search and Extract API
description: >-
Our REST API provides seamless access to Tavily Search, a powerful search
engine for LLM agents, and Tavily Extract, an advanced web scraping solution
optimized for LLMs.
version: 1.0.0
servers:
- url: https://api.tavily.com/
security: []
tags:
- name: Search
- name: Extract
- name: Crawl
- name: Map
- name: Research
- name: Usage
paths:
/usage:
get:
summary: Get API key and account usage details
description: Get API key and account usage details
responses:
'200':
description: Usage details returned successfully
content:
application/json:
schema:
type: object
properties:
key:
type: object
properties:
usage:
type: integer
description: >-
Total credits used for this API key during the current
billing cycle
example: 150
limit:
type: integer
description: Usage limit for the API key. Returns null if unlimited
example: 1000
search_usage:
type: integer
description: >-
Search endpoint credits used for this API key during
the current billing cycle
example: 100
extract_usage:
type: integer
description: >-
Extract endpoint credits used for this API key during
the current billing cycle
example: 25
crawl_usage:
type: integer
description: >-
Crawl endpoint credits used for this API key during
the current billing cycle
example: 15
map_usage:
type: integer
description: >-
Map endpoint credits used for this API key during the
current billing cycle
example: 7
research_usage:
type: integer
description: >-
Research endpoint credits used for this API key during
the current billing cycle
example: 3
account:
type: object
description: Account plan and usage information
properties:
current_plan:
type: string
description: The current subscription plan name
example: Bootstrap
plan_usage:
type: integer
description: >-
Total credits used for this plan during the current
billing cycle
example: 500
plan_limit:
type: integer
description: Usage limit for the current plan
example: 15000
paygo_usage:
type: integer
description: Current pay-as-you-go usage count
example: 25
paygo_limit:
type: integer
description: Pay-as-you-go usage limit
example: 100
search_usage:
type: integer
description: >-
Search endpoint credits used for this plan during the
current billing cycle
example: 350
extract_usage:
type: integer
description: >-
Extract endpoint credits used for this plan during the
current billing cycle
example: 75
crawl_usage:
type: integer
description: >-
Crawl endpoint credits used for this plan during the
current billing cycle
example: 50
map_usage:
type: integer
description: >-
Map endpoint credits used for this plan during the
current billing cycle
example: 15
research_usage:
type: integer
description: >-
Research endpoint credits used for this plan during
the current billing cycle
example: 10
'401':
description: Unauthorized - Your API key is wrong or missing.
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: 'Unauthorized: missing or invalid API key.'
'429':
description: Too Many Requests
content:
application/json:
schema:
type: object
properties:
detail:
type: object
properties:
error:
type: string
example:
detail:
error: >-
Your request has been blocked due to excessive requests.
Please reduce the rate of requests
security:
- bearerAuth: []
x-codeSamples: []
components:
securitySchemes:
bearerAuth:
type: http
scheme: bearer
bearerFormat: JWT
description: >-
Bearer authentication header in the form Bearer <token>, where <token>
is your Tavily API key (e.g., Bearer tvly-YOUR_API_KEY).
````
```
---
## Skill Companion Files
> Additional files collected from the skill directory layout.
### _meta.json
```json
{
"owner": "doahc",
"slug": "tavily-web",
"displayName": "Tavily",
"latest": {
"version": "0.0.3",
"publishedAt": 1772686426654,
"commit": "https://github.com/openclaw/skills/commit/8c3bd473843029920ea33086bee73d71805490a4"
},
"history": []
}
```
### references/research-streaming.md
```markdown
> ## Documentation Index
> Fetch the complete documentation index at: https://docs.tavily.com/llms.txt
> Use this file to discover all available pages before exploring further.
# Streaming
> Stream real-time research progress and results from Tavily Research API
## Overview
When using the Tavily Research API, you can stream responses in real-time by setting `stream: true` in your request. This allows you to receive research progress updates, tool calls, and final results as they're generated, providing a better user experience for long-running research tasks.
Streaming is particularly useful for:
* Displaying research progress to users in real-time
* Monitoring tool calls and search queries as they execute
* Receiving incremental updates during lengthy research operations
* Building interactive research interfaces
## Enabling Streaming
To enable streaming, set the `stream` parameter to `true` when making a request to the Research endpoint:
```json theme={null}
{
"input": "What are the latest developments in AI?",
"stream": true
}
```
The API will respond with a `text/event-stream` content type, sending Server-Sent Events (SSE) as the research progresses.
## Event Structure
Each streaming event follows a consistent structure compatible with the OpenAI chat completions format:
```json theme={null}
{
"id": "123e4567-e89b-12d3-a456-426614174111",
"object": "chat.completion.chunk",
"model": "mini",
"created": 1705329000,
"choices": [
{
"delta": {
// Event-specific data here
}
}
]
}
```
### Core Fields
| Field | Type | Description |
| --------- | ------- | ----------------------------------------------------- |
| `id` | string | Unique identifier for the stream event |
| `object` | string | Always `"chat.completion.chunk"` for streaming events |
| `model` | string | The research model being used (`"mini"` or `"pro"`) |
| `created` | integer | Unix timestamp when the event was created |
| `choices` | array | Array containing the delta with event details |
## Event Types
The streaming response includes different types of events in the `delta` object. Here are the main event types you'll encounter:
### 1. Tool Call Events
When the research agent performs actions like web searches, you'll receive tool call events:
```json theme={null}
{
"id": "evt_002",
"object": "chat.completion.chunk",
"model": "mini",
"created": 1705329005,
"choices": [
{
"delta": {
"role": "assistant",
"tool_calls": {
"type": "tool_call",
"tool_call": [
{
"name": "WebSearch",
"id": "fc_633b5932-e66c-4523-931a-04a7b79f2578",
"arguments": "Executing 5 search queries",
"queries": ["latest AI developments 2024", "machine learning breakthroughs", "..."]
}
]
}
}
}
]
}
```
**Tool Call Delta Fields:**
| Field | Type | Description |
| --------------------- | ------ | ------------------------------------------------------------------ |
| `type` | string | Either `"tool_call"` or `"tool_response"` |
| `tool_call` | array | Details about the tool being invoked |
| `name` | string | Name of the tool (see [Tool Types](#tool-types) below) |
| `id` | string | Unique identifier for the tool call |
| `arguments` | string | Description of the action being performed |
| `queries` | array | *(WebSearch only)* The search queries being executed |
| `parent_tool_call_id` | string | *(Pro mode only)* ID of the parent tool call for nested operations |
### 2. Tool Response Events
After a tool executes, you'll receive response events with discovered sources:
```json theme={null}
{
"id": "evt_003",
"object": "chat.completion.chunk",
"model": "mini",
"created": 1705329010,
"choices": [
{
"delta": {
"role": "assistant",
"tool_calls": {
"type": "tool_response",
"tool_response": [
{
"name": "WebSearch",
"id": "fc_633b5932-e66c-4523-931a-04a7b79f2578",
"arguments": "Completed executing search tool call",
"sources": [
{
"url": "https://example.com/article",
"title": "Example Article",
"favicon": "https://example.com/favicon.ico"
}
]
}
]
}
}
}
]
}
```
**Tool Response Fields:**
| Field | Type | Description |
| --------------------- | ------ | --------------------------------------------------------------- |
| `name` | string | Name of the tool that completed |
| `id` | string | Unique identifier matching the original tool call |
| `arguments` | string | Completion status message |
| `sources` | array | Sources discovered by the tool (with `url`, `title`, `favicon`) |
| `parent_tool_call_id` | string | *(Pro mode only)* ID of the parent tool call |
### 3. Content Events
The final research report is streamed as content chunks:
```json theme={null}
{
"id": "evt_004",
"object": "chat.completion.chunk",
"model": "mini",
"created": 1705329015,
"choices": [
{
"delta": {
"role": "assistant",
"content": "# Research Report\n\nBased on the latest sources..."
}
}
]
}
```
**Content Field:**
* Can be a **string** (markdown-formatted report chunks) when no `output_schema` is provided
* Can be an **object** (structured data) when an `output_schema` is specified
### 4. Sources Event
After the content is streamed, a sources event is emitted containing all sources used in the research:
```json theme={null}
{
"id": "evt_005",
"object": "chat.completion.chunk",
"model": "mini",
"created": 1705329020,
"choices": [
{
"delta": {
"role": "assistant",
"sources": [
{
"url": "https://example.com/article",
"title": "Example Article Title",
"favicon": "https://example.com/favicon.ico"
}
]
}
}
]
}
```
**Source Object Fields:**
| Field | Type | Description |
| --------- | ------ | ---------------------------- |
| `url` | string | The URL of the source |
| `title` | string | The title of the source page |
| `favicon` | string | URL to the source's favicon |
### 5. Done Event
Signals the completion of the streaming response:
```
event: done
```
## Tool Types
During research, you'll encounter the following tool types in streaming events:
| Tool Name | Description | Model |
| ------------------ | -------------------------------------------------------------- | -------- |
| `Planning` | Initializes the research plan based on the input query | Both |
| `Generating` | Generates the final research report from collected information | Both |
| `WebSearch` | Executes web searches to gather information | Both |
| `ResearchSubtopic` | Conducts deep research on specific subtopics | Pro only |
### Research Flow Example
A typical streaming session follows this sequence:
1. **Planning** tool\_call → Initializing research plan
2. **Planning** tool\_response → Research plan initialized
3. **WebSearch** tool\_call → Executing search queries (with `queries` array)
4. **WebSearch** tool\_response → Search completed (with `sources` array)
5. *(Pro mode)* **ResearchSubtopic** tool\_call/response cycles for deeper research
6. **Generating** tool\_call → Generating final report
7. **Generating** tool\_response → Report generated
8. **Content** events → Streamed report chunks
9. **Sources** event → Complete list of all sources used
10. **Done** event → Stream complete
## Handling Streaming Responses
### Python Example
```python theme={null}
from tavily import TavilyClient
# Step 1. Instantiating your TavilyClient
tavily_client = TavilyClient(api_key="tvly-YOUR_API_KEY")
# Step 2. Creating a streaming research task
stream = tavily_client.research(
input="Research the latest developments in AI",
model="pro",
stream=True
)
for chunk in stream:
print(chunk.decode('utf-8'))
```
### JavaScript Example
```javascript theme={null}
const { tavily } = require("@tavily/core");
const tvly = tavily({ apiKey: "tvly-YOUR_API_KEY" });
const stream = await tvly.research("Research the latest developments in AI", {
model: "pro",
stream: true,
});
for await (const chunk of result as AsyncGenerator<Buffer, void, unknown>) {
console.log(chunk.toString('utf-8'));
}
```
## Structured Output with Streaming
When using `output_schema` to request structured data, the `content` field will contain an object instead of a string:
```json theme={null}
{
"delta": {
"role": "assistant",
"content": {
"company": "Acme Corp",
"key_metrics": ["Revenue: $1M", "Growth: 50%"],
"summary": "Company showing strong growth..."
}
}
}
```
## Error Handling
If an error occurs during streaming, you may receive an error event:
```json theme={null}
{
"id": "1d77bdf5-38a4-46c1-87a6-663dbc4528ec",
"object": "error",
"error": "An error occurred while streaming the research task"
}
```
Always implement proper error handling in your streaming client to gracefully handle these cases.
## Non-Streaming Alternative
If you don't need real-time updates, set `stream: false` (or omit the parameter) to receive a single complete response:
```json theme={null}
{
"request_id": "123e4567-e89b-12d3-a456-426614174111",
"created_at": "2025-01-15T10:30:00Z",
"status": "pending",
"input": "What are the latest developments in AI?",
"model": "mini",
"response_time": 1.23
}
```
You can then poll the status endpoint to check when the research is complete.
```