library(artclaude)
# Create a chat with web search enabled
chat <- claude_new(
tools = list(claude_web_search())
)
# Claude will automatically search when needed
chat$chat("What were the major AI announcements this week?")Introduction
Claude’s training data has a knowledge cutoff, meaning it lacks information about events, updates, and developments after that date. The web search tool bridges this gap, giving Claude real-time access to current information from across the internet.
This vignette provides a comprehensive exploration of web search capabilities, from basic usage to sophisticated research workflows. You’ll learn how to build applications that combine Claude’s analytical reasoning with up-to-date web information.
Understanding Web Search
What Web Search Provides
The web search tool enables Claude to:
- Access current news and events
- Find updated documentation and specifications
- Research recent developments in any field
- Verify facts with current sources
- Gather multiple perspectives on topics
- Access information published after Claude’s training cutoff
How It Works Internally
When you enable web search, Claude gains access to a server-side tool that:
- Query Formulation: Claude determines what to search for based on the conversation
- Search Execution: The search is performed against Anthropic’s search infrastructure
- Result Processing: Search results (titles, snippets, URLs) are returned to Claude
- Citation Integration: Claude synthesizes information and cites sources
This all happens automatically—you don’t need to implement any search logic.
Pricing and Availability
Web search has additional costs beyond standard API usage:
- Cost: Approximately $10 per 1,000 searches (plus token costs)
- Models: Available for Claude 3.5 Sonnet, Claude 3.5 Haiku, and newer models
- Setup: Must be enabled in your Anthropic Console
Basic Usage
Enabling Web Search
Claude decides when to search based on:
- Questions about current events
- Requests for recent information
- Topics where up-to-date accuracy is important
- Explicit requests to “search” or “look up”
Understanding Claude’s Search Behavior
Claude makes intelligent decisions about searching:
chat <- claude_new(tools = list(claude_web_search()))
# Claude WILL search - needs current information
chat$chat("What is the current price of Bitcoin?")
# Claude WILL search - recent developments
chat$chat("What new features were added to R 4.4?")
# Claude might NOT search - historical fact
chat$chat("When was the Mona Lisa painted?")
# Claude WILL search if asked explicitly
chat$chat("Search for the history of the Mona Lisa and find recent restoration news")Automatic Source Citations
Claude automatically cites sources from search results:
chat <- claude_new(tools = list(claude_web_search()))
response <- chat$chat(
"What are the latest developments in large language models? Include sources."
)
# Response will include inline citations like:
# "According to [Source Name](https://example.com), recent advances include..."Configuration Options
Limiting Search Usage
Control how many searches Claude can perform per request:
# Allow up to 3 searches per request
chat <- claude_new(
tools = list(claude_web_search(max_uses = 3))
)
# For simple queries, one search might suffice
simple_chat <- claude_new(
tools = list(claude_web_search(max_uses = 1))
)
# For comprehensive research, allow more
research_chat <- claude_new(
tools = list(claude_web_search(max_uses = 10))
)Higher max_uses values:
- Allow more thorough research
- Increase costs
- Take longer to complete
- May find more diverse sources
Domain Restrictions
Allowing Specific Domains
Restrict searches to trusted sources:
# Technical documentation only
tech_chat <- claude_new(
tools = list(claude_web_search(
allowed_domains = c(
"cran.r-project.org",
"r-project.org",
"tidyverse.org",
"rstudio.com",
"posit.co",
"github.com"
)
))
)
tech_chat$chat("What are the latest updates to the tidyverse packages?")Use cases for domain restrictions:
- Medical applications: Limit to peer-reviewed journals and official health organizations
- Legal applications: Restrict to legal databases and official court records
- Financial applications: Allow only verified financial news sources
- Educational applications: Limit to academic and educational resources
# Academic research
academic_chat <- claude_new(
tools = list(claude_web_search(
allowed_domains = c(
"arxiv.org",
"scholar.google.com",
"pubmed.ncbi.nlm.nih.gov",
"jstor.org",
"nature.com",
"science.org"
)
))
)
academic_chat$chat("Find recent research on transformer architectures in computer vision")Blocking Specific Domains
Exclude unreliable or irrelevant sources:
# Block social media and user-generated content
professional_chat <- claude_new(
tools = list(claude_web_search(
blocked_domains = c(
"reddit.com",
"twitter.com",
"x.com",
"facebook.com",
"tiktok.com",
"quora.com",
"medium.com"
)
))
)
professional_chat$chat("Research best practices for API design")Note: You cannot use both allowed_domains and blocked_domains simultaneously—they are mutually exclusive.
Location-Based Search
Provide location context for localized results:
# New York-based user
nyc_chat <- claude_new(
tools = list(claude_web_search(
user_location = list(
country = "US",
city = "New York",
region = "NY",
timezone = "America/New_York"
)
))
)
nyc_chat$chat("What art exhibitions are opening this weekend?")
nyc_chat$chat("Find local R user groups and meetups")
# Tokyo-based user
tokyo_chat <- claude_new(
tools = list(claude_web_search(
user_location = list(
country = "JP",
city = "Tokyo",
region = "Tokyo",
timezone = "Asia/Tokyo"
)
))
)
tokyo_chat$chat("What tech conferences are happening nearby?")Location affects:
- Local business listings
- Regional news coverage
- Event and venue searches
- Location-specific content
Case Study: Building a Research Assistant
Let’s build a comprehensive research assistant that combines web search with other Claude capabilities.
Scenario
We’re building a research tool for a data science team that needs to:
- Stay current on R package updates
- Research best practices and new methodologies
- Compare different technical approaches
- Compile findings with proper citations
The Research Assistant
library(artclaude)
# Create a research-focused assistant
research_assistant <- claude_new(
sys_prompt = "You are a technical research assistant specializing in R programming, data science, and statistical methods. When researching topics:
1. Search for authoritative sources first (official docs, academic papers, respected blogs)
2. Compare multiple perspectives when there's disagreement
3. Always cite your sources with URLs
4. Distinguish between established best practices and experimental approaches
5. Note when information might be outdated
Format your responses with clear sections and bullet points for easy reading.",
tools = list(
claude_web_search(
max_uses = 5,
allowed_domains = c(
# Official R resources
"cran.r-project.org",
"r-project.org",
"tidyverse.org",
"rstudio.com",
"posit.co",
# Technical blogs and documentation
"r-bloggers.com",
"github.com",
"stackoverflow.com",
# Academic and research
"arxiv.org",
"jstatsoft.org"
)
)
),
think_effort = "medium" # Enable reasoning for better analysis
)Research Workflows
Staying Current on Packages
research_assistant$chat(
"Research the current state of data.table vs dplyr in 2024. What are the
latest benchmarks, new features, and community recommendations?"
)Claude will:
- Search for recent comparisons and benchmarks
- Look for official announcements from both packages
- Find community discussions and recommendations
- Synthesize findings with citations
Technical Deep Dives
research_assistant$chat(
"I need to implement time series forecasting in R. Research the current
best practices, comparing traditional methods (ARIMA, ETS) with modern
approaches (Prophet, neural networks). What does the latest research
recommend for production use?"
)Comparative Analysis
research_assistant$chat(
"Compare the latest approaches for handling missing data in R:
- mice package
- missForest
- Amelia
- Multiple imputation with tidymodels
Focus on recent updates and current recommendations from the R community."
)Combining Web Search with Structured Output
Extract structured research findings:
# Define the research output schema
research_schema <- ellmer::type_object(
topic = ellmer::type_string("Research topic"),
summary = ellmer::type_string("Executive summary of findings"),
sources = ellmer::type_array(
items = ellmer::type_object(
title = ellmer::type_string("Source title"),
url = ellmer::type_string("Source URL"),
key_finding = ellmer::type_string("Main takeaway from this source"),
credibility = ellmer::type_enum(
c("official", "academic", "industry", "community"),
"Type of source"
)
),
description = "List of sources consulted"
),
recommendations = ellmer::type_array(
items = ellmer::type_string(),
description = "Actionable recommendations"
),
caveats = ellmer::type_array(
items = ellmer::type_string(),
description = "Important caveats or limitations"
)
)
# Conduct research and get structured output
research_assistant$chat(
"Research the best approaches for deploying R Shiny applications in production."
)
results <- research_assistant$chat_structured(
"Summarize your research findings in structured format.",
type = research_schema
)
# Access structured data
print(results$summary)
print(results$recommendations)Multi-Turn Research Sessions
Build up research across multiple queries:
# Start broad
research_assistant$chat(
"I'm building a recommendation system in R. Give me an overview of the
current landscape and popular approaches."
)
# Narrow down
research_assistant$chat(
"That's helpful. Focus on collaborative filtering approaches.
What are the best packages and recent advances?"
)
# Get practical
research_assistant$chat(
"Compare the recommenderlab and recosystem packages.
Which is better for a production system with millions of users?"
)
# Implementation details
research_assistant$chat(
"Find examples and tutorials for implementing recosystem at scale.
What are common pitfalls and how do teams handle cold start problems?"
)Advanced Patterns
Combining Web Search with Custom Tools
Create powerful workflows by combining web search with your own tools:
# Custom tool to save research findings
save_finding <- claude_tool(
fn = function(topic, finding, source_url, tags) {
# In practice, save to database
timestamp <- format(Sys.time(), "%Y-%m-%d %H:%M:%S")
cat(sprintf(
"[%s] Saved finding for '%s':\n %s\n Source: %s\n Tags: %s\n\n",
timestamp, topic, finding, source_url, paste(tags, collapse = ", ")
))
sprintf("Finding saved with timestamp %s", timestamp)
},
name = "save_finding",
desc = "Save an important research finding to the knowledge base for future reference",
topic = ellmer::type_string("Research topic category"),
finding = ellmer::type_string("The key finding or insight"),
source_url = ellmer::type_string("URL of the source"),
tags = ellmer::type_array(
items = ellmer::type_string(),
description = "Tags for categorization"
)
)
# Custom tool to check existing research
check_existing <- claude_tool(
fn = function(topic) {
# In practice, query database
sprintf("No existing research found for topic: %s", topic)
},
name = "check_existing_research",
desc = "Check if we already have research on a topic before searching the web",
topic = ellmer::type_string("Topic to check")
)
# Combined research assistant
smart_researcher <- claude_new(
sys_prompt = "You are a research assistant with access to both web search and a knowledge base. Before searching the web, check if we have existing research. Save important findings for future reference.",
tools = list(
claude_web_search(max_uses = 5),
check_existing,
save_finding
)
)
smart_researcher$chat(
"Research the current best practices for R package documentation.
Save the key findings for our team."
)Web Search with Extended Thinking
For complex research requiring synthesis:
analytical_researcher <- claude_new(
sys_prompt = "You are a senior research analyst. Conduct thorough research, critically evaluate sources, identify patterns across sources, and synthesize insights.",
tools = list(claude_web_search(max_uses = 8)),
think_effort = "high",
interleaved = TRUE # Think between searches
)
analytical_researcher$chat(
"Investigate the current debate around R vs Python for data science in 2024.
Research arguments from both sides, look at recent surveys and industry
trends, and provide a balanced, well-reasoned analysis of when to use each.
Consider performance, ecosystem, learning curve, and job market factors."
)With interleaved = TRUE, Claude will:
- Perform initial search
- Think about what was found and what’s missing
- Search for counter-arguments or additional perspectives
- Think about patterns across sources
- Search for quantitative data to support conclusions
- Think through the synthesis
- Provide a well-reasoned final analysis
Fact-Checking Workflow
Use web search to verify claims:
fact_checker <- claude_new(
sys_prompt = "You are a fact-checker. When given a claim, search for authoritative sources to verify or refute it. Rate the accuracy and explain your reasoning.",
tools = list(
claude_web_search(
max_uses = 3,
blocked_domains = c("reddit.com", "twitter.com", "facebook.com")
)
),
temp = 0 # Consistent, factual responses
)
# Schema for fact-check results
fact_check_schema <- ellmer::type_object(
claim = ellmer::type_string("The claim being checked"),
verdict = ellmer::type_enum(
c("true", "mostly_true", "mixed", "mostly_false", "false", "unverifiable"),
"Accuracy verdict"
),
confidence = ellmer::type_number("Confidence level 0-1"),
explanation = ellmer::type_string("Detailed explanation"),
sources = ellmer::type_array(
items = ellmer::type_object(
name = ellmer::type_string("Source name"),
url = ellmer::type_string("Source URL"),
supports_claim = ellmer::type_boolean("Whether source supports the claim")
),
description = "Sources consulted"
)
)
# Check a claim
fact_checker$chat(
"Fact-check this claim: 'R is the most popular language for statistical computing
according to recent surveys.'"
)
result <- fact_checker$chat_structured(
"Provide your fact-check verdict in structured format.",
type = fact_check_schema
)
print(result$verdict)
print(result$explanation)Best Practices
Query Formulation
Help Claude search effectively with clear requests:
chat <- claude_new(tools = list(claude_web_search()))
# Good: Specific and time-bounded
chat$chat("Find announcements about R 4.4 features released in 2024")
# Good: Explicit sources
chat$chat("Search official tidyverse documentation for the latest dplyr 2.0 changes")
# Less effective: Vague
chat$chat("Tell me about R updates")Source Quality
Consider restricting domains for quality-sensitive applications:
# High-stakes: Official sources only
medical_chat <- claude_new(
tools = list(claude_web_search(
allowed_domains = c(
"nih.gov",
"cdc.gov",
"who.int",
"pubmed.ncbi.nlm.nih.gov",
"mayoclinic.org"
)
))
)
# General research: Block low-quality sources
general_chat <- claude_new(
tools = list(claude_web_search(
blocked_domains = c(
"reddit.com",
"quora.com",
"yahoo.com"
)
))
)Cost Management
Control search usage for budget-sensitive applications:
# Minimize searches for simple queries
economical_chat <- claude_new(
sys_prompt = "Search the web only when absolutely necessary for current information. Use your training knowledge for historical facts and established concepts.",
tools = list(claude_web_search(max_uses = 1))
)
# Allow thorough research for complex tasks
thorough_chat <- claude_new(
tools = list(claude_web_search(max_uses = 10))
)Error Handling
Web searches can fail or return no results:
chat <- claude_new(
sys_prompt = "If web search returns no results or fails, clearly state that you couldn't find current information and offer to answer based on your training knowledge with appropriate caveats.",
tools = list(claude_web_search())
)
# Claude will gracefully handle search failures
chat$chat("What are the latest updates to an obscure internal tool?")Limitations and Considerations
What Web Search Cannot Do
- Access paywalled content: Results include snippets, not full articles
- Real-time data: There’s a delay between web updates and searchability
- Private information: Only publicly indexed content is searchable
- Guaranteed accuracy: Search results may contain misinformation
When NOT to Use Web Search
- Historical facts (use Claude’s training knowledge)
- Well-established technical documentation
- Simple questions with stable answers
- When costs are a primary concern
Privacy Considerations
- Search queries are processed by Anthropic’s infrastructure
- User location data (if provided) affects search results
- Consider data sensitivity when building applications
Integration Patterns
Caching Results
For repeated queries, cache search results:
# Simple in-memory cache
.search_cache <- new.env()
cached_search_chat <- function(query, cache_hours = 24) {
cache_key <- digest::digest(query)
if (exists(cache_key, envir = .search_cache)) {
cached <- get(cache_key, envir = .search_cache)
if (difftime(Sys.time(), cached$time, units = "hours") < cache_hours) {
return(cached$result)
}
}
chat <- claude_new(tools = list(claude_web_search()))
result <- chat$chat(query)
assign(cache_key, list(result = result, time = Sys.time()), envir = .search_cache)
result
}Batch Research
Process multiple research queries:
research_topics <- c(
"Latest developments in tidymodels 2024",
"New features in ggplot2 3.5",
"R Shiny deployment best practices 2024"
)
# Process sequentially (respects rate limits)
results <- lapply(research_topics, function(topic) {
chat <- claude_new(tools = list(claude_web_search(max_uses = 3)))
response <- chat$chat(sprintf("Research: %s. Summarize key points.", topic))
list(topic = topic, response = response)
})Summary
Web search transforms Claude into a research assistant with access to current information. Key takeaways:
- Enable selectively: Use web search when current information matters
-
Control scope: Use
max_usesto balance thoroughness and cost - Curate sources: Use domain restrictions for quality-sensitive applications
- Combine capabilities: Pair web search with thinking and custom tools for powerful workflows
- Cite sources: Claude automatically provides citations—use them to verify information
