Code Execution: Claude as a Computational Engine

Introduction

The code execution tool transforms Claude from a conversational AI into a computational engine capable of writing and running code to solve problems. Instead of just describing how to calculate something or explaining an algorithm, Claude can actually execute code and return concrete results.

This vignette provides a comprehensive exploration of code execution capabilities, covering both Bash and Python environments, integration with other Claude features, and patterns for building sophisticated computational workflows.

Understanding Code Execution

What Code Execution Provides

With code execution enabled, Claude can:

Perform complex mathematical calculations
Process and analyze data
Generate visualizations
Execute shell commands
Manipulate files
Run iterative algorithms
Prototype and test code solutions

The Execution Environment

Code execution happens in Anthropic’s sandboxed infrastructure:

Isolated: Each execution runs in a fresh, isolated environment
Sandboxed: Limited access to network and system resources
Ephemeral: State doesn’t persist between requests (with exceptions for Files API)
Secure: Designed to prevent malicious code execution

Available Environments

Environment	Beta Header	Tool Type	Use Case
Bash	`code-execution-2025-08-25`	`code_execution_20250825`	Shell commands, file operations
Python	`code-execution-2025-05-22`	`code_execution_20250522`	Data analysis, visualization, computation

Enabling Code Execution

Bash Execution

library(artclaude)

# Enable Bash code execution
bash_chat <- claude_new(
  tools = list(claude_code_exec(type = "bash")),
  beta = beta_headers$BETA_CODE_EXEC_BASH
)

# Claude can now execute shell commands
bash_chat$chat("Calculate the SHA256 hash of the text 'Hello, World!'")

Python Execution

# Enable Python code execution
python_chat <- claude_new(
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

# Claude can now execute Python code
python_chat$chat("Calculate the first 50 prime numbers and show the distribution of their last digits")

Both Environments

You can enable both environments in the same session:

# Enable both Bash and Python
multi_exec_chat <- claude_new(
  tools = list(
    claude_code_exec(type = "bash"),
    claude_code_exec(type = "python")
  ),
  beta = c(beta_headers$BETA_CODE_EXEC_BASH, beta_headers$BETA_CODE_EXEC_PYTHON)
)

# Claude chooses the appropriate environment
multi_exec_chat$chat("Use bash to list files, then use Python to analyze the output")

Bash Execution Deep Dive

Basic Command Execution

bash_chat <- claude_new(
  tools = list(claude_code_exec(type = "bash")),
  beta = beta_headers$BETA_CODE_EXEC_BASH
)

# Simple commands
bash_chat$chat("What's the current date and time in UTC?")

# Text processing
bash_chat$chat("Generate 100 random numbers between 1-1000 and find the median")

# File operations
bash_chat$chat("Create a CSV file with 10 sample records of names and ages")

Use Cases for Bash

Data Processing Pipelines

bash_chat$chat(
  "Create a data pipeline that:
   1. Generates a sample log file with 1000 entries
   2. Extracts error messages
   3. Counts occurrences by error type
   4. Outputs a summary report"
)

System Information

bash_chat$chat("Analyze the environment: show available disk space, memory, and CPU info")

Text Manipulation

bash_chat$chat(
  "I have this CSV data:
   name,age,city
   Alice,30,NYC
   Bob,25,LA
   Carol,35,Chicago

   Use awk to calculate the average age"
)

Bash Limitations

The Bash environment has restrictions for security:

Limited network access
Cannot install system packages
Ephemeral filesystem (resets between requests)
Resource limits on CPU and memory

Python Execution Deep Dive

The Python Environment

Python execution provides access to a rich scientific computing environment:

Pre-installed Libraries:

Data manipulation: pandas, numpy
Visualization: matplotlib, seaborn, plotly
Scientific computing: scipy, scikit-learn
Statistical analysis: statsmodels
Machine learning: Various ML packages

Basic Python Execution

python_chat <- claude_new(
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

# Data analysis
python_chat$chat(
  "Generate a dataset of 500 random sales transactions with:
   - date (random dates in 2024)
   - product_id (1-20)
   - quantity (1-10)
   - price (10-500)

   Then analyze: total revenue by month, top products by revenue, and the
   correlation between quantity and price."
)

Data Visualization

python_chat$chat(
  "Create visualizations for the sales data:
   1. A line chart of monthly revenue trends
   2. A bar chart of top 5 products
   3. A scatter plot of quantity vs price with a trend line

   Use a clean, professional style with proper labels."
)

Claude will generate the visualizations and return them as part of the response.

Statistical Analysis

python_chat$chat(
  "Perform a comprehensive statistical analysis:
   1. Generate two groups of data (control vs treatment, n=100 each)
   2. Test for normality in each group
   3. Perform an appropriate statistical test to compare means
   4. Calculate effect size (Cohen's d)
   5. Create a violin plot comparing the distributions
   6. Summarize findings with confidence intervals"
)

Machine Learning

python_chat$chat(
  "Build a classification model:
   1. Create a synthetic dataset for binary classification (1000 samples, 10 features)
   2. Split into train/test sets
   3. Train a Random Forest classifier
   4. Evaluate with confusion matrix, precision, recall, F1
   5. Plot ROC curve and calculate AUC
   6. Show feature importance"
)

Case Study: Building a Data Analysis Assistant

Let’s build a comprehensive data analysis assistant that combines code execution with Claude’s reasoning capabilities.

Scenario

We’re building an assistant that can:

Accept data descriptions or generate sample data
Perform exploratory data analysis
Create visualizations
Run statistical tests
Build predictive models
Explain findings in plain language

The Data Analyst Assistant

library(artclaude)

data_analyst <- claude_new(
  sys_prompt = "You are an expert data analyst. When given data analysis tasks:

1. First understand the data structure and research question
2. Write clean, well-documented Python code
3. Perform thorough exploratory analysis before modeling
4. Choose appropriate statistical methods with justification
5. Create clear, publication-quality visualizations
6. Explain findings in plain language with caveats
7. Suggest follow-up analyses when relevant

Always show your work and code. Explain your analytical choices.",
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON,
  think_effort = "medium" # Enable reasoning for analytical decisions
)

Exploratory Data Analysis

data_analyst$chat(
  "I have a dataset of employee performance reviews with columns:
   - employee_id
   - department (Engineering, Sales, Marketing, HR)
   - years_experience
   - performance_score (1-5)
   - salary
   - promoted_this_year (True/False)

   Generate a realistic dataset of 500 employees and perform a complete
   exploratory data analysis. I want to understand what factors are
   associated with promotion decisions."
)

Hypothesis Testing

data_analyst$chat(
  "Based on the employee data, test these hypotheses:
   1. Is there a significant difference in promotion rates across departments?
   2. Does years of experience correlate with performance scores?
   3. Is there a salary premium for promoted employees, controlling for department?

   Use appropriate statistical tests and report effect sizes."
)

Predictive Modeling

data_analyst$chat(
  "Build a model to predict promotion likelihood:
   1. Use logistic regression as a baseline
   2. Try a Random Forest for comparison
   3. Handle class imbalance appropriately
   4. Use cross-validation for robust estimates
   5. Identify the most important predictors
   6. Discuss any fairness concerns with the model"
)

Getting Structured Results

# Define schema for analysis results
analysis_schema <- ellmer::type_object(
  research_question = ellmer::type_string("The question being analyzed"),
  methodology = ellmer::type_string("Statistical approach used"),
  key_findings = ellmer::type_array(
    items = ellmer::type_object(
      finding = ellmer::type_string("Description of finding"),
      statistical_support = ellmer::type_string("Statistical evidence"),
      confidence = ellmer::type_enum(
        c("high", "medium", "low"),
        "Confidence in finding"
      )
    ),
    description = "List of key findings"
  ),
  limitations = ellmer::type_array(
    items = ellmer::type_string(),
    description = "Limitations and caveats"
  ),
  recommendations = ellmer::type_array(
    items = ellmer::type_string(),
    description = "Recommendations for action or further analysis"
  )
)

# Get structured analysis summary
results <- data_analyst$chat_structured(
  "Summarize your complete analysis in structured format.",
  type = analysis_schema
)

print(results$key_findings)

Combining Code Execution with Other Features

Code Execution + Extended Thinking

For complex computational problems:

computational_solver <- claude_new(
  sys_prompt = "You are a computational problem solver. Think through problems carefully before writing code. Consider edge cases and efficiency.",
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON,
  think_effort = "high",
  interleaved = TRUE
)

computational_solver$chat(
  "Implement an efficient algorithm to find the longest palindromic substring
   in a string. Compare the naive O(n³) approach with the O(n²) dynamic
   programming approach. Benchmark both on strings of varying lengths
   (100, 1000, 5000 characters) and visualize the performance difference."
)

With interleaved = TRUE:

Claude thinks about the problem structure
Implements the naive solution
Thinks about optimization opportunities
Implements the DP solution
Thinks about benchmarking methodology
Runs benchmarks
Thinks about results interpretation
Creates visualization and explains findings

Code Execution + Web Search

Research-informed analysis:

research_analyst <- claude_new(
  sys_prompt = "You are a research analyst who combines literature research with data analysis. First research current best practices, then implement and analyze.",
  tools = list(
    claude_web_search(max_uses = 3),
    claude_code_exec(type = "python")
  ),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

research_analyst$chat(
  "Research the current best practices for time series anomaly detection,
   then implement the most recommended approach on a sample dataset.
   Compare it against a simple baseline (e.g., Z-score method)."
)

Code Execution + Custom Tools

Combine execution with your own tools:

# Custom tool to fetch data
fetch_data <- claude_tool(
  fn = function(dataset_name) {
    # Simulate fetching from data warehouse
    if (dataset_name == "sales_2024") {
      n <- 1000
      data.frame(
        date = seq.Date(as.Date("2024-01-01"), by = "day", length.out = n),
        revenue = cumsum(rnorm(n, 5000, 1000)),
        customers = sample(100:500, n, replace = TRUE),
        region = sample(c("North", "South", "East", "West"), n, replace = TRUE)
      )
    } else {
      NULL
    }
  },
  name = "fetch_data",
  desc = "Fetch a dataset from the data warehouse by name",
  dataset_name = ellmer::type_string("Name of dataset to fetch")
)

# Custom tool to save results
save_results <- claude_tool(
  fn = function(analysis_name, summary) {
    cat(sprintf("Saved analysis '%s': %s\n", analysis_name, summary))
    "Results saved successfully"
  },
  name = "save_results",
  desc = "Save analysis results to the results database",
  analysis_name = ellmer::type_string("Name for the analysis"),
  summary = ellmer::type_string("Summary of key findings")
)

# Combined analyst
integrated_analyst <- claude_new(
  sys_prompt = "You are a data analyst with access to the company's data warehouse. Fetch data using fetch_data, analyze with Python, and save important findings.",
  tools = list(
    fetch_data,
    save_results,
    claude_code_exec(type = "python")
  ),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

integrated_analyst$chat(
  "Fetch the sales_2024 dataset, analyze revenue trends by region,
   identify any anomalies, and save your findings."
)

Working with Files

Files API Integration

For working with uploaded files:

# Enable both code execution and Files API
files_analyst <- claude_new(
  tools = list(claude_code_exec(type = "python")),
  beta = c(
    beta_headers$BETA_CODE_EXEC_PYTHON,
    beta_headers$BETA_FILES_API
  )
)

# Process an uploaded file
files_analyst$chat(
  ellmer::content_file("sales_data.csv"),
  "Analyze this sales data: calculate summary statistics, identify trends,
   and create visualizations. Flag any data quality issues."
)

Processing Multiple Files

files_analyst$chat(
  ellmer::content_file("q1_sales.csv"),
  ellmer::content_file("q2_sales.csv"),
  "Compare Q1 and Q2 sales data. Calculate quarter-over-quarter growth,
   identify which products improved and which declined."
)

Advanced Patterns

Iterative Problem Solving

Let Claude refine solutions through multiple attempts:

iterative_solver <- claude_new(
  sys_prompt = "You are a meticulous problem solver. After implementing a solution, verify it with test cases. If issues are found, debug and fix them.",
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON,
  think_effort = "high"
)

iterative_solver$chat(
  "Implement a function to validate email addresses according to RFC 5321.
   Test it with edge cases including:
   - Normal emails
   - Subdomains
   - Plus addressing
   - Quoted strings
   - International characters
   - Invalid formats

   Fix any bugs you discover through testing."
)

Multi-Step Pipelines

Complex analyses spanning multiple code executions:

pipeline_analyst <- claude_new(
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

# Step 1: Data generation
pipeline_analyst$chat(
  "Generate a dataset simulating an A/B test with:
   - 10,000 users per variant
   - Conversion rates of 2.1% (control) vs 2.5% (treatment)
   - Add realistic noise and some missing data"
)

# Step 2: Data cleaning
pipeline_analyst$chat(
  "Clean the dataset: handle missing values, check for duplicates,
   validate data types, and document any issues found."
)

# Step 3: Analysis
pipeline_analyst$chat(
  "Perform a rigorous A/B test analysis:
   - Calculate conversion rates with confidence intervals
   - Run appropriate statistical test
   - Check for novelty effects (time-based analysis)
   - Segment by any available dimensions"
)

# Step 4: Reporting
pipeline_analyst$chat(
  "Create a visual report with:
   - Summary metrics
   - Conversion rate comparison chart
   - Time series of cumulative conversion
   - Recommendations with confidence level"
)

Error Recovery

Handle code execution errors gracefully:

robust_executor <- claude_new(
  sys_prompt = "You are a robust code executor. If code fails, analyze the error, fix the issue, and retry. Explain what went wrong and how you fixed it.",
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

robust_executor$chat(
  "Try to calculate the inverse of this matrix: [[1, 2], [2, 4]]
   (Note: this matrix is singular)
   Handle any errors appropriately and explain what's happening."
)

Best Practices

Code Quality

# Encourage clean code in system prompt
clean_coder <- claude_new(
  sys_prompt = "Write production-quality Python code:
   - Use descriptive variable names
   - Add docstrings and comments
   - Handle edge cases
   - Include error handling
   - Follow PEP 8 style guidelines",
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

Reproducibility

reproducible_analyst <- claude_new(
  sys_prompt = "Ensure reproducibility:
   - Set random seeds
   - Document package versions used
   - Make code self-contained
   - Include all necessary imports",
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

reproducible_analyst$chat(
  "Create a reproducible analysis of bootstrap confidence intervals.
   Make sure anyone could run this code and get the same results."
)

Security Considerations

Code executes in Anthropic’s sandboxed environment
Network access is restricted
Cannot access your local system
File operations are ephemeral (unless using Files API)
Resource limits prevent runaway computations

Limitations

Environment Constraints

Package availability: Only pre-installed packages; cannot pip install
Memory limits: Large datasets may cause memory errors
Time limits: Long-running computations may timeout
Network: Limited/no external network access
State: Environment resets between requests

What Code Execution Cannot Do

Access your local filesystem
Make external API calls
Install additional packages
Run indefinitely
Access GPU acceleration (typically)

When NOT to Use Code Execution

Simple calculations Claude can do mentally
When exact code is more important than results
Highly sensitive computations requiring auditing
Tasks requiring packages not in the environment

Troubleshooting

Common Issues

“Module not found” errors

# Check what's available
checker <- claude_new(
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

checker$chat("List all available Python packages in this environment")

Memory errors

# Use chunked processing for large data
memory_efficient <- claude_new(
  sys_prompt = "When processing large data, use chunked/streaming approaches to avoid memory errors.",
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

Timeout issues

# Optimize for speed
fast_coder <- claude_new(
  sys_prompt = "Write efficient code. Use vectorized operations over loops. Profile if needed.",
  tools = list(claude_code_exec(type = "python")),
  beta = beta_headers$BETA_CODE_EXEC_PYTHON
)

Summary

Code execution elevates Claude from describing solutions to implementing them. Key takeaways:

Choose the right environment: Bash for shell tasks, Python for data science
Enable with beta headers: Required for both execution types
Combine with thinking: Use think_effort for complex computational problems
Leverage the ecosystem: Python environment includes major data science packages
Handle limitations: Understand environment constraints and plan accordingly
Ensure reproducibility: Set seeds, document versions, write clean code