Browser Use Integration

Overview

Browser Use is an open-source AI browser automation framework that works with any LLM. Unlike cloud-dependent solutions, you can self-host for unlimited usage with local models.

Key Advantages:

Open Source: No API rate limits or vendor lock-in
Any LLM: Claude, GPT-4, Ollama (local), and more
Self-Hosted: Run on your infrastructure
3-5x Faster: Optimized for browser tasks

Quick Start (10 Minutes)

1. Install Browser Use

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install Browser Use
pip install browser-use

# Install LLM provider (choose one)
pip install langchain-anthropic  # For Claude
pip install langchain-openai     # For GPT-4
pip install langchain-ollama     # For local models

2. Configure API Key

# For Claude
export ANTHROPIC_API_KEY=your_key_here

# For OpenAI
export OPENAI_API_KEY=your_key_here

# For Ollama (no key needed, just run Ollama locally)
ollama serve

3. Write First Agent

# agent.py
import asyncio
from browser_use import Agent
from langchain_anthropic import ChatAnthropic

async def main():
    agent = Agent(
        task="Go to google.com and search for 'Browser Use AI automation'",
        llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
    )

    result = await agent.run()
    print(result)

asyncio.run(main())

4. Run

python agent.py

LLM Configuration

Claude (Recommended)

from langchain_anthropic import ChatAnthropic

# Claude Sonnet (best balance)
llm = ChatAnthropic(
    model="claude-sonnet-4-20250514",
    api_key=os.environ.get("ANTHROPIC_API_KEY"),
)

# Claude Opus (highest quality)
llm = ChatAnthropic(model="claude-opus-4-20250514")

# Claude Haiku (fastest, cheapest)
llm = ChatAnthropic(model="claude-3-5-haiku-20241022")

OpenAI

from langchain_openai import ChatOpenAI

# GPT-4o
llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.environ.get("OPENAI_API_KEY"),
)

# GPT-4 Turbo
llm = ChatOpenAI(model="gpt-4-turbo-preview")

Ollama (Free, Local)

# First, install and run Ollama
ollama serve

# Pull a model
ollama pull llama3.2

from langchain_ollama import ChatOllama

# Local Llama 3.2
llm = ChatOllama(
    model="llama3.2",
    base_url="http://localhost:11434",
)

# Local Mistral
llm = ChatOllama(model="mistral")

# Local Code Llama
llm = ChatOllama(model="codellama")

Cost Comparison

| LLM | Cost per 1M tokens | Best For | |-----|-------------------|----------| | Claude Haiku | ~$0.25 | Simple tasks | | Claude Sonnet | ~$3.00 | Complex tasks | | GPT-4o | ~$5.00 | General use | | Ollama | Free | Unlimited local |

Agent Patterns

Simple Task

agent = Agent(
    task="Search for 'Python tutorials' on YouTube and get the top 5 video titles",
    llm=llm,
)
result = await agent.run()

Multi-Step Task

agent = Agent(
    task="""
    1. Go to amazon.com
    2. Search for 'wireless mouse'
    3. Filter by 4+ star rating
    4. Extract the top 5 products with name, price, and rating
    5. Return as JSON
    """,
    llm=llm,
)
result = await agent.run()

Task with Extraction Schema

from pydantic import BaseModel
from typing import List

class Product(BaseModel):
    name: str
    price: float
    rating: float
    url: str

class ProductList(BaseModel):
    products: List[Product]

agent = Agent(
    task="Find the top 5 laptops on BestBuy under $1000",
    llm=llm,
    output_schema=ProductList,  # Structured output
)
result = await agent.run()
# result.products is List[Product]

With Custom Browser Settings

from browser_use import Agent, Browser

browser = Browser(
    headless=False,  # Show browser
    proxy="http://proxy.example.com:8080",  # Use proxy
)

agent = Agent(
    task="Navigate to example.com",
    llm=llm,
    browser=browser,
)

Error Handling

import asyncio
from browser_use import Agent, AgentError

async def run_with_retry(task: str, max_retries: int = 3):
    for attempt in range(max_retries):
        try:
            agent = Agent(task=task, llm=llm)
            result = await agent.run()
            return result
        except AgentError as e:
            print(f"Attempt {attempt + 1} failed: {e}")
            if attempt == max_retries - 1:
                raise
            await asyncio.sleep(2 ** attempt)  # Exponential backoff

# Usage
result = await run_with_retry("Search Google for 'AI news'")

Timeout Handling

async def run_with_timeout(task: str, timeout: int = 60):
    agent = Agent(task=task, llm=llm)
    try:
        result = await asyncio.wait_for(agent.run(), timeout=timeout)
        return result
    except asyncio.TimeoutError:
        print(f"Task timed out after {timeout}s")
        return None

Self-Hosting

Docker Setup

# Dockerfile
FROM python:3.11-slim

# Install Chrome
RUN apt-get update && apt-get install -y \
    wget gnupg \
    && wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list \
    && apt-get update \
    && apt-get install -y google-chrome-stable \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
CMD ["python", "agent.py"]

# requirements.txt
browser-use
langchain-anthropic
langchain-ollama

Docker Compose with Ollama

# docker-compose.yml
version: '3.8'
services:
  ollama:
    image: ollama/ollama:latest
    ports:
      - "11434:11434"
    volumes:
      - ollama-data:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]  # If GPU available

  browser-agent:
    build: .
    environment:
      - OLLAMA_HOST=http://ollama:11434
    depends_on:
      - ollama

volumes:
  ollama-data:

Run

# Build and run
docker-compose up -d

# View logs
docker-compose logs -f browser-agent

Use Cases

1. Web Scraping

agent = Agent(
    task="""
    Go to news.ycombinator.com
    Extract the top 30 stories with: title, points, comments, and URL
    Return as JSON array
    """,
    llm=llm,
)

2. Form Automation

agent = Agent(
    task="""
    Go to example.com/contact
    Fill the form:
    - Name: John Doe
    - Email: [email protected]
    - Message: I'm interested in your services
    Submit the form
    """,
    llm=llm,
)

3. Price Monitoring

agent = Agent(
    task="""
    Check the price of 'Sony WH-1000XM5' on:
    1. Amazon
    2. BestBuy
    3. Walmart
    Return prices from each site
    """,
    llm=llm,
)

4. Competitor Research

agent = Agent(
    task="""
    Visit competitor.com
    Extract:
    - Pricing tiers
    - Feature list
    - Customer testimonials
    Format as structured report
    """,
    llm=llm,
)

5. Data Entry

# Batch process data entry
data_entries = [
    {"name": "Product A", "price": 99.99},
    {"name": "Product B", "price": 149.99},
]

for entry in data_entries:
    agent = Agent(
        task=f"""
        Go to admin.example.com/products/new
        Add product: {entry['name']} with price ${entry['price']}
        Save and confirm
        """,
        llm=llm,
    )
    await agent.run()

Best Practices

1. Be Specific

# BAD - vague
agent = Agent(task="Find products", llm=llm)

# GOOD - specific
agent = Agent(
    task="Go to amazon.com, search for 'mechanical keyboard', filter by 4+ stars, extract top 5 with name and price",
    llm=llm,
)

2. Use Structured Output

from pydantic import BaseModel

class SearchResult(BaseModel):
    title: str
    url: str
    snippet: str

agent = Agent(
    task="Search Google for 'AI news' and get top 5 results",
    llm=llm,
    output_schema=SearchResult,  # Type-safe output
)

3. Handle Authentication

# Option 1: Include credentials in task
agent = Agent(
    task="""
    Go to app.example.com/login
    Login with email '[email protected]' and password 'secure123'
    Navigate to dashboard
    """,
    llm=llm,
)

# Option 2: Use cookies/session (more secure)
browser = Browser()
await browser.load_cookies("session_cookies.json")
agent = Agent(task="...", llm=llm, browser=browser)

4. Rate Limiting

import asyncio

async def run_with_rate_limit(tasks: list, rate_per_minute: int = 10):
    delay = 60 / rate_per_minute
    results = []

    for task in tasks:
        agent = Agent(task=task, llm=llm)
        result = await agent.run()
        results.append(result)
        await asyncio.sleep(delay)

    return results

Comparison: Browser Use vs Stagehand

| Feature | Browser Use | Stagehand | |---------|-------------|-----------| | Language | Python | TypeScript | | Self-Hosted | Yes | Yes | | Local LLM | Yes (Ollama) | Limited | | Speed | 3-5x optimized | 44% faster (v3) | | Best For | Python scraping | TypeScript testing | | Learning Curve | Easy | Medium |

When to use Browser Use:

Python projects
Need local LLM (Ollama)
Web scraping focus
Cost optimization (free with Ollama)

When to use Stagehand:

TypeScript/Node.js projects
Testing focus
Claude integration priority
Self-healing tests

References

references/browser-use-setup.md - Complete installation guide
references/llm-configuration.md - LLM setup for all providers

Browser Use gives you AI browser automation with full control - self-host with any LLM, no rate limits, no vendor lock-in.

Browser Use Integration

Overview

Browser Use is an open-source AI browser automation framework that works with any LLM. Unlike cloud-dependent solutions, you can self-host for unlimited usage with local models.

Key Advantages:

Open Source: No API rate limits or vendor lock-in
Any LLM: Claude, GPT-4, Ollama (local), and more
Self-Hosted: Run on your infrastructure
3-5x Faster: Optimized for browser tasks

Quick Start (10 Minutes)

1. Install Browser Use

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

# Install Browser Use
pip install browser-use

# Install LLM provider (choose one)
pip install langchain-anthropic  # For Claude
pip install langchain-openai     # For GPT-4
pip install langchain-ollama     # For local models

2. Configure API Key

# For Claude
export ANTHROPIC_API_KEY=your_key_here

# For OpenAI
export OPENAI_API_KEY=your_key_here

# For Ollama (no key needed, just run Ollama locally)
ollama serve

3. Write First Agent

# agent.py
import asyncio
from browser_use import Agent
from langchain_anthropic import ChatAnthropic

async def main():
    agent = Agent(
        task="Go to google.com and search for 'Browser Use AI automation'",
        llm=ChatAnthropic(model="claude-sonnet-4-20250514"),
    )

    result = await agent.run()
    print(result)

asyncio.run(main())

4. Run

python agent.py

LLM Configuration

Claude (Recommended)

from langchain_anthropic import ChatAnthropic

# Claude Sonnet (best balance)
llm = ChatAnthropic(
    model="claude-sonnet-4-20250514",
    api_key=os.environ.get("ANTHROPIC_API_KEY"),
)

# Claude Opus (highest quality)
llm = ChatAnthropic(model="claude-opus-4-20250514")

# Claude Haiku (fastest, cheapest)
llm = ChatAnthropic(model="claude-3-5-haiku-20241022")

OpenAI

from langchain_openai import ChatOpenAI

# GPT-4o
llm = ChatOpenAI(
    model="gpt-4o",
    api_key=os.environ.get("OPENAI_API_KEY"),
)

# GPT-4 Turbo
llm = ChatOpenAI(model="gpt-4-turbo-preview")

Ollama (Free, Local)

# First, install and run Ollama
ollama serve

# Pull a model
ollama pull llama3.2

from langchain_ollama import ChatOllama

# Local Llama 3.2
llm = ChatOllama(
    model="llama3.2",
    base_url="http://localhost:11434",
)

# Local Mistral
llm = ChatOllama(model="mistral")

# Local Code Llama
llm = ChatOllama(model="codellama")

Cost Comparison

Agent Patterns

Simple Task

agent = Agent(
    task="Search for 'Python tutorials' on YouTube and get the top 5 video titles",
    llm=llm,
)
result = await agent.run()

Multi-Step Task

agent = Agent(
    task="""
    1. Go to amazon.com
    2. Search for 'wireless mouse'
    3. Filter by 4+ star rating
    4. Extract the top 5 products with name, price, and rating
    5. Return as JSON
    """,
    llm=llm,
)
result = await agent.run()

Task with Extraction Schema

from pydantic import BaseModel
from typing import List

class Product(BaseModel):
    name: str
    price: float
    rating: float
    url: str

class ProductList(BaseModel):
    products: List[Product]

agent = Agent(
    task="Find the top 5 laptops on BestBuy under $1000",
    llm=llm,
    output_schema=ProductList,  # Structured output
)
result = await agent.run()
# result.products is List[Product]

With Custom Browser Settings

from browser_use import Agent, Browser

browser = Browser(
    headless=False,  # Show browser
    proxy="http://proxy.example.com:8080",  # Use proxy
)

agent = Agent(
    task="Navigate to example.com",
    llm=llm,
    browser=browser,
)

Error Handling

import asyncio
from browser_use import Agent, AgentError

async def run_with_retry(task: str, max_retries: int = 3):
    for attempt in range(max_retries):
        try:
            agent = Agent(task=task, llm=llm)
            result = await agent.run()
            return result
        except AgentError as e:
            print(f"Attempt {attempt + 1} failed: {e}")
            if attempt == max_retries - 1:
                raise
            await asyncio.sleep(2 ** attempt)  # Exponential backoff

# Usage
result = await run_with_retry("Search Google for 'AI news'")

Timeout Handling

async def run_with_timeout(task: str, timeout: int = 60):
    agent = Agent(task=task, llm=llm)
    try:
        result = await asyncio.wait_for(agent.run(), timeout=timeout)
        return result
    except asyncio.TimeoutError:
        print(f"Task timed out after {timeout}s")
        return None

Self-Hosting

Docker Setup

# Dockerfile
FROM python:3.11-slim

# Install Chrome
RUN apt-get update && apt-get install -y \
    wget gnupg \
    && wget -q -O - https://dl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && echo "deb http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list \
    && apt-get update \
    && apt-get install -y google-chrome-stable \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

COPY . .
CMD ["python", "agent.py"]

# requirements.txt
browser-use
langchain-anthropic
langchain-ollama

Docker Compose with Ollama

# docker-compose.yml
version: '3.8'
services:
  ollama:
    image: ollama/ollama:latest
    ports:
      - "11434:11434"
    volumes:
      - ollama-data:/root/.ollama
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]  # If GPU available

  browser-agent:
    build: .
    environment:
      - OLLAMA_HOST=http://ollama:11434
    depends_on:
      - ollama

volumes:
  ollama-data:

Run

# Build and run
docker-compose up -d

# View logs
docker-compose logs -f browser-agent

Use Cases

1. Web Scraping

agent = Agent(
    task="""
    Go to news.ycombinator.com
    Extract the top 30 stories with: title, points, comments, and URL
    Return as JSON array
    """,
    llm=llm,
)

2. Form Automation

agent = Agent(
    task="""
    Go to example.com/contact
    Fill the form:
    - Name: John Doe
    - Email: [email protected]
    - Message: I'm interested in your services
    Submit the form
    """,
    llm=llm,
)

3. Price Monitoring

agent = Agent(
    task="""
    Check the price of 'Sony WH-1000XM5' on:
    1. Amazon
    2. BestBuy
    3. Walmart
    Return prices from each site
    """,
    llm=llm,
)

4. Competitor Research

agent = Agent(
    task="""
    Visit competitor.com
    Extract:
    - Pricing tiers
    - Feature list
    - Customer testimonials
    Format as structured report
    """,
    llm=llm,
)

5. Data Entry

# Batch process data entry
data_entries = [
    {"name": "Product A", "price": 99.99},
    {"name": "Product B", "price": 149.99},
]

for entry in data_entries:
    agent = Agent(
        task=f"""
        Go to admin.example.com/products/new
        Add product: {entry['name']} with price ${entry['price']}
        Save and confirm
        """,
        llm=llm,
    )
    await agent.run()

Best Practices

1. Be Specific

# BAD - vague
agent = Agent(task="Find products", llm=llm)

# GOOD - specific
agent = Agent(
    task="Go to amazon.com, search for 'mechanical keyboard', filter by 4+ stars, extract top 5 with name and price",
    llm=llm,
)

2. Use Structured Output

from pydantic import BaseModel

class SearchResult(BaseModel):
    title: str
    url: str
    snippet: str

agent = Agent(
    task="Search Google for 'AI news' and get top 5 results",
    llm=llm,
    output_schema=SearchResult,  # Type-safe output
)

3. Handle Authentication

# Option 1: Include credentials in task
agent = Agent(
    task="""
    Go to app.example.com/login
    Login with email '[email protected]' and password 'secure123'
    Navigate to dashboard
    """,
    llm=llm,
)

# Option 2: Use cookies/session (more secure)
browser = Browser()
await browser.load_cookies("session_cookies.json")
agent = Agent(task="...", llm=llm, browser=browser)

4. Rate Limiting

import asyncio

async def run_with_rate_limit(tasks: list, rate_per_minute: int = 10):
    delay = 60 / rate_per_minute
    results = []

    for task in tasks:
        agent = Agent(task=task, llm=llm)
        result = await agent.run()
        results.append(result)
        await asyncio.sleep(delay)

    return results

Comparison: Browser Use vs Stagehand

When to use Browser Use:

Python projects
Need local LLM (Ollama)
Web scraping focus
Cost optimization (free with Ollama)

When to use Stagehand:

TypeScript/Node.js projects
Testing focus
Claude integration priority
Self-healing tests

References

references/browser-use-setup.md - Complete installation guide
references/llm-configuration.md - LLM setup for all providers

Browser Use gives you AI browser automation with full control - self-host with any LLM, no rate limits, no vendor lock-in.

Adoption

adaptationio/browser-use-integration

$ install --global

Security Scan Results

SKILL.md

Browser Use Integration

Overview

Quick Start (10 Minutes)

1. Install Browser Use

2. Configure API Key

3. Write First Agent

4. Run

LLM Configuration

Claude (Recommended)

OpenAI

Ollama (Free, Local)

Cost Comparison

Agent Patterns

Simple Task

Multi-Step Task

Task with Extraction Schema

With Custom Browser Settings

Error Handling

Timeout Handling

Self-Hosting

Docker Setup

Docker Compose with Ollama

Run

Use Cases

1. Web Scraping

2. Form Automation

3. Price Monitoring

4. Competitor Research

5. Data Entry

Best Practices

1. Be Specific

2. Use Structured Output

3. Handle Authentication

4. Rate Limiting

Comparison: Browser Use vs Stagehand

References

Related Skills

adaptationio/ttyd-remote-terminal-wsl2

adaptationio/tri-ai-collaboration

adaptationio/todo-management

adaptationio/testing-workflow

adaptationio/browser-use-integration

$ install --global

Security Scan Results

SKILL.md

Browser Use Integration

Overview

Quick Start (10 Minutes)

1. Install Browser Use

2. Configure API Key

3. Write First Agent

4. Run

LLM Configuration

Claude (Recommended)

OpenAI

Ollama (Free, Local)

Cost Comparison

Agent Patterns

Simple Task

Multi-Step Task

Task with Extraction Schema

With Custom Browser Settings

Error Handling

Timeout Handling

Self-Hosting

Docker Setup

Docker Compose with Ollama

Run

Use Cases

1. Web Scraping

2. Form Automation

3. Price Monitoring

4. Competitor Research

5. Data Entry

Best Practices