browser-use

Active
GitHub Python MIT

Description

browser-use enables browser automation for agents, allowing LLMs to understand pages and perform complex web interactions.

Key Features

  • LLM-driven browser automation - let LLMs understand page content and perform clicks, typing, navigation
  • Multi-model support - built-in ChatBrowserUse plus Gemini, Claude, GPT and other model backends
  • CLI commands - open/state/click/type/screenshot commands for fast browser control from terminal
  • Custom tool extension - register custom Python functions as callable tools for the agent
  • Cloud stealth browsers - optional Browser Use Cloud for proxy rotation and captcha solving
  • Template scaffolding - generate ready-to-run agent templates via uvx browser-use init

Use Cases

💡 Automate repetitive web tasks like job application forms and grocery ordering
💡 Build web scraping agents that navigate pages and extract target information
💡 Add browser operation capabilities to AI coding assistants like Claude Code
💡 Run end-to-end testing and UI verification for web applications at scale
💡 Create personal assistants for complex tasks like PC part comparison and flight search

Quick Start

# Install (requires Python >= 3.11)
uv init && uv add browser-use && uv sync

# Write your agent
# agent.py
from browser_use import Agent, Browser, ChatBrowserUse
import asyncio

async def main():
    agent = Agent(
        task="Find the number of stars of the browser-use repo",
        llm=ChatBrowserUse(),
        browser=Browser(),
    )
    await agent.run()

asyncio.run(main())

Related Projects

Related Articles