Browser Agent

mcpchrome-devtoolsbrowser-debug +2

MCP server providing Chrome DevTools capabilities to coding agents, enabling web debugging, performance analysis, and DOM manipulation automation.

Scrapling

59.2k · Python

An adaptive web scraping framework that intelligently handles anti-bot measures, from single requests to full-scale crawls, designed for AI agent data collection.

browserpythontools +2

AgenticSeek

26.4k · Python

browser-agentcoding-agentlocal-ai +3

Fully local Manus AI alternative that autonomously browses the web, writes code, and interacts via voice, with no API costs

Vision Agents

7.9k · Python

Open Vision Agents by Stream. Build voice and vision agents quickly with any model or video provider, using Stream's edge network for ultra-low latency realtime interactions.

voiceagentpython +3

ScaleCUA

1.1k · Python

Open-sourced computer use agents that can operate on cross-platform environments including Windows, macOS, Ubuntu, and Android. ICLR 2026 Oral paper project.

browserpythonagent +1

Agent Reach

20.9k · Python

Give your AI agent eyes to see the entire internet. Read and search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu with one CLI and zero API fees.

Dev Browser

6.2k · TypeScript

A Claude Skill that gives your AI coding agent the ability to use a web browser for browser automation.

browseragentcoding +2

Mote

83 · TypeScript

browser-automationagent-frameworkweb-interaction +1

A lightweight AI browser automation agent framework providing a clean API for building web interaction automation tools.

TuriX CUA

3.0k · Python

Open-source Computer-Use-Agent that automates GUI interactions through natural language instructions, enabling intelligent desktop automation.

browseragentpython +1

Vibium

2.8k · Go

browser-automationweb-agentgo +1

Browser automation tool for AI agents and humans, providing high-performance web interaction capabilities built in Go

PyWinAssistant

1.3k · Python

The first open-source Artificial Narrow Intelligence generalist agent that fully operates GUIs using only natural language. Uses Visualization-of-Thought and Chain-of-Thought reasoning for spatial perception and HID simulation.

browseragentpython +2

GitNexus

41.2k · TypeScript

The Zero-Server Code Intelligence Engine — a client-side knowledge graph creator running entirely in your browser with a built-in Graph RAG Agent for code exploration.

browseragentrag +2

Actionbook

1.5k · Rust

Let your AI agent use your browser. Actionbook makes browser automation actually work through natural language instructions.

rustbrowseragent +2

AgentGateway

3.0k · Rust

Next generation agentic proxy for AI agents and MCP servers. Provides unified traffic management, routing, and security control.

rustmcpagent +3

Page Agent

18.2k · TypeScript

browseragenttypescript +2

Page Agent is a JavaScript in-page GUI agent by Alibaba that controls web interfaces with natural language, enabling automated form filling, page navigation, and element interaction.

Anchor Browser

400 · TypeScript

browser-infraautomationagent-runtime

A browser runtime and control platform for AI agents, providing programmatic access to web sessions, page interactions, and automation workflows.

Bright Data MCP

2.4k · JavaScript

mcpweb-scrapingdata-extraction +2

Powerful MCP server providing all-in-one public web access for AI agents with web scraping and structured data extraction.

Browser Use Agent SDK

685 · Python

browser-automationagent-sdkweb-interaction

Browser Use Agent SDK is an agent SDK provided by the browser-use team, offering a toolkit for building browser automation agents, enabling developers to quickly create web-interacting AI agents.

browser-harness

14.3k · Python

browseragentautomation +2

Browser Harness | Self-healing harness that enables LLMs to complete any task.

browser-use

96.8k · Python

browseragentautomation +1

browser-use enables browser automation for agents, allowing LLMs to understand pages and perform complex web interactions.

Vibetest Use

796 · Python

qa-testingbrowser-usemcp +2

Automated QA testing MCP tool using Browser-Use agents, leveraging AI agents for browser-based automated quality assurance testing.

Browser Use Web UI

16.0k · Python

A web interface for running AI agents in the browser, providing a visual experience for browser automation operations.

Workflow Use

4.0k · Python

browser-agentworkflowautomation

An automation workflow project in the browser-use ecosystem that enables AI agents to operate browsers and complete multi-step web tasks.

Browserable

1.2k · JavaScript

browser-automationself-hosteddocker +3

Browserable is a self-hostable browser automation tool purpose-built for AI agents. It provides secure Docker-based browser environments with a JavaScript SDK, achieving 90.4% accuracy on the Web Voyager benchmark for autonomous web navigation.

MCP Server Browserbase

3.4k · TypeScript

mcp-serverbrowser-automationcloud-browser +1

Browserbase MCP server allows LLMs to control a browser with Browserbase and Stagehand, providing cloud-based browser automation capabilities for AI agents including web interaction, data scraping, and automated testing.

Open Operator

1.9k · TypeScript

browseragenttypescript +2

An open-source template for building web agents with Stagehand on Browserbase, providing serverless browser automation for AI agents to safely execute web tasks in the cloud.

Stagehand

22.9k · TypeScript

browser-agentsdkweb-automation +2

The SDK for browser agents by Browserbase. Provides act, extract, and observe primitives for AI agents to naturally browse and interact with web pages.

Browserless

13.3k · TypeScript

typescriptbrowsertools +2

Deploy headless browsers in Docker. Run on cloud or bring your own infrastructure. Provides powerful web automation and rendering capabilities for AI agents. Free for non-commercial uses.

BrowserMCP

6.6k · TypeScript

mcpbrowser-automationbrowser-extension +2

BrowserMCP is a browser extension-based MCP server that allows AI applications like Claude and Cursor to directly control and automate your browser.

BrowserOS

11.2k · TypeScript

browseragenttypescript +3

The open-source Agentic browser that transforms your browser into an AI-powered operating system. Alternative to ChatGPT Atlas, Perplexity Comet, and Dia.

BrowserWing

1.3k · Go

browser-agentmcpbrowser-automation +2

BrowserWing turns browser actions into MCP commands or Claude Skills, allowing AI agents to control browsers efficiently and reliably with reduced dependency on heavy LLM interactions.

UI-TARS Desktop

35.9k · TypeScript

multimodal-agentgui-automationcomputer-use +2

ByteDance's open-source multimodal AI agent stack connecting cutting-edge AI models with agent infrastructure for GUI automation and computer control.

SmolVM

572 · Python

sandboxcode-executionbrowser-use +2

Open-source AI sandbox infrastructure for code execution, browser use, and AI agent runtimes.

Open Computer Use

725 · TypeScript

typescriptagentbrowser +2

State of the Art 82% OSWorld Verified Computer Using Agent, fully open-source, safe, auditable, and production-ready for desktop automation.

Windows MCP

5.8k · Python

mcpwindowsdesktop-automation +2

Windows MCP is an MCP server for the Windows desktop, providing AI agents with computer-use capabilities for desktop automation and system operations.

Dendrite Python SDK

310 · Python

browser-agentpython-sdkweb-extraction

A Python SDK for AI browser automation that enables models to locate elements, perform web actions, and extract structured data from web pages.

DO Browser

2.8k · TypeScript

DO Browser is a browser-task agent tool focused on page understanding, action planning, and automation, serving as a lighter alternative to browser-use or Stagehand.

browserautomationweb +1

Deep Research

19.0k · TypeScript

deep-researchweb-scrapingai-research +2

AI-powered research assistant that performs iterative deep research on any topic by combining search engines, web scraping, and LLMs

Open Computer Use

2.1k · Python

computer-usesandboxe2b +1

AI computer use powered by open source LLMs and E2B Desktop Sandbox.

BB Browser

5.6k · TypeScript

An MCP server and CLI that turns the browser into an API, allowing AI agents to control Chrome with existing login sessions for web operations, data scraping, and automation tasks without re-authentication.

mcptoolsbrowser +2

MCP Playwright

5.5k · TypeScript

mcpplaywrightbrowser-automation +2

Playwright Model Context Protocol server for automating browsers and APIs in Claude Desktop, Cline, Cursor IDE and other AI coding tools

Firecrawl

127.8k · TypeScript

web-scrapingsearch-enginemarkdown +2

Firecrawl is a web scraping and search engine designed for AI agents, converting any webpage into structured Markdown data with search, scrape, and clean capabilities for building web-data-powered AI applications.

Firecrawl Web Agent

1.1k · TypeScript

web-scrapingdata-extractionbrowser-agent +2

Open-source web data agent optimized for structured web research, capable of autonomously browsing websites and extracting structured data.

Cappuccino

44 · Python

web-agentbrowser-automationbenchmark

A research project exploring how models understand web interfaces, decompose action steps, and complete complex online tasks through browser agent capabilities.

PPT Master

23.8k · Python

AI-powered PPT generation tool that creates natively editable PPTX from any document, producing real PowerPoint shapes instead of images.

HyperAgent

1.4k · TypeScript

browser-automationplaywrightai-agent +3

HyperAgent is a Playwright-based AI browser automation framework offering high-level APIs like page.ai(), page.perform(), and page.extract(). It features built-in MCP client support and action caching, enabling AI agents to browse, interact, and extract data using natural language.

Camofox Browser

6.2k · JavaScript

anti-detectionbrowser-automationcamoufox +3

Camofox Browser is a headless browser automation server powered by Camoufox, a Firefox fork with C++-level fingerprint spoofing. It bypasses Google, Cloudflare, and most bot detection, providing token-efficient accessibility snapshots and stable element references for AI agents.

LaVague

6.4k · Python

browserweb-agentlarge-action-model +2

LaVague is a Large Action Model (LAM) framework for developing AI web agents, combining RAG techniques for natural-language-driven browser automation.

Lightpanda Browser

30.8k · Zig

headless-browserautomationweb-agent

A lightweight browser runtime designed for automation and scraping scenarios, offering lower overhead than traditional browsers for headless tasks.

Index

2.3k · Python

The SOTA open-source browser agent for autonomously performing complex tasks on the web with natural language-driven web automation.

pythonbrowseragent +2

Auto Browser

535 · Python

browser-agentmcphuman-in-the-loop +2

An MCP-native browser agent that gives AI systems a real browser for web tasks while keeping a human in the loop.

Magnitude

4.1k · TypeScript

vision-firstbrowser-automationweb-agent +2

An open-source, vision-first browser agent that drives web automation through visual understanding, supporting complex web interaction tasks for QA testing and workflow automation.

Terminator

1.5k · Rust

desktop-automationwindowsaccessibility +2

Playwright for Windows desktop automation, enabling AI agents to control desktop applications through natural language

UFO

8.8k · Python

gui-agentwindowsautomation +1

UFO is a Windows GUI automation agent by Microsoft that understands screen interfaces and executes complex OS tasks through natural language commands.

Webwright

4.9k · Python

browseragentautomation +2

A simple SWE style browser agent framework that achieves SOTA results on long horizon web tasks.

Fara

5.4k · Python

browser-agentweb-automationresearch

Microsoft's open-source browser and web task agent that uses large models to understand pages, plan actions, and complete real web workflows.

Magentic-UI

9.9k · Python

A research prototype of a human-centered web agent from Microsoft Research, emphasizing human-in-the-loop interaction for collaborative web browsing and data collection tasks.

browseragentpython +1

Windows Agent Arena

863 · Python

benchmarkcomputer-usewindows +1

Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.

WebQA Agent

215 · Python

browser-agentweb-testingqa +2

An autonomous web browser QA agent that evaluates performance, functionality, and user experience through GUI or CLI workflows.

Mobile Use

2.6k · Python

Framework enabling AI agents to use real Android and iOS apps just like a human, supporting autonomous operation and interaction with mobile interfaces.

NanoBrowser

13.1k · TypeScript

browserautomationchrome-extension +2

NanoBrowser is an open-source Chrome extension for AI-powered multi-agent browser automation, supporting web task workflows with your own LLM API key.

Notte

2.0k · Python

browserweb-agentautomation +1

Notte is a framework for building web agents and deploying serverless browser automation functions, providing reliable browser infrastructure and web-aware agent capabilities.

OpenBrowser

9.5k · TypeScript

browser-agentweb-automationplaywright +2

AI-powered autonomous web browsing framework that enables agents to click, type, navigate, and extract data like a human, with support for OpenAI, Anthropic, and Google models.

OpenAdapt

1.6k · Python

computer-useautomationdesktop +1

OpenAdapt is an open-source agent tool for desktop automation and computer-use scenarios, capturing user interactions, replaying tasks, and enabling GUI automation workflows.

OpenAI CUA Sample App

1.7k · TypeScript

Official sample application for OpenAI Computer Using Agent (CUA). Learn how to use CUA via the API on multiple computer environments.

agentbrowseropenai +2

Mind2Web

999 · Jupyter Notebook

The first LLM-based web agent and benchmark for generalist web agents, providing datasets, evaluation frameworks and baseline methods for building agents that operate on real websites.

web-agentbenchmarkllm +2

SeeAct

845 · Python

web-agentmultimodalllm +2

A system for generalist web agents that autonomously carry out tasks on any given website, leveraging large multimodal models like GPT-4V.

Oxylabs AI Studio

2.9k · Python

web-scrapingbrowser-agentai-scraper +3

Oxylabs AI Studio Python SDK provides an all-in-one AI-powered web scraping toolkit integrating an AI scraper, crawler, browser agent, search engine, and sitemap tool for structured data extraction driven by natural language instructions.

Oxylabs Browser Agent

1.2k · Unknown

An advanced browser AI tool developed by Oxylabs AI Studio that automates real user browsing tasks using natural language instructions.

browseragenttools +1

Chrome CDP Skill

3.1k · JavaScript

Give AI agents access to your live Chrome session. Works out of the box, connects to tabs you already have open.

chromebrowsercdp +2

Autotab

1.0k · Python

Open-source framework for building browser agents for real-world tasks, learning from user demonstrations to automate web interactions.

pythonbrowseragent +2

Surf

53 · Python

chat-uibrowser-agentmemory +2

A self-hosted AI chat platform with a web UI and terminal CLI, supporting any model, web search, browser-agent automation, persistent memory, and analytics.

Rebrowser Patches

1.4k · JavaScript

playwrightbrowser-automationanti-detection

Anti-detection patches for Playwright and browser automation scenarios, helping automated browsers appear more like real user sessions.

Playwriter

3.6k · HTML

mcpplaywrightbrowser-automation +2

Chrome extension & CLI to let agents control your browser. Runs Playwright snippets in a stateful sandbox. Available as CLI or MCP.

Bananalyzer

328 · Python

agent-evaluationweb-tasksbenchmark +2

Open source AI Agent evaluation framework for web tasks to measure and compare AI agent performance on web operations.

AgentLab

585 · Python

web-agentbenchmarkevaluation +2

An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reproducibility.

Awesome GUI Agent

1.2k · Unknown

browseragentevaluation +1

A curated list of papers and resources for multi-modal Graphical User Interface agents, systematically covering computer use, mobile interaction and more.

Computer Use OOTB

1.9k · Python

computer-usegui-agentdesktop +1

Out-of-the-box (OOTB) GUI Agent for Windows and macOS.

ShowUI

1.8k · Python

ShowUI is an open-source, end-to-end Vision-Language-Action model for GUI agents and computer use, capable of understanding screenshots and executing precise interface interactions.

browseragentllm +2

Agent S

11.7k · Python

computer-usegui-agentautomation +1

Open-source agentic framework that uses computers like a human, capable of completing complex GUI tasks with autonomous learning and experience accumulation.

Skyvern

21.8k · Python

Skyvern is an agent platform for browser task automation, using page understanding and action planning to complete complex web workflows such as forms and back-office tasks.

browserautomationweb +1

Browser Use Steel

180 · Python

browser-usecloud-browserautomation

A project combining browser-use agent control with Steel's cloud browser infrastructure for scalable web automation.

Steel Browser

7.1k · TypeScript

browser-automationbrowser-sandboxanti-detection +3

Steel Browser is an open-source browser sandbox purpose-built for AI agents and applications. It provides a full browser API with session management, proxy integration, and built-in anti-detection, enabling web automation without infrastructure headaches.

Computer Agent

644 · Rust

computer-usedesktoprust +1

Desktop app to control your computer with AI using your terminal, browser, mouse & keyboard.

AppAgent

6.8k · Python

multimodalsmartphonegui-agent +3

AppAgent is an LLM-based multimodal agent framework designed to operate smartphone apps like a human, supporting touch interaction and autonomous exploration.

Hercules

1.0k · Python

testing-agentbrowser-testinge2e-testing +3

The first open-source testing agent that enables UI, API, security, accessibility, and visual validations without writing code or maintaining tests

OpenAgent

5.1k · Go

personal-assistantbrowser-agentcomputer-use +4

Next-generation personal AI assistant powered by LLM, RAG and agent loops, supporting computer-use, browser-use and coding agent capabilities.

AgentQL

1.4k · Python

web-scrapingbrowser-automationplaywright +2

A suite of tools for connecting AI to the web with a query language and Playwright integrations for precise, scalable web element interaction and data extraction.

CUA

17.5k · HTML

computer-usedesktop-automationsandbox +2

CUA provides open-source infrastructure for Computer-Use Agents, including sandboxes, SDKs, and benchmarks to train and evaluate AI agents that control full desktops (macOS, Linux, Windows).

Agent Browser

35.0k · Rust

browser-automationclirust +1

An open-source browser automation CLI for AI agents by Vercel, built with Rust for high performance and programmability.

WebArena

1.5k · Python

benchmarkweb-agentevaluation +3

WebArena is a realistic benchmark environment for evaluating autonomous web agents. It provides Gym-like interactive website simulations covering e-commerce, forums, CMS, and more, enabling end-to-end task evaluation as a standard framework for web agent research.

Midscene.js

13.6k · TypeScript

browser-automationui-testingvision +3

AI-powered vision-driven UI automation that lets you describe actions in natural language instead of writing selectors, supporting browser and mobile platforms

OpenBrowse

58 · TypeScript

browser-agentmacosautomation +1

A macOS browser agent that completes web tasks through autonomous execution, chat-based clarification, and resumable local workflows.

OpenCUA

775 · Python

Open Foundations for Computer-Use Agents. Provides datasets, benchmarks, and foundation models for training and evaluating AI agents that control desktop environments.

pythonagentbrowser +2

autoMate

3.9k · Python

computer-usedesktop-automationrpa +2

An AI-driven local automation assistant like Manus, a computer use agent that uses natural language to make computers work autonomously.

Open-AutoGLM

25.4k · Python