AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Projects WebArena

WebArena

Stale
GitHub Python Apache-2.0

Description

WebArena is a realistic benchmark environment for evaluating autonomous web agents. It provides Gym-like interactive website simulations covering e-commerce, forums, CMS, and more, enabling end-to-end task evaluation as a standard framework for web agent research.

Tags

benchmark web-agent evaluation e2e-testing research python

Categories

🌐 Browser Agent
Visit GitHub

Project Metrics

Stars 1.5k
Forks 241
Watchers 1.5k
Issues 95
Created July 24, 2023
Last commit November 26, 2025

Deployment

Local

Related Projects

Mind2Web

999 · Jupyter Notebook
Stale

The first LLM-based web agent and benchmark for generalist web agents, providing datasets, evaluation frameworks and baseline methods for building agents that operate on real websites.

web-agentbenchmarkllm +2

AgentLab

585 · Python
Normal

An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reproducibility.

web-agentbenchmarkevaluation +2

Cappuccino

44 · Python
Stale

A research project exploring how models understand web interfaces, decompose action steps, and complete complex online tasks through browser agent capabilities.

web-agentbrowser-automationbenchmark

LaVague

6.4k · Python
Stale

LaVague is a Large Action Model (LAM) framework for developing AI web agents, combining RAG techniques for natural-language-driven browser automation.

browserweb-agentlarge-action-model +2
AgentList

The most comprehensive directory of open-source AI Agent projects. Discover and compare top Agent frameworks like LangChain, CrewAI, and more.

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community