AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Projects Windows Agent Arena

Windows Agent Arena

Normal
GitHub Python MIT

Description

Windows Agent Arena (WAA) πŸͺŸ is a scalable OS platform for testing and benchmarking of multi-modal AI agents.

Tags

benchmark computer-use windows python

Categories

πŸ“Š Observability 🌐 Browser Agent
Visit GitHub

Project Metrics

Stars 863
Forks 94
Watchers 863
Issues 35
Created July 29, 2024
Last commit April 13, 2026

Deployment

Local

Related Projects

Bananalyzer

328 Β· Python
Stale

Open source AI Agent evaluation framework for web tasks to measure and compare AI agent performance on web operations.

agent-evaluationweb-tasksbenchmark +2

WebQA Agent

215 Β· Python
Active

An autonomous web browser QA agent that evaluates performance, functionality, and user experience through GUI or CLI workflows.

browser-agentweb-testingqa +2

LM Evaluation Harness

12.8k Β· Python
Active

A framework for few-shot evaluation of language models by EleutherAI, providing standardized evaluation pipelines supporting hundreds of benchmark tasks and widely adopted as a core LLM evaluation tool in the community.

llm-evaluationbenchmarkevaluation-framework +2

Windows MCP

5.8k Β· Python
Active

Windows MCP is an MCP server for the Windows desktop, providing AI agents with computer-use capabilities for desktop automation and system operations.

mcpwindowsdesktop-automation +2
AgentList

The most comprehensive directory of open-source AI Agent projects. Discover and compare top Agent frameworks like LangChain, CrewAI, and more.

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

Β© 2026 AgentList. All rights reserved.

Made with for the open source community