AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Projects SWE-bench

SWE-bench

Normal
GitHub Python MIT

Description

SWE-bench is a benchmark for evaluating language models on real-world GitHub issue resolution, featuring genuine problems from popular Python repositories, now a core standard for measuring AI coding agent capabilities.

Tags

evaluation python coding agent testing

Categories

💻 Coding Agent
Visit GitHub

Project Metrics

Stars 5.1k
Forks 878
Watchers 5.1k
Issues 106
Created October 4, 2023
Last commit April 1, 2026

Deployment

Local

Related Projects

Augment SWE-bench Agent

873 · Python
Stale

Augment SWE-bench Agent is the number one open-source SWE-bench Verified implementation, demonstrating how to build high-performance software engineering agents to automatically resolve GitHub issues.

codingpythonagent +2

AutoCodeRover

3.1k · Python
Stale

AutoCodeRover is a project structure-aware autonomous software engineer agent that achieves automated program repair and issue resolution by understanding the overall codebase architecture.

codingpythonagent +2

Micro Agent

4.3k · TypeScript
Stale

An AI agent that writes actually useful code for you by writing tests first, then generating code to pass them.

typescriptcodingagent +2

DeepCode

15.8k · Python
Active

DeepCode is an open agentic coding platform supporting Paper2Code, Text2Web, and Text2Backend, leveraging agent technology for automated software development workflows.

codingpythonllm +2
AgentList

The most comprehensive directory of open-source AI Agent projects. Discover and compare top Agent frameworks like LangChain, CrewAI, and more.

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community