AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Home / Projects / SWE-bench

SWE-bench

Active
GitHub Python MIT

Description

SWE-bench is a benchmark for evaluating language models on real-world GitHub issue resolution, featuring genuine problems from popular Python repositories, now a core standard for measuring AI coding agent capabilities.

Tags

evaluation python coding agent testing

Categories

💻 Coding Agent
Visit GitHub

Project Metrics

Stars 4.7k
Forks 829
Watchers 4.7k
Issues 104
Created October 4, 2023
Last commit April 1, 2026

Deployment

Local

Related Projects

Augment SWE-bench Agent

867 · Python
Stale

Augment SWE-bench Agent is the number one open-source SWE-bench Verified implementation, demonstrating how to build high-performance software engineering agents to automatically resolve GitHub issues.

codingpythonagent +2

AutoCodeRover

3.1k · Python
Stale

AutoCodeRover is a project structure-aware autonomous software engineer agent that achieves automated program repair and issue resolution by understanding the overall codebase architecture.

codingpythonagent +2

DeepCode

15.2k · Python
Active

DeepCode is an open agentic coding platform supporting Paper2Code, Text2Web, and Text2Backend, leveraging agent technology for automated software development workflows.

codingpythonllm +2

Kodezi Chronos

5.0k · Java
Stale

Kodezi Chronos is a debugging-first language model achieving state-of-the-art performance on SWE-bench, capable of autonomously handling software debugging and code repair tasks.

codingjavallm +2
AgentList

Curated directory of open-source AI agent projects

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community