SWE-Lancer Benchmark
不活跃简介
SWE-Lancer is an OpenAI benchmark dataset evaluating frontier language models on freelance software engineering tasks, covering real scenarios from simple bug fixes to complex feature development.
SWE-Lancer is an OpenAI benchmark dataset evaluating frontier language models on freelance software engineering tasks, covering real scenarios from simple bug fixes to complex feature development.
Augment SWE-bench Agent is the number one open-source SWE-bench Verified implementation, demonstrating how to build high-performance software engineering agents to automatically resolve GitHub issues.
AutoCodeRover is a project structure-aware autonomous software engineer agent that achieves automated program repair and issue resolution by understanding the overall codebase architecture.
DeepCode is an open agentic coding platform supporting Paper2Code, Text2Web, and Text2Backend, leveraging agent technology for automated software development workflows.
Kodezi Chronos is a debugging-first language model achieving state-of-the-art performance on SWE-bench, capable of autonomously handling software debugging and code repair tasks.