SWE-Lancer Benchmark

Stale

GitHub Unknown No License

Description

SWE-Lancer is an OpenAI benchmark dataset evaluating frontier language models on freelance software engineering tasks, covering real scenarios from simple bug fixes to complex feature development.

Related Projects

Augment SWE-bench Agent

873 · Python

Stale

Augment SWE-bench Agent is the number one open-source SWE-bench Verified implementation, demonstrating how to build high-performance software engineering agents to automatically resolve GitHub issues.

codingpythonagent +2

AutoCodeRover

3.1k · Python

Stale

AutoCodeRover is a project structure-aware autonomous software engineer agent that achieves automated program repair and issue resolution by understanding the overall codebase architecture.

codingpythonagent +2

Micro Agent

4.3k · TypeScript

Stale

An AI agent that writes actually useful code for you by writing tests first, then generating code to pass them.

typescriptcodingagent +2

DeepCode

15.8k · Python

Active

DeepCode is an open agentic coding platform supporting Paper2Code, Text2Web, and Text2Backend, leveraging agent technology for automated software development workflows.

codingpythonllm +2

SWE-Lancer Benchmark

Description

Tags

Categories

Related Projects

Augment SWE-bench Agent

AutoCodeRover

Micro Agent

DeepCode