AutoResearch

Normal
GitHub Python No License

Description

AI research automation agent by Andrej Karpathy that automatically runs nanochat training research experiments on a single GPU.

Key Features

  • Autonomous experiment loop - AI agent modifies training code, runs 5-min experiments, evaluates results, and keeps or discards changes
  • Single-file agent editing - Agent only modifies train.py (model, optimizer, training loop) while humans edit program.md instructions
  • Fixed time budget - Each experiment runs exactly 5 minutes wall clock, making val_bpb comparable across architectural changes
  • Nanochat-based training - Simplified single-GPU GPT training with Muon + AdamW optimizer, self-contained with minimal dependencies
  • Markdown-driven research programming - Define agent behavior via program.md, iterating on research strategy like coding a research org
  • Extensible multi-agent - Starts with a single agent baseline, architecturally supports adding more agents to accelerate research

Use Cases

💡 Run LLM training experiments overnight autonomously and review results in the morning
💡 Automate hyperparameter search for GPT architectures to find optimal configs for specific GPUs
💡 Research AI agent capabilities for autonomous scientific discovery by iterating on program.md
💡 Rapidly prototype and validate LLM training ideas in a single-GPU environment
💡 Educational tool for learning LLM training workflows, model architecture, and optimizer mechanics

Quick Start

# Install uv package manager
curl -LsSf https://astral.sh/uv/install.sh | sh

# Install dependencies
uv sync

# Download data and train tokenizer (one-time, ~2 min)
uv run prepare.py

# Manually run a single training experiment (~5 min)
uv run train.py

# Start autonomous research: point Claude/Codex at this repo and prompt "look at program.md and let's kick off a new experiment"

Related Projects