Megatron-LM

Active
GitHub Python NOASSERTION

Description

NVIDIA's open-source GPU-optimized library for training transformer models at scale, providing tensor parallelism, pipeline parallelism, sequence parallelism, and mixed-precision (FP8/FP4) support — the core foundation for trillion-parameter LLM training.

Key Features

  • Tensor parallelism (TP) — Shard individual transformer layers across GPUs to reduce per-card memory
  • Pipeline parallelism (PP) — Distribute model layers across GPUs for ultra-large models
  • Context parallelism (CP) — Handle ultra-long sequences efficiently with million-token training
  • Mixed-precision training — Full-stack FP16, BF16, FP8, and FP4 low-precision support
  • Mixture of Experts (MoE) — Native support for DeepSeek-V3, Mixtral and other MoE architectures
  • Megatron Bridge — Bidirectional HuggingFace checkpoint format conversion

Use Cases

💡 Large language model pretraining: Train trillion-parameter models across thousands of GPUs
💡 MoE model training: Efficiently train DeepSeek-V3, Mixtral and other mixture-of-experts models
💡 Ultra-long context training: Million-token level long-sequence pretraining
💡 Custom training frameworks: Build tailored training systems from Megatron Core modular components

Quick Start

# Install Megatron Core
uv pip install megatron-core

# Or install from source
git clone https://github.com/NVIDIA/Megatron-LM.git
cd Megatron-LM && uv pip install -e .

# Run quickstart training script
cd examples
python pretrain_gpt.py ...

# See docs
# https://docs.nvidia.com/megatron-core/developer-guide/latest/get-started/quickstart.html

Related Projects