Weights & Biases

Active

Description

Weights & Biases is an experiment tracking, visualization, and collaboration platform for ML and LLM applications, covering agent training evaluation, hyperparameter management, and model registry workflows.

Key Features

Experiment tracking — Automatically log hyperparameters, metrics, system resources, and code versions with side-by-side comparison
W&B Models — Provides model artifacts registry, versioning, and promotion to production
W&B Weave — LLM and agent tracing tool with prompt evaluation, conversation replay, and quality scoring
Sweeps hyperparameter search — Built-in Bayesian and grid search to find the best hyperparameter combinations at scale
Team collaboration — Shareable experiment reports and dashboards with comments and access control
Reports and dashboards — Drag-and-drop authoring of publishable experiment reports with embedded charts and interactive components

Use Cases

💡 Track agent training and fine-tuning experiments, comparing different models and hyperparameter combinations

💡 Use Weave to record LLM call traces, debug agent decision chains, and evaluate output quality

💡 Manage prompt engineering experiments for agents with prompt versioning and evaluation scores

💡 Share experiment reports and dashboards across teams to standardize agent R&D workflows

💡 Register trained models in W&B Artifacts and publish them to production inference services

Quick Start

pip install wandb
wandb login
import wandb
wandb.init(project='agent-eval', config={'lr': 0.001, 'model': 'claude-sonnet-4-6'})
for step in range(100):
  wandb.log({'loss': 0.1 * step, 'accuracy': 0.9 + 0.001 * step})
wandb.finish()

Visit GitHub Visit Website View Docs

Weights & Biases

Description

Key Features

Use Cases

Tags

Categories

Quick Start

Related Projects

Blaxel AI SDK

Helicone

Plano

AxonHub