Deepchecks

Stale
GitHub Python NOASSERTION

Description

Testing and monitoring platform for ML and LLM applications — unit tests for AI.

Key Features

  • ML testing — Auto-check data drift, label leakage, and model performance pre/post training
  • LLM evals — Built-in checks for hallucination, bias, and toxicity
  • CI-friendly — Wire into pytest in a few lines
  • Visualization — HTML reports make check results intuitive
  • OSS and self-host — Data stays local; suitable for sensitive industries
  • Extensible — Custom Checks and Suites for business needs

Use Cases

💡 Establish regression tests for ML teams before model rollout.
💡 Auto-check LLM outputs for hallucination and toxicity.
💡 Run data-drift checks in CI to prevent model degradation.

Quick Start

# Install
pip install deepchecks
# LLM eval example
from deepchecks.llm.checks import Toxicity
result = Toxicity().run(
    production_samples={'text': ['I hate this product']},
)
result.show()

Related Projects