TruLens

Active
GitHub Python MIT

Description

TruLens is an open-source tool for evaluating and tracking LLM apps. It provides specialized evaluation for RAG applications including context relevance, groundedness, and answer relevance.

Key Features

  • OpenTelemetry-based tracing with structured OTEL spans
  • 7 agentic evaluators: consistency, efficiency, plan adherence, quality, tool selection, tool calling, tool quality
  • Batch and inline evaluation with configurable workers
  • MCP tool call instrumentation for latency and output tracking
  • RAG Triad evaluation: context relevance, groundedness, answer relevance
  • Multi-provider support: OpenAI, Anthropic, Google, Bedrock, Snowflake, HuggingFace

Use Cases

💡 Systematically evaluating LLM application quality during development
💡 Monitoring RAG pipeline performance with the RAG Triad metrics
💡 Instrumenting agentic workflows for failure mode detection
💡 Running batch evaluations on datasets to compare model versions
💡 Integrating observability into existing OpenTelemetry infrastructure

Quick Start

pip install trulens-core, then pip install trulens-providers-openai (or your provider). Import instrument decorator, wrap your RAG functions with @instrument, define feedback functions, and run evaluations via the dashboard or Python API.

Related Projects

Related Articles