Weave
ActiveDescription
A toolkit by Weights & Biases for developing AI-powered applications, providing LLM call tracing, evaluation experiment management, and versioning from prototype to production.
A toolkit by Weights & Biases for developing AI-powered applications, providing LLM call tracing, evaluation experiment management, and versioning from prototype to production.
Argilla is a collaboration platform for AI engineers and domain experts to build high-quality datasets, collect human feedback, and evaluate models.
OpenTelemetry instrumentation for AI observability, providing standardized tracing, metrics collection, and span definitions for LLM inference processes to help developers monitor and debug AI agent systems.
A library by Hugging Face for easily evaluating machine learning models and datasets, providing a wide range of metrics and evaluation methods.
Meta's set of tools to assess and improve LLM security, including safety benchmarks, prompt injection detection, and output auditing to help evaluate and enhance the safety of large language models.