Argilla
ActiveDescription
Argilla is a collaboration platform for AI engineers and domain experts to build high-quality datasets, collect human feedback, and evaluate models.
Argilla is a collaboration platform for AI engineers and domain experts to build high-quality datasets, collect human feedback, and evaluate models.
A library by Hugging Face for easily evaluating machine learning models and datasets, providing a wide range of metrics and evaluation methods.
A toolkit by Weights & Biases for developing AI-powered applications, providing LLM call tracing, evaluation experiment management, and versioning from prototype to production.
An automatic prompt optimization framework by Salesforce AI Research that leverages LLMs to search for and refine prompts for improved model performance.
A comprehensive benchmark to evaluate LLMs as agents (ICLR 2024), covering operating systems, databases, knowledge graphs, digital card games and more.