AgentList
HomeProjectsArticlesAbout
Explore Projects
HomeProjectsArticlesAbout
Explore Projects
Home / Projects / Hugging Face Evaluate

Hugging Face Evaluate

Active
GitHub Python Apache-2.0

Description

A library by Hugging Face for easily evaluating machine learning models and datasets, providing a wide range of metrics and evaluation methods.

Tags

evaluation llm python huggingface framework

Categories

📊 Observability
Visit GitHub Visit Website

Project Metrics

Stars 2.4k
Forks 313
Watchers 2.4k
Issues 270
Created March 30, 2022
Last commit April 17, 2026

Deployment

Local

Related Projects

Argilla

4.9k · Python
Active

Argilla is a collaboration platform for AI engineers and domain experts to build high-quality datasets, collect human feedback, and evaluate models.

evaluationdata-processingllm +2

Weave

1.1k · Python
Active

A toolkit by Weights & Biases for developing AI-powered applications, providing LLM call tracing, evaluation experiment management, and versioning from prototype to production.

observabilityevaluationllm +2

PrompToMatix

948 · Python
Stale

An automatic prompt optimization framework by Salesforce AI Research that leverages LLMs to search for and refine prompts for improved model performance.

prompt-engineeringevaluationllm +1

AgentBench

3.3k · Python
Normal

A comprehensive benchmark to evaluate LLMs as agents (ICLR 2024), covering operating systems, databases, knowledge graphs, digital card games and more.

evaluationpythonagent +1
AgentList

Curated directory of open-source AI agent projects

Quick Links

  • Project List
  • Featured Articles
  • Browse Categories

Contact

  • About
  • Privacy Policy
  • Contact Us

© 2026 AgentList. All rights reserved.

Made with for the open source community