Bananalyzer
NormalDescription
Open source AI Agent evaluation framework for web tasks to measure and compare AI agent performance on web operations.
Open source AI Agent evaluation framework for web tasks to measure and compare AI agent performance on web operations.
A framework for few-shot evaluation of language models by EleutherAI, providing standardized evaluation pipelines supporting hundreds of benchmark tasks and widely adopted as a core LLM evaluation tool in the community.
A CNCF Sandbox SRE Agent that automatically analyzes infrastructure logs and metrics to assist with incident diagnosis and system operations.
An open-source, modern-design AI training tracking and visualization tool. Supports PyTorch, Transformers and more. Monitor and evaluate AI agent training processes.
Interactive sandboxes for AI agent evaluations and reinforcement learning on third-party APIs like Slack, LinkedIn, and more.