mistral.rs
ActiveDescription
Fast, flexible LLM inference engine built in Rust — supports multiple model architectures and quantization schemes for high-performance local LLM deployment.
Fast, flexible LLM inference engine built in Rust — supports multiple model architectures and quantization schemes for high-performance local LLM deployment.
An end-to-end RL training framework by NVIDIA for orchestrating tools and agentic workflows. Optimizes multi-step agent decision-making and tool-use policies.
A flexible framework for experiencing heterogeneous LLM inference and fine-tuning optimizations — run large language models efficiently on consumer hardware with kernel-level optimizations.
A high-performance LLM inference API and Chat UI that integrates DeepSeek R1's chain-of-thought reasoning traces with Anthropic Claude models.
Open Vision Agents by Stream. Build voice and vision agents quickly with any model or video provider, using Stream's edge network for ultra-low latency realtime interactions.