Speech-to-Speech
ActiveDescription
Build local voice agents with open-source models. An end-to-end speech-to-speech pipeline from HuggingFace for fully local voice AI agent deployment.
Build local voice agents with open-source models. An end-to-end speech-to-speech pipeline from HuggingFace for fully local voice AI agent deployment.
Conversational voice AI agents platform for building natural language phone interactions with multilingual speech synthesis and real-time dialogue management.
Open Vision Agents by Stream. Build voice and vision agents quickly with any model or video provider, using Stream's edge network for ultra-low latency realtime interactions.
An end-to-end RL training framework by NVIDIA for orchestrating tools and agentic workflows. Optimizes multi-step agent decision-making and tool-use policies.
A Python library by Google for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization, designed for data annotation and knowledge extraction workflows.