ShowUI
ActiveDescription
ShowUI is an open-source, end-to-end Vision-Language-Action model for GUI agents and computer use, capable of understanding screenshots and executing precise interface interactions.
ShowUI is an open-source, end-to-end Vision-Language-Action model for GUI agents and computer use, capable of understanding screenshots and executing precise interface interactions.
The first open-source Artificial Narrow Intelligence generalist agent that fully operates GUIs using only natural language. Uses Visualization-of-Thought and Chain-of-Thought reasoning for spatial perception and HID simulation.
The open-source Agentic browser that transforms your browser into an AI-powered operating system. Alternative to ChatGPT Atlas, Perplexity Comet, and Dia.
An adaptive web scraping framework that intelligently handles anti-bot measures, from single requests to full-scale crawls, designed for AI agent data collection.
Open Vision Agents by Stream. Build voice and vision agents quickly with any model or video provider, using Stream's edge network for ultra-low latency realtime interactions.