SeeAct

Stale

GitHub Python NOASSERTION

Description

A system for generalist web agents that autonomously carry out tasks on any given website, leveraging large multimodal models like GPT-4V.

Related Projects

Mind2Web

988 · Jupyter Notebook

Stale

The first LLM-based web agent and benchmark for generalist web agents, providing datasets, evaluation frameworks and baseline methods for building agents that operate on real websites.

web-agentbenchmarkllm +2

AppAgent

6.7k · Python

Stale

AppAgent is an LLM-based multimodal agent framework designed to operate smartphone apps like a human, supporting touch interaction and autonomous exploration.

multimodalsmartphonegui-agent +3

AgenticSeek

26.3k · Python

Active

Fully local Manus AI alternative that autonomously browses the web, writes code, and interacts via voice, with no API costs

browser-agentcoding-agentlocal-ai +3

Vision Agents

7.8k · Python

Active

Open Vision Agents by Stream. Build voice and vision agents quickly with any model or video provider, using Stream's edge network for ultra-low latency realtime interactions.

voiceagentpython +3

SeeAct

Description

Tags

Categories

Related Projects

Mind2Web

AppAgent

AgenticSeek

Vision Agents