Welcome
Welcome to Cua
Cua is an open-source framework for building, deploying and evaluating Computer-Use Agents - AI systems that autonomously interact with computer interfaces by understanding visual elements and executing actions. Cua provides SDKs for easy integration with 100+ vision-language models (VLMs), supporting everything from simple task automation to complex multi-step workflows across Windows, Linux, and macOS environments.

What is a Computer-Use Agent?
Computer-Use Agents (CUAs) are AI systems that can autonomously interact with computer interfaces through visual understanding and action execution. They work by capturing screenshots, feeding them to a vision-language model (VLM), and letting the model determine the next action to take - such as clicking, typing, or scrolling - in a continuous loop until the task is complete.
What is a Computer-Use Sandbox?
Computer-Use Sandboxes are isolated, controlled environments where AI agents can safely interact with computer interfaces. They provide a secure execution space for agents to perform actions such as clicking, typing, and running code, test automation workflows, and learn from interactions without affecting production systems.
Key Features
With the Computer SDK, you can:
- Automate Windows, Linux, and macOS sandboxes with a consistent, pyautogui-like API
- Create & manage sandboxes locally or using Cua Cloud
With the Agent SDK, you can:
- Run computer-use models with a consistent schema
- Benchmark on OSWorld-Verified, SheetBench-V2, and ScreenSpot
- Combine UI grounding models with any LLM using composed agents
- Use 100+ models via API or local inference (Claude, GPT-4, Gemini, Ollama, MLX)
Get Started
Follow the Quickstart guide for step-by-step setup with Python or TypeScript.
Check out our tutorials, examples, and notebooks to start building with Cua today.
Was this page helpful?