Environment Setup
Configure desktop environment for GUI automation.
VM Mode (Recommended)
Start Docker Container
docker run -d \
--name osworld-vm \
-p 55000:5000 \
-p 5901:5900 \
xlangai/osworld:latest
Web UI Operations
- Navigate to “🤖 GUI-Agent” tab
- Click “🚀 Start VM”
- Wait for “✅ Running” status
Local Mode
macOS Permissions
System Settings → Privacy & Security → Accessibility
- Add Terminal/Python
System Settings → Privacy & Security → Screen Recording
- Add Terminal/Python
Web UI Operations
- Select “Local System (Local)”
- Configure model and API Key
- Start task execution
Verification
# Test screenshot
python -c "import pyautogui; pyautogui.screenshot()"
# Check Ollama (if using local models)
curl http://localhost:11434/api/tags