macOS
macOS desktop GUI agent benchmark.
Image-grounded macOS desktop environment counterpart to Windows, exercising click-driven workflows over native applications under both pop-up and screenshot-borne injections.
Environments
The macOS domain ships 1 sandboxed environment:
Benchmark
See the leaderboard for live Indirect ASR, Direct ASR, and BSR results on the macOS domain across all supported agent frameworks and models.