OS-Filesystem
Shell-driven file-system operations.
Filesystem and shell environment for evaluating destructive command execution, path-traversal abuse, and data-exfiltration risks under adversarial directory contents.
Environments
The OS-Filesystem domain ships 1 sandboxed environment:
Benchmark
See the leaderboard for live Indirect ASR, Direct ASR, and BSR results on the OS-Filesystem domain across all supported agent frameworks and models.