Skip to main content

OS-Filesystem

Shell-driven file-system operations.

Filesystem and shell environment for evaluating destructive command execution, path-traversal abuse, and data-exfiltration risks under adversarial directory contents.

Environments

The OS-Filesystem domain ships 1 sandboxed environment:

Benchmark

See the leaderboard for live Indirect ASR, Direct ASR, and BSR results on the OS-Filesystem domain across all supported agent frameworks and models.