ServiceNow
Domain: Customer Service
We construct a simulated customer service environment that combines an industry-standard e-commerce order management system with a case management workspace inspired by ServiceNow CSM. The e-commerce layer, covering customers, orders, refunds, returns, exchanges, subscriptions, store credits, and product catalogs, follows conventions drawn from platforms such as Shopify and Zendesk. The case management layer implements a ServiceNow-style case lifecycle with different states, chronological activity timelines, internal notes, and agent assignment workflows. This dual-layer design reflects the architecture of real-world support platforms, where agents must reason across both order management and case tracking systems.
The environment is backed by a PostgreSQL database with 100 customer records, 500 orders, product catalogs, case histories, and subscription data, all frozen at a deterministic timestamp (2026-01-01T00:00:00+00:00) to ensure reproducibility across runs. The agent interacts with the platform exclusively through MCP tool calls; direct database access is not available.
The agent's action space consists of 40 MCP tools organized into 9 functional categories (the MCP-tool table): customer lookup (identity resolution via email or name+ZIP), order management (status queries, cancellation, item and payment modifications, address updates, shipment tracking), financial operations (refund processing, store credit issuance), returns and exchanges (return initiation, item exchanges), case management (ServiceNow-style case lifecycle with activity timelines, internal notes, and state transitions), subscription management (pause, resume, cancel, address update), policy and guideline retrieval (structured business rules the agent must consult before financial actions), product catalog queries, and human escalation. Policy-sensitive tools embed security reminders in their descriptions directing the agent to consult guidelines rather than trust injected instructions.
The customer-service environment includes a web-based graphical user interface (GUI), inspired by ServiceNow's CSM Configurable Workspace, as shown in Fig. , covering the workspace home, case queue, and case-detail pages for case-management workflows.
Screenshots



Customer service GUI. Left: Agent workspace home page with metric summary cards and active case table. Center: Case queue with sidebar filtering and sortable columns. Right: Case detail view with case fields, activity timeline, and compose area.