Skip to content

Background sub-agent silently hangs at total_turns=0 when model="gpt-5.5" #3547

@ravisha22

Description

@ravisha22

Describe the bug

Calling task(agent_type="general-purpose", mode="background", model="gpt-5.5", ...) from a parent agent reports a successful dispatch (Agent started in background with agent_id: ...), but the sub-agent then sits at status: running, total_turns: 0 indefinitely. No completion notification, no error surfaced, no timeout. Observed for >70 minutes in a single instance before the parent gave up; reproduces across multiple sessions over the last ≥17 days (CLI versions 1.0.49, 1.0.52, 1.0.55).

How I encountered this

I built a custom Copilot CLI extension ("multi-agent-convergence", installed under ~/.copilot/extensions/) that orchestrates a research-grade adversarial debate between two persistent background sub-agents from different model families — GPT-5.5 and Claude Opus 4.7 — across at least three rounds. The "different model families" constraint is core to the protocol's value, which is why both model= overrides are passed explicitly to task(...).

I was setting up a live demo recording of the harness when this hit. The parent agent issued two background task calls in the same response: one with model="gpt-5.5" (converge-A) and one with model="claude-opus-4.7" (converge-B). The Claude sub-agent returned a clean Round 1 position in 35 seconds. The GPT-5.5 sub-agent stayed at total_turns: 0 for the entire 6+ minute window I waited before aborting the recording. It was still in that state ~70 minutes later when I came back, and I never received a completion notification.

Cross-referencing the local session store (session_refs / past turns), the symptom — me asking the parent agent some variant of "what's happening?" / "are you still running?" while a background sub-agent silently hangs — appears in sessions dating back to 2026-04-24, then again on 2026-05-11 and now multiple times in late May. So this isn't a one-off; it's a consistent enough failure mode that it broke a demo recording today and has been observable for weeks. Filing it as a real bug rather than a transient flake.

A peer sub-agent dispatched in the same tool.execution_start batch with model="claude-opus-4.7" completes normally in ~35 s, isolating the failure to the gpt-5.5 sub-agent dispatch path. The Copilot CLI itself appears to already route default general-purpose sub-agents around this path via the ExP flag copilot_cli_gpt_5_4_for_subagents: true; explicit model="gpt-5.5" opts back into the broken path.

Affected version

GitHub Copilot CLI 1.0.55-7

Also reproduced on 1.0.49 and 1.0.52 (same dispatch trace).

Steps to reproduce the behavior

  1. Start the Copilot CLI on Windows: copilot.
  2. In the parent session (any model — observed with claude-opus-4.7-xhigh), have the assistant issue:
    task(
      agent_type="general-purpose",
      mode="background",
      model="gpt-5.5",
      name="repro",
      prompt="Reply with the single word: OK"
    )
    
  3. (Optional, to prove harness health) In the same response, issue a second task(...) with model="claude-opus-4.7" and any prompt.
  4. Wait. Call read_agent(agent_id="repro") periodically.

Observed output:

Agent is still running. agent_id: repro, agent_type: general-purpose,
status: running, description: ..., elapsed: 4095s, total_turns: 0,
model: gpt-5.5.

The Claude peer (step 3) returns its result in ~35 s. The gpt-5.5 agent never advances past total_turns: 0.

Smoking-gun evidence (from ~/.copilot/logs/process-<pid>.log for the parent session):

23:19:11.252 [INFO]  Task tool invoked ... model: gpt-5.5, mode: background
23:19:11.253 [DEBUG] validateAndResolveModel: "gpt-5.5" resolved to "gpt-5.5"
23:19:11.253 [DEBUG] enforceMultiplierGuard: result="gpt-5.5"
23:19:11.253 [DEBUG] Task tool dispatch: effectiveModel="gpt-5.5"
23:19:11.255 [INFO]  Started background agent with id: repro
23:19:11.358 [DEBUG] General-purpose agent: modelOverride="gpt-5.5", resolvedModel="gpt-5.5"
23:19:11.359 [INFO]  General-purpose agent invoked with prompt: ...
23:19:11.373 [INFO]  General-purpose agent using tools: powershell, ...
<<< no further log entries for this agent_id, ever >>>

After General-purpose agent using tools: ... at 23:19:11.373, no further log line mentions this agent_id anywhere on disk. No assistant_usage telemetry event is emitted for the gpt-5.5 call (every healthy turn produces one with provider_call_id, api_call_id, input_tokens, etc.). Get-NetTCPConnection -OwningProcess <pid> shows no long-lived socket to the model provider that could correspond to an in-flight request — every Established connection on the parent process is younger than the dispatch timestamp.

Expected behavior

Either:

  1. The sub-agent makes the model API call, streams a response, and completes normally; or
  2. A client-side timeout / dispatch error fires within a bounded window (e.g. 60–120 s) and surfaces via the normal completion notification path with a clear error message.

Silently sitting at total_turns: 0 forever — with no error, no timeout, and no log entry past General-purpose agent using tools: ... — is not acceptable behavior. The bookkeeping says "running" but no work is in flight, so neither the user nor a parent agent has any signal to retry, cancel, or escalate.

Additional context

Operating system: Windows 11, build 10.0.26200 (x86_64 / AMD64)
Terminal: Windows Terminal (WT_SESSION present)
Shell: PowerShell 7.6.2 (ConsoleHost)
Node version (CLI runtime): v24.15.0
Copilot plan: enterprise; is_staff: true
Session model: claude-opus-4.7-xhigh
Sub-agent target model: gpt-5.5

Relevant feature flags (from telemetry exp_context_fetch):

copilot_cli_gpt_5_4_for_subagents: true
copilot_cli_websocket_responses: true
copilot_cli_subagent_parallelism_prompts: false
WEBSOCKET_RESPONSES: false      (env override, conflicts with ExP)
SESSION_BASED_SUBAGENTS: false

The conflict between ExP-assigned copilot_cli_websocket_responses=true and env WEBSOCKET_RESPONSES=false may or may not be related but is worth investigating as part of the response-routing layer.

Hypothesis on root cause: The bug lives between General-purpose agent invoked with prompt and the first HTTPS request to the model provider — likely in the gpt-5.5 model-client / provider-routing branch. Because that branch never reaches the network and no watchdog/timeout is registered, the task stays in running indefinitely. The existence of copilot_cli_gpt_5_4_for_subagents=true suggests the product team already routes sub-agents away from this path by default.

Reproducer trick for a maintainer:

  • Always co-spawn an Anthropic sub-agent in the same parent response — proves the harness, WebSocket, hooks, and permission service are healthy and isolates the failure to the gpt-5.5 path.
  • Repro rate has been ~100% in the sessions I've observed (≥4 days in last 17), but I haven't done a scripted N-of-M run.

Workarounds users can apply today:

  1. Don't pass model="gpt-5.5" to background sub-agents; use the default (which resolves to gpt-5.4 via the existing flag) or pass model="gpt-4.1" explicitly.
  2. If gpt-5.5 must be used, set mode="sync" so failures surface to the parent.
  3. Always co-spawn a non-gpt-5.5 peer so a hang doesn't block the whole task.

Operational pain caused:

  • Hung agents are not stoppable via the task tool surface from inside the same session; they survive until process exit or manual /tasks termination.
  • The parent's "working" status indicator stays on as long as any sub-agent is running, leading to user confusion ("what is happening?", "are you still running?") which is the recurring symptom that triggered this investigation across multiple of my own past sessions.

Diagnostic doc with full evidence chain:
A complete diagnostic report (logs, TCP inventory, telemetry, dispatch trace) was prepared during investigation and is available locally at ~/.copilot/session-state/<session-id>/files/diagnostics-gpt5.5-subagent-hang.md. Happy to attach on request or paste into a comment.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions