Problem
Model selection in the api-proxy can fail in multiple ways:
- Requested model not in
cachedModels — upstream returns 400/404
- Alias resolves to a glob with no matches — e.g.,
gpt-5.5* when only gpt-5.4 exists
- Family fallback exhausted —
gpt-5.<minor> → gpt-5 alias, but that alias also has no match
- Provider temporarily unavailable for specific model — model exists in cache but upstream rejects it
- Model removed between cache refresh — stale
cachedModels list
In all these cases, the proxy currently either forwards the request unchanged (resulting in upstream errors) or returns null (no rewrite). The agent then retries in a loop, wasting tokens and time.
Proposed Solution: Middle-Power Fallback Policy
When all other selection criteria fail, select the middle-power model from the available models for the target provider. "Middle-power" is defined as the median model by capability tier from the cached model list.
Selection Algorithm
1. Filter cachedModels to the same provider and model family (if determinable)
2. Sort by capability tier (e.g., opus > sonnet > haiku; gpt-5.x > gpt-4.x > gpt-3.5)
3. Select the median entry (round down if even count)
4. If family filtering yields 0 results, fall back to all models for that provider
Capability Tier Ordering
Define a simple tier map:
- Anthropic: opus (5) > sonnet (4) > haiku (3)
- OpenAI/Copilot: gpt-5.x (5) > gpt-4.x (4) > gpt-3.5 (3)
- Unknown: sort lexicographically, pick median
Logging Requirements
When the middle-power fallback activates, emit a clearly distinguishable structured log event:
{
"level": "warn",
"event": "model_fallback_activated",
"provider": "copilot",
"original_model": "gpt-5.5",
"fallback_model": "gpt-4.1",
"reason": "no_alias_match_and_not_in_available_models",
"available_models_count": 24,
"selection_method": "middle_power_median"
}
Key logging points:
model_fallback_activated (warn) — the fallback was used, with full context
model_fallback_candidates (debug) — the sorted tier list considered
model_fallback_skipped (info) — fallback was available but not needed (normal resolution succeeded)
Configuration
The fallback should be:
- Enabled by default — this is a safety net, not a feature toggle
- Configurable via stdin config (not env var) — e.g.,
model_fallback.enabled: true, model_fallback.strategy: "middle_power"
- Overridable per-alias — an alias definition could set
fallback: false to disable for that specific alias
Integration Points
containers/api-proxy/model-resolver.js — add fallback logic after line ~170 (where current resolution returns null)
containers/api-proxy/model-discovery.js — expose tier-sorted model list
containers/api-proxy/server.js — wire fallback config from stdin
Acceptance Criteria
Related
Problem
Model selection in the api-proxy can fail in multiple ways:
cachedModels— upstream returns 400/404gpt-5.5*when onlygpt-5.4existsgpt-5.<minor>→gpt-5alias, but that alias also has no matchcachedModelslistIn all these cases, the proxy currently either forwards the request unchanged (resulting in upstream errors) or returns null (no rewrite). The agent then retries in a loop, wasting tokens and time.
Proposed Solution: Middle-Power Fallback Policy
When all other selection criteria fail, select the middle-power model from the available models for the target provider. "Middle-power" is defined as the median model by capability tier from the cached model list.
Selection Algorithm
Capability Tier Ordering
Define a simple tier map:
Logging Requirements
When the middle-power fallback activates, emit a clearly distinguishable structured log event:
{ "level": "warn", "event": "model_fallback_activated", "provider": "copilot", "original_model": "gpt-5.5", "fallback_model": "gpt-4.1", "reason": "no_alias_match_and_not_in_available_models", "available_models_count": 24, "selection_method": "middle_power_median" }Key logging points:
model_fallback_activated(warn) — the fallback was used, with full contextmodel_fallback_candidates(debug) — the sorted tier list consideredmodel_fallback_skipped(info) — fallback was available but not needed (normal resolution succeeded)Configuration
The fallback should be:
model_fallback.enabled: true,model_fallback.strategy: "middle_power"fallback: falseto disable for that specific aliasIntegration Points
containers/api-proxy/model-resolver.js— add fallback logic after line ~170 (where current resolution returns null)containers/api-proxy/model-discovery.js— expose tier-sorted model listcontainers/api-proxy/server.js— wire fallback config from stdinAcceptance Criteria
model_fallback_activatedwarn-level log is emitted with full context/reflectendpoint includesmodel_fallback.enabledandmodel_fallback.strategyin its outputRelated