You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Our scheduled Integration Data Updater agentic workflow on microsoft/aspire.dev is being blocked by the hard-coded 100-file cap in the create_pull_request safe-output. This workflow legitimately needs to land ~220 files in a single PR because every weekday it regenerates per-package API reference data + integration metadata used by our Astro template-driven docs site.
We'd like the default lifted (or made trivially configurable per workflow) so this kind of generated-content workflow can run unattended without being silently turned off.
Base branch for microsoft/aspire.dev: main
Warning: Pull request limit exceeded: E003: Cannot create pull request with more than 100 files (received 220)
Error: ✗ Message 1 (create_pull_request) failed: E003: Cannot create pull request with more than 100 files (received 220)
Warning: ⚠️ Code push operation 'create_pull_request' failed — remaining safe outputs will be cancelled
Why 220 files is expected (and not a bug on our side)
The workflow runs pnpm update:all daily on weekdays and refreshes the data the API reference / integrations pages render from:
src/frontend/src/data/aspire-integrations.json — NuGet metadata for every Aspire integration package
src/frontend/src/data/github-stats.json — repo stars/description/license for each integration's GitHub repo
src/frontend/src/data/pkgs/*.json — currently 147 per-package API schemas (one JSON file per integration package), and growing every release as more Aspire integrations ship
Additional generated derivatives (Twoslash caches, schemas, etc.) used by Astro template rendering at build time
So a typical run lands somewhere in the range of 150–250 changed files. These are generated, tested, and required for our docs to build; they can't reasonably be split across multiple PRs because:
The files are produced atomically by a single script (pnpm update:all) and intentionally regenerated together so the rendered docs stay internally consistent (e.g. a package's row in aspire-integrations.json matches its per-package schema in pkgs/).
The workflow is fully automated (schedule + workflow_dispatch) with no human in the loop to herd N partial PRs.
Splitting would produce N flaky, partially-consistent PRs that fail review and visual diff checks.
Today the workflow's terminal step reports Code push operation 'create_pull_request' failed — remaining safe outputs will be cancelled and quietly drops the PR. The agent did the right thing, the data is correct, and there's no recovery path: the next scheduled run will hit the same wall, so our integration data effectively stops updating until someone notices the failing schedule and intervenes manually.
This is the same shape of problem discussed in #28471, but from the opposite direction — that issue was about the count being wrong (counting the full branch diff). Our case is the count being right: 220 unique files really is what we need to land, and the default cap rejects a legitimate use case.
Requests
In rough order of preference:
Raise the default for MAX_FILES to something that accommodates generated-content / docs / data workflows out of the box (e.g. 500 or 1000). 100 is fine as a guardrail against runaway agents, but it's restrictive enough that any non-trivial data refresh trips it.
Make the existing override discoverable and stable. I can see max-patch-files was added under safe-outputs (per the changeset in .changeset/patch-create-pr-max-files-config.md). If that's the long-term answer, please:
Soft-fail by default. When the limit is exceeded, downgrade from "error + cancel all remaining safe outputs" to a clear warning + fallback-as-issue so the agent's work isn't lost. Right now a single tripped guardrail effectively disables the entire scheduled workflow with no artifact to recover from.
Even just (1) would unblock us. (2) and (3) would prevent the same papercut for the next team that automates a generated-content refresh.
Happy to help test a fix against our real workflow if useful — it runs on a daily schedule and reliably reproduces the limit hit.
Summary
Our scheduled
Integration Data Updateragentic workflow on microsoft/aspire.dev is being blocked by the hard-coded 100-file cap in thecreate_pull_requestsafe-output. This workflow legitimately needs to land ~220 files in a single PR because every weekday it regenerates per-package API reference data + integration metadata used by our Astro template-driven docs site.We'd like the default lifted (or made trivially configurable per workflow) so this kind of generated-content workflow can run unattended without being silently turned off.
Failing run
Workflow:
Integration Data Updater(Docs edit #63)Run: https://github.com/microsoft/aspire.dev/actions/runs/25938482819/job/76252411313
gh-aw version:
v0.67.2(viagithub/gh-aw-actions/setup@v0.67.2)Error from the step summary:
The check that throws this is in
actions/setup/js/create_pull_request.cjs(enforcePullRequestLimits/MAX_FILES = 100).Why 220 files is expected (and not a bug on our side)
The workflow runs
pnpm update:alldaily on weekdays and refreshes the data the API reference / integrations pages render from:src/frontend/src/data/aspire-integrations.json— NuGet metadata for every Aspire integration packagesrc/frontend/src/data/github-stats.json— repo stars/description/license for each integration's GitHub reposrc/frontend/src/data/pkgs/*.json— currently 147 per-package API schemas (one JSON file per integration package), and growing every release as more Aspire integrations shipSo a typical run lands somewhere in the range of 150–250 changed files. These are generated, tested, and required for our docs to build; they can't reasonably be split across multiple PRs because:
pnpm update:all) and intentionally regenerated together so the rendered docs stay internally consistent (e.g. a package's row inaspire-integrations.jsonmatches its per-package schema inpkgs/).workflow_dispatch) with no human in the loop to herd N partial PRs.The workflow source is here for context:
.github/workflows/update-integration-data.md.Impact
Today the workflow's terminal step reports
Code push operation 'create_pull_request' failed — remaining safe outputs will be cancelledand quietly drops the PR. The agent did the right thing, the data is correct, and there's no recovery path: the next scheduled run will hit the same wall, so our integration data effectively stops updating until someone notices the failing schedule and intervenes manually.This is the same shape of problem discussed in #28471, but from the opposite direction — that issue was about the count being wrong (counting the full branch diff). Our case is the count being right: 220 unique files really is what we need to land, and the default cap rejects a legitimate use case.
Requests
In rough order of preference:
MAX_FILESto something that accommodates generated-content / docs / data workflows out of the box (e.g. 500 or 1000). 100 is fine as a guardrail against runaway agents, but it's restrictive enough that any non-trivial data refresh trips it.max-patch-fileswas added undersafe-outputs(per the changeset in.changeset/patch-create-pr-max-files-config.md). If that's the long-term answer, please:create-pull-requestsafe-output docs with an example for "regenerated data file" workflows.v0.67.2; the changeset doesn't appear in the v0.67.2 release notes I checked).max-patch-sizealready goes up to 10240 KB per create_pull_request 100-file limit counts full branch diff, not per-push diff #28471).fallback-as-issueso the agent's work isn't lost. Right now a single tripped guardrail effectively disables the entire scheduled workflow with no artifact to recover from.Even just (1) would unblock us. (2) and (3) would prevent the same papercut for the next team that automates a generated-content refresh.
Happy to help test a fix against our real workflow if useful — it runs on a daily schedule and reliably reproduces the limit hit.