Raise / unblock default 100-file cap in create_pull_request safe-output for generated-content workflows

## Summary

Our scheduled `Integration Data Updater` agentic workflow on [microsoft/aspire.dev](https://github.com/microsoft/aspire.dev) is being blocked by the hard-coded 100-file cap in the `create_pull_request` safe-output. This workflow legitimately needs to land ~220 files in a single PR because every weekday it regenerates per-package API reference data + integration metadata used by our Astro template-driven docs site.

We'd like the default lifted (or made trivially configurable per workflow) so this kind of generated-content workflow can run unattended without being silently turned off.

## Failing run

- Workflow: `Integration Data Updater` (#63)
- Run: https://github.com/microsoft/aspire.dev/actions/runs/25938482819/job/76252411313
- gh-aw version: `v0.67.2` (via `github/gh-aw-actions/setup@v0.67.2`)
- Error from the step summary:

  ```
  Base branch for microsoft/aspire.dev: main
  Warning: Pull request limit exceeded: E003: Cannot create pull request with more than 100 files (received 220)
  Error: ✗ Message 1 (create_pull_request) failed: E003: Cannot create pull request with more than 100 files (received 220)
  Warning: ⚠️ Code push operation 'create_pull_request' failed — remaining safe outputs will be cancelled
  ```

The check that throws this is in [`actions/setup/js/create_pull_request.cjs`](https://github.com/github/gh-aw/blob/main/actions/setup/js/create_pull_request.cjs) (`enforcePullRequestLimits` / `MAX_FILES = 100`).

## Why 220 files is expected (and not a bug on our side)

The workflow runs `pnpm update:all` daily on weekdays and refreshes the data the API reference / integrations pages render from:

- `src/frontend/src/data/aspire-integrations.json` — NuGet metadata for every Aspire integration package
- `src/frontend/src/data/github-stats.json` — repo stars/description/license for each integration's GitHub repo
- `src/frontend/src/data/pkgs/*.json` — currently **147** per-package API schemas (one JSON file per integration package), and growing every release as more Aspire integrations ship
- Additional generated derivatives (Twoslash caches, schemas, etc.) used by Astro template rendering at build time

So a typical run lands somewhere in the range of 150–250 changed files. These are **generated, tested, and required for our docs to build**; they can't reasonably be split across multiple PRs because:

1. The files are produced atomically by a single script (`pnpm update:all`) and intentionally regenerated together so the rendered docs stay internally consistent (e.g. a package's row in `aspire-integrations.json` matches its per-package schema in `pkgs/`).
2. The workflow is fully automated (schedule + `workflow_dispatch`) with no human in the loop to herd N partial PRs.
3. Splitting would produce N flaky, partially-consistent PRs that fail review and visual diff checks.

The workflow source is here for context: [`.github/workflows/update-integration-data.md`](https://github.com/microsoft/aspire.dev/blob/main/.github/workflows/update-integration-data.md).

## Impact

Today the workflow's terminal step reports `Code push operation 'create_pull_request' failed — remaining safe outputs will be cancelled` and quietly drops the PR. The agent did the right thing, the data is correct, and there's no recovery path: the next scheduled run will hit the same wall, so our integration data effectively stops updating until someone notices the failing schedule and intervenes manually.

This is the same shape of problem discussed in #28471, but from the opposite direction — that issue was about the count being **wrong** (counting the full branch diff). Our case is the count being **right**: 220 unique files really is what we need to land, and the default cap rejects a legitimate use case.

## Requests

In rough order of preference:

1. **Raise the default** for `MAX_FILES` to something that accommodates generated-content / docs / data workflows out of the box (e.g. 500 or 1000). 100 is fine as a guardrail against runaway agents, but it's restrictive enough that any non-trivial data refresh trips it.
2. **Make the existing override discoverable and stable.** I can see `max-patch-files` was added under `safe-outputs` (per the changeset in [`.changeset/patch-create-pr-max-files-config.md`](https://github.com/github/gh-aw/blob/main/.changeset/patch-create-pr-max-files-config.md)). If that's the long-term answer, please:
   - Call it out in the [`create-pull-request` safe-output docs](https://githubnext.github.io/gh-aw/reference/safe-outputs/) with an example for "regenerated data file" workflows.
   - Confirm which gh-aw version it first shipped in (we're on `v0.67.2`; the changeset doesn't appear in the v0.67.2 release notes I checked).
   - Allow values up to at least a few thousand (matching how `max-patch-size` already goes up to 10240 KB per #28471).
3. **Soft-fail by default.** When the limit is exceeded, downgrade from "error + cancel all remaining safe outputs" to a clear warning + `fallback-as-issue` so the agent's work isn't lost. Right now a single tripped guardrail effectively disables the entire scheduled workflow with no artifact to recover from.

Even just (1) would unblock us. (2) and (3) would prevent the same papercut for the next team that automates a generated-content refresh.

Happy to help test a fix against our real workflow if useful — it runs on a daily schedule and reliably reproduces the limit hit.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise / unblock default 100-file cap in create_pull_request safe-output for generated-content workflows #32536

Summary

Failing run

Why 220 files is expected (and not a bug on our side)

Impact

Requests

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Raise / unblock default 100-file cap in create_pull_request safe-output for generated-content workflows #32536

Description

Summary

Failing run

Why 220 files is expected (and not a bug on our side)

Impact

Requests

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions