Skip to content

Make z.preprocess defer optionality to inner schema#5929

Merged
colinhacks merged 8 commits into
mainfrom
zod-preprocess-codec
May 1, 2026
Merged

Make z.preprocess defer optionality to inner schema#5929
colinhacks merged 8 commits into
mainfrom
zod-preprocess-codec

Conversation

@colinhacks
Copy link
Copy Markdown
Owner

Fixes #5917.

z.preprocess(fn, schema) desugared to pipe(transform(fn), schema), so the resulting pipe inherited optin from ZodTransform (undefined) instead of from the inner schema. When the inner schema was itself optional and the preprocess sat as an object property, the object compiler took the !isOptionalIn branch and synthesized a nonoptional issue for missing keys — making the position of .optional() change presence semantics:

z.object({ x: z.preprocess(v => v, z.number()).optional() }).safeParse({}).success;  // true
z.object({ x: z.preprocess(v => v, z.number().optional()) }).safeParse({}).success;  // false on main, true here

Introduce ZodPreprocess as a structural subtype of ZodPipe (same def.type === "pipe", instanceof ZodPipe still true). It pins def.in to a permissive z.unknown() and re-installs the four lazy metadata props so optin, optout, values, and propValues all defer to def.out (the inner schema). Forward direction skips the no-op def.in.run() and embeds the user fn as the codec's decode; backward throws, matching today's behavior.

The two JSON-schema sites that special-cased the old pipe(transform, ...) shape (pipeProcessor and isTransforming) now also detect preprocess via _zod.traits.

`z.preprocess(fn, schema)` desugared to `pipe(transform(fn), schema)`,
so the resulting pipe's `optin` was inherited from `ZodTransform`
(undefined) instead of from the inner schema. When the inner schema was
itself optional and the preprocess sat as an object property, the
object compiler took the `!isOptionalIn` branch and synthesized a
`nonoptional` issue for missing keys — making the position of
`.optional()` change presence semantics.

Introduce `ZodPreprocess` as a structural subtype of `ZodPipe` (same
`def.type === "pipe"`, `instanceof ZodPipe` still true). It pins
`def.in` to a permissive `z.unknown()` and re-installs the four lazy
metadata props so `optin`, `optout`, `values`, and `propValues` all
defer to `def.out` (the inner schema). Backward direction throws,
matching today's behavior.

The two JSON-schema sites that special-cased the old
`pipe(transform, ...)` shape (`pipeProcessor` and `isTransforming`)
now also detect preprocess via `_zod.traits`.
@pullfrog
Copy link
Copy Markdown
Contributor

pullfrog Bot commented May 1, 2026

TL;DRz.preprocess(fn, schema) previously desugared to a plain pipe(transform, schema), causing the resulting pipe to inherit optionality from the transform (always non-optional) rather than the inner schema. This broke cases where the inner schema was .optional() inside an object. The fix introduces a dedicated ZodPreprocess subtype of ZodPipe that defers presence semantics (optin/optout) to the inner schema while reusing a real ZodTransform as def.in — keeping runtime behavior identical to the legacy form.

Key changes

  • Introduce $ZodPreprocess core schema — a structural subtype of $ZodPipe<$ZodTransform, B> that overrides only the lazy presence props (optin/optout) to delegate to def.out; runtime parse is inherited from $ZodPipe unchanged, and def.in is a real ZodTransform (not a synthetic z.unknown())
  • Add classic ZodPreprocess wrapper — mirrors $ZodPreprocess in the classic layer by composing ZodPipe.init + $ZodPreprocess.init
  • Update z.preprocess() factory — returns a ZodPreprocess instance with def.in = transform(fn) instead of desugaring to pipe(transform(...), schema)
  • Simplify JSON Schema detectionpipeProcessor uses a single def.in._zod.traits.has("$ZodTransform") check (covers both legacy pipe(transform, ...) and z.preprocess); isTransforming only special-cases $ZodCodec via traits since preprocess's real ZodTransform input is naturally discovered by recursion
  • Document $ZodPreprocess in inheritance hierarchy — updates core.mdx to list $ZodPreprocess as a child of $ZodPipe alongside $ZodCodec
  • Add regression, assignability, and type-level tests — verifies optional propagation, structural subtype relationship, values/propValues non-propagation, discriminated-union behavior, record-key acceptance, codec example stripping on input JSON schema, ZodPreprocess assignability to ZodPipe/$ZodPipe, and presence-field inference at the type level

Summary | 9 files | 8 commits | base: mainzod-preprocess-codec


Presence vs. input-set delegation

Before: z.preprocess(fn, schema.optional()) inside an object produced a nonoptional error for missing keys because optionality was read from ZodTransform (always undefined). Additionally, deferring values/propValues to the inner schema would corrupt discriminated-union disc maps and record-key enumerators.
After: ZodPreprocess forwards only optin/optout (presence — whether the container can omit the slot) to the inner schema. values and propValues stay inherited from def.in (the ZodTransform), which has neither — matching the legacy pipe(transform, ...) form exactly.

The core constructor calls $ZodPipe.init first (inheriting pipe behavior and def.type === "pipe" identity), then overrides presence via util.defineLazy. The runtime parse function is not overridden — def.in is a real ZodTransform that carries the user-supplied fn, so the inherited pipe parse naturally runs the transform then pipes into def.out.

Why separate presence from values? Presence semantics (optin/optout) describe whether the outer container can omit this slot — preprocess is transparent to that, so z.preprocess(fn, schema.optional()) and z.preprocess(fn, schema).optional() should be equivalent. values and propValues, by contrast, are claims about the input set the schema accepts. Deferring them to B would let the discriminated-union disc map short-circuit on B's literal set before the preprocess fn ever runs, and the record-key enumerator would restrict accepted keys to B's enum. Both stay undefined (inherited from the transform), matching the legacy form.

packages/zod/src/v4/core/schemas.ts · packages/zod/src/v4/classic/schemas.ts · packages/zod/src/v4/classic/tests/preprocess.test.ts


JSON Schema awareness via traits

Before: pipeProcessor detected transforms via the string check def.in._zod.def.type === "transform", and isTransforming had no knowledge of preprocess or codec schemas — allowing output-side metadata (examples, defaults) to leak into input JSON schemas for these types.
After: pipeProcessor uses def.in._zod.traits.has("$ZodTransform") — a single traits-based check that covers both legacy pipe(transform, ...) and the new z.preprocess form (whose def.in is a real ZodTransform). isTransforming only needs to special-case $ZodCodec via traits since preprocess's real ZodTransform input is naturally discovered by the recursive walk.

packages/zod/src/v4/core/json-schema-processors.ts · packages/zod/src/v4/core/to-json-schema.ts · packages/zod/src/v4/classic/tests/to-json-schema.test.ts


Documentation and type-level assignability tests

Before: No compile-time verification that ZodPreprocess structurally satisfies ZodPipe, and the core.mdx inheritance docs did not mention the new type.
After: preprocess-types.test.ts uses expectTypeOf to assert assignability to ZodPipe<$ZodTransform, B>/$ZodPipe<$ZodTransform, B>, correct narrowing of presence fields for optional vs. required inner schemas, and unchanged output/input inference. assignability.test.ts adds satisfies assertions for both core and classic layers. core.mdx lists $ZodPreprocess under $ZodPipe in both the type union and the tree diagram.

packages/zod/src/v4/classic/tests/preprocess-types.test.ts · packages/zod/src/v4/classic/tests/assignability.test.ts · packages/docs/content/packages/core.mdx

Pullfrog  | View workflow run | via Pullfrog | Using Claude Opus𝕏

Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed — no issues found.

Pullfrog  | View workflow run | Using Claude Opus𝕏

`values` and `propValues` describe the *input* set a schema accepts.
Preprocess opens that set to anything `fn` can map to a B-accepted
value, so deferring these to B was unsound:

- discriminated unions use `propValues` as a fast-path disc map; with
  B's propValues exposed, an option claims to match only B's literal
  inputs and silently routes wrong (e.g. preprocess(toUpperCase,
  z.literal("A")) would fail to match input "a"). Now reverts to
  throwing at construction, matching pre-PR behavior.
- `z.record()` uses `values` to enumerate expected keys; with B's
  values exposed it short-circuits before the preprocess fn ever runs
  on input keys.

Presence semantics (`optin`/`optout`) describe whether the *outer*
container can omit this slot; preprocess is transparent to those, so
they continue to defer to B (the original #5917 fix).
Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

Minor comment/code mismatch introduced by the follow-up commit.

  • Stopped deferring values/propValues from B in the core $ZodPreprocess init — only optin/optout are re-installed now
  • Added three tests covering discriminated-union, record-key, and direct values/propValues non-propagation
  • Updated the block comment and interface to remove the values/propValues type overrides

Pullfrog  | Fix all ➔Fix 👍s ➔View workflow run | Using Claude Opus𝕏

Comment thread packages/zod/src/v4/classic/schemas.ts Outdated
}

// ZodPreprocess
// Subtype of ZodPipe whose `optin`/`optout`/`values`/`propValues` defer to B
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment still mentions values/propValues deferring to B, but the second commit deliberately stopped doing that. Should read something like:

// Subtype of ZodPipe whose `optin`/`optout` defer to B

Document the invariant that ZodPreprocess<B> is structurally
assignable to ZodPipe<$ZodType, B>, and that the optin/optout
override surfaces B's declared type (narrowed for schemas like
ZodOptional that hardcode it, open `"optional" | undefined` for
schemas like ZodString that inherit it).
Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new issues. Reviewed the following changes:

  • Added type-level tests verifying ZodPreprocess<B> assignability to ZodPipe, optin/optout narrowing, and output/input inference

Prior review feedback:

  • Comment on line 2365 of classic/schemas.ts still mentions values/propValues deferring to B — not addressed

Pullfrog  | View workflow run | Using Claude Opus𝕏

`pipeProcessor` was mixing detection styles: `traits.has("$ZodPreprocess")`
for the new subtype and `def.in._zod.def.type === "transform"` for the
legacy pipe(transform, ...) form. Both check "input contributes no
validation, use B for input-side JSON schema." Converge on traits.
Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new issues. Reviewed the following changes:

  • Refactored pipeProcessor to detect the legacy pipe(transform, ...) form via def.in._zod.traits.has("$ZodTransform") instead of inspecting def.type directly — consistent with how $ZodPreprocess is already detected; added an explanatory comment

Prior review feedback:

  • Comment on line 2365 of classic/schemas.ts still mentions values/propValues deferring to B — not addressed

Pullfrog  | View workflow run | Using Claude Opus𝕏

Same diagnosis as the preprocess case: `isTransforming` recurses into
`def.in` and `def.out` looking for `def.type === "transform"`, but
`ZodCodec` embeds an implicit transform fn (its `decode`/`encode`)
while `def.in` and `def.out` are validating schemas. The recursion
finds no transform and returns false even though the codec absolutely
is transforming, causing output-side examples/defaults to leak into
the input JSON schema.

Detect via `_zod.traits.has("$ZodCodec")` alongside the preprocess
check.
Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new issues. Reviewed the following changes:

  • Added $ZodCodec trait detection to isTransforming() so codecs are recognized as transforming — output-side examples/default metadata is now correctly stripped from input JSON schemas
  • Added test verifying codec example stripping behavior for both io: "input" and io: "output"

Prior review feedback:

  • Comment on line 2365 of classic/schemas.ts still mentions values/propValues deferring to B — not addressed

Pullfrog  | View workflow run | Using Claude Opus𝕏

Surface ZodPreprocess everywhere the codebase enumerates first-party
types as a structural sibling of ZodCodec under ZodPipe:

- core.mdx: $ZodTypes union comment ("$ZodCodec and $ZodPreprocess
  extend this") and the inheritance diagram
- assignability.test.ts: explicit `satisfies` checks confirming the
  type-level relationship to $ZodPreprocess and $ZodPipe (with
  generics specified — the bare-generic form bites variance because
  inherited reverseTransform is contravariant in B)

The $ZodTypes union itself doesn't need a new entry — preprocess is
covered transitively via $ZodPipe, same as $ZodCodec — but documenting
the relationship in prose and tests makes the hierarchy discoverable.
Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new issues. Reviewed the following changes:

  • Documented $ZodPreprocess in the core.mdx inheritance hierarchy (comment update and tree entry)
  • Added type-level assignability tests verifying ZodPreprocess satisfies both $ZodPipe and ZodPipe with the expected type parameters

Prior review feedback:

  • Comment on line 2365 of classic/schemas.ts still mentions values/propValues deferring to B — not addressed

Pullfrog  | View workflow run | Using Claude Opus𝕏

The bug is purely a static-metadata problem: $ZodPipeInternals.optin
was inheriting from A (the leading transform, which has no optin)
instead of from B (the inner schema). The runtime parse path for
pipe(transform(fn), schema) was already correct.

So the simpler design is: keep the runtime structure exactly as the
legacy form (def.in is a real ZodTransform), and just override the
optin/optout lazies in a thin subtype.

What this eliminates from the previous design:

- Custom parse function in $ZodPreprocess.init (forward direction with
  embedded transform fn)
- Custom parse function in classic ZodPreprocess.init (addIssue
  injection — the inner ZodTransform's classic parse override already
  does this)
- The synthetic z.unknown() input slot
- The required `transform` field on the def
- traits.has("$ZodPreprocess") check in pipeProcessor and
  isTransforming — def.in is a real ZodTransform, so the existing
  detection paths fire correctly

The traits.has("$ZodCodec") check in isTransforming stays — codec
embeds its transform fn directly on the def (different shape) and
needs its own detection.
Copy link
Copy Markdown
Contributor

@pullfrog pullfrog Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No new issues. Reviewed the following changes:

  • Simplified ZodPreprocess from a custom-parse-function subtype to a pure metadata-override subtype — def.in is now a real ZodTransform (via transform(fn)), and the pipe's inherited parse handles the flow
  • Removed the def.transform field and handlePreprocessResult helper since the transform logic now lives inside def.in
  • Narrowed the A type parameter from $ZodType to $ZodTransform across interfaces and the factory
  • Simplified isTransforming to drop the $ZodPreprocess trait check — recursion into def.in (a real transform) detects it naturally
  • Updated pipeProcessor to use def.in._zod.traits.has("$ZodTransform") uniformly for both preprocess and legacy pipe(transform, ...) forms
  • Updated type-level assignability tests and comments to reflect $ZodTransform as the A parameter

Prior review feedback:

  • Comment on line 2365 of classic/schemas.ts still mentions values/propValues deferring to B — addressed (comment now correctly states only optin/optout)

Pullfrog  | View workflow run | Using Claude Opus𝕏

@colinhacks colinhacks merged commit 02c2baf into main May 1, 2026
3 of 6 checks passed
@colinhacks colinhacks deleted the zod-preprocess-codec branch May 1, 2026 19:35
colinhacks added a commit that referenced this pull request May 4, 2026
`z.object({ a: z.preprocess(fn, T) }).parse({})` worked in 4.3 (the fn
ran with `undefined`, produced a value, the inner schema validated it)
but started failing on absent keys after #5661 tightened the object
parser. Users commonly use preprocess to inject pre-parse defaults for
fields that may be missing — that pattern broke silently in 4.4.

Restore by marking $ZodPreprocess as `optin === "optional"`, telling
`$ZodObject` that absent keys are legal here. The fn then runs with
`undefined` exactly as it did in 4.3.

To preserve the long-stable behavior of `preprocess(fn, T).optional()
.parse(undefined)` returning `undefined` (true in both 4.3 and 4.4 for
multi-year compatibility), have `$ZodTransform` set the `fallback`
payload flag on every invocation. `$ZodOptional` already clobbers a
result with `fallback === true` when its input was `undefined`, so the
outer optional keeps short-circuiting to `undefined` even though the
transform now runs underneath.

  z.object({ a: z.preprocess(v => v ?? "X", z.string()) }).parse({})
  // 4.3:  { a: "X" }
  // 4.4:  FAIL  (regression)
  // now:  { a: "X" }

  z.preprocess(v => v ?? "X", z.string()).optional().parse(undefined)
  // 4.3 + 4.4:  undefined
  // now:        undefined  (preserved)

Drops the `optin` defer-to-inner from #5929, but the same outcome
holds: when inner is `.optional()`, preprocess still accepts absent
keys (`optin === "optional"` either way).
colinhacks added a commit that referenced this pull request May 4, 2026
…n absent keys (#5941)

* fix(v4): propagate fallback flag through pipe boundaries

`$ZodCatch` sets a payload flag when its `catchValue` substitutes so an
outer `$ZodOptional` can clobber the recovery value with `undefined`
(per #5939). But `handlePipeResult` was building a fresh payload for the
right side of the pipe without copying the flag, so any chain like
`catch().transform()...optional()` lost it — `optional` couldn't tell
the inner had recovered, and surfaced the catch value instead of
clobbering.

Propagate the flag through pipe handoffs, alongside `value`/`issues`.
Also rename `caught` to `fallback`: a slightly broader name that
describes the consumer contract ("override me if you have a better
value when input was undefined") rather than the producer ("catch fired
me"). Internal-only; no public API surface.

* fix(v4): restore preprocess handling for absent object keys

`z.object({ a: z.preprocess(fn, T) }).parse({})` worked in 4.3 (the fn
ran with `undefined`, produced a value, the inner schema validated it)
but started failing on absent keys after #5661 tightened the object
parser. Users commonly use preprocess to inject pre-parse defaults for
fields that may be missing — that pattern broke silently in 4.4.

Restore by marking $ZodPreprocess as `optin === "optional"`, telling
`$ZodObject` that absent keys are legal here. The fn then runs with
`undefined` exactly as it did in 4.3.

To preserve the long-stable behavior of `preprocess(fn, T).optional()
.parse(undefined)` returning `undefined` (true in both 4.3 and 4.4 for
multi-year compatibility), have `$ZodTransform` set the `fallback`
payload flag on every invocation. `$ZodOptional` already clobbers a
result with `fallback === true` when its input was `undefined`, so the
outer optional keeps short-circuiting to `undefined` even though the
transform now runs underneath.

  z.object({ a: z.preprocess(v => v ?? "X", z.string()) }).parse({})
  // 4.3:  { a: "X" }
  // 4.4:  FAIL  (regression)
  // now:  { a: "X" }

  z.preprocess(v => v ?? "X", z.string()).optional().parse(undefined)
  // 4.3 + 4.4:  undefined
  // now:        undefined  (preserved)

Drops the `optin` defer-to-inner from #5929, but the same outcome
holds: when inner is `.optional()`, preprocess still accepts absent
keys (`optin === "optional"` either way).

* fix(v4): generalize optin=optional from preprocess to transform

Promotes the "user-written input handler accepts absence" signal from
$ZodPreprocess to $ZodTransform. Any schema with a transform fn at its
input boundary (preprocess, standalone z.transform) now declares
optin="optional" at runtime.

Effects:
- preprocess inherits optin="optional" via pipe.optin = transform.optin
  (same outcome as the previous commit's explicit override; preprocess
  loses both its optin and optout overrides since pipe already does the
  optout defer)
- standalone z.transform(fn) now accepts absent object keys
- z.string().transform(fn): unchanged (pipe.optin = string.optin =
  undefined; transform on the OUT side doesn't drive optin)
- z.unknown().transform(fn).pipe(A): unchanged (pipe.optin = unknown.
  optin = undefined)

The static type stays unchanged — transform's interface doesn't
declare optin, so this only sets the runtime value, mirroring the
catch pattern. Captures the "flexible inputs, strict outputs" design
principle: schemas with a user-written escape hatch (catch's recovery,
transform's fn) accept undefined at runtime even when the static type
declares the input as required.

After this, $ZodPreprocess is a near-empty marker subtype — the
constructor body is just $ZodPipe.init(inst, def), kept for type
narrowing and traits identity.

* docs(wiki): add internal reference for v4 optionality semantics

Captures the current state of optin/optout/fallback, who sets each,
who reads each, the static/runtime divergence pattern, and walked-
through cases for the gnarly interactions (catch+optional, default
vs catch vs preprocess vs transform under optional, etc.).

Also documents the "flexible inputs, strict outputs" design principle
that motivates the runtime/static optin divergence on $ZodCatch and
$ZodTransform: schemas with a user-written escape hatch accept
undefined at runtime while keeping their declared input type strict.

Internal-only doc; not published.

* docs(wiki): explain why unknown.transform.pipe stays strict

Adds the explicit contrast between z.preprocess(fn, T) (= pipe(transform,
T), accepts absent) and z.unknown().transform(fn).pipe(T) (= pipe(pipe(
unknown, transform), T), rejects absent). The two look structurally
similar but only the leading position drives optin, and z.unknown()
isn't input-optional.

Also drops the stale "prototype only" caveat from the standalone
z.transform(fn) walkthrough — the runtime optin=optional move from
preprocess to transform is now a real part of this branch, not a
prototype.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

The results vary depending on the position of the “optional” in the schema with preprocess since v4.4.0

1 participant