Skip to content

codegen: propagate bytes_fields to map<K, bytes> values (#76)#147

Open
senthil1216 wants to merge 2 commits into
anthropics:mainfrom
senthil1216:feat/bytes-fields-map-values
Open

codegen: propagate bytes_fields to map<K, bytes> values (#76)#147
senthil1216 wants to merge 2 commits into
anthropics:mainfrom
senthil1216:feat/bytes-fields-map-values

Conversation

@senthil1216
Copy link
Copy Markdown

Summary

  • Under use_bytes_type() (or use_bytes_type_in(...)), map<K, bytes> values now become bytes::Bytes instead of Vec<u8> — the lone hold-out among bytes contexts. This lets map values participate in the to_owned_from_source zero-copy slice_ref path that singular / optional / repeated / oneof bytes_fields already use.
  • One intentional carve-out: map<bytes, bytes> (only reachable via strict_utf8_mapping(true)) keeps Vec<u8> values because the existing bytes_key_bytes_val_map JSON helper is concrete HashMap<Vec<u8>, Vec<u8>>. Documented on both knobs.
  • Generated Arbitrary impls for the affected map fields go through a new __private::arbitrary_bytes_map<K> shim (K: Arbitrary + Eq + Hash — every proto map-key type qualifies). No turbofish needed at the call site.
  • Breaking for code already using use_bytes_type() on a proto containing map<K, bytes>: rewrite value construction from Vec<u8> to bytes::Bytes (b"v".to_vec()bytes::Bytes::from_static(b"v")).

Carve-out logic lives in classify_field::map_value_use_bytes (the source of truth) and is mirrored in impl_message::map_merge_arm, impl_text::map_merge_arm, and view::map_to_owned_expr with parallel "see classify_field" comments.

Closes #76 (item a). Item c (checked_add in bytes_from_source) was already done in #84; item d (string_fields shared-buffer strings) remains for a future PR. Net diff ≈ 120 lines excluding tests.

Test plan

  • Updated test_bytes_type_map_value_stays_vectest_bytes_type_map_value_uses_bytes (positive type assertion plus wire / view→owned / JSON / text agreement).
  • New test_bytes_type_map_value_to_owned_from_source_zero_copy asserts slice_ref aliasing into the source Bytes buf for map values — same property the singular/repeated/oneof tests pin.
  • Updated test_bytes_type_json_all_contexts_roundtrip + test_bytes_type_view_encode_roundtrip to construct map values as Bytes.
  • cargo test --workspace — 28 test suites pass, no failures.
  • cargo clippy --workspace --all-targets -- -D warnings — clean.
  • cargo fmt --all --check — clean.
  • cargo check -p buffa --no-default-features (host no_std) — clean.
  • task gen-wkt-types and gen-bootstrap-types produce no diff (neither set has map<K, bytes> fields).
  • Pre-commit rust-code-reviewer + rust-api-ergonomics-reviewer agents per CLAUDE.md — no Critical findings; High (CHANGELOG entry) and Medium (docstring updates) addressed in this PR.

🤖 Generated with Claude Code

Under `use_bytes_type()`, singular/optional/repeated/oneof bytes fields
already become `bytes::Bytes`; `map<K, bytes>` values were the lone
hold-out at `Vec<u8>`. Closes that asymmetry so map values participate
in the `to_owned_from_source` zero-copy `slice_ref` path. One carve-out:
`map<bytes, bytes>` (only reachable via `strict_utf8_mapping`) keeps
`Vec<u8>` to preserve the existing JSON helper's concrete signature;
documented on both knobs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown


Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution. You can sign the CLA by just posting a Pull Request Comment same as the below format.


I have read the CLA Document and I hereby sign the CLA


You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot.

Addresses review feedback on the map<K, bytes> bytes_fields change:

- Extract the "value uses Bytes" decision into a single
  `map_value_use_bytes` helper, replacing the four copy-pasted predicates
  in classify_field, the binary and text map_merge_arm decoders, and
  view::map_to_owned_expr. The helper takes Option<Type> key/value so it
  mirrors the original predicate and treats a missing entry field as
  non-bytes. A split decision would only surface as a compile error in
  the consuming crate, so a single source of truth matters.
- Drop the redundant field_uses_bytes re-check on the view map path by
  emitting bytes_from_source directly; the carve-out already implies it.
- Docs: spell out the read-side migration (Bytes has no inherent
  as_slice) and the precise carve-out trigger (strict_utf8_mapping plus a
  map key carrying features.utf8_validation = NONE; strict_utf8_mapping
  alone keeps a plain map<string, bytes> value as Bytes). Add intra-doc
  links from the no-arg use_bytes_type to use_bytes_type_in.
- Tests: carve-out coverage (a NONE-keyed map<string, bytes> keeps
  Vec<u8> values and round-trips via bytes_key_bytes_val_map), an owned
  binary-decode assertion for a bytes map value, and a by_key probe in
  the arbitrary smoke test.
@iainmcgin iainmcgin force-pushed the feat/bytes-fields-map-values branch from bc97dd1 to 2a25397 Compare May 24, 2026 04:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Follow-ups from #74: bytes_fields for map values, checked_add in bytes_from_source

2 participants