feat(py-client): add protoc-gen-py-client (Python HTTP client generator)#172
Conversation
Stand up cmd/protoc-gen-py-client and internal/pyclientgen mirroring the tsclientgen layout: one generated _client.py per .proto source, stdlib-only output, dataclasses + IntEnum + Protocol-typed transport. Field rendering, JSON-mapping annotations, and RPC method bodies land in subsequent commits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ions Replace the message scaffold with full @DataClass field rendering plus to_dict / from_dict serialization. Honors int64_encoding, enum_encoding + enum_value, bytes_encoding, timestamp_format, nullable, empty_behavior, unwrap (root + map-value), flatten + flatten_prefix, and oneof discriminator configurations. Adds a Python type-mapping helper module and JSON encode / decode expression builders so each field collapses to one or two lines in the generated to_dict / from_dict. Enum decoding emits a per-enum helper that accepts string (proto name or custom enum_value) or int wire forms and raises on unknown values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…archy Flesh out the client class with real request/response handling: - path parameter substitution via urllib.parse.quote - query parameter encoding via urlencode(doseq=True), with proper guards for string/bool/numeric/repeated fields - header building from default + per-call + typed service/method header options generated from sebuf.http.service_headers and method_headers annotations - transport invocation through the injectable HttpTransport protocol with per-call timeout fallback - response parsing using each message's generated from_dict - content-type negotiation surface (JSON implemented, proto raises NotImplementedError until a follow-up adds binary protobuf encoding) - SSE streaming methods detected via HttpConfig.stream and emit NotImplementedError pointing at the follow-up issue Replace the error stub with full per-*Error-message exception classes. Each class subclasses ApiError, exposes proto fields as constructor kwargs, and ships a populate() classmethod that builds an instance from a parsed JSON dict. An _ERROR_CLASSES registry indexed by required JSON key set lets the client's _raise_for_status pick the most specific exception for a response. Add a constants.go module with Python type names and well-known type proto-name constants to satisfy goconst, and tighten every switch with the appropriate nolint:exhaustive pragmas where the default branch is intentional. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generator implementation is complete on this branch (3 commits). This doc hands off the remaining test, demo, docs, and PR-opening work to the next agent. Includes file-by-file pointers to the patterns to mirror, the lint command tuned for go.mod 1.26, a note about the pre-existing openapiv3 test failure on main, and the rationale for not cherry-picking from PR #132. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bugs
Adds 15 per-feature test protos mirroring tsclientgen's testdata (plus a
new errors.proto exercising the per-*Error exception class generation
that is unique to py-client) and a golden test harness that also runs
`python3 -c "import ast; ast.parse(...)"` on each generated file to
catch syntactic regressions a string-compare cannot.
Capturing the goldens surfaced four generator bugs, all fixed here:
- error.go: empty set literal was emitted as `{}`, which is an empty
dict — violated the registry's `set[str]` type and would have crashed
if the runtime guard ever fell through.
- message.go: empty messages emitted `pass` followed by methods, which
is semantically incorrect and noisy. The methods alone keep the class
body non-empty.
- types.go: Timestamp WKT fields with unix-seconds / unix-millis /
date formats were typed as `int` / `str`, but encoding.go always
calls `.timestamp()` / `.strftime()`, assuming `datetime`. Aligned
on `datetime` for every timestamp_format — the format only affects
the wire encoding, not the user-facing type.
- message.go: WKT message-kind fields (Timestamp, Duration, FieldMask,
Any, Empty, Struct, scalar wrappers) routed through the scalar
to_dict path were emitted unconditionally, even though they are
always nullable in proto3. Guard them like proto3 `optional` scalars
so the encoder never sees a `None` default and raises AttributeError.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…tants
Covers the pure helpers that the golden tests exercise only indirectly:
- snakeCase: CamelCase → snake_case method-name conversion
- headerOptionName: HTTP header → Python kwarg, with keyword-collision
escape (X-Class → class_)
- escapePyKeyword: hard + soft Python 3.10 keywords
- formatPyStringSet: empty input emits set(), not the dict-literal {}
- stripOptional, camelToSnake, isInvalidIdentifier: small string utils
Also lifts three repeated literals into constants (pyFalse, pyEmptySet,
pyListStr) so goconst is happy and there is one place to change the
emitted Python idiom for each.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The generator was stripping the enum-name prefix and lowercasing each
variant ("PRIORITY_HIGH" -> "high"). That was an ergonomic improvement
on paper but broke cross-generator wire compatibility: the Go server
emits enums via protojson default ("PRIORITY_HIGH"), while
_encode_enum_X falls back to IntEnum.name, which the renaming had
turned into "high". A Python client talking to a Go server was always
going to misparse enum-typed fields.
Keep the proto value name verbatim ("PRIORITY_HIGH") so .name and the
wire format agree. Users write Priority.PRIORITY_HIGH which is also
PEP 8-conformant (UPPER_CASE for enum members).
Removes the now-orphaned camelToSnake/isInvalidIdentifier helpers and
their unit tests. Verified end-to-end against the python-client-demo
Go server: enums round-trip correctly across CRUD, query filtering,
and the unwrap response path.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors examples/ts-client-demo section-by-section so a reader can compare the two client surfaces directly. Shares the proto + Go HTTP server with the TS demo (NoteService — CRUD over Notes with enums, maps, optional fields, headers, query params, validation, unwrap response, and a typed NotFoundError). The Python client demonstrates: - Section 1: NoteServiceClientOptions with typed kwargs for service headers (api_key, tenant_id) and a default_headers escape hatch - Section 2: every HTTP verb (GET/POST/PUT/PATCH/DELETE) with path params, request bodies, and method-level headers via call options - Section 3: query parameter encoding for ListNotes (status/priority/sort/limit/offset) - Section 4: header layering (service options vs call options vs per-call headers dict, and per-call override of a service header) - Section 5: ValidationError parsing on min_len / max_len / missing required header — same buf.validate rules as the TS demo - Section 6: typed NotFoundError exception subclass (not a generic ApiError) chosen by the _ERROR_CLASSES registry from response shape - Section 7: custom HttpTransport injection (logging middleware) and the unwrap response path (NoteList.notes flattens on the wire) Verified end-to-end against the Go server: `make demo` runs the full suite cleanly with no failures. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ADME + CLAUDE.md Adds the dedicated Python client reference and wires protoc-gen-py-client into the toolkit overview in README.md and CLAUDE.md (now six plugins, not five). docs/python-generation.md covers: generator output (dataclasses, IntEnum, transport Protocol, error hierarchy, options, client class), transport injection, URL building (path + query params), header layering, ApiError/ValidationError/typed *Error exceptions, every JSON-mapping annotation (with focus on Timestamp/int64/bytes/oneof — same wire format as the Go and TS generators), Python keyword escaping, the SSE NotImplementedError stub, known limitations, and a link to examples/python-client-demo. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both planning docs were time-limited handoffs between agents working on this branch (PYTHON_CLIENT_REWRITE.md → the rewrite plan after PR #132 was closed; PY_CLIENT_HANDOFF.md → the Phase 2 testing/docs/PR handoff). The work they tracked is now landed on this branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #172 +/- ##
========================================
- Coverage 4.97% 4.52% -0.45%
========================================
Files 35 47 +12
Lines 4443 5586 +1143
========================================
+ Hits 221 253 +32
- Misses 4218 5329 +1111
Partials 4 4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
🔍 CI Pipeline Status✅ Lint: success 📊 Coverage Report: Available in checks above |
|
Bug: error classes emitted before the enums they reference When a proto message with an Error suffix references an enum field with a default value, the generated error class is declared before the enum — causing a NameError at import time. Reproduction: Our EventError message has code: RejectionReason as a field. The generated output emits class EventError(ApiError) at line 99, but class RejectionReason(IntEnum) isn't declared until line 215. class EventError(ApiError): |
When an *Error message has an enum-typed field, the generated default expression — code: Reason = Reason.X — is evaluated at class-definition time, so the enum class must already be declared. The previous file ordering emitted writeErrors before the enum loop, raising NameError at import time for any error that referenced an enum. Reorder the file so enums are written before writeErrors. Messages already trailed both blocks and don't need adjustment — message-typed defaults are always None, so forward references in them are safe. Add a regression case to testdata/proto/errors.proto (EventError + RejectionReason) matching the exact shape @yashagarwal-sarwa reported on #172, and upgrade the golden test to actually execute each generated file via importlib (ast.parse only checks syntax, not runtime NameErrors). The new import check also registers the module in sys.modules so @DataClass machinery can resolve string annotations from `from __future__ import annotations`. Reported-by: @yashagarwal-sarwa Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@yashagarwal-sarwa great catch — confirmed and fixed in 0166439. Root cause was exactly what you diagnosed: The fix is a one-line reorder (enums before errors) — messages already trailed both blocks, and message-typed defaults are always Also upgraded the golden test: every generated file now goes through Verified end-to-end against the python-client-demo Go server: clean run. |
|
@yashagarwal-sarwa fix is up in 0166439 (now on the branch). Mind taking another look when you have a minute? |
|
EventError.from_dict() missing — EventResult.from_dict() fails at runtime Error classes generated from proto *Error messages get to_dict() and populate(), but NOT from_dict(). However, when EventError appears as a field on a regular message (EventResult.error), the generated EventResult.from_dict() calls EventError.from_dict() which doesn't exist: In EventResult.from_dict() — generated code:kwargs["error"] = EventError.from_dict(data["error"]) # AttributeError: no from_dict |
When an *Error message is embedded as a field on another message,
the parent's generated from_dict calls EventError.from_dict(...) —
but error classes only had populate() and to_dict(), so the call
raised AttributeError at runtime.
Add a from_dict classmethod on every *Error class that delegates to
populate() with neutral status/body/headers. This keeps the error
class shape interchangeable with regular messages for serialization
purposes, which is what the parent message's deserializer assumes.
Extend errors.proto with EventResult { EventError error } as a
regression case matching the exact shape @yashagarwal-sarwa reported
on #172. The import test (added in 0166439) catches the AttributeError
on the next regen attempt.
Reported-by: @yashagarwal-sarwa
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…t-map unwrap
Three generator bugs surfaced by the new examples/python-encoding-demo
end-to-end round-trip against a real Go server:
1. decodeTimestampExpr for TIMESTAMP_FORMAT_DATE returned the raw
"YYYY-MM-DD" string instead of a datetime, even though the field
type is datetime. Now parses with datetime.strptime so the assigned
value matches the declared annotation.
2. The Python flatten encoder iterated nested.to_dict().items() and
prefix-tagged each key — which used JSON names (camelCase). The Go
HTTP plugin's flatten encoder uses proto names (snake_case), so the
Python side emitted `author_zipCode` while the server emits
`author_zip_code`, breaking round-trips. Rewritten to emit one wire
key per nested field using the field's proto name, with the matching
decoder reading those keys directly. Encoder + decoder now agree
with the Go server byte for byte.
3. annotations.FindUnwrapField is documented as a list-only helper, but
py-client's root-unwrap codepaths called it for messages whose unwrap
field is a map. That silently produced empty `to_dict() -> {}` /
`from_dict() -> cls()` on every root-map unwrap message. Added a
local findRootUnwrapField that doesn't filter on IsList(); kept the
shared helper unchanged so other generators stay untouched.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two new end-to-end examples that round-trip every protoc-gen-py-client feature (except SSE, tracked as #167) against a real Go server. Each example follows the established repo pattern — single focus, Go server + Python client + `make demo` target. examples/python-encoding-demo (51 assertions) Round-trips every JSON-mapping annotation: enum_value override, timestamp_format (RFC3339/UNIX_S/UNIX_MS/DATE), int64_encoding STRING+NUMBER, bytes_encoding base64+HEX, flatten+flatten_prefix, oneof_config nested + flattened variants, all three unwrap variants (root repeated, root map, map-value), Python keyword field-name escaping (`from`/`class`/`return`), and repeated query parameters. Each annotation lives on its own message because the Go HTTP plugin emits one MarshalJSON method per (message, annotation) and would produce duplicate methods otherwise. Writing this demo surfaced three real generator bugs that were invisible to the golden tests (they pass ast.parse and import-time exec but never check wire-format compatibility with the Go side). Fixes shipped in the preceding commit. examples/python-errors-demo (41 assertions) Covers every error surface: ValidationError parsed from a buf.validate body, registry-based disambiguation across NotFoundError / ConflictError / RateLimitError, and an *Error embedded as a field on a regular response (the BatchCreateItemResult pattern from @yashagarwal-sarwa's #172 — exercises the FieldValidationError. from_dict alias that lives alongside populate()). CLAUDE.md updated to list both examples in the project-structure section. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
@yashagarwal-sarwa update — pushed the from_dict fix you flagged, plus broader coverage. The new commits are:
The encoding demo specifically surfaced three issues we didn't have coverage for before:
All fixed and round-tripping clean. Mind taking another look? |
Adds
protoc-gen-py-client, the sixth generator in the sebuf toolkit. Generates type-safe Python HTTP clients that depend only on the Python standard library (Python 3.10+); users plug inrequests/httpx/aiohttpvia a duck-typedHttpTransportProtocol at construction time.What's in the box
cmd/protoc-gen-py-client/+internal/pyclientgen/— full generator@dataclasses,IntEnums,HttpTransportProtocol +UrllibTransportdefault, typed*ClientOptions/*CallOptions, one client class per service, and a per-*Error-messageApiErrorsubclass hierarchy chosen at runtime from response shape via_ERROR_CLASSES(sebuf.http.*)JSON-mapping annotations honored:int64_encoding,enum_encoding+enum_value,nullable,empty_behavior,timestamp_format,bytes_encoding,oneof_config+oneof_value,flatten+flatten_prefix,unwrap(map-value + root)urlencode(doseq=True)), typed service + method header kwargs, content-type negotiation surface, Python 3.10 keyword escaping (hard + soft)python3 -c "import ast; ast.parse(...)"so syntactic regressions are caught even when the golden string-compare wouldn't fireexamples/python-client-demo/mirrorsexamples/ts-client-demo/section-by-section against the same Go HTTP server —make demoruns the full suite end-to-endWhat's deferred
Filed as follow-up issues:
NotImplementedErrorstub ships now; full iterator support is the follow-up)__init__.pyre-exportsGenerator bugs caught during Phase 2 testing
These were latent in the implementation commits and fixed before locking the goldens:
{}(a dict) instead ofset()passfollowed by methods on field-less messagesgoogle.protobuf.Timestamptyped asint/strfor unix/date formats while the encoder always calleddatetimemethods — aligned ondatetimeeverywhereto_dictPriority.high) but the encoder emittedIntEnum.name, producing\"high\"instead of Go's\"PRIORITY_HIGH\"— wire-format-breaking. Restored verbatim proto namesCredit
This builds on test fixtures and serialization ideas from @elzalem's #132. The architecture intentionally follows the existing
tsclientgen/clientgenpattern (callinginternal/annotations/directly) rather than thecontractmodellayer from that PR — so we could ship without blocking on the C# generator design discussion separately tracked on #131.Test plan
make build— producesbin/protoc-gen-py-clientgo test ./internal/pyclientgen/...— golden tests + helper unit tests all passgolangci-lint run ./internal/pyclientgen/... ./cmd/protoc-gen-py-client/...— 0 issuescd examples/python-client-demo && make demo— all 7 demo sections green end-to-end against the real Go serverdocs/python-generation.mdand the new entries inREADME.md+CLAUDE.md🤖 Generated with Claude Code