feat(sensing-server): adaptive person count — RollingP95 + dedup_factor runtime API#491
Merged
Merged
Conversation
…or runtime API
RollingP95 adaptive normalizer (ADR-044 §5.2):
- Streaming P95 estimator (600-sample / ~30 s window) replaces fixed-scale
denominators (variance/300, motion/250, spectral/500) that saturated against
live ESP32 values, collapsing dynamic range to zero.
- Cold-start (<60 samples) falls back to legacy denominators — day-0 behaviour
is preserved.
- Three new fields on AppStateInner: p95_variance, p95_motion_band_power,
p95_spectral_power (all RollingP95::new(600, 60)).
- compute_person_score() refactored to accept &AppStateInner; all three call
sites (wifi, wifi-fallback, simulated) updated.
- 5 unit tests in rolling_p95_tests module.
dedup_factor runtime API (ADR-044 §5.3):
- New field dedup_factor: f64 (default 3.0) on AppStateInner.
- fuse_or_fallback() gains dedup_factor param; fallback switches from max() to
sum/dedup_factor (ceiling), matching the fork's sum-based aggregation.
- RuntimeConfig struct + load/save_runtime_config() for data/config.json
persistence across restarts.
- Three new REST endpoints:
GET /api/v1/config/dedup-factor
POST /api/v1/config/dedup-factor
POST /api/v1/config/ground-truth (auto-tune from known person count)
Explicitly NOT included:
- lambda=5.0 (upstream keeps its 0.1 default — deployment-specific tuning)
- CC intensity threshold 0.3 and min-cluster-size 4 hardcodes
- max_cc_size filter removal
Owner
|
Hi @schwarztim — current
The PRs were all evaluated thoughtful and well-described. Once you rebase against |
This was referenced May 17, 2026
ruvnet
added a commit
that referenced
this pull request
May 19, 2026
…count Merge #491: feat(sensing-server): adaptive person count — RollingP95 + dedup_factor (integration on schwarztim's behalf)
ruvnet
added a commit
that referenced
this pull request
May 19, 2026
#654) The previous table mixed status badges (✅ /⚠️ / 🔬) and verbose "pending wiring / not yet released" caveat columns. Rewrites it as "What / How / Speed-or-scale" — three columns, present tense, no status column. Captures what actually shipped this week: * Presence detection now points at the trained head shipped on HF (100% validation accuracy), with the phase-variance fallback reframed as a no-model option rather than a "loader pending" caveat. * 17-keypoint pose is its own row now — cog-pose-estimation v0.0.1 binaries on GCS, 8.4 ms cold-start on Pi 5, train-your-own in 2.1 s on RTX 5080. References ADR-101 + the benchmark log. * Multi-person counting drops the "Heuristic, not learned" framing. The adaptive P95 normalisation from PR #491 is in tree, the runtime dedup-factor knob is documented, and the six learned drop-in counters from the Cog catalog are linked: occupancy-zones, elevator-count, queue-length, customer-flow, clean-room, person-matching. * Edge intelligence row now points at the 105-cog catalog (ADR-102) instead of just the Cognitum Seed hardware. * Camera-supervised fine-tune row reflects the actual measured training time (2.1 s on RTX 5080 for 400 epochs) instead of the laptop estimate. * Drops the status-legend footer (no more ✅/⚠️ /🔬 column to legend). Replaces it with a pointer down to the Edge Module Catalog. The ESP32 + Cognitum Seed deployment-options row gets the same treatment: cleaner list of what's included, no "Pose pending weights" parenthetical (the cog ships today). Net effect: same information, present tense, positive voice. Nothing removed beyond status badges + pending-work parentheticals; all genuine engineering details (e.g. "needs ~30 s ambient calibration" for the fallback) are preserved inline.
ruvnet
added a commit
that referenced
this pull request
May 21, 2026
Motivated by #499 (multi-node double-skeletons) which PR #491 stopped the bleeding on but didn't take to the WiFi-CSI literature's state of the art. Designs a learned counter that replaces today's slot heuristic + dedup_factor knob, reusing the primitives we've already shipped this week: * Candle / RTX 5080 training pipeline (proven yesterday, 2.1 s for 400 epochs on pose_v1.safetensors) * HF presence encoder as initialization (architectures compatible, unlike the pose head case) * ruvector-mincut (Stoer-Wagner) for multi-node fusion upper-bound * Cog packaging spec (ADR-100) + edge module registry (ADR-102) * Paired-data pipeline (PR #641 streaming-safe align-ground-truth.js) — `n_persons` labels come for free; no new data collection campaign required to bootstrap. Architecture: per-node CSI [56×20] -> frozen HF encoder -> 128-dim embedding \ > count head (softmax {0..7}) > confidence head (sigmoid) N nodes' distributions -> confidence-weighted log-sum -> Stoer-Wagner min-cut upper-bound clip -> { count, confidence, count_p95_low, count_p95_high, per_node_breakdown } Compares the proposal explicitly against WiCount / DeepCount / CrossCount / HeadCount published numbers and is honest about the hardware gap (their 3x3 MIMO research NICs vs our 1x1 SISO ESP32-S3). v0.1.0 acceptance gates target >=80% within-+/-1 same-room and >=60% cross-room — modest on purpose; bounded by the same paired- data scarcity #645 documents for pose. The framework is the deliverable; the accuracy follows the data. Includes: * Architecture diagram in ascii * Comparison table vs published WiFi-CSI counting SOTA * Per-failure-mode mapping from #499 symptoms to how the learned counter addresses each * v0.1.0 + v0.2.0 acceptance gates with measurable thresholds * Repo layout for the new `v2/crates/cog-person-count/` crate * Five-step migration plan from this ADR -> first GCS release Status: Proposed. Implementation follows in the same incremental pattern ADR-101 used: scaffold-cog PR -> train+publish PR -> server-wiring PR.
ruvnet
added a commit
that referenced
this pull request
May 21, 2026
… (ADR-103) (#694) First implementation PR for ADR-103. Same incremental shape that ADR-101 used: scaffold the cog crate, ship a stub-backend release that satisfies the runtime contract + 15 tests + measured cold-start, then follow up with the trained count_v1.safetensors in a separate PR. What ships: * v2/crates/cog-person-count/ — new workspace member. - Cargo.toml: candle-core/candle-nn 0.9 (cpu default, cuda feature opt-in), safetensors, ureq, sha2 — same dep shape as the pose cog but minus wifi-densepose-train (this cog has no training-side consumer, so the dep tree is materially smaller → 2.36 MB binary vs the pose cog's 4.5 MB). - src/inference.rs: CountNet (Conv1d 56→64→128→128 encoder + count head Linear(128→64→8)+softmax + confidence head Linear(128→32→1)+sigmoid). Stub backend returns `{1-person, 0-confidence}` honestly when no safetensors present. - src/fusion.rs: fuse_confidence_weighted() — Bayesian product of per-node distributions with confidence-weighted log-sum, plus fuse_with_mincut_clip() hook for the v0.2.0 Stoer-Wagner upper-bound (`ruvector-mincut` dep lands when min-cut graph builder is ready). Confidences floored at 1e-3 and probs floored at 1e-9 before logs — no NaN propagation. - src/publisher.rs: emits {count, confidence, count_p95_low, count_p95_high, n_nodes, probs} per ADR-103 §"Output". - src/main.rs: full ADR-100 four-verb CLI (version|manifest|health |run). The `run` subcommand explicitly returns "wiring pending v0.0.1" so the in-process library API is the v0.0.1-clean integration path. - tests/smoke.rs (8 tests) + fusion::tests (7 tests, in-lib) — 15 total, all green. Cover stub-backend behaviour, wrong-shape rejection, fusion math (empty / single / agreement / high-conf override / normalisation), p95-range correctness, and min-cut clip semantics. - cog/{manifest.template.json, config.schema.json, README.md} + cog/artifacts/ placeholder dir. * v2/Cargo.toml: registers the new workspace member. Verified locally: cargo check -p cog-person-count --no-default-features → clean cargo test -p cog-person-count --no-default-features → 8/8 pass cargo test -p cog-person-count --lib → 7/7 pass cargo build -p cog-person-count --release → 2.36 MB binary ./cog-person-count version → "person-count 0.3.0" ./cog-person-count manifest → JSON skeleton ./cog-person-count health → backend:stub, count:1, conf:0, p95:[1,1] Cold-start: 30 sequential `health` invocations → 53.3 ms/invocation (vs cog-pose-estimation's 76.2 ms — smaller dep tree) cog/README.md adds: * Security section — six-row threat table covering safetensor mmap trust, non-finite outputs, sensing fetch failures, fusion divide-by-zero / log-of-zero, min-cut degenerate cases, and stdout spoofing. * Performance / optimization section — binary size, release profile (already opt-level=3 / lto=fat / codegen-units=1 / strip=true at workspace level), cold-start comparison table, projected warm-path latency budget. Still pending (separate PRs, ADR-103 §"Migration"): * Train count_v1.safetensors on the existing 1,077 paired samples with `n_persons` labels (Candle on RTX 5080, same script that produced pose_v1.safetensors yesterday). * `run` subcommand wiring (long-running polling loop, same shape as cog-pose-estimation::runtime). * Cross-compile + sign + GCS upload (mirror of cog-pose-estimation release pipeline). * Server-side `csi.rs::score_to_person_count` call-site rewire to consume this cog when installed; falls back to PR #491's heuristic when not.
ruvnet
added a commit
that referenced
this pull request
May 21, 2026
…al (#697) Phase 4 of ADR-103. Adds the long-running polling loop so the cog's fourth verb (`run`) does real work, completing the ADR-100 runtime contract end-to-end: cog-person-count version → "person-count 0.3.0" cog-person-count manifest → JSON skeleton cog-person-count health → loads weights + 1-shot infer + emit cog-person-count run --config → long-running per-frame emit ← THIS What ships: * src/runtime.rs (new) — `run_loop` polls sensing_url every poll_ms, slides a [56, 20] CSI window, runs InferenceEngine::infer, emits publisher::person_count events. Same shape as cog-pose-estimation::runtime — fetch_frame extracts amplitudes from `snapshot.nodes[0].amplitude[]`, fails open on connect errors with a WARN log rather than crashing. * src/lib.rs — registers the runtime module. * src/main.rs — cmd_run now loads RunConfig from a JSON file, builds the InferenceEngine (with weights if cfg.model_path is set, otherwise auto-discover), emits a run.started event, and hands off to the Tokio multi-thread runtime's block_on(run_loop). Single-node fusion is a no-op for N=1 today; v0.2.0 will append predictions from sibling nodes and call fusion::fuse_confidence_weighted before emit. Verified locally: cargo check -p cog-person-count --no-default-features → clean cargo test -p cog-person-count → 15/15 pass (no regressions) cargo build -p cog-person-count --release → 2.36 MB unchanged ./cog-person-count run --config bad-config.json: line 1: {"event":"run.started","fields":{"cog":"person-count", "sensing_url":"http://127.0.0.1:9999/...",poll_ms:100, "model_path":"(auto-discover)"}} line 2: WARN sensing-server fetch failed error=Connection Failed: Connect error: actively refused (loop alive — exits cleanly on SIGTERM, no crash, no NaN) Also adds a "Relationship to the in-process score_to_person_count heuristic" section to cog/README.md explaining the dual-emitter design (sensing-server keeps emitting the PR #491 slot heuristic; the cog runs out-of-process and emits person.count events from the learned model). Operators choose by installing the cog or not — no sensing-server rebuild required. ADR-103 §"Migration" status: 1. Land ADR + scaffold ........... done (#693, #694) 2. Train count_v1 ................ done (#695) 3. Cross-compile + sign + GCS .... done (#696) 4. Server-side wiring ............ done — out-of-process design means no rewire needed; this cog is the wiring. 5. v0.2.0 multi-room + LoRA ...... data-bound (#645)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Person counting in
v2/uses fixed-scale feature normalization which works well in calibrated environments but degrades when signal characteristics drift across rooms, interference levels, or hardware. Two improvements here are deployment-neutral:1.
RollingP95adaptive normalizercompute_person_score()previously normalized features with hard-coded denominators (variance/300,motion_band_power/250,spectral_power/500). When live ESP32 values exceed those limits the normalized inputs clamp to 1.0 and dynamic range collapses.RollingP95is a streaming P95 estimator (600-sample / ~30 s sliding window) that self-calibrates to whatever feature distribution the deployment produces. Cold-start (< 60 samples) falls back to the legacy denominators so day-0 behaviour is fully preserved.2.
dedup_factorruntime APIExposes the multi-node cluster deduplication divisor via REST so deployments can tune to their environment without rebuilding. Includes an auto-tune endpoint that derives the optimal
dedup_factorfrom a known person count (calibration mode). Config persists across restarts indata/config.json.Explicitly NOT included
This fork also has additional ISTA
lambdatuning (specificallylambda=5.0) for its local 8×8×4 babycube grid. Those values are deployment-specific and intentionally not included in this PR — they would degrade person-count quality on different room geometries. This PR keeps upstream's existinglambda: 0.1default.Changes
v2/crates/wifi-densepose-sensing-server/src/main.rsRollingP95struct +impl(ADR-044 §5.2)RuntimeConfigstruct +load_runtime_config/save_runtime_config(ADR-044 §5.3)AppStateInner: addedp95_variance,p95_motion_band_power,p95_spectral_power,dedup_factor,data_dirfieldscompute_person_score()signature:&FeatureInfo→&AppStateInner + &FeatureInfo(adaptive denominators)config_get_dedup_factor,config_set_dedup_factor,config_set_ground_truthGET/POST /api/v1/config/dedup-factor,POST /api/v1/config/ground-truthrolling_p95_testsmodulev2/crates/wifi-densepose-sensing-server/src/multistatic_bridge.rsfuse_or_fallback()gainsdedup_factor: f64parameter; fallback switches frommax()toceil(sum / dedup_factor)Test results
cargo test --workspace --no-default-features: 1636 passed, 0 failed (includes 5 newRollingP95unit tests)python archive/v1/data/proof/verify.py: VERDICT: FAIL — pre-existing onorigin/main(numpy/scipy version drift); not caused by this PRNotes
dedup_factorto match observed cluster count. Useful for calibration during install.RollingP95is a generic primitive — could be reused for other adaptive thresholds in future.lambda: 0.1intomography.rsis untouched.