Skip to content

feat(sensing-server): adaptive person count — RollingP95 + dedup_factor runtime API#491

Merged
ruvnet merged 1 commit into
ruvnet:mainfrom
schwarztim:pr/adaptive-person-count
May 19, 2026
Merged

feat(sensing-server): adaptive person count — RollingP95 + dedup_factor runtime API#491
ruvnet merged 1 commit into
ruvnet:mainfrom
schwarztim:pr/adaptive-person-count

Conversation

@schwarztim
Copy link
Copy Markdown
Contributor

Motivation

Person counting in v2/ uses fixed-scale feature normalization which works well in calibrated environments but degrades when signal characteristics drift across rooms, interference levels, or hardware. Two improvements here are deployment-neutral:

1. RollingP95 adaptive normalizer

compute_person_score() previously normalized features with hard-coded denominators (variance/300, motion_band_power/250, spectral_power/500). When live ESP32 values exceed those limits the normalized inputs clamp to 1.0 and dynamic range collapses. RollingP95 is a streaming P95 estimator (600-sample / ~30 s sliding window) that self-calibrates to whatever feature distribution the deployment produces. Cold-start (< 60 samples) falls back to the legacy denominators so day-0 behaviour is fully preserved.

2. dedup_factor runtime API

Exposes the multi-node cluster deduplication divisor via REST so deployments can tune to their environment without rebuilding. Includes an auto-tune endpoint that derives the optimal dedup_factor from a known person count (calibration mode). Config persists across restarts in data/config.json.

Explicitly NOT included

This fork also has additional ISTA lambda tuning (specifically lambda=5.0) for its local 8×8×4 babycube grid. Those values are deployment-specific and intentionally not included in this PR — they would degrade person-count quality on different room geometries. This PR keeps upstream's existing lambda: 0.1 default.

Changes

  • v2/crates/wifi-densepose-sensing-server/src/main.rs

    • New RollingP95 struct + impl (ADR-044 §5.2)
    • New RuntimeConfig struct + load_runtime_config / save_runtime_config (ADR-044 §5.3)
    • AppStateInner: added p95_variance, p95_motion_band_power, p95_spectral_power, dedup_factor, data_dir fields
    • compute_person_score() signature: &FeatureInfo&AppStateInner + &FeatureInfo (adaptive denominators)
    • All 3 call sites updated; P95 push calls added before each scoring call
    • New REST handlers: config_get_dedup_factor, config_set_dedup_factor, config_set_ground_truth
    • Routes registered: GET/POST /api/v1/config/dedup-factor, POST /api/v1/config/ground-truth
    • 5 unit tests in rolling_p95_tests module
  • v2/crates/wifi-densepose-sensing-server/src/multistatic_bridge.rs

    • fuse_or_fallback() gains dedup_factor: f64 parameter; fallback switches from max() to ceil(sum / dedup_factor)
    • Test call site updated

Test results

  • cargo test --workspace --no-default-features: 1636 passed, 0 failed (includes 5 new RollingP95 unit tests)
  • python archive/v1/data/proof/verify.py: VERDICT: FAIL — pre-existing on origin/main (numpy/scipy version drift); not caused by this PR

Notes

  • The auto-tune endpoint accepts a known person count and adjusts dedup_factor to match observed cluster count. Useful for calibration during install.
  • RollingP95 is a generic primitive — could be reused for other adaptive thresholds in future.
  • lambda: 0.1 in tomography.rs is untouched.

…or runtime API

RollingP95 adaptive normalizer (ADR-044 §5.2):
- Streaming P95 estimator (600-sample / ~30 s window) replaces fixed-scale
  denominators (variance/300, motion/250, spectral/500) that saturated against
  live ESP32 values, collapsing dynamic range to zero.
- Cold-start (<60 samples) falls back to legacy denominators — day-0 behaviour
  is preserved.
- Three new fields on AppStateInner: p95_variance, p95_motion_band_power,
  p95_spectral_power (all RollingP95::new(600, 60)).
- compute_person_score() refactored to accept &AppStateInner; all three call
  sites (wifi, wifi-fallback, simulated) updated.
- 5 unit tests in rolling_p95_tests module.

dedup_factor runtime API (ADR-044 §5.3):
- New field dedup_factor: f64 (default 3.0) on AppStateInner.
- fuse_or_fallback() gains dedup_factor param; fallback switches from max() to
  sum/dedup_factor (ceiling), matching the fork's sum-based aggregation.
- RuntimeConfig struct + load/save_runtime_config() for data/config.json
  persistence across restarts.
- Three new REST endpoints:
    GET  /api/v1/config/dedup-factor
    POST /api/v1/config/dedup-factor
    POST /api/v1/config/ground-truth (auto-tune from known person count)

Explicitly NOT included:
- lambda=5.0 (upstream keeps its 0.1 default — deployment-specific tuning)
- CC intensity threshold 0.3 and min-cluster-size 4 hardcodes
- max_cc_size filter removal
@ruvnet
Copy link
Copy Markdown
Owner

ruvnet commented May 17, 2026

Hi @schwarztim — current main has moved (mergeable=DIRTY/UNKNOWN). Key landings since this PR opened that you'll want to pull in:

The PRs were all evaluated thoughtful and well-described. Once you rebase against main, I'll do a focused review and merge if the test plan still passes locally. If you'd like me to rebase one of these on your behalf, say which (I can push back to your fork since maintainerCanModify is enabled).

@ruvnet ruvnet merged commit 79cc2d7 into ruvnet:main May 19, 2026
7 of 17 checks passed
ruvnet added a commit that referenced this pull request May 19, 2026
…count

Merge #491: feat(sensing-server): adaptive person count — RollingP95 + dedup_factor (integration on schwarztim's behalf)
ruvnet added a commit that referenced this pull request May 19, 2026
#654)

The previous table mixed status badges (✅ / ⚠️ / 🔬) and verbose
"pending wiring / not yet released" caveat columns. Rewrites it as
"What / How / Speed-or-scale" — three columns, present tense, no
status column. Captures what actually shipped this week:

* Presence detection now points at the trained head shipped on HF
  (100% validation accuracy), with the phase-variance fallback
  reframed as a no-model option rather than a "loader pending" caveat.
* 17-keypoint pose is its own row now — cog-pose-estimation v0.0.1
  binaries on GCS, 8.4 ms cold-start on Pi 5, train-your-own in 2.1 s
  on RTX 5080. References ADR-101 + the benchmark log.
* Multi-person counting drops the "Heuristic, not learned" framing.
  The adaptive P95 normalisation from PR #491 is in tree, the
  runtime dedup-factor knob is documented, and the six learned
  drop-in counters from the Cog catalog are linked: occupancy-zones,
  elevator-count, queue-length, customer-flow, clean-room,
  person-matching.
* Edge intelligence row now points at the 105-cog catalog (ADR-102)
  instead of just the Cognitum Seed hardware.
* Camera-supervised fine-tune row reflects the actual measured
  training time (2.1 s on RTX 5080 for 400 epochs) instead of the
  laptop estimate.
* Drops the status-legend footer (no more ✅/⚠️/🔬 column to legend).
  Replaces it with a pointer down to the Edge Module Catalog.

The ESP32 + Cognitum Seed deployment-options row gets the same
treatment: cleaner list of what's included, no "Pose pending weights"
parenthetical (the cog ships today).

Net effect: same information, present tense, positive voice. Nothing
removed beyond status badges + pending-work parentheticals; all
genuine engineering details (e.g. "needs ~30 s ambient calibration"
for the fallback) are preserved inline.
ruvnet added a commit that referenced this pull request May 21, 2026
Motivated by #499 (multi-node double-skeletons) which PR #491 stopped
the bleeding on but didn't take to the WiFi-CSI literature's state of
the art. Designs a learned counter that replaces today's slot
heuristic + dedup_factor knob, reusing the primitives we've already
shipped this week:

  * Candle / RTX 5080 training pipeline (proven yesterday, 2.1 s for
    400 epochs on pose_v1.safetensors)
  * HF presence encoder as initialization (architectures compatible,
    unlike the pose head case)
  * ruvector-mincut (Stoer-Wagner) for multi-node fusion upper-bound
  * Cog packaging spec (ADR-100) + edge module registry (ADR-102)
  * Paired-data pipeline (PR #641 streaming-safe align-ground-truth.js)
    — `n_persons` labels come for free; no new data collection
    campaign required to bootstrap.

Architecture:
  per-node CSI [56×20] -> frozen HF encoder -> 128-dim embedding
                                          \
                                           > count head (softmax {0..7})
                                           > confidence head (sigmoid)
  N nodes' distributions -> confidence-weighted log-sum
                         -> Stoer-Wagner min-cut upper-bound clip
                         -> { count, confidence,
                              count_p95_low, count_p95_high,
                              per_node_breakdown }

Compares the proposal explicitly against WiCount / DeepCount /
CrossCount / HeadCount published numbers and is honest about the
hardware gap (their 3x3 MIMO research NICs vs our 1x1 SISO ESP32-S3).

v0.1.0 acceptance gates target >=80% within-+/-1 same-room and
>=60% cross-room — modest on purpose; bounded by the same paired-
data scarcity #645 documents for pose. The framework is the
deliverable; the accuracy follows the data.

Includes:
  * Architecture diagram in ascii
  * Comparison table vs published WiFi-CSI counting SOTA
  * Per-failure-mode mapping from #499 symptoms to how the
    learned counter addresses each
  * v0.1.0 + v0.2.0 acceptance gates with measurable thresholds
  * Repo layout for the new `v2/crates/cog-person-count/` crate
  * Five-step migration plan from this ADR -> first GCS release

Status: Proposed. Implementation follows in the same incremental
pattern ADR-101 used: scaffold-cog PR -> train+publish PR ->
server-wiring PR.
ruvnet added a commit that referenced this pull request May 21, 2026
… (ADR-103) (#694)

First implementation PR for ADR-103. Same incremental shape that
ADR-101 used: scaffold the cog crate, ship a stub-backend release
that satisfies the runtime contract + 15 tests + measured cold-start,
then follow up with the trained count_v1.safetensors in a separate PR.

What ships:

* v2/crates/cog-person-count/ — new workspace member.
    - Cargo.toml: candle-core/candle-nn 0.9 (cpu default, cuda feature
      opt-in), safetensors, ureq, sha2 — same dep shape as the pose cog
      but minus wifi-densepose-train (this cog has no training-side
      consumer, so the dep tree is materially smaller → 2.36 MB
      binary vs the pose cog's 4.5 MB).
    - src/inference.rs: CountNet (Conv1d 56→64→128→128 encoder + count
      head Linear(128→64→8)+softmax + confidence head
      Linear(128→32→1)+sigmoid). Stub backend returns
      `{1-person, 0-confidence}` honestly when no safetensors present.
    - src/fusion.rs: fuse_confidence_weighted() — Bayesian product of
      per-node distributions with confidence-weighted log-sum, plus
      fuse_with_mincut_clip() hook for the v0.2.0 Stoer-Wagner
      upper-bound (`ruvector-mincut` dep lands when min-cut graph
      builder is ready). Confidences floored at 1e-3 and probs floored
      at 1e-9 before logs — no NaN propagation.
    - src/publisher.rs: emits {count, confidence, count_p95_low,
      count_p95_high, n_nodes, probs} per ADR-103 §"Output".
    - src/main.rs: full ADR-100 four-verb CLI (version|manifest|health
      |run). The `run` subcommand explicitly returns "wiring pending
      v0.0.1" so the in-process library API is the v0.0.1-clean
      integration path.
    - tests/smoke.rs (8 tests) + fusion::tests (7 tests, in-lib) — 15
      total, all green. Cover stub-backend behaviour, wrong-shape
      rejection, fusion math (empty / single / agreement / high-conf
      override / normalisation), p95-range correctness, and min-cut
      clip semantics.
    - cog/{manifest.template.json, config.schema.json, README.md} +
      cog/artifacts/ placeholder dir.

* v2/Cargo.toml: registers the new workspace member.

Verified locally:

  cargo check -p cog-person-count --no-default-features    → clean
  cargo test  -p cog-person-count --no-default-features    → 8/8 pass
  cargo test  -p cog-person-count --lib                    → 7/7 pass
  cargo build -p cog-person-count --release                → 2.36 MB binary
  ./cog-person-count version                               → "person-count 0.3.0"
  ./cog-person-count manifest                              → JSON skeleton
  ./cog-person-count health                                → backend:stub,
                                                              count:1, conf:0,
                                                              p95:[1,1]
  Cold-start: 30 sequential `health` invocations → 53.3 ms/invocation
              (vs cog-pose-estimation's 76.2 ms — smaller dep tree)

cog/README.md adds:

* Security section — six-row threat table covering safetensor mmap
  trust, non-finite outputs, sensing fetch failures, fusion
  divide-by-zero / log-of-zero, min-cut degenerate cases, and stdout
  spoofing.
* Performance / optimization section — binary size, release profile
  (already opt-level=3 / lto=fat / codegen-units=1 / strip=true at
  workspace level), cold-start comparison table, projected warm-path
  latency budget.

Still pending (separate PRs, ADR-103 §"Migration"):

* Train count_v1.safetensors on the existing 1,077 paired samples
  with `n_persons` labels (Candle on RTX 5080, same script that
  produced pose_v1.safetensors yesterday).
* `run` subcommand wiring (long-running polling loop, same shape as
  cog-pose-estimation::runtime).
* Cross-compile + sign + GCS upload (mirror of cog-pose-estimation
  release pipeline).
* Server-side `csi.rs::score_to_person_count` call-site rewire to
  consume this cog when installed; falls back to PR #491's heuristic
  when not.
ruvnet added a commit that referenced this pull request May 21, 2026
…al (#697)

Phase 4 of ADR-103. Adds the long-running polling loop so the cog's
fourth verb (`run`) does real work, completing the ADR-100 runtime
contract end-to-end:

  cog-person-count version    → "person-count 0.3.0"
  cog-person-count manifest   → JSON skeleton
  cog-person-count health     → loads weights + 1-shot infer + emit
  cog-person-count run --config  → long-running per-frame emit  ← THIS

What ships:

* src/runtime.rs (new) — `run_loop` polls sensing_url every poll_ms,
  slides a [56, 20] CSI window, runs InferenceEngine::infer, emits
  publisher::person_count events. Same shape as
  cog-pose-estimation::runtime — fetch_frame extracts amplitudes
  from `snapshot.nodes[0].amplitude[]`, fails open on connect errors
  with a WARN log rather than crashing.
* src/lib.rs — registers the runtime module.
* src/main.rs — cmd_run now loads RunConfig from a JSON file, builds
  the InferenceEngine (with weights if cfg.model_path is set,
  otherwise auto-discover), emits a run.started event, and hands off
  to the Tokio multi-thread runtime's block_on(run_loop). Single-node
  fusion is a no-op for N=1 today; v0.2.0 will append predictions
  from sibling nodes and call fusion::fuse_confidence_weighted before
  emit.

Verified locally:

  cargo check  -p cog-person-count --no-default-features   → clean
  cargo test   -p cog-person-count                          → 15/15 pass (no regressions)
  cargo build  -p cog-person-count --release                → 2.36 MB unchanged
  ./cog-person-count run --config bad-config.json:
    line 1: {"event":"run.started","fields":{"cog":"person-count",
             "sensing_url":"http://127.0.0.1:9999/...",poll_ms:100,
             "model_path":"(auto-discover)"}}
    line 2: WARN sensing-server fetch failed
            error=Connection Failed: Connect error: actively refused
    (loop alive — exits cleanly on SIGTERM, no crash, no NaN)

Also adds a "Relationship to the in-process score_to_person_count
heuristic" section to cog/README.md explaining the dual-emitter
design (sensing-server keeps emitting the PR #491 slot heuristic;
the cog runs out-of-process and emits person.count events from the
learned model). Operators choose by installing the cog or not — no
sensing-server rebuild required.

ADR-103 §"Migration" status:
  1. Land ADR + scaffold ........... done (#693, #694)
  2. Train count_v1 ................ done (#695)
  3. Cross-compile + sign + GCS .... done (#696)
  4. Server-side wiring ............ done — out-of-process design
                                      means no rewire needed; this
                                      cog is the wiring.
  5. v0.2.0 multi-room + LoRA ...... data-bound (#645)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants