bernstein writes a tamper-evident, append-only audit log for every orchestrator action. each entry is HMAC-SHA256-signed per RFC 2104 (HMAC) using FIPS 180-4 SHA-256, and chained to the previous entry's hmac, so any after-the-fact edit invalidates every following hmac. this page tells an SRE or security operator how to run that surface in production: where the key lives, how to rotate it, how to verify a snapshot, how to ship to a SIEM, and what to do when a chain breaks.
short version: keep the key off the audit volume, run
bernstein audit verify from cron, fail the run if the exit code is
non-zero.
logs live under .sdd/audit/ as one JSONL file per UTC day, e.g.
.sdd/audit/2026-05-07.jsonl. each line is one event:
{
"timestamp": "2026-05-07T14:30:00.000000Z",
"event_type": "task.transition",
"actor": "orchestrator",
"resource_type": "task",
"resource_id": "TASK-001",
"details": {"from_status": "open", "to_status": "claimed"},
"prev_hmac": "0000…0000",
"hmac": "d4e5f6…"
}the hmac field is computed over the canonical JSON payload (every
field above except hmac itself) concatenated with the previous
entry's hmac. pseudocode:
payload = prev_hmac + json.dumps(entry_without_hmac, sort_keys=True)
entry.hmac = HMAC_SHA256(audit_key, payload).hexdigest()
the very first event in a chain uses a genesis prev_hmac of 64
zeros. daily rotation never resets the chain - the next file's first
entry uses the last hmac from yesterday.
source: src/bernstein/core/security/audit.py.
the tool-call approval gate runs a deterministic command/tool
classifier (src/bernstein/core/security/auto_approve.py) before a tool
call reaches the operator queue. each decision the gate acts on is
recorded as an auto_approve_decision event so an auditor can prove,
after the fact, which calls were rejected or auto-approved and which
pattern drove the verdict:
{
"event_type": "auto_approve_decision",
"actor": "backend",
"resource_type": "tool_call",
"resource_id": "Bash",
"details": {
"decision": "deny",
"tool": "Bash",
"reason": "Dangerous command detected: 'rm -rf /tmp/x'",
"matched_pattern": "\\brm\\s+-rf\\b",
"command": "rm -rf /tmp/x"
}
}precedence is deny-wins and the posture is fail-closed:
- a deny-listed command (e.g.
rm -rf,git push --force,DROP TABLE,curl ... | bash) is rejected by the production path regardless of interactive mode, and the rejection is written to the chain with its matched pattern. - a safe command is only auto-approved when the operator opts in via
approvals.smart_auto_approve: trueinbernstein.yaml. with the default (false) a safe verdict still goes to the queue. - an ambiguous (
ask) verdict, or any classifier error, never auto-approves - it falls through to human review.
source: src/bernstein/core/approval/gate.py.
the HMAC key is loaded from, in order:
- the
BERNSTEIN_AUDIT_KEY_PATHenvironment variable, if set. $XDG_STATE_HOME/bernstein/audit.keyifXDG_STATE_HOMEis set.~/.local/state/bernstein/audit.key(XDG default).
a legacy fallback at <sdd>/../config/audit-key is still read for
verification only - bernstein logs a warning when it is used. migrate
off it.
design intent: the key sits outside .sdd/audit/ so an attacker
who can write .sdd/audit/*.jsonl cannot also read or rotate the
signing key. don't undo this - never colocate the key with the log
volume, and don't bake the key into a docker image layer.
bernstein refuses to load a key file with mode looser than 0600. on
posix systems a group- or world-readable key raises
AuditKeyPermissionError and the orchestrator refuses to start. on
windows the bit-check is skipped (NTFS uses ACLs).
chmod 0600 ~/.local/state/bernstein/audit.key
ls -l ~/.local/state/bernstein/audit.key
# -rw------- 1 ops ops 64 May 7 14:30 audit.keyif no key file exists, bernstein generates one (32 bytes of
secrets.token_hex) at the resolved path on first startup, with the
parent directory created 0700 and the file 0600. that auto-bootstrap
is fine for a developer laptop. for a production deploy, materialise
the key from your secrets manager and write it yourself before the
orchestrator first starts - that way the key value is owned by your
KMS / vault, not by whichever host happened to boot first.
# pull from vault, write to the canonical path, lock it down
mkdir -p ~/.local/state/bernstein
vault kv get -field=audit_key secret/bernstein/audit \
> ~/.local/state/bernstein/audit.key
chmod 0600 ~/.local/state/bernstein/audit.keybernstein does not call out to a KMS for every entry - that would put a network hop on the audit write path. the supported pattern is:
- KMS / vault holds the canonical key.
- a sidecar (vault-agent, sops, your config-mgmt tool) projects it
into the file at
BERNSTEIN_AUDIT_KEY_PATHwith0600. - the orchestrator reads the file once at startup; rotate the projected file and restart to roll keys.
if you need full HSM-backed signing, that's an enhancement, not a shipped feature - open an issue.
the chain is keyed by the secret used to generate the hmacs that already exist on disk. swapping the key invalidates every hmac written before the swap. operationally, rotation is therefore a "close current chain, open a new one" event, not an in-place update.
procedure:
-
drain in-flight tasks or stop the orchestrator:
bernstein stop. -
snapshot the current chain for forensic retention:
bernstein audit verify --hmac-only # confirm clean before rollover bernstein audit seal # store the merkle root tar czf audit-pre-rotate-$(date -u +%F).tgz .sdd/audit/
-
archive
.sdd/audit/somewhere read-only and clear the live directory (or move the orchestrator to a fresh.sdd/). the sealed merkle root + the archived JSONL still verify with the old key - keep it alongside the archive. -
write the new key to
BERNSTEIN_AUDIT_KEY_PATH(0600). -
start the orchestrator. the next entry begins a new chain with a genesis
prev_hmac.
the practical rotation cadence is annual or on personnel/key compromise events, not weekly - the cost is a chain split, not zero. schedule it alongside SOC 2 / ISO renewals.
if you want a "soft rotation" with no chain break, the only honest answer is "we don't ship that today". the chain is intentionally keyed by a single secret to keep verification simple for auditors.
the public verb is bernstein audit verify:
bernstein audit verify # hmac chain + merkle seal
bernstein audit verify --hmac-only # hmac chain only
bernstein audit verify --merkle-onlyexit codes:
| code | meaning |
|---|---|
| 0 | chain intact (and merkle seal matches, if checked) |
| 1 | one or more verification errors |
a healthy run prints a green panel:
╭───────────────────────────────────╮
│ HMAC Chain Verification Passed │
╰───────────────────────────────────╯
a tamper hit prints a red panel and one line per mismatched entry - filename, line number, expected vs stored prefix:
╭───────────────────────────────────╮
│ HMAC Chain Verification FAILED │
╰───────────────────────────────────╯
! 2026-05-07.jsonl:42: HMAC mismatch (expected a1b2c3d4… got deadbeef…)
! 2026-05-07.jsonl:43: prev_hmac mismatch (expected a1b2c3d4… got deadbeef…)
run from cron (every 15 min is fine - the verify is read-only and a day's worth of events checks in milliseconds):
*/15 * * * * cd /srv/bernstein && bernstein audit verify --hmac-only \
>> /var/log/bernstein/audit-verify.log 2>&1 || \
/usr/local/bin/page-on-call.sh "bernstein audit verify failed"on every orchestrator start, bernstein re-verifies the last 100
entries automatically (verify_on_startup in
src/bernstein/core/security/audit_integrity.py). a key with
loose permissions raises AuditKeyPermissionError and the
orchestrator refuses to start - that's by design. clean up the
permissions, don't bypass the check.
if you only have python and the audit volume (e.g. an offline auditor laptop), call the module directly:
from pathlib import Path
from bernstein.core.security.audit_integrity import verify_audit_integrity
result = verify_audit_integrity(Path(".sdd/audit"), count=10_000)
print(result.valid, result.entries_checked, result.errors)bernstein audit seal writes a Merkle root over the daily *.jsonl
files into .sdd/audit/merkle/seal-<ISO>.json. it is a file-level
integrity layer on top of the per-line HMAC chain: it detects a
deleted, inserted, reordered, or content-tampered daily file.
what the seal binds:
- leaf per file - each file's leaf is the hash of its whole
canonical byte content, so flipping any byte of any line (not just
the last) changes the leaf and trips
TAMPEREDon verify. - domain-separated tree - leaves are hashed
H(0x00 || bytes)and internal nodesH(0x01 || left || right)(RFC-6962 style). a lone node at an odd level is promoted unchanged rather than paired with itself, so a leaf set with its last entry duplicated cannot collide with the un-duplicated set. - chain precheck -
audit sealverifies the HMAC chain before writing the root. a broken chain raises and no seal is written, so re-sealing can never launder a pre-existing tamper into a fresh root. seal a known-broken chain on purpose only via the recovery procedure below.
scheme versioning: seals carry a "scheme" field. verification
dispatches on it, so seals written before this hardening (scheme 1,
or absent) still verify under their original rule. roots produced
before and after the upgrade are not comparable - re-seal once
after upgrading, and keep any pinned pre-upgrade root labelled as the
old scheme. this is a scheme upgrade, not a tamper alert.
the query verb walks all JSONL files under .sdd/audit/ and
streams matching events back as a table - handy for "show me what
this actor did between these two timestamps":
bernstein audit query --actor orchestrator --since 2026-05-07
bernstein audit query --event-type task.transition --limit 200
bernstein audit show --limit 50 # tail of the live lognote that query does not re-verify hmacs - pair it with
bernstein audit verify if the question is "is this trustworthy",
not "what happened".
bernstein audit slice writes a deterministic JSONL subset of the
audit log between two HMAC anchors. The output is byte-stable -
each line is sort_keys-serialised - so a downstream replayer can
hash the slice directly. The HMAC chain inside the slice is
re-verified before the file is written; a structural mismatch
aborts the export.
# Slice a closed range
bernstein audit slice --from <hash> --to <hash> -o /tmp/slice.jsonl
# Head-anchored: from genesis through a known checkpoint
bernstein audit slice --to <hash> -o /tmp/head.jsonl
# Tail-anchored: from a known checkpoint through the latest entry
bernstein audit slice --from <hash> -o /tmp/tail.jsonlThe slice CLI pairs with the multi-tenant export in Multi-tenant audit slices when an auditor needs a tenant-scoped, signed bundle rather than the raw JSONL.
daily rotation happens on every log() call (the file path is
derived from today's UTC date). there is no in-process compaction;
old files are gzipped into .sdd/audit/archive/ by
AuditLog.archive(RetentionPolicy(retention_days=…)), default 90
days. invoke it from a maintenance job or a periodic timer.
a minimal logrotate snippet, if you'd rather not run the python
archiver:
/srv/bernstein/.sdd/audit/*.jsonl {
daily
rotate 90
compress
nocreate
missingok
notifempty
# do NOT use copytruncate - the orchestrator appends, truncating
# mid-write would corrupt the in-memory chain pointer
}
avoid copytruncate. the orchestrator holds an open append handle
and a python-side _prev_hmac; truncating under it produces a chain
that disagrees with itself across a process restart.
archive format guarantees: gzipped JSONL, the chain still verifies end-to-end across uncompressed + archived files as long as you feed both into the verifier.
bernstein ships in-process exporters for splunk HEC, elasticsearch,
cloudwatch logs, syslog (RFC 5424), generic webhook, and a local file
sink. they live in
src/bernstein/core/security/audit_export.py and consume the same
AuditEntry records the chain emits. configure via bernstein.yaml:
audit_export:
target: splunk # splunk | elasticsearch | cloudwatch | syslog | webhook | file
batch_size: 100
flush_interval_s: 30
splunk:
endpoint: https://splunk.example.com:8088
token: ${SPLUNK_HEC_TOKEN}
index: bernstein-audit
sourcetype: bernstein:auditwebhook payload, one batch:
[
{
"timestamp": 1715091000.123,
"event_type": "task.transition",
"actor": "orchestrator",
"resource": "task/TASK-001",
"action": "claim",
"outcome": "success",
"details": {"from_status": "open", "to_status": "claimed"},
"hmac": "d4e5f6…",
"source": "bernstein-audit"
}
]retry behaviour: failed batches retry with exponential backoff
(max_retries, retry_backoff_s) before being counted in
total_failed. the chain on disk is the source of truth - SIEM is
a mirror, not the master copy. if the SIEM drops a batch, replay
from .sdd/audit/ with bernstein audit query.
splunk:
index=bernstein-audit | stats count by event_type, actor
index=bernstein-audit event_type=task.transition outcome=failure
datadog (via webhook → log intake):
service:bernstein-audit @event_type:task.transition @outcome:failure
a real tamper looks like one or both of:
prev_hmac mismatch- someone deleted or reordered an entry.HMAC mismatch- someone edited an entry's payload after the fact.
rare benign causes:
- partial write at process kill (last line truncated). expected to be unparseable JSON, not an hmac mismatch - both verifier paths log this distinctly.
- key rotation done without archive-then-clear (see above).
- legacy log written under the old
.hmac_keyfilename, now read with the XDG-default key.
-
freeze the volume. don't
rm, don't restart the orchestrator yet. snapshot.sdd/audit/somewhere read-only. -
capture full verification output:
bernstein audit verify > /tmp/audit-verify.txt 2>&1; echo $?
-
diff the suspect line range against your SIEM mirror - if the webhook / splunk export was healthy, the SIEM has the original payload and you can identify exactly which fields were edited.
-
seal the broken chain so the corruption is itself tamper-evident forensically. a plain
audit sealrefuses a broken chain (so re-sealing can't hide a tamper) - pass--allow-broken-chainto capture the corrupted state on purpose:bernstein audit seal --allow-broken-chain
-
file an incident, preserve the snapshot, and start a fresh chain per the rotation procedure above. do not rewrite hmacs to "fix" the chain - that destroys the evidence.
-
if the orchestrator can't start because of
AuditKeyPermissionError, tighten the key permissions and try again rather than pointing at a different key - the existing log was signed with the original.
- SOC 2 audit mode quick start
- Multi-tenant audit-chain export
- DSSE / in-toto envelope - third-party-verifiable wrapper over the Article 12 bundle.
- EU AI Act Article 12 evidence pack
- Security hardening guide
- Disaster recovery runbook
- Enterprise evaluation checklist
- source:
src/bernstein/core/security/audit.py,audit_integrity.py,audit_export.py.