The Event Store & Supply Audit
The immutable ledger that makes the whole platform auditable. No silent state changes. No "trust us" on the supply. Every balance is a replay of history, and a worker checks that history against itself every 60 seconds.
Why It Matters
Every sovereign infrastructure eventually gets this question: "how do we know the operator isn't quietly printing tokens?" The honest answer has to be cryptographic, not rhetorical. The Event Store is how the protocol provides that answer. Every economic event — transfers, mints, stakes, disputes, slashings — lives in an append-only ledger. Balances are derived from replay, not stored. And a continuous self-audit compares issuance totals to circulation totals. Delta must equal zero.
This chapter covers how that works end to end: the CQRS architecture that makes it cheap, the policy layer that keeps the ledger honest, and the audit algorithm that proves supply is conserved.
CQRS at the Heart
The platform uses Command Query Responsibility Segregation (CQRS): writes go to an append-only event log, reads come from derived projections that get rebuilt from that log. No row in any balance table is ever updated in place — rows are rebuilt by replaying events.
Three benefits you get from this shape:
- Audit trail for free. You don't add auditability — it is the storage model.
- Time-travel queries. "What was agent X's balance on 2026-03-15 18:00:00Z?" is a natural question because you can replay the log up to that timestamp.
- Rebuild from scratch. A corrupted projection is fixable — drop it and replay the log. The log is the only thing that matters.
Event Types (Partial List)
Not every event affects balances. The projection engine cares about the balance-affecting ones. A non-exhaustive taxonomy:
| Category | Event | Balance effect | Emitter |
|---|---|---|---|
| Mint | TokensIssued | +liquid (treasury) | Registry (mint proxy; TEG performs the DB mint and is gate-blocked from emitting) |
| Mint | DisputeCompensationMinted | +liquid (winner) | Registry |
| Move | TokensTransferred | −sender, +receiver | Registry |
| Move | TransactionFeeCollected | +fee_collector | Registry |
| Stake | TokensStaked | −liquid, +staked | Registry |
| Stake | TokensUnstaked | +liquid, −staked | Registry |
| Stake | RewardsDistributed / RewardsAutoCompounded | +liquid (paid) or +staked (auto-compounded) | Registry |
| Audit | BalanceCorrected | ±liquid (operator action) | Registry |
| Audit | ProjectionSnapshot | baseline reset | Registry |
| Cross-reg | CrossRegistryTransferCompleted | 0 (audit-only) | Registry |
| Lifecycle | AgentOnboarded, AgentStatusChanged | 0 | Registry |
Events are serialized JSON. The log stores id (UUID), event_type, aggregate_id (usually the agent DID or system account), aggregate_type, data (the event payload), metadata, timestamp, version, source_service (which service emitted — e.g. registry-a, teg-layer-a), and idempotency_key. Emitting services authenticate to the Event Store via mTLS (SPIRE SVID at the connection layer), not by signing each event individually.
Emission Policy (f037)
The critical piece — who is allowed to emit which events? If both the registry and the TEG emit TokensTransferred, your projection double-counts. The policy layer prevents that.
Each event type has a row in event_emission_policies:
event_type emit_from_teg emit_from_registry skip_in_projection
---------------------------------------------------------------------------------------------
TokensIssued FALSE TRUE FALSE ← Registry mint proxy emits; TEG blocked
TokensTransferred FALSE TRUE FALSE
TokensStaked FALSE TRUE FALSE
TransactionFeeCollected FALSE TRUE FALSE
CrossRegistryTransferCompleted FALSE TRUE TRUE ← audit only
RewardsDistributed FALSE TRUE TRUE ← audit onlyThe live event_emission_policies table holds a few dozen rows (financial + governance + lifecycle event types) — the set grows as new event types land, so query it live rather than trusting a static count. The six rows above are the supply-critical ones; everything else is operational telemetry.
When any service calls event_emitter.emit(...), this decision tree runs:
- Rejected events are logged (for forensics) but never written to the main log.
- Skip-in-projection events are written — they exist for audit trail and WebSocket subscribers — but the projection engine ignores them for balance calculations.
CrossRegistryTransferCompletedis the canonical example: it's informational; the underlyingTokensTransferredevents already update balances.
The policy is mutable via governance. An admin API (PUT /api/v1/admin/event-emission-policies/{event_type}) changes the row live, with an audit event recorded — you can't silently flip the bits.
::: warn Never bypass the emission policy gate. The supply audit algorithm below assumes the policy is honored. If a TEG-side code path emits TokensTransferred (bypassing the gate), every balance will be double-counted, and the auditor will page the network within 60 seconds. :::
Projections — How Balances Are Calculated
When an endpoint asks "what's agent X's balance?", the projection engine looks at its cached snapshot row for X. The snapshot is maintained by a background worker that continuously replays new events after the last known baseline.
Key concept: baseline + delta.
balance(agent, t) = snapshot_at(baseline_t) + Σ events(agent, baseline_t..t)On startup, or after POST /api/v1/admin/projection-snapshot, the engine:
- Pulls current balances directly from the TEG (source of truth)
- Writes them as a single
ProjectionSnapshotevent - Replays only events after the snapshot to reach current state
This means:
- Projections are fast — reading a balance is an index lookup, not a full log replay
- A corrupted projection is recoverable — take a new snapshot and replay
- The snapshot itself is a ledger event — it's auditable too
Single-Writer Balance Integrity (CQRS Transactional Outbox)
The replay model above is correct, but it had a structural weakness: balances are mutated at 40-plus call sites across roughly ten TEG files, and every one of those sites had to remember to emit the right balance event. Miss one and the projection silently drifts — the "emission gap." On top of that, two projection paths (the incremental trigger and the replay/snapshot path) could disagree, and using per-agent BalanceCorrected events as steady-state healing inflated the issued/destroyed magnitudes (it pushed one frame's issued figure to roughly twice the true supply).
The fix (now shipped) stops trying to make 40 hand-written emit calls complete. Instead it observes balances at the one place every change must pass through — the database row — and makes a single event type the sole writer of the projection:
Why it conserves supply by construction. For one atomic TEG transaction (one Postgres txid), sum the balance deltas across all affected profiles — and because treasury and fee_collector are themselves profiles, they are in the sum. If the sum is 0 it was a transfer/stake/fee (issued/destroyed unchanged); if > 0, net new tokens entered circulation (issuance); if < 0, tokens left it (destruction). The journal records the actual moves, so this classification is correct no matter how mint, stake, fee, or bridge are coded across those forty sites. The invariant issued − destroyed + transit_net == circulating holds at the baseline and is preserved by every transaction group — Δ = 0 by construction, not by after-the-fact reconciliation.
Two engineering details make it safe: the outbox stamps emitted_at only after the EventStore accepts the event, and the idempotency key ledger-tx-{frame}-{txid} rides the EventStore's UNIQUE constraint — so a crash mid-cycle simply retries, exactly-once. The cutover is gated by a single projection_config.single_writer flag; when it flips true, the legacy typed balance events (TokensIssued, TokensTransferred, TokensStaked, fees, slashes, BalanceCorrected, …) are still written to the log but become no-ops in the projection, leaving LedgerBalanceChanged as the only thing that moves a balance.
The Supply Invariant (Option 4 — Self-Consistent)
This is the heart of the chapter. Earlier designs compared ES numbers to DB numbers — and hit race conditions because the two systems update asynchronously. Option 4 fixes that by auditing the Event Store against itself.
The invariant:
tokens_issued − tokens_destroyed + transit_net == total_circulatingWhere:
tokens_issued = snapshot_baseline_issued
+ Σ TokensIssued
+ Σ BalanceCorrected (positive corrections)
+ Σ DisputeCompensationMinted
total_circulating = Σ (liquid + staked) across ALL balance records in the projection
(includes treasury, fee_collector, and every agent account)
tokens_destroyed = Σ (burned tokens) + Σ BalanceCorrected (negative corrections)
(a frame with no burns sits at zero; burns add here, never to circulating)
transit_net = Σ signed non-supply legs (transfers, stakes, fees,
cross-frame transit) — Δ-neutral, closes the per-frame booksFees are zero-sum for circulation: a transfer deducts 100 from sender, credits 99.5 to receiver and 0.5 to fee_collector. The sum of circulating changes is 0. Circulation only moves when tokens are minted; the destruction term tracks burns + negative balance corrections separately so a burn doesn't masquerade as a deficit.
The supply_auditor worker runs this every 60 seconds. Under sustained multi-simulator stress tests pushing tens to hundreds of transactions per second, delta stayed at 0.0 across millions of events.
Query the Audit Directly
The EventStore endpoint is internal-only on production (not exposed via nginx). For external visibility, a public composite auditor (e.g. auditor.example.com) can surface each frame's delta in real time. Operators with internal access can hit:
GET /api/v1/projections/supply-audit (EventStore, internal){
"tokens_issued": 1000000000.0,
"tokens_destroyed": 0.0,
"tokens_fees_collected": 12500.0,
"net_supply": 1000000000.0,
"total_circulating": 1000000000.0,
"delta": 0.0,
"status": "OK",
"agent_count": 100
}If delta is non-zero on any frame, the auditor pages within the same 60-second cycle.
INFO
When a BREACH fires, the first action is not to try to "fix" the number. The first action is to halt new minting, capture the snapshot, and diff it against the last known-good snapshot. The invariant is not the problem — it's the alarm.
Snapshot Lifecycle
You can take a snapshot on demand (admin only):
POST /api/v1/admin/projection-snapshot
Authorization: Bearer <admin_jwt>This pulls live balances from the TEG, writes a ProjectionSnapshot event, and sets it as the baseline for the projection engine. Subsequent audits anchor to this baseline + replay events since.
Set ENABLE_PROJECTION_SNAPSHOTS=true in the environment to make this available. In production it's on; in dev it can be disabled if you don't want a manual snapshot capability exposed.
WebSocket Fan-Out
Every accepted event is broadcast over WebSocket to subscribed clients. This is how UIs, monitoring dashboards, and agent programs get real-time updates without polling.
WebSocket /ws/events
Optional query param: ?event_types=TokensTransferred,ProposalTallied (comma-separated filter)Events arrive as JSON frames over the WebSocket connection. The server replays the most recent 15 events on every reconnect (so reactors and dashboards catch up across brief disconnects); deduplication by id is the client's responsibility — at-least-once delivery semantics.
Querying the Log Directly
Admin-only, for forensics and audit:
GET /api/v1/events?aggregate_id=<did>&from=<ts>&to=<ts>&limit=100
Authorization: Bearer <admin_jwt>Returns raw events in chronological order. Bounded by pagination. Immutable — you cannot delete an event through the API; the only way to "remove" one is to add a compensating event with a BalanceCorrected-like semantic, and the correction is itself logged.
Reactors — Cross-Service Reactions to Events
Some events have follow-on consequences that span more than one service: when a developer is suspended, the local registry must revoke credentials and fan webhooks out to the developer's apps and tell every peer registry to drop the agents from its blocklist and notify the TEG layer. The original endpoint handles the local part synchronously and inside its database transaction; the rest happens out-of-band.
The reactor framework runs those cross-service follow-ons. Each reactor subscribes to one event type on the WebSocket fan-out, runs leader-elected via Redis (one process holds the lock per reactor across N uvicorn workers), and dispatches HTTP commands to the right downstream services when the event lands.
CQRS line preserved
Reactors react to events; their side effects are HTTP commands. They do not emit further events — that path leads to event-loop avalanches. If a downstream genuinely needs to emit (e.g., dispute reconciliation), the f037 emission policy gates it explicitly, and the inline endpoint does it.
Source of truth stays inline
The synchronous endpoint that triggered the event (e.g. bulk/suspend → coordinated_agent_suspend) is the source of truth. Reactor execution is best-effort propagation. If a reactor fails, the local DB state is still correct — the system retries on the next event observation, or an operator can re-fire the propagation manually. Disabling the framework with REACTOR_FRAMEWORK_ENABLED=false pauses propagation without touching inline correctness.
What the reactors do
Dozens of reactor instances run on each registry (several files define multiple sub-reactors: reactor_canary_validator.py has 5 sub-classes, reactor_admin_broadcast.py / reactor_org_lifecycle_notification.py / reactor_cicd_notification.py each have 3). Rather than trust a static count, verify the live state any time via GET /api/v1/admin/reactor/status (returns the running set with dispatch counts + error counters).
Grouped by domain:
| Domain | Reactors | What they do |
|---|---|---|
| Identity / agents | AgentSuspended, AgentReinstated, AgentSlashed, AgentCardDriftFlag, DeveloperSuspended, JWSVerifyFailedThrottle | Webhook + TEG block/unblock + cross-frame fan-out on agent lifecycle. Drift-flag reactor surfaces SPIFFE↔card mismatches; JWS throttle batches verify failures. |
| Federation / peers | RegistryFederated, RegistryQuarantined, FederationLicenseSuspended, FederationLicenseReinstated, PeerRegistryDeactivated, OperatorApplicationRevoked, PeerUrlSelfHeal, FederationStateDriftDetected, ComplianceDryRunDrift, FrameFederationRevoked | TEG add-partner / remove-partner + DB cleanup + webhooks. PeerUrlSelfHeal rewrites stale peer URLs in place; ComplianceDryRunDrift emits observation-mode signals without acting. |
| Economy / treasury | BalanceSync, BalanceCorrected, SupplyInvariantBreach, BridgeTransferExpired, EventEmissionPolicyUpdated, FundingRequestCreated | Sync cached balances from TEG to Registry; alert on invariant drift (5-min throttle, breaks on delta change); email + webhook on every treasury correction. |
| Cross-frame settlement | CrossFrameInitiated, CrossFrameSettled, CrossFrameRefunded | Receive-side credit + saga reconciliation for cross-frame async transfers — polling backstop overridden to 10s for tighter SLO. |
| Canary (validators) | CanaryValidator-TokensTransferred, CanaryValidator-TransactionFeeCollected, CanaryValidator-CrossFrameInitiated, CanaryValidator-CrossFrameSettled, CanaryValidator-CrossFrameRefunded | Stamp + judge state hash per canary test (5 sub-reactors in reactor_canary_validator.py). Powers the 38-path canary scoreboard on Grafana. |
| Governance | ProposalTallied, DisputeFiled | Webhook + themed email when a proposal closes or a dispute is filed. |
| Operator lifecycle | OrgMemberInvited, OrgMemberJoined, OrgMemberRemoved | WebSocket broadcasts to invitee + org admins on each lifecycle transition (3 sub-reactors in reactor_org_lifecycle_notification.py, see Ch 20). |
| CI/CD pipelines | DeploymentTriggered, DeploymentStatusUpdated, DeploymentWebhookExhausted | Pipeline lifecycle propagation — webhook delivery on deployment state change (3 sub-reactors in reactor_cicd_notification.py). |
| Admin broadcasts | AdminBroadcastSentReceived, AdminBroadcastEditedReceived, AdminBroadcastRetractedReceived | Cross-frame broadcast propagation — admin announcements fan-out to peer frames (3 sub-reactors in reactor_admin_broadcast.py). |
| Webhooks (self) | WebhookRetryExhausted | Themed dev-email — "your endpoint is down, here is what we tried" — after the 5-attempt retry ladder exhausts. |
| Routed fiat settlement | FiatRoutedSettlementInitiated, FiatRoutedSettlementCompleted | Routed-fiat (Stripe→AVT) settlement reactors. Run outside the REACTOR_FRAMEWORK_ENABLED master gate (alongside eventstore_balance_sync) because routed-fiat operation depends on them. |
| ZKP attestations (gated off) | AttestationDueReminder, AttestationExpired, AttestationRevoked | Reminders + revocation cleanup — currently gate-blocked by ZKP_PHASE_*_ENABLED=false until those phases activate. |
Operational properties
- Idempotent. Side-effect targets (TEG block, webhook delivery, mail send) tolerate replay. The inline source of truth is the contract; reactors only propagate.
- Leader-elected via Redis. One worker per reactor holds a 30-second lock; standbys watch and take over on lease expiry. Crash-safe.
- Observable.
/admin/reactor/statusreturns aggregate dispatch counts, error counters, last-event timestamps, and per-reactoris_leader. The AdminReactors panel (/ui#/admin/reactors) auto-refreshes the same data every 10 seconds. - Backlog replay safe. The EventStore replays the most recent 15 events on every reconnect; each reactor keeps a 256-entry ring of seen
event_ids and skips duplicates within that window. (Errors past the 15-event replay window are silently dropped — a known operational caveat.) - Throttled where it matters.
SupplyInvariantBreachdefaults to a 5-minute throttle to avoid alert storms during projection-snapshot races; persistent drift fires again on the next cycle. - Killable.
REACTOR_FRAMEWORK_ENABLED=falsepauses propagation entirely without affecting the inline endpoint correctness. Per-reactor toggles (REACTOR_<NAME>_ENABLED=false) let an operator pause a single reactor while keeping the rest running.
TIP
The reactor framework is what lets the platform tell a developer "your agent was just suspended" within 500ms of the suspension landing — without coupling the suspension endpoint's transaction to a flaky webhook delivery. Inline does correctness; reactors do reach.
Admin Broadcasts — The Federation-Wide Announcement Channel
Sometimes the platform needs to tell every developer something at once. Maintenance is starting in 10 minutes. A critical advisory just dropped. The price of AVT just moved 3%. A new sovereign agent is live and accepting commissions. The Protocol ships a first-class admin-broadcast channel for exactly this — and broadcasts propagate across sovereign frames so a Frame A admin's announcement reaches developers connected to Frame B without anyone re-posting.
What gets delivered, where
A broadcast is composed once by an admin and arrives on three independent channels per recipient — best-fit per the recipient's current connection state:
The four broadcast types drive UI colour-theming + email theme + sub-headers — info (cyan, default), warning (amber), maintenance (purple), critical (red). Warning and critical also flip the email render to the sovereign red theme; info and maintenance use the neural cyan theme.
Cross-frame propagation — frame-wide reach without a relay
Where it gets interesting: broadcasts opt-in to cross-frame federation via the AdminBroadcastSent event landing on the local EventStore, propagating to the peer frame via SF-4 cross-frame event projection, and materialising as a local row on the peer frame courtesy of reactor_admin_broadcast_received.
Three reactor classes back this — AdminBroadcastSentReceived, AdminBroadcastEditedReceived, AdminBroadcastRetractedReceived (all in reactor_admin_broadcast.py, all in the reactor inventory above). Dedup is keyed by (source_frame, idempotency_key) so a broadcast that flows back to its origin (via further peer relays) is filtered out cleanly.
INFO
The opt-in semantics, end-to-end. A broadcast that carries an idempotency_key flows cross-frame. One that doesn't, stays local. Both POST schemas accept the field — the admin chooses propagation at compose time. The design lock: "default audience is LOCAL; federation-wide is admin opt-in." When you want fleet-wide reach, set idempotency_key; when you want a local-only operator notice, leave it null.
Targeting — who actually sees a broadcast
A single broadcast can land everywhere on the local frame, narrow to one organisation's members, or fire in the future:
All three dimensions are independent — a maintenance window can be org-scoped + scheduled + frame-wide in one POST.
Actionable + acknowledgement-required broadcasts
Two recent capabilities let broadcasts go beyond "FYI":
- Call-to-action (CTA). Set
cta_url+cta_labelon the POST. The UI renders a labelled button — e.g.Approve the changelinking to/ui#/admin/proposals/42. Useful for action-required notices that should not just disappear. - Requires acknowledgement. Set
requires_ack=true. The notification cannot be auto-dismissed; the developer must explicitly click "Acknowledge" before the toast hides. The dismissal writes adeveloper_notification_readsrow keyed by(developer_id, broadcast_id). Once acknowledged, it stops appearing on login.
Per-developer notification preferences
Developers control their own filter via GET/PATCH /developers/me/notification-prefs. They can mute by type — e.g., a developer who only cares about critical can mute info + maintenance and still receive warning + critical. Email opt-out is independent of WebSocket.
Two dedicated workers
| Worker | Cadence | Job |
|---|---|---|
broadcast_scheduler | 60s | Find rows with scheduled_for ≤ now() AND email_sent_at IS NULL AND is_active=true; fire WS + (optional) email. |
broadcast_email_retry_worker | 60s | Retry failed email sends with exponential backoff; capture permanent failures in a dead-letter queue surfaced via GET /admin/broadcast-email-retry/dead-letter. |
Both are leader-elected via Redis (single execution across the registry's worker pool) and observable via the /admin/broadcast-scheduler/status and /admin/broadcast-email-retry/status endpoints.
The broadcast endpoint surface
| Method | Path | Auth | Purpose |
|---|---|---|---|
POST | /admin/broadcasts | admin | Compose + send (or schedule) a broadcast |
GET | /admin/broadcasts | admin | List broadcasts (filter by include_inactive, broadcast_type, target_org_id) |
PUT | /admin/broadcasts/{id} | admin | Edit an active broadcast — emits AdminBroadcastEdited for cross-frame sync |
DELETE | /admin/broadcasts/{id} | admin | Soft-retract (is_active=false) — emits AdminBroadcastRetracted |
POST | /admin/broadcasts/{id}/reinstate | admin | Reverse a soft-retract |
GET | /admin/broadcasts/overview | admin | Aggregate stats: counts by type, last-24h / last-7d, ws_connections_now, verified_developers |
GET | /admin/broadcasts/{id}/stats | admin | Per-broadcast: ws_recipients, reads count, email outcomes |
GET | /notifications/unread | dev JWT | Pending broadcasts the caller hasn't read |
GET | /notifications/{broadcast_id} | dev JWT | Single broadcast detail |
POST | /notifications/{broadcast_id}/read | dev JWT | Mark as read (writes developer_notification_reads row) |
GET | /admin/broadcast-templates | admin | List saved compose templates |
POST | /admin/broadcast-templates | admin | Save a new template (label, title_template, body_template) |
DELETE | /admin/broadcast-templates/{id} | admin | Delete a template |
GET | /developers/me/notification-prefs | dev JWT | Read caller's per-type preferences |
PATCH | /developers/me/notification-prefs | dev JWT | Mute/unmute by type, toggle email opt-out |
GET | /admin/broadcast-scheduler/status | admin | Worker liveness + last-run timestamp + scheduled count |
GET | /admin/broadcast-email-retry/status | admin | Email retry worker liveness + queue depth |
GET | /admin/broadcast-email-retry/dead-letter | admin | Permanently-failed email recipients |
GET | /community/announcements | public | Public-facing announcement feed (limited fields, info-type only) |
TIP
Test a broadcast safely. Compose with broadcast_type=info + send_email=false. The WS push fires to anyone currently watching, but no email goes out. Retract via DELETE /admin/broadcasts/{id} immediately if it was test scaffolding. Per-broadcast stats at /admin/broadcasts/{id}/stats show whether anyone actually saw it.
INFO
The frontend. /ui#/admin/broadcasts is the management view — compose, schedule, edit, retract, see live stats per broadcast, manage templates, view dead-letter queue. The composer view supports preview-as-developer to see exactly what your audience will see before sending.
What's Next
- 🔗 02 — The Token Economy — the balances this ledger is the source of truth for
- 🔗 04 — Contracts, A2A & Disputes — where
DisputeCompensationMintedcomes from - 🔗 08 — Security Architecture — how event emitters are authenticated via SPIRE
- 🔗 16 — Monitoring & Observability — dashboards and alerts on invariant health
- 🔗 19 — Compliance & Governance of the Protocol — auditor workflow and ISO 42001 mapping
- 🔗 21 — Webhooks & Integrations — how to subscribe a backend HTTP endpoint to the reactor stream described above