Skip to content

The Event Store & Supply Audit

The immutable ledger that makes the whole platform auditable. No silent state changes. No "trust us" on the supply. Every balance is a replay of history, and a worker checks that history against itself every 60 seconds.

Why It Matters

Every sovereign infrastructure eventually gets this question: "how do we know the operator isn't quietly printing tokens?" The honest answer has to be cryptographic, not rhetorical. The Event Store is how the protocol provides that answer. Every economic event — transfers, mints, stakes, disputes, slashings — lives in an append-only ledger. Balances are derived from replay, not stored. And a continuous self-audit compares issuance totals to circulation totals. Delta must equal zero.

This chapter covers how that works end to end: the CQRS architecture that makes it cheap, the policy layer that keeps the ledger honest, and the audit algorithm that proves supply is conserved.

CQRS at the Heart

The platform uses Command Query Responsibility Segregation (CQRS): writes go to an append-only event log, reads come from derived projections that get rebuilt from that log. No row in any balance table is ever updated in place — rows are rebuilt by replaying events.

Three benefits you get from this shape:

  1. Audit trail for free. You don't add auditability — it is the storage model.
  2. Time-travel queries. "What was agent X's balance on 2026-03-15 18:00:00Z?" is a natural question because you can replay the log up to that timestamp.
  3. Rebuild from scratch. A corrupted projection is fixable — drop it and replay the log. The log is the only thing that matters.

Event Types (Partial List)

Not every event affects balances. The projection engine cares about the balance-affecting ones. A non-exhaustive taxonomy:

CategoryEventBalance effectEmitter
MintTokensIssued+liquid (treasury)Registry (mint proxy; TEG performs the DB mint and is gate-blocked from emitting)
MintDisputeCompensationMinted+liquid (winner)Registry
MoveTokensTransferred−sender, +receiverRegistry
MoveTransactionFeeCollected+fee_collectorRegistry
StakeTokensStaked−liquid, +stakedRegistry
StakeTokensUnstaked+liquid, −stakedRegistry
StakeRewardsDistributed / RewardsAutoCompounded+liquid (paid) or +staked (auto-compounded)Registry
AuditBalanceCorrected±liquid (operator action)Registry
AuditProjectionSnapshotbaseline resetRegistry
Cross-regCrossRegistryTransferCompleted0 (audit-only)Registry
LifecycleAgentOnboarded, AgentStatusChanged0Registry

Events are serialized JSON. The log stores id (UUID), event_type, aggregate_id (usually the agent DID or system account), aggregate_type, data (the event payload), metadata, timestamp, version, source_service (which service emitted — e.g. registry-a, teg-layer-a), and idempotency_key. Emitting services authenticate to the Event Store via mTLS (SPIRE SVID at the connection layer), not by signing each event individually.

Emission Policy (f037)

The critical piece — who is allowed to emit which events? If both the registry and the TEG emit TokensTransferred, your projection double-counts. The policy layer prevents that.

Each event type has a row in event_emission_policies:

event_type                           emit_from_teg  emit_from_registry  skip_in_projection
---------------------------------------------------------------------------------------------
TokensIssued                         FALSE          TRUE                FALSE   ← Registry mint proxy emits; TEG blocked
TokensTransferred                    FALSE          TRUE                FALSE
TokensStaked                         FALSE          TRUE                FALSE
TransactionFeeCollected              FALSE          TRUE                FALSE
CrossRegistryTransferCompleted       FALSE          TRUE                TRUE    ← audit only
RewardsDistributed                   FALSE          TRUE                TRUE    ← audit only

The live event_emission_policies table holds a few dozen rows (financial + governance + lifecycle event types) — the set grows as new event types land, so query it live rather than trusting a static count. The six rows above are the supply-critical ones; everything else is operational telemetry.

When any service calls event_emitter.emit(...), this decision tree runs:

  • Rejected events are logged (for forensics) but never written to the main log.
  • Skip-in-projection events are written — they exist for audit trail and WebSocket subscribers — but the projection engine ignores them for balance calculations. CrossRegistryTransferCompleted is the canonical example: it's informational; the underlying TokensTransferred events already update balances.

The policy is mutable via governance. An admin API (PUT /api/v1/admin/event-emission-policies/{event_type}) changes the row live, with an audit event recorded — you can't silently flip the bits.

::: warn Never bypass the emission policy gate. The supply audit algorithm below assumes the policy is honored. If a TEG-side code path emits TokensTransferred (bypassing the gate), every balance will be double-counted, and the auditor will page the network within 60 seconds. :::

Projections — How Balances Are Calculated

When an endpoint asks "what's agent X's balance?", the projection engine looks at its cached snapshot row for X. The snapshot is maintained by a background worker that continuously replays new events after the last known baseline.

Key concept: baseline + delta.

balance(agent, t) = snapshot_at(baseline_t) + Σ events(agent, baseline_t..t)

On startup, or after POST /api/v1/admin/projection-snapshot, the engine:

  1. Pulls current balances directly from the TEG (source of truth)
  2. Writes them as a single ProjectionSnapshot event
  3. Replays only events after the snapshot to reach current state

This means:

  • Projections are fast — reading a balance is an index lookup, not a full log replay
  • A corrupted projection is recoverable — take a new snapshot and replay
  • The snapshot itself is a ledger event — it's auditable too

Single-Writer Balance Integrity (CQRS Transactional Outbox)

The replay model above is correct, but it had a structural weakness: balances are mutated at 40-plus call sites across roughly ten TEG files, and every one of those sites had to remember to emit the right balance event. Miss one and the projection silently drifts — the "emission gap." On top of that, two projection paths (the incremental trigger and the replay/snapshot path) could disagree, and using per-agent BalanceCorrected events as steady-state healing inflated the issued/destroyed magnitudes (it pushed one frame's issued figure to roughly twice the true supply).

The fix (now shipped) stops trying to make 40 hand-written emit calls complete. Instead it observes balances at the one place every change must pass through — the database row — and makes a single event type the sole writer of the projection:

Why it conserves supply by construction. For one atomic TEG transaction (one Postgres txid), sum the balance deltas across all affected profiles — and because treasury and fee_collector are themselves profiles, they are in the sum. If the sum is 0 it was a transfer/stake/fee (issued/destroyed unchanged); if > 0, net new tokens entered circulation (issuance); if < 0, tokens left it (destruction). The journal records the actual moves, so this classification is correct no matter how mint, stake, fee, or bridge are coded across those forty sites. The invariant issued − destroyed + transit_net == circulating holds at the baseline and is preserved by every transaction group — Δ = 0 by construction, not by after-the-fact reconciliation.

Two engineering details make it safe: the outbox stamps emitted_at only after the EventStore accepts the event, and the idempotency key ledger-tx-{frame}-{txid} rides the EventStore's UNIQUE constraint — so a crash mid-cycle simply retries, exactly-once. The cutover is gated by a single projection_config.single_writer flag; when it flips true, the legacy typed balance events (TokensIssued, TokensTransferred, TokensStaked, fees, slashes, BalanceCorrected, …) are still written to the log but become no-ops in the projection, leaving LedgerBalanceChanged as the only thing that moves a balance.

The Supply Invariant (Option 4 — Self-Consistent)

This is the heart of the chapter. Earlier designs compared ES numbers to DB numbers — and hit race conditions because the two systems update asynchronously. Option 4 fixes that by auditing the Event Store against itself.

The invariant:

tokens_issued − tokens_destroyed + transit_net  ==  total_circulating

Where:

tokens_issued      = snapshot_baseline_issued
                   + Σ TokensIssued
                   + Σ BalanceCorrected (positive corrections)
                   + Σ DisputeCompensationMinted

total_circulating  = Σ (liquid + staked) across ALL balance records in the projection
                    (includes treasury, fee_collector, and every agent account)

tokens_destroyed   = Σ (burned tokens) + Σ BalanceCorrected (negative corrections)
                    (a frame with no burns sits at zero; burns add here, never to circulating)

transit_net        = Σ signed non-supply legs (transfers, stakes, fees,
                    cross-frame transit) — Δ-neutral, closes the per-frame books

Fees are zero-sum for circulation: a transfer deducts 100 from sender, credits 99.5 to receiver and 0.5 to fee_collector. The sum of circulating changes is 0. Circulation only moves when tokens are minted; the destruction term tracks burns + negative balance corrections separately so a burn doesn't masquerade as a deficit.

The supply_auditor worker runs this every 60 seconds. Under sustained multi-simulator stress tests pushing tens to hundreds of transactions per second, delta stayed at 0.0 across millions of events.

Query the Audit Directly

The EventStore endpoint is internal-only on production (not exposed via nginx). For external visibility, a public composite auditor (e.g. auditor.example.com) can surface each frame's delta in real time. Operators with internal access can hit:

http
GET /api/v1/projections/supply-audit       (EventStore, internal)
json
{
  "tokens_issued": 1000000000.0,
  "tokens_destroyed": 0.0,
  "tokens_fees_collected": 12500.0,
  "net_supply": 1000000000.0,
  "total_circulating": 1000000000.0,
  "delta": 0.0,
  "status": "OK",
  "agent_count": 100
}

If delta is non-zero on any frame, the auditor pages within the same 60-second cycle.

INFO

When a BREACH fires, the first action is not to try to "fix" the number. The first action is to halt new minting, capture the snapshot, and diff it against the last known-good snapshot. The invariant is not the problem — it's the alarm.

Snapshot Lifecycle

You can take a snapshot on demand (admin only):

http
POST /api/v1/admin/projection-snapshot
Authorization: Bearer <admin_jwt>

This pulls live balances from the TEG, writes a ProjectionSnapshot event, and sets it as the baseline for the projection engine. Subsequent audits anchor to this baseline + replay events since.

Set ENABLE_PROJECTION_SNAPSHOTS=true in the environment to make this available. In production it's on; in dev it can be disabled if you don't want a manual snapshot capability exposed.

WebSocket Fan-Out

Every accepted event is broadcast over WebSocket to subscribed clients. This is how UIs, monitoring dashboards, and agent programs get real-time updates without polling.

WebSocket /ws/events
  Optional query param: ?event_types=TokensTransferred,ProposalTallied  (comma-separated filter)

Events arrive as JSON frames over the WebSocket connection. The server replays the most recent 15 events on every reconnect (so reactors and dashboards catch up across brief disconnects); deduplication by id is the client's responsibility — at-least-once delivery semantics.

Querying the Log Directly

Admin-only, for forensics and audit:

http
GET /api/v1/events?aggregate_id=<did>&from=<ts>&to=<ts>&limit=100
Authorization: Bearer <admin_jwt>

Returns raw events in chronological order. Bounded by pagination. Immutable — you cannot delete an event through the API; the only way to "remove" one is to add a compensating event with a BalanceCorrected-like semantic, and the correction is itself logged.

Reactors — Cross-Service Reactions to Events

Some events have follow-on consequences that span more than one service: when a developer is suspended, the local registry must revoke credentials and fan webhooks out to the developer's apps and tell every peer registry to drop the agents from its blocklist and notify the TEG layer. The original endpoint handles the local part synchronously and inside its database transaction; the rest happens out-of-band.

The reactor framework runs those cross-service follow-ons. Each reactor subscribes to one event type on the WebSocket fan-out, runs leader-elected via Redis (one process holds the lock per reactor across N uvicorn workers), and dispatches HTTP commands to the right downstream services when the event lands.

CQRS line preserved

Reactors react to events; their side effects are HTTP commands. They do not emit further events — that path leads to event-loop avalanches. If a downstream genuinely needs to emit (e.g., dispute reconciliation), the f037 emission policy gates it explicitly, and the inline endpoint does it.

Source of truth stays inline

The synchronous endpoint that triggered the event (e.g. bulk/suspendcoordinated_agent_suspend) is the source of truth. Reactor execution is best-effort propagation. If a reactor fails, the local DB state is still correct — the system retries on the next event observation, or an operator can re-fire the propagation manually. Disabling the framework with REACTOR_FRAMEWORK_ENABLED=false pauses propagation without touching inline correctness.

What the reactors do

Dozens of reactor instances run on each registry (several files define multiple sub-reactors: reactor_canary_validator.py has 5 sub-classes, reactor_admin_broadcast.py / reactor_org_lifecycle_notification.py / reactor_cicd_notification.py each have 3). Rather than trust a static count, verify the live state any time via GET /api/v1/admin/reactor/status (returns the running set with dispatch counts + error counters).

Grouped by domain:

DomainReactorsWhat they do
Identity / agentsAgentSuspended, AgentReinstated, AgentSlashed, AgentCardDriftFlag, DeveloperSuspended, JWSVerifyFailedThrottleWebhook + TEG block/unblock + cross-frame fan-out on agent lifecycle. Drift-flag reactor surfaces SPIFFE↔card mismatches; JWS throttle batches verify failures.
Federation / peersRegistryFederated, RegistryQuarantined, FederationLicenseSuspended, FederationLicenseReinstated, PeerRegistryDeactivated, OperatorApplicationRevoked, PeerUrlSelfHeal, FederationStateDriftDetected, ComplianceDryRunDrift, FrameFederationRevokedTEG add-partner / remove-partner + DB cleanup + webhooks. PeerUrlSelfHeal rewrites stale peer URLs in place; ComplianceDryRunDrift emits observation-mode signals without acting.
Economy / treasuryBalanceSync, BalanceCorrected, SupplyInvariantBreach, BridgeTransferExpired, EventEmissionPolicyUpdated, FundingRequestCreatedSync cached balances from TEG to Registry; alert on invariant drift (5-min throttle, breaks on delta change); email + webhook on every treasury correction.
Cross-frame settlementCrossFrameInitiated, CrossFrameSettled, CrossFrameRefundedReceive-side credit + saga reconciliation for cross-frame async transfers — polling backstop overridden to 10s for tighter SLO.
Canary (validators)CanaryValidator-TokensTransferred, CanaryValidator-TransactionFeeCollected, CanaryValidator-CrossFrameInitiated, CanaryValidator-CrossFrameSettled, CanaryValidator-CrossFrameRefundedStamp + judge state hash per canary test (5 sub-reactors in reactor_canary_validator.py). Powers the 38-path canary scoreboard on Grafana.
GovernanceProposalTallied, DisputeFiledWebhook + themed email when a proposal closes or a dispute is filed.
Operator lifecycleOrgMemberInvited, OrgMemberJoined, OrgMemberRemovedWebSocket broadcasts to invitee + org admins on each lifecycle transition (3 sub-reactors in reactor_org_lifecycle_notification.py, see Ch 20).
CI/CD pipelinesDeploymentTriggered, DeploymentStatusUpdated, DeploymentWebhookExhaustedPipeline lifecycle propagation — webhook delivery on deployment state change (3 sub-reactors in reactor_cicd_notification.py).
Admin broadcastsAdminBroadcastSentReceived, AdminBroadcastEditedReceived, AdminBroadcastRetractedReceivedCross-frame broadcast propagation — admin announcements fan-out to peer frames (3 sub-reactors in reactor_admin_broadcast.py).
Webhooks (self)WebhookRetryExhaustedThemed dev-email — "your endpoint is down, here is what we tried" — after the 5-attempt retry ladder exhausts.
Routed fiat settlementFiatRoutedSettlementInitiated, FiatRoutedSettlementCompletedRouted-fiat (Stripe→AVT) settlement reactors. Run outside the REACTOR_FRAMEWORK_ENABLED master gate (alongside eventstore_balance_sync) because routed-fiat operation depends on them.
ZKP attestations (gated off)AttestationDueReminder, AttestationExpired, AttestationRevokedReminders + revocation cleanup — currently gate-blocked by ZKP_PHASE_*_ENABLED=false until those phases activate.

Operational properties

  • Idempotent. Side-effect targets (TEG block, webhook delivery, mail send) tolerate replay. The inline source of truth is the contract; reactors only propagate.
  • Leader-elected via Redis. One worker per reactor holds a 30-second lock; standbys watch and take over on lease expiry. Crash-safe.
  • Observable. /admin/reactor/status returns aggregate dispatch counts, error counters, last-event timestamps, and per-reactor is_leader. The AdminReactors panel (/ui#/admin/reactors) auto-refreshes the same data every 10 seconds.
  • Backlog replay safe. The EventStore replays the most recent 15 events on every reconnect; each reactor keeps a 256-entry ring of seen event_ids and skips duplicates within that window. (Errors past the 15-event replay window are silently dropped — a known operational caveat.)
  • Throttled where it matters. SupplyInvariantBreach defaults to a 5-minute throttle to avoid alert storms during projection-snapshot races; persistent drift fires again on the next cycle.
  • Killable. REACTOR_FRAMEWORK_ENABLED=false pauses propagation entirely without affecting the inline endpoint correctness. Per-reactor toggles (REACTOR_<NAME>_ENABLED=false) let an operator pause a single reactor while keeping the rest running.

TIP

The reactor framework is what lets the platform tell a developer "your agent was just suspended" within 500ms of the suspension landing — without coupling the suspension endpoint's transaction to a flaky webhook delivery. Inline does correctness; reactors do reach.

Admin Broadcasts — The Federation-Wide Announcement Channel

Sometimes the platform needs to tell every developer something at once. Maintenance is starting in 10 minutes. A critical advisory just dropped. The price of AVT just moved 3%. A new sovereign agent is live and accepting commissions. The Protocol ships a first-class admin-broadcast channel for exactly this — and broadcasts propagate across sovereign frames so a Frame A admin's announcement reaches developers connected to Frame B without anyone re-posting.

What gets delivered, where

A broadcast is composed once by an admin and arrives on three independent channels per recipient — best-fit per the recipient's current connection state:

The four broadcast types drive UI colour-theming + email theme + sub-headers — info (cyan, default), warning (amber), maintenance (purple), critical (red). Warning and critical also flip the email render to the sovereign red theme; info and maintenance use the neural cyan theme.

Cross-frame propagation — frame-wide reach without a relay

Where it gets interesting: broadcasts opt-in to cross-frame federation via the AdminBroadcastSent event landing on the local EventStore, propagating to the peer frame via SF-4 cross-frame event projection, and materialising as a local row on the peer frame courtesy of reactor_admin_broadcast_received.

Three reactor classes back this — AdminBroadcastSentReceived, AdminBroadcastEditedReceived, AdminBroadcastRetractedReceived (all in reactor_admin_broadcast.py, all in the reactor inventory above). Dedup is keyed by (source_frame, idempotency_key) so a broadcast that flows back to its origin (via further peer relays) is filtered out cleanly.

INFO

The opt-in semantics, end-to-end. A broadcast that carries an idempotency_key flows cross-frame. One that doesn't, stays local. Both POST schemas accept the field — the admin chooses propagation at compose time. The design lock: "default audience is LOCAL; federation-wide is admin opt-in." When you want fleet-wide reach, set idempotency_key; when you want a local-only operator notice, leave it null.

Targeting — who actually sees a broadcast

A single broadcast can land everywhere on the local frame, narrow to one organisation's members, or fire in the future:

All three dimensions are independent — a maintenance window can be org-scoped + scheduled + frame-wide in one POST.

Actionable + acknowledgement-required broadcasts

Two recent capabilities let broadcasts go beyond "FYI":

  • Call-to-action (CTA). Set cta_url + cta_label on the POST. The UI renders a labelled button — e.g. Approve the change linking to /ui#/admin/proposals/42. Useful for action-required notices that should not just disappear.
  • Requires acknowledgement. Set requires_ack=true. The notification cannot be auto-dismissed; the developer must explicitly click "Acknowledge" before the toast hides. The dismissal writes a developer_notification_reads row keyed by (developer_id, broadcast_id). Once acknowledged, it stops appearing on login.

Per-developer notification preferences

Developers control their own filter via GET/PATCH /developers/me/notification-prefs. They can mute by type — e.g., a developer who only cares about critical can mute info + maintenance and still receive warning + critical. Email opt-out is independent of WebSocket.

Two dedicated workers

WorkerCadenceJob
broadcast_scheduler60sFind rows with scheduled_for ≤ now() AND email_sent_at IS NULL AND is_active=true; fire WS + (optional) email.
broadcast_email_retry_worker60sRetry failed email sends with exponential backoff; capture permanent failures in a dead-letter queue surfaced via GET /admin/broadcast-email-retry/dead-letter.

Both are leader-elected via Redis (single execution across the registry's worker pool) and observable via the /admin/broadcast-scheduler/status and /admin/broadcast-email-retry/status endpoints.

The broadcast endpoint surface

MethodPathAuthPurpose
POST/admin/broadcastsadminCompose + send (or schedule) a broadcast
GET/admin/broadcastsadminList broadcasts (filter by include_inactive, broadcast_type, target_org_id)
PUT/admin/broadcasts/{id}adminEdit an active broadcast — emits AdminBroadcastEdited for cross-frame sync
DELETE/admin/broadcasts/{id}adminSoft-retract (is_active=false) — emits AdminBroadcastRetracted
POST/admin/broadcasts/{id}/reinstateadminReverse a soft-retract
GET/admin/broadcasts/overviewadminAggregate stats: counts by type, last-24h / last-7d, ws_connections_now, verified_developers
GET/admin/broadcasts/{id}/statsadminPer-broadcast: ws_recipients, reads count, email outcomes
GET/notifications/unreaddev JWTPending broadcasts the caller hasn't read
GET/notifications/{broadcast_id}dev JWTSingle broadcast detail
POST/notifications/{broadcast_id}/readdev JWTMark as read (writes developer_notification_reads row)
GET/admin/broadcast-templatesadminList saved compose templates
POST/admin/broadcast-templatesadminSave a new template (label, title_template, body_template)
DELETE/admin/broadcast-templates/{id}adminDelete a template
GET/developers/me/notification-prefsdev JWTRead caller's per-type preferences
PATCH/developers/me/notification-prefsdev JWTMute/unmute by type, toggle email opt-out
GET/admin/broadcast-scheduler/statusadminWorker liveness + last-run timestamp + scheduled count
GET/admin/broadcast-email-retry/statusadminEmail retry worker liveness + queue depth
GET/admin/broadcast-email-retry/dead-letteradminPermanently-failed email recipients
GET/community/announcementspublicPublic-facing announcement feed (limited fields, info-type only)

TIP

Test a broadcast safely. Compose with broadcast_type=info + send_email=false. The WS push fires to anyone currently watching, but no email goes out. Retract via DELETE /admin/broadcasts/{id} immediately if it was test scaffolding. Per-broadcast stats at /admin/broadcasts/{id}/stats show whether anyone actually saw it.

INFO

The frontend. /ui#/admin/broadcasts is the management view — compose, schedule, edit, retract, see live stats per broadcast, manage templates, view dead-letter queue. The composer view supports preview-as-developer to see exactly what your audience will see before sending.

What's Next

Server components AGPL-v3 · client SDK Apache-2.0. If a doc and the running stack disagree, trust the stack.