06 — Execution authority & intent safety

Two layers in one doc because they are the two halves of "act safely": L4 makes execution know which authority it's acting under, and L5 makes sure only a real, reasoned action ever reaches L4.

Part A — L4: the `agent_authority` execution block

A.1 The gap this closes

Before this layer, a venue's execution binding carried venue-local signer fields (Polymarket CLOB creds, Hyperliquid API wallet) but never surfaced the parent authority. Every downstream consumer implicitly assumed Mode A / provider signer. That is dangerous: if a user linked a Virtuals wallet (Mode B), execution silently used the wrong authority.

The fix: every execution binding carries a non-secret agent_authority block that tells any consumer exactly which authority is active and how to sign.

A.2 The block

jsonc

{
  "agent_authority": {
    "wallet_profile_id": "uuid|null",
    "authority_source": "provider_managed | virtuals_linked | external_imported | unknown",
    "wallet_provider":  "privy | virtuals | external | unknown",
    "wallet_address":   "0x…",
    "provider_wallet_id": "id|null",
    "virtuals_agent_id":  "id|null",
    "proof_state":      "verified | unverified | pending | not_applicable",
    "signer_status":    "ready | required | pending | unsupported | unknown",
    "supported_chains": ["base","ethereum","arbitrum","polygon","solana"],
    "execution_route":  "provider_agent_wallet | virtuals_acp_sidecar | venue_local_signer | unsupported",
    "execution_supported": true,
    "blockers": []
  }
}

Never include: private keys, API wallet secrets, bearer tokens, CLOB secret material, Privy auth tokens, ACP secrets, raw encrypted payloads, signer key references. IDs only if already non-secret internal references.

A.3 Authority → execution route mapping

Active authority	Condition	`execution_route`	`execution_supported`
Mode A (`provider`/`privy`)	venue supports provider signing	`provider_agent_wallet`	✅ where venue impl can operate it
Mode B (`virtuals`, proof verified)	venue wired for ACP sidecar	`virtuals_acp_sidecar`	✅ only for supported venues
Mode B (`virtuals`, proof verified)	venue not wired for sidecar	`unsupported`	❌ blocker: `virtuals_authority_execution_not_supported_for_venue`
Mode B (`virtuals`, unverified)	—	`unsupported`	❌ blocker: `virtuals_authority_not_verified`
any	venue uses a child venue-local signer	`venue_local_signer`	✅ if that signer is ready
none (no profile)	—	`unsupported`	❌ blocker: `agent_wallet_required`

Hard safety rules:

No silent fallback Mode B → Mode A. If Mode B execution isn't wired for a venue, block precisely — never quietly sign with the provider wallet.
Do not auto-create Mode A when the active authority is Mode B.
ACP-native venues (Virtuals, DegenClaw) require Virtuals identity + proof + signer; missing → exact blocker (virtuals_identity_required, virtuals_signer_required, owner_handoff_required).
Venue-local signer fields stay separate from the parent agent_authority. The binding shows both: agent_authority (parent) and venue_signer/venue_credentials (child).

A.4 The execution pipeline (execution-capable, never prepare-only)

execute_or_plan(action, *, execute):
   preflight (always):
     resolve agent_authority block            ← knows Mode A vs B vs local
     resolve VenueBinding + venue-local signer
     resolve funding (L3)
     build the plan / card
   if not execute:
     return plan; SIGN NOTHING; MOVE NOTHING
   if execute:
     check binding ready
     check agent_authority.execution_supported  → else exact blocker
     check VenuePolicy (per-venue caps/categories)
     check global spend policy (daily caps, reservations)
     check funding present
     check live flags                            → else ready_but_live_locked, execution_performed=false
     if approval required:                       → return requires_approval
     create durable SpendIntent + reservation
     execute via venue adapter (route per agent_authority.execution_route)
     write ExecutionReceipt + append audit/spend events
     stream real progress to the brain/chat

ready_but_live_locked with execution_performed: false is the canonical "everything is correct but the live flag is off" state. It is not a fake prepare-only product; flip the flag (after canary) and the same path executes.

A.5 The economic-action ledger (durable, attributable)

Every live action must be traceable end to end:

SpendIntent (reservation + policy decision)
   → venue execution (adapter)
   → ExecutionReceipt / EconomicActionReceipt
   → audit event

Each receipt must trace back to: user → agent → AgentAuthority → VenueBinding → SpendIntent. If a receipt can't be attributed to that chain, it's a quarantine condition, not a successful action. No secrets in any of these records.

A.6 Consumers

The agent_authority block is not decorative — at least one real consumer uses it. In the reference implementation the funding router consumes it to resolve the source wallet (Mode A vs Mode B address), and read-only execution previews surface it so a UI can show "you'll trade as <Mode B Virtuals agent>" before anything happens.

Part B — L5: reasoning-based intent, deterministic execution

B.1 The principle

Slash commands → deterministic. Natural language → a reasoning classifier. Clear high-confidence action → deterministic gated execution. Question / negation / hypothetical → the agent answers. Ambiguity → one precise clarifier. Model failure → abstain.

Keyword/regex intent gates are the wrong primary mechanism. "Should I fund Hyperliquid?", "I don't want to bind Polymarket", "what are the risks of launching a token?" all contain the action keywords but are not actions. Pattern matching can't tell an action from a question, a negation, or a hypothetical. A reasoning step can. Reference: chat_capability_intent_service.py.

B.2 The two-stage classifier

natural-language message
   → cheap, high-recall, NON-DECIDING pre-filter   (keywords decide only: "worth a model call?")
   → ONE structured-JSON model call (classify)
   → { capability, is_action, venue, amount_usd, mode, confidence, ambiguous, clarifying_question }
   → dispatch:
        confident action + required params         → deterministic gated executor (L4)
        question / negation / hypothetical          → agent answers (no executor)
        ambiguous / missing param / low confidence   → one precise clarifying question
        model error                                  → abstain

The classifier output (reference shape):

jsonc

{
  "capability": "bind | fund | trade | credit | intel | treasury | degen | status | none",
  "is_action": true,                  // question/negation/hypothetical ⇒ false
  "venue": "polymarket | hyperliquid | … | null",
  "amount_usd": 50.0,                 // or null
  "mode": "create | link_existing | status | explain | null",  // e.g. Virtuals Mode A vs B
  "confidence": 0.0,
  "ambiguous": false,
  "clarifying_question": "string|null"
}

Crucial: no keyword maps to an outcome. Keywords only decide whether to spend one classify call. The model decides action-vs-question; the backend decides whether the action is allowed.

B.3 `mode` distinguishes Mode A vs Mode B (and explain)

For binding, mode separates:

create → Mode A provider wallet setup,
link_existing → Mode B link the user's Virtuals agent (must not create Mode A),
status → a read,
explain → forces is_action=false (an explanation is never an action).

So "link my existing Virtuals agent" routes to Mode B, "set up my agent wallet" routes to Mode A, and "what does binding Polymarket mean?" routes to neither.

B.4 Failure policy: strict vs graceful (by risk)

Model failure handling is risk-tiered:

Path class	On classifier failure
Read / status / explain	may fall through to the agent (graceful)
Money / action / purchase / setup / trade (fund, bind, credit, intel, treasury, degen, trade)	must not execute — abstain → ask clarifier or fall through to the agent

Design guidance (from review): the safest stance for all high-risk paths is strict abstain — model down must never silently revert to keyword execution for anything that moves money or commits a setup. Treat "abstain-graceful" as a migration convenience, not the end state.

B.5 Why this is the security complement to A

L4 guarantees that if an action reaches the executor, it is checked against policy/funding/live-flags/authority before it signs. L5 guarantees that only a real, reasoned action reaches the executor — a poisoned input or a casual question can't masquerade as a command. Together:

prompt injection / casual question
   → L5 classifier: not a confident action → never dispatched
   → even if mis-classified, L4: policy + funding + live-flag + authority gates → blocked or approval-required

Two independent gates, neither of which is the LLM holding a key.

B.6 What L4+L5 give a dev

python

intent = await oaw.intent.classify(message)          # reasoning, fail-safe
if intent.is_action and intent.confidence > T and intent.venue:
    result = await oaw.venues.execute(user, intent.venue, action, execute=confirmed)
    # L4: agent_authority-aware, policy/funding/live-flag gated, receipted
else:
    # hand back to the agent brain to answer or clarify

The dev gets safe autonomy: the agent can act from chat, but only real actions, only under policy, only with the right authority, and never with the model holding a key.

06 — Execution authority & intent safety ​

Part A — L4: the agent_authority execution block ​

A.1 The gap this closes ​

A.2 The block ​

A.3 Authority → execution route mapping ​

A.4 The execution pipeline (execution-capable, never prepare-only) ​

A.5 The economic-action ledger (durable, attributable) ​

A.6 Consumers ​

Part B — L5: reasoning-based intent, deterministic execution ​

B.1 The principle ​

B.2 The two-stage classifier ​

B.3 mode distinguishes Mode A vs Mode B (and explain) ​

B.4 Failure policy: strict vs graceful (by risk) ​

B.5 Why this is the security complement to A ​

B.6 What L4+L5 give a dev ​