06 — Execution authority & intent safety
Two layers in one doc because they are the two halves of "act safely": L4 makes execution know which authority it's acting under, and L5 makes sure only a real, reasoned action ever reaches L4.
Part A — L4: the agent_authority execution block
A.1 The gap this closes
Before this layer, a venue's execution binding carried venue-local signer fields (Polymarket CLOB creds, Hyperliquid API wallet) but never surfaced the parent authority. Every downstream consumer implicitly assumed Mode A / provider signer. That is dangerous: if a user linked a Virtuals wallet (Mode B), execution silently used the wrong authority.
The fix: every execution binding carries a non-secret agent_authority block that tells any consumer exactly which authority is active and how to sign.
A.2 The block
{
"agent_authority": {
"wallet_profile_id": "uuid|null",
"authority_source": "provider_managed | virtuals_linked | external_imported | unknown",
"wallet_provider": "privy | virtuals | external | unknown",
"wallet_address": "0x…",
"provider_wallet_id": "id|null",
"virtuals_agent_id": "id|null",
"proof_state": "verified | unverified | pending | not_applicable",
"signer_status": "ready | required | pending | unsupported | unknown",
"supported_chains": ["base","ethereum","arbitrum","polygon","solana"],
"execution_route": "provider_agent_wallet | virtuals_acp_sidecar | venue_local_signer | unsupported",
"execution_supported": true,
"blockers": []
}
}Never include: private keys, API wallet secrets, bearer tokens, CLOB secret material, Privy auth tokens, ACP secrets, raw encrypted payloads, signer key references. IDs only if already non-secret internal references.
A.3 Authority → execution route mapping
| Active authority | Condition | execution_route | execution_supported |
|---|---|---|---|
Mode A (provider/privy) | venue supports provider signing | provider_agent_wallet | ✅ where venue impl can operate it |
Mode B (virtuals, proof verified) | venue wired for ACP sidecar | virtuals_acp_sidecar | ✅ only for supported venues |
Mode B (virtuals, proof verified) | venue not wired for sidecar | unsupported | ❌ blocker: virtuals_authority_execution_not_supported_for_venue |
Mode B (virtuals, unverified) | — | unsupported | ❌ blocker: virtuals_authority_not_verified |
| any | venue uses a child venue-local signer | venue_local_signer | ✅ if that signer is ready |
| none (no profile) | — | unsupported | ❌ blocker: agent_wallet_required |
Hard safety rules:
- No silent fallback Mode B → Mode A. If Mode B execution isn't wired for a venue, block precisely — never quietly sign with the provider wallet.
- Do not auto-create Mode A when the active authority is Mode B.
- ACP-native venues (Virtuals, DegenClaw) require Virtuals identity + proof + signer; missing → exact blocker (
virtuals_identity_required,virtuals_signer_required,owner_handoff_required). - Venue-local signer fields stay separate from the parent
agent_authority. The binding shows both:agent_authority(parent) andvenue_signer/venue_credentials(child).
A.4 The execution pipeline (execution-capable, never prepare-only)
execute_or_plan(action, *, execute):
preflight (always):
resolve agent_authority block ← knows Mode A vs B vs local
resolve VenueBinding + venue-local signer
resolve funding (L3)
build the plan / card
if not execute:
return plan; SIGN NOTHING; MOVE NOTHING
if execute:
check binding ready
check agent_authority.execution_supported → else exact blocker
check VenuePolicy (per-venue caps/categories)
check global spend policy (daily caps, reservations)
check funding present
check live flags → else ready_but_live_locked, execution_performed=false
if approval required: → return requires_approval
create durable SpendIntent + reservation
execute via venue adapter (route per agent_authority.execution_route)
write ExecutionReceipt + append audit/spend events
stream real progress to the brain/chatready_but_live_locked with execution_performed: false is the canonical "everything is correct but the live flag is off" state. It is not a fake prepare-only product; flip the flag (after canary) and the same path executes.
A.5 The economic-action ledger (durable, attributable)
Every live action must be traceable end to end:
SpendIntent (reservation + policy decision)
→ venue execution (adapter)
→ ExecutionReceipt / EconomicActionReceipt
→ audit eventEach receipt must trace back to: user → agent → AgentAuthority → VenueBinding → SpendIntent. If a receipt can't be attributed to that chain, it's a quarantine condition, not a successful action. No secrets in any of these records.
A.6 Consumers
The agent_authority block is not decorative — at least one real consumer uses it. In the reference implementation the funding router consumes it to resolve the source wallet (Mode A vs Mode B address), and read-only execution previews surface it so a UI can show "you'll trade as <Mode B Virtuals agent>" before anything happens.
Part B — L5: reasoning-based intent, deterministic execution
B.1 The principle
Slash commands → deterministic. Natural language → a reasoning classifier. Clear high-confidence action → deterministic gated execution. Question / negation / hypothetical → the agent answers. Ambiguity → one precise clarifier. Model failure → abstain.
Keyword/regex intent gates are the wrong primary mechanism. "Should I fund Hyperliquid?", "I don't want to bind Polymarket", "what are the risks of launching a token?" all contain the action keywords but are not actions. Pattern matching can't tell an action from a question, a negation, or a hypothetical. A reasoning step can. Reference: chat_capability_intent_service.py.
B.2 The two-stage classifier
natural-language message
→ cheap, high-recall, NON-DECIDING pre-filter (keywords decide only: "worth a model call?")
→ ONE structured-JSON model call (classify)
→ { capability, is_action, venue, amount_usd, mode, confidence, ambiguous, clarifying_question }
→ dispatch:
confident action + required params → deterministic gated executor (L4)
question / negation / hypothetical → agent answers (no executor)
ambiguous / missing param / low confidence → one precise clarifying question
model error → abstainThe classifier output (reference shape):
{
"capability": "bind | fund | trade | credit | intel | treasury | degen | status | none",
"is_action": true, // question/negation/hypothetical ⇒ false
"venue": "polymarket | hyperliquid | … | null",
"amount_usd": 50.0, // or null
"mode": "create | link_existing | status | explain | null", // e.g. Virtuals Mode A vs B
"confidence": 0.0,
"ambiguous": false,
"clarifying_question": "string|null"
}Crucial: no keyword maps to an outcome. Keywords only decide whether to spend one classify call. The model decides action-vs-question; the backend decides whether the action is allowed.
B.3 mode distinguishes Mode A vs Mode B (and explain)
For binding, mode separates:
create→ Mode A provider wallet setup,link_existing→ Mode B link the user's Virtuals agent (must not create Mode A),status→ a read,explain→ forcesis_action=false(an explanation is never an action).
So "link my existing Virtuals agent" routes to Mode B, "set up my agent wallet" routes to Mode A, and "what does binding Polymarket mean?" routes to neither.
B.4 Failure policy: strict vs graceful (by risk)
Model failure handling is risk-tiered:
| Path class | On classifier failure |
|---|---|
| Read / status / explain | may fall through to the agent (graceful) |
| Money / action / purchase / setup / trade (fund, bind, credit, intel, treasury, degen, trade) | must not execute — abstain → ask clarifier or fall through to the agent |
Design guidance (from review): the safest stance for all high-risk paths is strict abstain — model down must never silently revert to keyword execution for anything that moves money or commits a setup. Treat "abstain-graceful" as a migration convenience, not the end state.
B.5 Why this is the security complement to A
L4 guarantees that if an action reaches the executor, it is checked against policy/funding/live-flags/authority before it signs. L5 guarantees that only a real, reasoned action reaches the executor — a poisoned input or a casual question can't masquerade as a command. Together:
prompt injection / casual question
→ L5 classifier: not a confident action → never dispatched
→ even if mis-classified, L4: policy + funding + live-flag + authority gates → blocked or approval-requiredTwo independent gates, neither of which is the LLM holding a key.
B.6 What L4+L5 give a dev
intent = await oaw.intent.classify(message) # reasoning, fail-safe
if intent.is_action and intent.confidence > T and intent.venue:
result = await oaw.venues.execute(user, intent.venue, action, execute=confirmed)
# L4: agent_authority-aware, policy/funding/live-flag gated, receipted
else:
# hand back to the agent brain to answer or clarifyThe dev gets safe autonomy: the agent can act from chat, but only real actions, only under policy, only with the right authority, and never with the model holding a key.