FinClaw V1 Agent Orchestration Design
状态:Accepted Initial Design / P0 Design Output 日期:2026-05-14 项目:FinClaw 文档级别:项目级设计支撑 上游文档:v1-prd.md、product-object-and-advisor-design.md、terminology-and-object-naming.md、v1-user-journey-and-interaction-flow.md、v1-product-object-and-schema-design.md、v1-evaluation-initial-plan.md
本文定义 FinClaw V1 的 Agent Orchestration 初稿。它不定义工程实现,不恢复 Action Suggestion,不创建交易执行、生产 channel、真实提醒或账户能力。
1. Orchestration Goal
FinClaw V1 的编排目标是把用户自然语言问题转成对象化金融认知,而不是展示多 Agent 表演。
编排层负责:
- 识别用户任务;
- 判断是否需要澄清或低置信输出;
- 选择必要的 Financial Cognition Advisor;
- 调用 FinSkills;
- 生成 Snapshot、Thread 或 Pre-Execution Checkpoint;
- 写入 EvidenceItem 和 DataQualityNote;
- 保留分歧、风险和执行边界;
- 把反馈和失败样本交给 evaluation / trial ops。
2. Layers
| Layer | Responsibility | Not responsible for |
|---|---|---|
| Task Router | 识别任务类型、对象、风险和澄清需求 | 直接生成最终金融结论 |
| Context Builder | 装载用户问题、线程、快照、证据和偏好 | 收集账户、私钥或执行权限 |
| Advisor Planner | 选择必要顾问视角 | 为展示而调用所有顾问 |
| Skill Runtime | 执行原子金融认知能力 | 决定产品边界 |
| Evidence / Quality Checker | 标注来源、缺口、冲突和低置信 | 伪造来源或隐藏缺证据 |
| Object Writer | 写入 Snapshot、Thread、Checkpoint | 写入订单、信号或执行字段 |
| Boundary Guard | 检查行动邻近、敏感信息和 forbidden fields | 用免责声明替代结构边界 |
| Feedback Adapter | 记录试运营反馈和失败样本 | 代表生产客服或交易支持 |
3. Task Routing
| User intent | Route | Required output |
|---|---|---|
| 看资产 / 主题 / 新闻 | Snapshot route | MarketCognitionSnapshot |
| 持续看 / 保存 / 后面刷新 | Thread proposal route | Snapshot + Thread proposal |
| 和上次比变化 | Thread refresh route | Refresh diff + updated Thread |
| 挑风险 / 反方 | Risk challenge route | Counter-thesis + invalidators |
| 买卖 / 加减仓 / 开多开空 | Pre-execution route | PreExecutionCheckpoint |
| 证据够不够 | Evidence audit route | Claim inventory + EvidenceItem / DataQualityNote |
| 输入凭证 / 私钥 | Sensitive rejection route | Rejection + masked handling record |
Router must allow a task to branch. Example: a colloquial BTC question may produce Snapshot plus Thread proposal; an action-adjacent prompt must branch to Pre-Execution Checkpoint.
4. Advisor Roles
| Advisor | Used when | Writes to |
|---|---|---|
| Event Interpretation Advisor | News, policy, announcement, event chain | Snapshot event summary, affected objects, watch questions |
| Asset Research Advisor | Asset, project, protocol or theme question | Snapshot main thesis, market context, unknowns |
| Market / Macro Advisor | Market regime or cross-market context matters | Snapshot market context, refresh conditions |
| Risk Advisor | User needs risk map or action-adjacent caution | Risk constraints, invalidators, checkpoint |
| Counter-Thesis Advisor | User asks what could be wrong or system detects one-sided thesis | Counter thesis, watch questions, invalidators |
| Pre-Execution Advisor | User uses buy / sell / add / reduce / long / short language | PreExecutionCheckpoint only |
| Source Quality Advisor | Sources are missing, stale, conflicting or user-supplied | EvidenceItem, DataQualityNote |
No advisor may output real orders, position sizes, leverage, account operations, wallet actions, automatic alerts or production channel calls.
5. FinSkills
Initial FinSkills are capability labels, not implementation commitments:
| Skill | Purpose |
|---|---|
asset-context-summarizer | Build bounded context around an asset, project or theme |
event-impact-reader | Separate event facts from impact inference |
narrative-mapper | Map main and counter narratives |
risk-controversy-mapper | Identify risks, disputes and invalidators |
watch-question-generator | Produce watch questions and refresh conditions |
strategy-hypothesis-framer | Convert action-adjacent questions into conditional cognition |
source-quality-checker | Label source and data quality limitations |
sensitive-input-classifier | Classify and reject sensitive credentials |
Skills must return structured output that can be written into product objects. Free-form skill output is not sufficient for V1 formal outputs.
6. Object Write Rules
| Target | Write allowed | Write forbidden |
|---|---|---|
| Snapshot | facts, inferences, risks, unknowns, watch questions, evidence, quality | order instructions, target price as instruction |
| Thread | linked snapshots, thesis, counter thesis, refresh conditions, invalidators, cognition changes | auto alerts, trade state, portfolio management |
| Checkpoint | conditional hypothesis, preconditions, risk constraints, invalidators, non-execution boundary | orders, account fields, leverage, private keys |
| EvidenceItem | source state and limitations | fabricated provenance |
| DataQualityNote | freshness, conflicts, permission limits, model inference | confidence labels that imply tradable signal |
Object Writer must reject writes that contain forbidden execution fields.
7. Disagreement Handling
Advisor disagreement is expected. The system must preserve useful disagreement.
| Disagreement source | Required handling |
|---|---|
| Fact conflict | List conflicting sources and evidence state |
| Inference difference | Keep both paths and assumptions |
| Time horizon difference | Separate short-term and medium-term cognition |
| Risk preference difference | Mark as user-context dependent |
| Data quality difference | Explain source limitations |
| Execution boundary pressure | Convert to Pre-Execution Checkpoint |
The orchestrator should produce a main view, counter view and watch questions rather than forcing false certainty.
8. Boundary Guard
Boundary Guard runs before final output and before object write.
It blocks or rewrites:
- direct buy / sell / hold / short / long commands;
- target price as instruction;
- position size or leverage;
- exchange / broker operation;
- wallet / private key / seed phrase handling;
- automatic execution or production alert claims;
- unsupported high-confidence claims;
- reference experience treated as product truth.
If action-adjacent content remains, output must become Pre-Execution Checkpoint.
9. Sensitive Input Handling
Sensitive input classifier must label:
- ordinary preference;
- financial context;
- sensitive personal / financial information;
- credential or permission.
Credential or permission input triggers:
- masking;
- rejection;
- no save;
- no train;
- no echo;
- optional human review flag.
Financial context such as holdings or cost basis can be used for the current cognition task but cannot be saved without ProfileConsent.
10. Evaluation Hooks
Each orchestration run should record:
- route selected;
- advisors used and why;
- skills used;
- object write targets;
- evidence and data quality notes;
- boundary guard results;
- sensitive input handling;
- missing fields;
- case mapping;
- reviewer notes.
Evaluation should fail if the system produces the right prose but cannot map the output to Snapshot, Thread or Checkpoint fields.
10A. Multi-Advisor Coordination
A single user task may require 2–4 advisors. The orchestrator coordinates them as follows.
10A.1 Coordination Modes
| Mode | When | Behavior |
|---|---|---|
| Sequential | Advisor B depends on Advisor A's output (e.g., Risk Advisor needs Asset Research thesis first) | Run A, feed structured output to B, merge into Snapshot |
| Parallel-then-merge | Advisors are independent on same input (e.g., Event + Market/Macro on same news) | Run in parallel, Object Writer merges non-conflicting fields; conflicts enter Disagreement Handling §7 |
| Challenge-after-draft | Counter-Thesis or Risk Advisor challenges an existing draft thesis | Run primary advisors, produce draft Snapshot, then pass draft to challenger; challenger writes counter_thesis, invalidators, watch_questions |
10A.2 Merge Rules
- Each advisor writes to its declared
thread_write_targetfields (§6). No advisor overwrites another advisor's target unless it is the designated challenge pass. - When two advisors write the same field (e.g., both add
watch_questions), Object Writer concatenates and deduplicates. - When advisors produce contradictory
main_thesisorsupporting_reasons, the orchestrator must invoke Disagreement Handling (§7) and preserve both paths in the output. - The final Snapshot or Thread update must trace which advisor contributed which claim via
advisor_outputsreferences.
10A.3 Advisor Budget
V1 limits each task to at most 5 advisor invocations per turn to bound latency and cost. If more than 5 advisors are plausible, Advisor Planner must rank by relevance and defer lower-priority advisors to follow-up turns.
10B. Degradation and Fallback Paths
When a component fails, the orchestrator must degrade gracefully rather than produce silent errors or fabricated outputs.
| Failure | Detection | Fallback | User-visible effect |
|---|---|---|---|
| FinSkill timeout or error | Skill Runtime returns error or exceeds timeout | Retry once; if still fails, mark affected claims as evidence_status: unavailable | DataQualityNote with quality_state: unavailable and impact explanation |
| Source unavailable | Source Quality Advisor or tool returns no data | Proceed with available sources; add DataQualityNote unavailable for missing source | Snapshot shows source-limited state; claims dependent on missing source marked low-confidence |
| Model low-quality output | Evidence/Quality Checker detects unsupported certainty, hallucination markers, or empty structured fields | Demote to low_confidence state; if critical fields are empty, return needs_clarification | UI shows low-confidence badge; user offered to provide context or accept bounded output |
| All advisors fail | No advisor produces usable structured output | Return a minimal acknowledgment explaining the failure, do not fabricate a Snapshot | User sees failure state with retry option |
| Boundary Guard rejects output | Forbidden execution fields detected in advisor output | Strip forbidden fields, re-run through Boundary Guard; if persistent, block output and flag for human review | Output withheld; user sees boundary enforcement message |
| Context window exceeded | Context Builder detects input exceeds model limit | Summarize older snapshots, trim low-priority evidence, preserve most recent thesis and user question | Output may miss historical nuance; DataQualityNote records context trimming |
10B.1 Degradation Principles
- Never fabricate sources or evidence to fill gaps left by failures.
- Every degradation must produce a visible DataQualityNote or UI state change.
- Degraded outputs must still pass Boundary Guard before delivery.
- If degradation affects a Pre-Execution Checkpoint, the
non_execution_statementmust note the limitation.
10C. Context Budget Management
V1 operates within finite model context windows. The orchestrator must manage context deliberately.
10C.1 Context Priority Order
When context must be trimmed, preserve in this priority (highest first):
- Current user question and task type;
- Active thread
current_thesis,counter_thesis,invalidators,watch_questions; - Most recent snapshot (full);
- Evidence items for current claims;
- User context and profile consent state;
- Earlier snapshots (summarized, not full);
- Advisor output history (summarized);
- Reference and background material (summarized or dropped).
10C.2 Budget Rules
- Context Builder must estimate token usage before advisor invocation. If estimated usage exceeds 80% of the context window, trigger summarization of items at priority 6–8.
- Each advisor receives a scoped context subset relevant to its role, not the full context. Advisor Planner determines the subset.
- Refresh tasks always load the previous snapshot in full; older history loads as summaries.
- Sensitive input handling records are never summarized or dropped; they are compact by design.
11. Engineering Handoff Boundary
Engineering implementation may choose concrete classes, queues, prompts or runtime structure later. This design only fixes:
- product-level responsibilities;
- object write targets;
- boundary guard behavior;
- evaluation hooks;
- forbidden execution fields.
Engineering must not infer that V1 already has live data, account access, production alerting, training pipeline or trial evidence.
12. Acceptance
本文可作为 Engineering-start draft input together with Journey, Schema, UI / UX and Evaluation Initial Plan. It does not by itself satisfy Engineering-start gate.
13. Open Items
- Prompt / tool contract details remain engineering design work.
- Evaluation Review / Acceptance Plan must test boundary guard and object-write behavior.
- Trial Ops must define how failures are reported, reviewed and stopped.