FinClaw V1 Agent Orchestration Design

状态：Accepted Initial Design / P0 Design Output 日期：2026-05-14 项目：FinClaw 文档级别：项目级设计支撑上游文档：v1-prd.md、product-object-and-advisor-design.md、terminology-and-object-naming.md、v1-user-journey-and-interaction-flow.md、v1-product-object-and-schema-design.md、v1-evaluation-initial-plan.md

本文定义 FinClaw V1 的 Agent Orchestration 初稿。它不定义工程实现，不恢复 Action Suggestion，不创建交易执行、生产 channel、真实提醒或账户能力。

1. Orchestration Goal

FinClaw V1 的编排目标是把用户自然语言问题转成对象化金融认知，而不是展示多 Agent 表演。

编排层负责：

识别用户任务；
判断是否需要澄清或低置信输出；
选择必要的 Financial Cognition Advisor；
调用 FinSkills；
生成 Snapshot、Thread 或 Pre-Execution Checkpoint；
写入 EvidenceItem 和 DataQualityNote；
保留分歧、风险和执行边界；
把反馈和失败样本交给 evaluation / trial ops。

2. Layers

Layer	Responsibility	Not responsible for
Task Router	识别任务类型、对象、风险和澄清需求	直接生成最终金融结论
Context Builder	装载用户问题、线程、快照、证据和偏好	收集账户、私钥或执行权限
Advisor Planner	选择必要顾问视角	为展示而调用所有顾问
Skill Runtime	执行原子金融认知能力	决定产品边界
Evidence / Quality Checker	标注来源、缺口、冲突和低置信	伪造来源或隐藏缺证据
Object Writer	写入 Snapshot、Thread、Checkpoint	写入订单、信号或执行字段
Boundary Guard	检查行动邻近、敏感信息和 forbidden fields	用免责声明替代结构边界
Feedback Adapter	记录试运营反馈和失败样本	代表生产客服或交易支持

3. Task Routing

User intent	Route	Required output
看资产 / 主题 / 新闻	Snapshot route	MarketCognitionSnapshot
持续看 / 保存 / 后面刷新	Thread proposal route	Snapshot + Thread proposal
和上次比变化	Thread refresh route	Refresh diff + updated Thread
挑风险 / 反方	Risk challenge route	Counter-thesis + invalidators
买卖 / 加减仓 / 开多开空	Pre-execution route	PreExecutionCheckpoint
证据够不够	Evidence audit route	Claim inventory + EvidenceItem / DataQualityNote
输入凭证 / 私钥	Sensitive rejection route	Rejection + masked handling record

Router must allow a task to branch. Example: a colloquial BTC question may produce Snapshot plus Thread proposal; an action-adjacent prompt must branch to Pre-Execution Checkpoint.

4. Advisor Roles

Advisor	Used when	Writes to
Event Interpretation Advisor	News, policy, announcement, event chain	Snapshot event summary, affected objects, watch questions
Asset Research Advisor	Asset, project, protocol or theme question	Snapshot main thesis, market context, unknowns
Market / Macro Advisor	Market regime or cross-market context matters	Snapshot market context, refresh conditions
Risk Advisor	User needs risk map or action-adjacent caution	Risk constraints, invalidators, checkpoint
Counter-Thesis Advisor	User asks what could be wrong or system detects one-sided thesis	Counter thesis, watch questions, invalidators
Pre-Execution Advisor	User uses buy / sell / add / reduce / long / short language	PreExecutionCheckpoint only
Source Quality Advisor	Sources are missing, stale, conflicting or user-supplied	EvidenceItem, DataQualityNote

No advisor may output real orders, position sizes, leverage, account operations, wallet actions, automatic alerts or production channel calls.

5. FinSkills

Initial FinSkills are capability labels, not implementation commitments:

Skill	Purpose
`asset-context-summarizer`	Build bounded context around an asset, project or theme
`event-impact-reader`	Separate event facts from impact inference
`narrative-mapper`	Map main and counter narratives
`risk-controversy-mapper`	Identify risks, disputes and invalidators
`watch-question-generator`	Produce watch questions and refresh conditions
`strategy-hypothesis-framer`	Convert action-adjacent questions into conditional cognition
`source-quality-checker`	Label source and data quality limitations
`sensitive-input-classifier`	Classify and reject sensitive credentials

Skills must return structured output that can be written into product objects. Free-form skill output is not sufficient for V1 formal outputs.

6. Object Write Rules

Target	Write allowed	Write forbidden
Snapshot	facts, inferences, risks, unknowns, watch questions, evidence, quality	order instructions, target price as instruction
Thread	linked snapshots, thesis, counter thesis, refresh conditions, invalidators, cognition changes	auto alerts, trade state, portfolio management
Checkpoint	conditional hypothesis, preconditions, risk constraints, invalidators, non-execution boundary	orders, account fields, leverage, private keys
EvidenceItem	source state and limitations	fabricated provenance
DataQualityNote	freshness, conflicts, permission limits, model inference	confidence labels that imply tradable signal

Object Writer must reject writes that contain forbidden execution fields.

7. Disagreement Handling

Advisor disagreement is expected. The system must preserve useful disagreement.

Disagreement source	Required handling
Fact conflict	List conflicting sources and evidence state
Inference difference	Keep both paths and assumptions
Time horizon difference	Separate short-term and medium-term cognition
Risk preference difference	Mark as user-context dependent
Data quality difference	Explain source limitations
Execution boundary pressure	Convert to Pre-Execution Checkpoint

The orchestrator should produce a main view, counter view and watch questions rather than forcing false certainty.

8. Boundary Guard

Boundary Guard runs before final output and before object write.

It blocks or rewrites:

direct buy / sell / hold / short / long commands;
target price as instruction;
position size or leverage;
exchange / broker operation;
wallet / private key / seed phrase handling;
automatic execution or production alert claims;
unsupported high-confidence claims;
reference experience treated as product truth.

If action-adjacent content remains, output must become Pre-Execution Checkpoint.

9. Sensitive Input Handling

Sensitive input classifier must label:

ordinary preference;
financial context;
sensitive personal / financial information;
credential or permission.

Credential or permission input triggers:

masking;
rejection;
no save;
no train;
no echo;
optional human review flag.

Financial context such as holdings or cost basis can be used for the current cognition task but cannot be saved without ProfileConsent.

10. Evaluation Hooks

Each orchestration run should record:

route selected;
advisors used and why;
skills used;
object write targets;
evidence and data quality notes;
boundary guard results;
sensitive input handling;
missing fields;
case mapping;
reviewer notes.

Evaluation should fail if the system produces the right prose but cannot map the output to Snapshot, Thread or Checkpoint fields.

10A. Multi-Advisor Coordination

A single user task may require 2–4 advisors. The orchestrator coordinates them as follows.

10A.1 Coordination Modes

Mode	When	Behavior
Sequential	Advisor B depends on Advisor A's output (e.g., Risk Advisor needs Asset Research thesis first)	Run A, feed structured output to B, merge into Snapshot
Parallel-then-merge	Advisors are independent on same input (e.g., Event + Market/Macro on same news)	Run in parallel, Object Writer merges non-conflicting fields; conflicts enter Disagreement Handling §7
Challenge-after-draft	Counter-Thesis or Risk Advisor challenges an existing draft thesis	Run primary advisors, produce draft Snapshot, then pass draft to challenger; challenger writes `counter_thesis`, `invalidators`, `watch_questions`

10A.2 Merge Rules

Each advisor writes to its declared thread_write_target fields (§6). No advisor overwrites another advisor's target unless it is the designated challenge pass.
When two advisors write the same field (e.g., both add watch_questions), Object Writer concatenates and deduplicates.
When advisors produce contradictory main_thesis or supporting_reasons, the orchestrator must invoke Disagreement Handling (§7) and preserve both paths in the output.
The final Snapshot or Thread update must trace which advisor contributed which claim via advisor_outputs references.

10A.3 Advisor Budget

V1 limits each task to at most 5 advisor invocations per turn to bound latency and cost. If more than 5 advisors are plausible, Advisor Planner must rank by relevance and defer lower-priority advisors to follow-up turns.

10B. Degradation and Fallback Paths

When a component fails, the orchestrator must degrade gracefully rather than produce silent errors or fabricated outputs.

Failure	Detection	Fallback	User-visible effect
FinSkill timeout or error	Skill Runtime returns error or exceeds timeout	Retry once; if still fails, mark affected claims as `evidence_status: unavailable`	DataQualityNote with `quality_state: unavailable` and `impact` explanation
Source unavailable	Source Quality Advisor or tool returns no data	Proceed with available sources; add DataQualityNote `unavailable` for missing source	Snapshot shows `source-limited` state; claims dependent on missing source marked low-confidence
Model low-quality output	Evidence/Quality Checker detects unsupported certainty, hallucination markers, or empty structured fields	Demote to `low_confidence` state; if critical fields are empty, return `needs_clarification`	UI shows low-confidence badge; user offered to provide context or accept bounded output
All advisors fail	No advisor produces usable structured output	Return a minimal acknowledgment explaining the failure, do not fabricate a Snapshot	User sees failure state with retry option
Boundary Guard rejects output	Forbidden execution fields detected in advisor output	Strip forbidden fields, re-run through Boundary Guard; if persistent, block output and flag for human review	Output withheld; user sees boundary enforcement message
Context window exceeded	Context Builder detects input exceeds model limit	Summarize older snapshots, trim low-priority evidence, preserve most recent thesis and user question	Output may miss historical nuance; DataQualityNote records context trimming

10B.1 Degradation Principles

Never fabricate sources or evidence to fill gaps left by failures.
Every degradation must produce a visible DataQualityNote or UI state change.
Degraded outputs must still pass Boundary Guard before delivery.
If degradation affects a Pre-Execution Checkpoint, the non_execution_statement must note the limitation.

10C. Context Budget Management

V1 operates within finite model context windows. The orchestrator must manage context deliberately.

10C.1 Context Priority Order

When context must be trimmed, preserve in this priority (highest first):

Current user question and task type;
Active thread current_thesis, counter_thesis, invalidators, watch_questions;
Most recent snapshot (full);
Evidence items for current claims;
User context and profile consent state;
Earlier snapshots (summarized, not full);
Advisor output history (summarized);
Reference and background material (summarized or dropped).

10C.2 Budget Rules

Context Builder must estimate token usage before advisor invocation. If estimated usage exceeds 80% of the context window, trigger summarization of items at priority 6–8.
Each advisor receives a scoped context subset relevant to its role, not the full context. Advisor Planner determines the subset.
Refresh tasks always load the previous snapshot in full; older history loads as summaries.
Sensitive input handling records are never summarized or dropped; they are compact by design.

11. Engineering Handoff Boundary

Engineering implementation may choose concrete classes, queues, prompts or runtime structure later. This design only fixes:

product-level responsibilities;
object write targets;
boundary guard behavior;
evaluation hooks;
forbidden execution fields.

Engineering must not infer that V1 already has live data, account access, production alerting, training pipeline or trial evidence.

12. Acceptance

本文可作为 Engineering-start draft input together with Journey, Schema, UI / UX and Evaluation Initial Plan. It does not by itself satisfy Engineering-start gate.

13. Open Items

Prompt / tool contract details remain engineering design work.
Evaluation Review / Acceptance Plan must test boundary guard and object-write behavior.
Trial Ops must define how failures are reported, reviewed and stopped.

1. Orchestration Goal​

2. Layers​

3. Task Routing​

4. Advisor Roles​

5. FinSkills​

6. Object Write Rules​

7. Disagreement Handling​

8. Boundary Guard​

9. Sensitive Input Handling​

10. Evaluation Hooks​

10A. Multi-Advisor Coordination​

10A.1 Coordination Modes​

10A.2 Merge Rules​

10A.3 Advisor Budget​

10B. Degradation and Fallback Paths​

10B.1 Degradation Principles​

10C. Context Budget Management​

10C.1 Context Priority Order​

10C.2 Budget Rules​

11. Engineering Handoff Boundary​

12. Acceptance​

13. Open Items​