Claude Code Dynamic Workflows 是什麼？

Dynamic Workflows 是 Claude Code 在 Opus 4.8 釋出(2026-05-28)同時推出的 research preview 功能。Claude 主 session 寫一支 JS script，由 runtime 在背景執行，script 內可 orchestrate 數百個 read-only Explore subagent 平行作業。中間結果存在 script 變數而非 Claude context 內，主 session token 不被汙染，script 可存成 slash command 重用。

Dynamic Workflows 需要哪個 plan？

所有付費 plan 都可用，包括 Pro / Max / Team / Enterprise。需要 Claude Code 2.1.154 或更新版本。Pro plan 要先去 /config 把 Dynamic workflows 切開。可從 Anthropic API、Bedrock、Vertex AI、Foundry 使用。

怎麼啟動一個 Dynamic Workflow？

三種方式：(1) 在 prompt 裡塞 workflow 字眼，Claude Code 會反白並跳出 approval prompt；(2) /effort ultracode 開全 session 自動 workflow，但僅 Opus 4.7/4.8 支援；(3) 直接跑內建 /deep-research 做網路研究。第一次跑強烈建議按 View raw script 看 Claude 寫的 JS。

Sonnet 4.6 能用 xhigh effort 嗎？

不能。只有 Opus 4.7 跟 Opus 4.8 支援 xhigh。Sonnet 4.6 跟 Opus 4.6 支援 low / medium / high / max 四級。如果你硬塞 xhigh 給 Sonnet，CLI 會自動降到 high 不報錯。ultracode 在 Sonnet 的 /effort 菜單甚至不會出現。

Workflow 跟一般 subagent 的差別？

Subagent 是 Claude 在主 turn 內 spawn 的 worker，結果回到 Claude context；workflow 是 runtime 執行一支 JS script，平行 spawn 數十到數百個 Explore subagent，中間結果留在 script 變數，可平行度遠超 subagent。Workflow runtime 還會自動把 worker 派到較便宜的 model(實測派 Haiku 4.5)，主 orchestrator 才用 Opus。

Workflow 適合什麼題目？

四條件全符合的：(1) 可平行化、N 個獨立 chunk；(2) 結果可被 JSON Schema 約束(enum 有限)；(3) 有 ground truth 可驗；(4) 最好是 read-only。典型場景：codebase audit、大規模 migration、cross-tenant 一致性 sweep、合規檢查。不適合：互動 debug、中途需要決策、序列邏輯、新功能 spike、預算敏感任務(成本約 baseline 2-3 倍)。

Claude Code Dynamic Workflows 實測:Opus 4.8 + 11 Haiku 平行 audit 109 個 resolver

重點摘要

Anthropic 在 2026-05-28 釋出 Claude Opus 4.8,帶來 Dynamic Workflows(背景跑 JS script orchestrate 數百 subagent)、Effort Control(low/medium/high/xhigh/max/ultracode)、Messages API system entries 中段插入。
實測:同題雙軌跑 Home123(Go + gqlgen 後端)的 109 個 resolver 函式 authz pattern audit,baseline (Opus 4.7 + 1 個 general-purpose agent) vs workflow (Opus 4.8 + 11 個 Haiku 4.5 worker)。
最大發現:workflow runtime 自動把 worker model 派為 Haiku 4.5,主 orchestrator 維持 Opus 4.8 — 對應「想 → opus / 找 → haiku」分工 doctrine,以前要手寫 AGENTS.md 分配的事 runtime 替你做了。
Wall time 沒省(workflow runtime 132 秒比 baseline 236 秒快,但加上主 session synthesis 反而是 392 秒)。本次估算成本 workflow ~$6.1 是 baseline ~$0.72 的 ~8.5× — 主要花在 runtime cache write 固定成本,大規模任務攤平會降下來。
收益不在省錢,在:JSON Schema 強制輸出契約、reproducibility(script 落地,以後 /<name> 一鍵重跑)、主 orchestrator 主動 cross-check 下游(實戰抓到 synthesis subagent 算錯總數 83 vs 真實 109)。

一眼看完:Baseline vs Workflow

整篇下面會逐項拆,先給你 5 秒讀完的版本。題目都是同一份 Home123 GraphQL 後端 109 個 resolver 函式的 authz pattern 分類。

面向	Baseline	Workflow
啟動指令	主 session 派一個 Agent	`claude --model claude-opus-4-8 --effort high` 後 prompt 內塞 `workflow` 字
主模型	Opus 4.7	Opus 4.8
Worker 數量	1(general-purpose,可寫)	11(Explore 型,強制 read-only)
Worker 模型	Opus 4.7(跟主一致)	全 11 個 Haiku 4.5(runtime 自選)
並發加速	—	11 agents 加總 354s 實跑 132s = 2.7×
主 session tool call	24 (14 Bash + 10 Read)	8 (5 Bash + 2 Read + 1 Workflow)
輸出契約	自由 markdown	JSON Schema + enum 鎖死 4 值
覆蓋	109/109	109/109
總耗時	3:56(236s)	6:32 turn,workflow runtime 2:12
抓到的 root cause	1 個(`CapsAny` 提案)	6 個(含 `guard` 5× 重複的真兇)
結構級洞見	沒做出	read-side vs write-side 不對稱
主動 cross-check 下游	—(只有 1 agent)	✅ 抓到 synthesis agent 算錯 83 vs 真實 109
估算成本	~$0.72	~$6.10(約 8.5× baseline)
可重跑	無	script 落地,以後 `/<name>` 一鍵

架構差異:一張流程圖

讀者最常問的不是「快多少」,是「結構上到底差在哪」。把兩種模式畫成流程:

Baseline                              Workflow
─────────                             ─────────
User prompt                           User prompt
    │                                     │
    ▼                                     ▼
┌─────────────────┐                   ┌─────────────────┐
│ Main Claude     │                   │ Main Claude     │
│ (Opus 4.7)      │                   │ (Opus 4.8)      │
│                 │                   │ 寫 JS script    │
│ Agent tool      │                   └────────┬────────┘
│   ↓ 派 1 agent  │                            │ Workflow tool
└────────┬────────┘                            ▼
         │                              ┌──────────────────┐
         ▼                              │ Runtime          │
   ┌──────────────┐                     │ (背景執行 script) │
   │ 1× Opus 4.7  │                     └────────┬─────────┘
   │ general-     │                              │ fan-out
   │ purpose      │                              ▼
   │              │            ┌──────────────────────────────────┐
   │ ─ 24 tool ─  │            │ Phase 1: Classify (平行)         │
   │   call ─ 自由 │            │ ┌────┐┌────┐┌────┐ ... ┌────┐  │
   │   分類 +    │            │ │H4.5││H4.5││H4.5│      │H4.5│  │
   │   markdown  │            │ └────┘└────┘└────┘      └────┘  │
   └──────┬───────┘            │ 10 agents (schema 切 8 chunks +  │
          │                    │  visitor 1 + announcement 1)     │
          ▼                    └────────────┬─────────────────────┘
   markdown report                          │ JSON schema 驗證
                                            ▼
                               ┌──────────────────────────┐
                               │ Phase 2: Synthesize      │
                               │ ┌────┐                    │
                               │ │H4.5│ 1 agent           │
                               │ └────┘                    │
                               └────────────┬──────────────┘
                                            │
                                            ▼
                               ┌──────────────────────────┐
                               │ Main Claude (Opus 4.8)   │
                               │ cross-check synthesis,   │
                               │ 抓到 83 vs 109 算術錯誤  │
                               │ 用 authoritative rows     │
                               │ 重算 + 寫最終 report     │
                               └────────────┬──────────────┘
                                            ▼
                                   final markdown report
                                   + JS script 落地

關鍵差別不是 agent 數,是「主 Claude 不再親手做活,變成 orchestrator + verifier」。中間結果落在 script 變數,不灌進主 session context。

背景:Opus 4.8 帶來什麼

Anthropic 在 2026-05-28 同步發佈了 Claude Opus 4.8、Dynamic Workflows research preview、Effort Control、以及 Messages API 的中段 system entries 支援。發佈當天我看了官網內容,隔天 5/29 拿真實專案 Home123 跑了一輪雙軌對照,本文記錄完整數據與啟動 prompt。

三個直接相關的能力更新:

能力	說明	適用 plan
Dynamic Workflows	Claude 寫一支 JS script,runtime 背景跑,內含可數百個 read-only Explore subagent 平行作業	所有付費(Pro 要先去 `/config` 開)
Effort Control	5 級:low / medium / high / xhigh / max,另有 ultracode(= xhigh + 自動 workflow orchestration)	所有 plan
Messages API system entries	可在 messages array 中段插入 system entry,更新指令但不破 prompt cache、不需要假裝 user turn	API 直連

不是所有 model 都吃所有 effort 等級。實測 claude --effort foo 會回:

error: option '--effort <level>' argument 'foo' is invalid.
It must be one of: low, medium, high, xhigh, max

官方支援表(來自 code.claude.com/docs/en/model-config):

Model	支援 effort 等級	預設
Opus 4.8, Opus 4.7	low, medium, high, xhigh, max	4.8 → high / 4.7 → xhigh
Opus 4.6, Sonnet 4.6	low, medium, high, max(沒 xhigh)	high
其他	不支援 effort	—

常見誤解:Sonnet 也能 xhigh 嗎?不行。Sonnet 4.6 只到 high,你硬塞 xhigh 會自動降到 high 不報錯但也少一級。ultracode 在 Sonnet 連菜單都不出現。

對照實驗設計:為什麼挑 Home123 authz audit

Workflow 適合的題目有 4 個必要條件:

可平行化 — 任務天然能切成 N 個獨立 chunk
可被 JSON Schema 約束 — 輸出結構穩定、enum 有限
有 ground truth — 否則 11 個 agent 跑完你也不知道對不對
Read-only 為佳 — Explore type agent 不會誤改檔

挑的題目:掃描 Home123 backend GraphQL 的 3 個 resolver 檔,把 109 個 resolver function 分類成 USES_PRELUDE / LEGIT_PUBLIC / HAND_ROLLED / UNKNOWN。

Home123 是 multi-tenant SaaS,我自己寫了一個 authzPrelude centralised authn+cap+scope+audit 序列,規範每個 protected resolver 進 withTx 後第一段要 call。但程式碼累積過程中,有些 resolver 為了 cap-OR 之類 prelude 還不支援的情境 hand-roll 了授權邏輯。這次審計就是要逐一抓出來。

符合條件	為什麼這題符合
可平行化	109 個 function 各自獨立,每個分類不依賴其他結果
可被 schema 約束	`class` enum 只有 4 值;每筆 row 必填 file / func / line / class / evidence
有 ground truth	`grep -c authzPrelude backend/graph/*.resolvers.go` 給出客觀基準
Read-only	純審計、不改檔

啟動 prompt:兩邊放在一起看

兩邊我都跑了。下面先給差異速覽,再給雙邊完整 prompt 對照。

差異速覽

面向	Baseline	Workflow
session 啟動	任何 session 都可,呼叫 Agent tool	`claude --model claude-opus-4-8 --effort high`
觸發機制	主動派 1 個 Agent	prompt 內含 `workflow` 關鍵字 → Claude Code 反白 → approval prompt
Prompt 句子數	~20 行(任務說明)	~24 行(任務說明 + 第一個字「workflow」)
關鍵差異字	直述命令式 — 「做這個 audit」	第一個字加「跑一個 workflow,」 — 觸發 runtime 寫 script
事前需要	無(general-purpose 預設可用)	Claude Code ≥ 2.1.154,Pro 要先 `/config` 開
跑前是否確認	不需要,直接跑	approval prompt 必須選 [1] Yes;第一次強烈建議按 [3] View raw script

左:Baseline prompt(逐字)

你在 /home/tom/Desktop/home123_new 做一次 read-only authz pattern audit。
禁止修改任何檔案。

審計三個 resolver 檔的每一個 resolver function:
- backend/graph/schema.resolvers.go (~80 funcs)
- backend/graph/visitor.resolvers.go (~15 funcs)
- backend/graph/announcement.resolvers.go (~16 funcs)

分類規則:
1. USES_PRELUDE — body 第一段 call authzPrelude(...)
2. LEGIT_PUBLIC — 真正不需要 auth 的 (Login / RefreshAccessToken 等)
3. HAND_ROLLED — 沒 call authzPrelude 但自己寫 cap/scope/role/audit 檢查
4. UNKNOWN — 你無法判定

輸出:markdown 表 | file | func | line | class | evidence |,
所有 function 一行不能少。

最後加 ## Summary:每 class 數量、HAND_ROLLED 跟 UNKNOWN 清單、可疑模式。

用 Bash grep / Read 自己抓,不要憑記憶或猜。

右:Workflow prompt(逐字)

新 session 啟動:

cd /home/tom/Desktop/home123_new
claude --model claude-opus-4-8 --effort high

進去後丟這個 prompt(留意第一個字是 「在這個 repo 跑一個 workflow」,這個關鍵字就是觸發 runtime 的開關):

在這個 repo 跑一個 workflow,做 read-only authz pattern audit。不准改任何檔案。

審計三個 resolver 檔的「每一個」resolver function:
- backend/graph/schema.resolvers.go (約 78 funcs)
- backend/graph/visitor.resolvers.go (約 15 funcs)
- backend/graph/announcement.resolvers.go (約 16 funcs)

每個 function 分類成 4 種之一:
1. USES_PRELUDE — body 第一段 call authzPrelude,或 1-line delegate 到呼叫
   authzPrelude 的 helper
2. LEGIT_PUBLIC — 真公開查詢 (Login / RefreshAccessToken / Healthcheck /
   ApplicationStatus 等),依據 schema directive / 註解 / 函式意圖判定
3. HAND_ROLLED — 沒 call authzPrelude 但自己寫 cap/scope/role/audit
   (CheckCap / hasRole / rbac.RequireCapability / 手動讀 ctx 判 user)
4. UNKNOWN — 邏輯太繞或無法判定

輸出格式:markdown 表 | file | func | line | class | evidence |,
每個 function 一行不能少。

最後加 ## Summary:每 class 數量、HAND_ROLLED 跟 UNKNOWN 名單、可疑模式
(eg. asymmetry / 同一 cap 兩種寫法 / 整個檔的傾向)。

建議擴展 prelude 才能解的問題也標出來 (eg. CapsAny 多 cap OR-chain)。

送出後 Claude Code 會問你

Workflow plan:
  Phase 1: Classify — one Explore agent per function-range chunk
  Phase 2: Synthesize — cross-file pattern analysis + suspicious findings

[1] Yes, run it
[2] Yes, and don't ask again for <name> in <path>
[3] View raw script   ← 第一次強烈建議按
[4] No

第一次按 View raw script 看 Claude 寫出什麼 JS,看完按 ESC 回去再按 Yes。這是 workflow 教育核心,跟 subagent 的 black box 是完全不同的體驗 — 你看得到 orchestration 的具體 code。

Claude 寫的 workflow script 長啥樣

從 session jsonl 撈出來的真實 script 開頭片段:

export const meta = {
  name: 'authz-pattern-audit',
  phases: [
    { title: 'Classify',   detail: 'one Explore agent per function-range chunk' },
    { title: 'Synthesize', detail: 'cross-file pattern analysis' },
  ],
}

// JSON Schema 強制每筆 row 必填 5 欄
const ROW_SCHEMA = {
  type: 'object',
  required: ['rows'],
  properties: {
    rows: {
      type: 'array',
      items: {
        type: 'object',
        required: ['file', 'func', 'line', 'class', 'evidence'],
        properties: {
          file: { type: 'string' },
          func: { type: 'string' },
          line: { type: 'integer' },
          class: {
            type: 'string',
            enum: ['USES_PRELUDE','LEGIT_PUBLIC','HAND_ROLLED','UNKNOWN']
          },
          evidence: { type: 'string' },
        },
      },
    },
  },
}

const PRELUDE_REF = `
REFERENCE — what each class means in THIS codebase (backend/graph/authz.go):
authzPrelude(ctx, tx, claims, PreludeOptions{Cap:..., Scope:..., ...})
is the centralised authn+cap+scope+audit sequence...
CLASSIFICATION RULES (pick exactly one per function):
...
`

// schema.resolvers.go 切 8 chunks,各 1 個 Explore agent;visitor / announcement 各 1
// fanOut(...) → 10 個 read-only Explore agents 平行
// 然後 1 個 synthesis Explore agent 做 cross-file 分析

return { rows, summary }

注意三件事:

JSON Schema 鎖死 enum:subagent 不可能跑出 "class": "MAYBE" 這種,runtime 直接退回。Baseline 自由 markdown 沒有這層保護。
agentType: 'Explore' 是 script 層硬性:11 個 agent 完全不能 Edit/Write,不是靠 prompt 講「不准改」這種 soft 約束。
chunk 切法寫死在 script:每個 agent context 不汙染,不會記錯行號。

跑起來後的真實 process 狀態

從本機 process tree + session jsonl 即時觀察到的(workflow 跑到 110 秒時):

$ ps -o pid,pcpu,pmem,vsz,rss,etime,cmd -p 1676911
    PID %CPU %MEM    VSZ   RSS     ELAPSED CMD
1676911 26.9  1.0 73786132 337864    01:49 claude --model claude-opus-4-8 --effort high

$ python3 - << 'PY'
import json
events = {}; tools = {}
with open('<session>.jsonl') as f:
    for line in f:
        d = json.loads(line); ...
print(events, tools)
PY
# 結果:
# {'assistant': 14, 'system': 2, 'user': 6, ...}
# Tools: {'Bash': 3, 'Read': 1, 'Workflow': 1}

主 session 只花 5 個 tool call(3 Bash + 1 Read 探勘 + 1 Workflow 派出去)就把工作全部 delegate 出去,自己等 callback。

Workflow runtime 把 11 個 agent 的執行軌跡寫到磁碟,結構是這樣的:

~/.claude/projects/<project>/<session>/
├── workflows/
│   ├── scripts/
│   │   └── authz-pattern-audit-wf_d545fc61-597.js   ← Claude 寫的 script
│   └── wf_d545fc61-597.json                         ← run state
└── subagents/workflows/wf_d545fc61-597/
    ├── journal.jsonl                                ← 11 個 agent 起跑/完成軌跡
    ├── agent-a1b66c368221dfcbc.jsonl                ← 每個 agent 的完整 transcript
    ├── agent-a1b66c368221dfcbc.meta.json
    ├── ...(共 11 對 jsonl + meta)
    └── agent-a24f70cfc4a9cea4c.jsonl                ← synthesis agent

從 journal.jsonl 可以看到並發狀況:7 個 agent 同時 started,然後 result/start 交錯。實測 Tom 8-core 機器同時 active ~7 個,沒撞到 16 上限。

最大發現:runtime 自動把 worker 派為 Haiku 4.5

從每個 agent 的 jsonl 撈 model 跟 token:

aid          model         dur(s)   in     out     cache_r    cache_w
a1b66c3682   haiku-4-5     36.6     68     2913    704739     86473
a24f70cfc4   haiku-4-5     65.3     126    6780    831993     139390
a44126f699   haiku-4-5     46.0     113    2750    780310     99018
a5b0952489   haiku-4-5     18.1     36     2106    172988     79402
a677c4aa85   haiku-4-5     15.6     41     1955    164250     84687
a98dc4ecbc   haiku-4-5     18.3     26     2189    132348     57398
aa0358d7b4   haiku-4-5     39.4     132    3935    803238     66657
aac3435e5b   haiku-4-5     22.4     77     2798    384636     174664
aac426ca5b   haiku-4-5     24.5     41     2616    235382     64016
ab4b713305   haiku-4-5     28.6     120    2618    480910     158476
abe7fdedda   haiku-4-5     39.8     88     3173    896956     110064

11 個 worker 全部是 Haiku 4.5。我啟動用 --model claude-opus-4-8,但 workflow runtime 自選把 worker 派去 Haiku。

這對應到我 MEMORY 裡寫的「想 → opus / 做 → sonnet / 找 → haiku」模型分層 doctrine。Runtime 替我做了:

主 orchestrator(寫 script + synthesis)= Opus → 「想」
11 個 worker(bulk classification / grep / read)= Haiku → 「找」
沒 Sonnet 是因為這題沒「做」(read-only audit 沒寫入)

意義:原本 plan 期手寫 AGENTS.md 分配模型的事,runtime 替你做了。這比文件公告任何一條新功能都重要 — adaptive staffing 從「未來方向」變成「實裝」。

對照結果:數字攤開

面向	Baseline (Opus 4.7 + 1 agent)	Workflow (Opus 4.8 + 11 Haiku)
總耗時	3:56 (236s)	6:32 turn / 2:12 workflow runtime
真實平行加速	—	11 agents 累加 354s,實跑 132s → 2.7×
主 session tool call	24 (14 Bash + 10 Read)	8 (5 Bash + 2 Read + 1 Workflow)
Subagent 數	1(general-purpose,可寫)	11(Explore,read-only)
Subagent 模型	Opus 4.7	全 11 個 Haiku 4.5
覆蓋	109/109	109/109
總 token(估)	~103K	~480K(主 315K + workers 165K)
Cache read	低	6.95M(瘋狂重用)
估算成本	~$0.72	~$6.10(約 8.5× baseline)

分類結果差異

Class	Baseline	Workflow	差異原因
USES_PRELUDE	84	78	Workflow 把「prelude + 後續 manual orchestration」歸到 hand_rolled hybrid,標準更嚴
LEGIT_PUBLIC	13	17	Workflow 把 Logout / Me / Community 視為 public(judgment call)
HAND_ROLLED	10	12	Workflow 多抓出 5 個 hybrid hand-roll(prelude 加 manual 後處理)
UNKNOWN	2	2	同(兩個 *AuditUnacked,純靠 RLS)

質的差異:同題下 baseline vs workflow 各看到什麼

數字一樣是 109/109 覆蓋,但兩邊對同一個 codebase 「看見」的東西不一樣。下面 4 個維度,左 baseline 右 workflow 並列。

差異 #1:對下游結果是否複查

Baseline 表現 Workflow 表現

只有 1 個 agent,自己跑自己算,沒有下游可疑。Report 怎麼算就怎麼算,讀者要自己 grep 才能驗。 Synthesis subagent 自己算出 total: 83(漏算 26 個)。主 orchestrator (Opus 4.8) 拒絕信任,自己重算 109,公開記下:

“The audit is complete, but the synthesis agent’s counts look off (it reported total=83, but I dispatched coverage for 109 functions). Let me verify completeness from the authoritative rows array myself.”

這就是 4.8 公告吹的「4× less likely to allow flaws in code it wrote to pass unremarked」實戰演示。

Baseline 表現	Workflow 表現
只有 1 個 agent,自己跑自己算,沒有下游可疑。Report 怎麼算就怎麼算,讀者要自己 grep 才能驗。	Synthesis subagent 自己算出 `total: 83`(漏算 26 個)。主 orchestrator (Opus 4.8) 拒絕信任,自己重算 109,公開記下: “The audit is complete, but the synthesis agent’s counts look off (it reported total=83, but I dispatched coverage for 109 functions). Let me verify completeness from the authoritative rows array myself.” 這就是 4.8 公告吹的「4× less likely to allow flaws in code it wrote to pass unremarked」實戰演示。

差異 #2:對 root cause 的挖掘深度

Baseline 的觀察 Workflow 的觀察

「cap-OR 是 4 個 hand-roll 不 migrate 的主因」。

停在第一層觀察。沒挖到更底層的結構問題。「Households / Parcels / ParcelsPage / CommunityVisitors / VisitorAuditLog 同一段 SQL 抄 5 次。真兇是 AuthzDecision.IsCommunityWide 排除了 guard role,迫使每個把 guard 當 community-wide 的 resolver 都得自己重 probe。」

挖到第二層結構根因。能做出來是因為 synthesis 是專屬 agent、有獨立 context,沒被 109 函式的細節塞滿。

Baseline 的觀察	Workflow 的觀察
「`cap-OR` 是 4 個 hand-roll 不 migrate 的主因」。停在第一層觀察。沒挖到更底層的結構問題。	「`Households` / `Parcels` / `ParcelsPage` / `CommunityVisitors` / `VisitorAuditLog` 同一段 SQL 抄 5 次。真兇是 `AuthzDecision.IsCommunityWide` 排除了 `guard` role,迫使每個把 guard 當 community-wide 的 resolver 都得自己重 probe。」挖到第二層結構根因。能做出來是因為 synthesis 是專屬 agent、有獨立 context,沒被 109 函式的細節塞滿。

差異 #3:可落地的 fix 提案數

Baseline 給的提案 Workflow 給的提案

Baseline 給的提案	Workflow 給的提案
1 個:擴展 prelude 支援 `CapsAny` 多 cap OR-chain	6 個可落地的 prelude extension: `CapsAny []string` — 解 4 個 hand-roll 承認 `guard` 是 broad role(`RoleListBroad bool`)— 解 5× 重複的 SQL probe `ScopeDowngradeOnRole map[role]AuthzScope` — 解 2 個 visitor scope 降級 `Cap` + `CapOwn` + `OwnershipIDCol` — 解 2 個 announcement ownership `authzPreludeAny([]PreludeOptions)` 多路徑 helper — 解 2 個 multi-path `ScopeRLSOnly` enum(顯式宣告 RLS-only)— 解 2 個 UNKNOWN 最高 leverage:#1 + #2 兩個 API 微調可幹掉 12 個 hand-roll 中的 7 個。

1 個:擴展 prelude 支援 CapsAny 多 cap OR-chain

6 個 可落地的 prelude extension:

CapsAny []string — 解 4 個 hand-roll
承認 guard 是 broad role(RoleListBroad bool)— 解 5× 重複的 SQL probe
ScopeDowngradeOnRole map[role]AuthzScope — 解 2 個 visitor scope 降級
Cap + CapOwn + OwnershipIDCol — 解 2 個 announcement ownership
authzPreludeAny([]PreludeOptions) 多路徑 helper — 解 2 個 multi-path
ScopeRLSOnly enum(顯式宣告 RLS-only)— 解 2 個 UNKNOWN

最高 leverage:#1 + #2 兩個 API 微調可幹掉 12 個 hand-roll 中的 7 個。

差異 #4:結構級觀察

Baseline 的結論深度	Workflow 的結論深度
逐個 function 分類完,給 Summary 數字。沒有跨函式結構洞見。讀者拿到表得自己再看一輪才能發現「全部 hand-roll 都在 query 那邊」。	「所有 7 個純 hand-roll + 2 個 UNKNOWN 集中在 schema.resolvers.go 的 query tail(line 6210-7130),mutation 100% 走 prelude。」結論:「read-side authz is where the rot is.」這級結構洞見 baseline 沒做出來,因為它的 agent 同時在處理逐函式分類跟跨函式觀察兩件事,context 被佔滿。

4 個差異的共同根因

Workflow 4 個維度都贏,不是模型差,而是架構差:

Synthesis 是專屬 phase 跟專屬 agent,不跟逐函式分類搶 context
主 orchestrator 不直接做活,有餘力去 cross-check 下游
JSON Schema 鎖死 enum,讓 synthesis 的算術錯誤被直接 vs 證據對照,容易被抓
每個 Classify agent 只看自己那塊 chunk,context 不被 109 函式整體噪音蓋掉

換句話說:把 baseline 的單一 Opus 4.7 agent 升級成 Opus 4.8 也救不了 — 缺的是 architectural separation of concerns,不是模型聰明度。

成本拆解:雙邊對稱對照

用公開 pricing 算(Opus 4.7 / 4.8 同價:$5/M input、$25/M output、~$0.50/M cache_read、$6.25/M cache_write;Haiku 4.5:$1/M input、$5/M output、~$0.10/M cache_read、$1.25/M cache_write)。

Baseline 拆解

誠實先講:baseline 走 Agent tool,Anthropic 只回傳 total_tokens: 103,078,沒給 input/output 拆解。所以我只能上下界估算。

情境	假設拆解	成本
下界(全是 input)	103K in / 0 out	$0.52
合理估(read-heavy audit,輸出 ~10K markdown)	93K in / 10K out	~$0.72
上界(全是 output,不現實)	0 in / 103K out	$2.58

Workflow 拆解

Workflow runtime 把每個 agent 跟主 session 的 token 都寫到 jsonl,可以精算。

項目	Input	Output	Cache read	Cache write	成本
主 session (Opus 4.8)	31,686	59,506	1,361,289	262,528	~$3.97
11 workers (Haiku 4.5)	~870	~33,833	~5,587,750	~1,120,245	~$2.13
Workflow 合計	~32,556	~93,339	~6,949,039	~1,382,773	~$6.10

並排比較與成本比

指標	Baseline	Workflow	倍率
總 token(可量到的)	~103K	~8.46M(主 + workers,含 cache)	~82×
非 cache token	~103K	~125K(主 + workers,純 in+out)	~1.2×
估算成本	~$0.72	~$6.10	~8.5×

校正一下我之前隨口說的「2-3× cost」 — 認真算下來這次 workflow 是 baseline 的 ~8.5×。會這麼高的主因是 workflow runtime 海量 cache write(主 session prompt 加上 11 個 worker 各自的 system+task prompt 都得 cache),這在小規模任務上吃掉的固定成本不成比例。

更大規模(例如 1000 個 function、5 個檔案、要跑多輪 phase 的 codebase-scale migration)倍率會掉下來,因為 cache 攤平。本次 109 函式可能還沒到 workflow 的 sweet spot。

收益不在錢,在結構性保證:JSON Schema 鎖死輸出契約、Explore agent 鎖死 read-only、script 落地 reproducible、主 orchestrator 主動 cross-check 下游算術錯誤、結構級洞見的獨立 synthesis pass。如果這次的 audit 結果直接讓你修對 5 個 bug,$5 的價差完全划得來;如果只是研究性掃描、要常常重跑,$0.72 的 baseline 更合理。

何時用、何時不用

情境	該用
平行 audit / sweep / 分類(N 個獨立目標)	Workflow
結果可被 JSON Schema 約束	Workflow
有 ground truth 可驗(grep / 規則對照)	Workflow
Codebase migration 數百到千行	Workflow
30 行小改、互動式 debug、需要中途決策	Subagent / 直接對話
設計 / 重構 / 探索 spike	直接對話
序列邏輯(A 結果決定 B 做啥)	Subagent
預算敏感、有 deadline	Subagent

硬性限制(踩過的)

限制	數字	影響
並發 agent	最多 16(8-core 機器實測 ~7 並發)	大 chunk 設計,別追求一函式一 agent
單 run 累計 agent	硬上限 1000	大型 migration 拆多個 workflow
中途要用戶輸入	不行	多階段簽核拆多 workflow,中間落地
script 直接 fs / shell	不行	透過 agent 做,script 純 orchestrate
Subagent permission mode	永遠 `acceptEdits`	想 read-only 一定用 `agentType: 'Explore'`

把 workflow 存成可重用 slash command

跑完滿意,在 workflow session 內:

/workflows               # 列出 run
↑↓ 選那個 run → Enter   # 進入進度檢視
s                        # 存成 slash command
  → 選 .claude/workflows/<name>.js (repo 共享)
  → 或 ~/.claude/workflows/<name>.js (個人跨專案)

之後 /<name> 像 slash command 一樣用。Project workflow 優先級高於 personal 同名。

想 resume failed run:

Workflow({
  scriptPath: "<session>/workflows/scripts/<name>-<runId>.js",
  resumeFromRunId: "wf_xxx"
})
# completed agents 回傳 cached result,只重跑失敗的

關掉 workflow

範圍	怎麼關
自己	`/config` 切 Dynamic workflows / settings.json 設 `"disableWorkflows": true` / 環境變數 `CLAUDE_CODE_DISABLE_WORKFLOWS=1`
整組織	managed settings `"disableWorkflows": true` 或 Claude Code admin page 切

關掉後:built-in /deep-research 不能用、workflow 關鍵字不觸發、ultracode 從 /effort 菜單消失。

對日常工作的影響

我自己接下來打算用 workflow 跑的場景:

Home123 RLS coverage sweep — 每張 table 確認 CRUD 4 個 RLS policy 都齊
Home123 invariant coverage — 170 個 INV 是否在 backlog 有對應 test sketch
iDempiere @RestResource endpoint 巡檢 — endpoint × test × OData filter doc 三角檢
SimpleEC OMS Kafka 事件覆蓋矩陣 — topic × producer × consumer 對應

不會用 workflow 的場景:走 decision-server 的任何題(workflow 中途不能停下來問人)、30 行小修、新功能 spike(reproducibility 沒價值)。

跟我之前文章的關聯

「為什麼修不完?」— 從 21 輪迴圈到 5 type taxonomy 的多 agent 角色設計 — 寫死 AGENTS.md 的痛,workflow 的自動 staffing 部分解了
從 4 條原則到動態大腦:兩種 Claude Code 知識系統的差異 — 靜態 vs 動態知識系統,workflow 是動態 dispatch 的具體實裝
「56 條 INV 全綠,user 點一次抓出 4 個 bug」— Multi-Agent 業界共識的五個自家補丁 — Generator-Verifier adversarial pair 的概念,在 workflow 的 synthesis 主動 cross-check 上得到驗證
Cycle File:Multi-Agent 工程的狀態接力棒 — workflow script 持久化是 cycle file 思想的執行層延伸

結論:結構性保證,不是省錢工具

Dynamic Workflows 不是「Claude 變強了所以更便宜」,而是「Claude 把 plan 期才做的 staffing 決策搬到 runtime,並用 JSON Schema + Explore 型別把輸出契約寫進 code 而非 prompt」。對任務符合 4 條件(可平行、可 schema、有 ground truth、read-only)的場景,workflow 的結構性保證值得 2-3× cost。不符合的場景繼續用傳統 subagent 跟直接對話,別因為「workflow 看起來酷」default 用。

真正令人意外的不是任何單一功能,是「runtime 自動派 Haiku 4.5 給 worker、保留 Opus 4.8 給 orchestrator」這件事連文件都沒重點說。它把以往要寫進 AGENTS.md 的「想/做/找」三層分工,從文件變成系統行為。下一輪 model release(Mythos Preview 已預告)會繼續推進這條軸線:組織 prompt 從手寫變 runtime 推導。