COG-7COGP1/S-MSpec-levelPROPOSED

Latent reasoning / test-time compute budget：何时多想、在哪个 latent 层多想、何时停止

—

Evaluation modality

Spec-level

A spec-motivation / governance borrow. Evaluated by spec review + contract tests, not A/B or ablation.

Primary owner: —
Phase-A verdict: —
Shadow profile: —
Source papers: Recurrent Depth 2025 + Coconut 2024 + Quiet-STaR 2024 + Snell 2024 + s1 2025
Specs: docs/specs/temporal-abstraction.mddocs/specs/evaluation.mddocs/specs/expression-layer.md

Blind spot (现状盲点)

VZ 明确不走 token-space long-term RL，但这不等于忽视 test-time compute。当前缺口是：metacontroller 何时应该多想？多想发生在什么 latent 层？何时停止？如果没有预算曲线，β_t / reasoning depth 可能变成固定配置或经验调参。

Adoptable suggestions (可落地动作)

1.在 temporal-abstraction spec 中新增 "latent compute budget readout"：把 Recurrent Depth / Coconut / Quiet-STaR / Snell / s1 作为对照，不采纳 token forcing，但吸收预算与停止准则。PROPOSED
Not a runnable A/B candidate — evaluated by the path above, not ablation.
2.定义 report-only 指标：extra latent steps vs PE reduction、cost vs relationship outcome、overthinking drift。PROPOSED
Not a runnable A/B candidate — evaluated by the path above, not ablation.
3.与 expression-layer 边界绑定：增加 compute 不等于增加 user-facing CoT，不允许把 hidden budget 变成输出 token 长度奖励。PROPOSED
Not a runnable A/B candidate — evaluated by the path above, not ablation.

Traceability

No plugins / runs linked yet. Scaffold a suggestion to start.

Expected benefit (预期收益)

- 让"思考多久"成为可测控制问题，而不是 prompt / token 长度策略。 - 给 low-cost lifeform runtime 提供 compute allocation 基线。 - 与 OA-2 Mind/Face 隔离互补：内部多想不泄露为表达层 CoT 训练。

Cited paper (引用论文)

**Recurrent Depth**（2025）、**Coconut**（2024）、**Quiet-STaR**（2024）、**Scaling LLM Test-Time Compute Optimally**（2024）、**s1**（2025）。详见 [`research/probe/02_axis_walkthrough.md`](../../research/probe/02_axis_walkthrough.md) A1。 ---