Latent reasoning / test-time compute budget:何时多想、在哪个 latent 层多想、何时停止
Evaluation modality
Spec-levelA spec-motivation / governance borrow. Evaluated by spec review + contract tests, not A/B or ablation.
- Primary owner
- —
- Phase-A verdict
- —
- Shadow profile
- —
- Source papers
- Recurrent Depth 2025 + Coconut 2024 + Quiet-STaR 2024 + Snell 2024 + s1 2025
- Specs
- docs/specs/temporal-abstraction.mddocs/specs/evaluation.mddocs/specs/expression-layer.md
Blind spot (现状盲点)
VZ 明确不走 token-space long-term RL,但这不等于忽视 test-time compute。当前缺口是:metacontroller 何时应该多想?多想发生在什么 latent 层?何时停止?如果没有预算曲线,β_t / reasoning depth 可能变成固定配置或经验调参。
Adoptable suggestions (可落地动作)
- 1.在 temporal-abstraction spec 中新增 "latent compute budget readout":把 Recurrent Depth / Coconut / Quiet-STaR / Snell / s1 作为对照,不采纳 token forcing,但吸收预算与停止准则。PROPOSED
Not a runnable A/B candidate — evaluated by the path above, not ablation.
- 2.定义 report-only 指标:extra latent steps vs PE reduction、cost vs relationship outcome、overthinking drift。PROPOSED
Not a runnable A/B candidate — evaluated by the path above, not ablation.
- 3.与 expression-layer 边界绑定:增加 compute 不等于增加 user-facing CoT,不允许把 hidden budget 变成输出 token 长度奖励。PROPOSED
Not a runnable A/B candidate — evaluated by the path above, not ablation.
Traceability
No plugins / runs linked yet. Scaffold a suggestion to start.
Expected benefit (预期收益)
- 让"思考多久"成为可测控制问题,而不是 prompt / token 长度策略。 - 给 low-cost lifeform runtime 提供 compute allocation 基线。 - 与 OA-2 Mind/Face 隔离互补:内部多想不泄露为表达层 CoT 训练。
Cited paper (引用论文)
**Recurrent Depth**(2025)、**Coconut**(2024)、**Quiet-STaR**(2024)、**Scaling LLM Test-Time Compute Optimally**(2024)、**s1**(2025)。详见 [`research/probe/02_axis_walkthrough.md`](../../research/probe/02_axis_walkthrough.md) A1。 ---