Catalog
COG-7COGP1/S-MSpec-levelPROPOSED

Latent reasoning / test-time compute budget:何时多想、在哪个 latent 层多想、何时停止

Evaluation modality

Spec-level

A spec-motivation / governance borrow. Evaluated by spec review + contract tests, not A/B or ablation.

Primary owner
Phase-A verdict
Shadow profile
Source papers
Recurrent Depth 2025 + Coconut 2024 + Quiet-STaR 2024 + Snell 2024 + s1 2025
Specs
docs/specs/temporal-abstraction.mddocs/specs/evaluation.mddocs/specs/expression-layer.md

Blind spot (现状盲点)

VZ 明确不走 token-space long-term RL,但这不等于忽视 test-time compute。当前缺口是:metacontroller 何时应该多想?多想发生在什么 latent 层?何时停止?如果没有预算曲线,β_t / reasoning depth 可能变成固定配置或经验调参。

Adoptable suggestions (可落地动作)

  1. 1.在 temporal-abstraction spec 中新增 "latent compute budget readout":把 Recurrent Depth / Coconut / Quiet-STaR / Snell / s1 作为对照,不采纳 token forcing,但吸收预算与停止准则。PROPOSED

    Not a runnable A/B candidate — evaluated by the path above, not ablation.

  2. 2.定义 report-only 指标:extra latent steps vs PE reduction、cost vs relationship outcome、overthinking drift。PROPOSED

    Not a runnable A/B candidate — evaluated by the path above, not ablation.

  3. 3.与 expression-layer 边界绑定:增加 compute 不等于增加 user-facing CoT,不允许把 hidden budget 变成输出 token 长度奖励。PROPOSED

    Not a runnable A/B candidate — evaluated by the path above, not ablation.

Traceability

No plugins / runs linked yet. Scaffold a suggestion to start.

Expected benefit (预期收益)

- 让"思考多久"成为可测控制问题,而不是 prompt / token 长度策略。 - 给 low-cost lifeform runtime 提供 compute allocation 基线。 - 与 OA-2 Mind/Face 隔离互补:内部多想不泄露为表达层 CoT 训练。

Cited paper (引用论文)

**Recurrent Depth**(2025)、**Coconut**(2024)、**Quiet-STaR**(2024)、**Scaling LLM Test-Time Compute Optimally**(2024)、**s1**(2025)。详见 [`research/probe/02_axis_walkthrough.md`](../../research/probe/02_axis_walkthrough.md) A1。 ---