COG-5COGP1/MSpec-levelPROPOSED

R13 压缩-强化交替作为一等工程程序，而非隐含原则

—

Evaluation modality

Spec-level

A spec-motivation / governance borrow. Evaluated by spec review + contract tests, not A/B or ablation.

Primary owner: —
Phase-A verdict: —
Shadow profile: —
Source papers: NL 2025 + Algorithm Distillation 2022 + MesaNet 2025
Specs: docs/specs/multi-timescale-learning.mddocs/specs/thinking-loop.mddocs/specs/evidence_program.md

Blind spot (现状盲点)

R13 在设计法则里非常核心：SSL 压缩 → RL 强化交替。但当前 26 条多是局部组件：DM-5 imagination、DM-6 data value、EVO-5 proposer/verifier。缺少一个把"压缩质量本身如何测、何时强化、强化是否只作用于压缩结构"写成工程程序的方向。

Adoptable suggestions (可落地动作)

1.在 [`docs/specs/multi-timescale-learning.md`](../specs/multi-timescale-learning.md) 增加 "compression-reinforcement alternation evidence" 小节。PROPOSED
Not a runnable A/B candidate — evaluated by the path above, not ablation.
2.将 thinking-loop / snapshot replay export 视为 SSL 压缩产物，定义压缩质量 readout：prediction improvement、owner snapshot stability、held-out reconstruction of semantic state。PROPOSED
Not a runnable A/B candidate — evaluated by the path above, not ablation.
3.强化阶段只允许作用于 controller / retention head / owner-internal policy，不允许绕到 Face token 或 substrate base weight。PROPOSED
Not a runnable A/B candidate — evaluated by the path above, not ablation.

Traceability

No plugins / runs linked yet. Scaffold a suggestion to start.

Expected benefit (预期收益)

- 把 R13 从原则变成可验收 pipeline。 - 让 background-slow 反思不仅产出文本 summary，而是产出可测的 compressed state。 - 防止后续学习工程退化为"哪里有 reward 就在哪里训"。

Cited paper (引用论文)

**Nested Learning**（2025）、**Algorithm Distillation**（2022）、**MesaNet**（2025）、**Mesa-Optimization in Transformers**（2023）。详见 [`research/core-author-paper-assessment-2026-05.md`](../../research/core-author-paper-assessment-2026-05.md) 与 [`research/probe/10_deep_synthesis_2026.md`](../../research/probe/10_deep_synthesis_2026.md) T4。 ---