ADR: introduce docs/history/, merge 0011+0018, prune migration cruft

- CLAUDE.md: add ADR Lifecycle subsection (superseded → docs/history/, immutable numbering, no renumber) - ADR-0011: merge ADR-0018 content as "Address Model: LA" section alongside PA / VA; status notes VA model is currently implemented - ADR-0018 / 0029 / 0031: moved to docs/history/ with status updates (0018 merged into 0011, 0029 superseded by 0032, 0031 absorbed into 0001 rev 2) - ADR-0019: rewrite Context as PE-HBM connectivity decision (self-contained, no LA model framing) - ADR-0019/0020/0021/0023/0025/0027: Status Proposed → Accepted (code verified) and prune Implementation Notes / Affected files / Test strategy / "현재 상태" sub-sections describing pre-impl state - ADR-0024/0026: same migration-flavor cleanup; 0026 also drops D6 Migration and D8 docs-update sub-decisions - ADR-0030: status simplified (blocker ADR-0031 now superseded) - SPEC.md: R10 + §0.2 reflect PA / VA / LA model names - ADR-0008/0012/0013: refresh ADR-0011 subtitle in Links 21 files changed, 553 insertions(+), 1290 deletions(-). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-19 11:42:45 -07:00
parent ecc57d050d
commit 22fd0d2b9d
23 changed files with 553 additions and 1290 deletions
@@ -2,7 +2,7 @@

 ## Status

-Proposed
+Accepted

 ## Context

@@ -16,21 +16,6 @@ but do not actually read tensor data or perform computations.
 2. PE_GEMM, PE_MATH must be able to perform actual matrix operations and verify results
 3. Must minimize simulation performance degradation

-### Limitations of the Existing Kernel Execution Structure
-
-The current kernel execution is separated into 3 stages:
-
-```
-Phase 0: Kernel function execution in TLContext → PeCommand list generation (outside SimPy, no data)
-Phase 1: PE_CPU replays PeCommand list via SimPy (timing only)
-```
-
-Phase 0 requires the kernel to **complete execution entirely** before SimPy begins.
-`tl.load()` returns a TensorHandle (placeholder), so actual data cannot be accessed.
-Therefore, branching based on data values (dynamic control flow) is impossible.
-
-This ADR resolves this limitation **for memory operations only** (see D1, D3).
-
 ### Constraints

 - SimPy is a single-thread event loop — running numpy matmul inside it blocks everything
@@ -532,22 +517,3 @@ Per-dtype tolerance policy:
  (computations execute in Phase 2, result values are undetermined in Phase 1).
  Memory-data-based branching is supported via greenlet.
 - greenlet C extension dependency added (pip install greenlet)
-
---
-
-## Affected Files
-
-| File | Change |
-|------|--------|
-| `src/kernbench/components/base.py` | Add `_on_process_start/end` hooks |
-| `src/kernbench/common/pe_commands.py` | Add `data_op = True`, extend metadata fields |
-| `src/kernbench/sim_engine/op_log.py` | New: OpRecord, OpLogger |
-| `src/kernbench/sim_engine/data_executor.py` | New: DataExecutor, MemoryStore |
-| `src/kernbench/sim_engine/engine.py` | op_logger injection (optional) |
-| `src/kernbench/triton_emu/tl_context.py` | greenlet switch calls inside `tl.load()` etc. |
-| `src/kernbench/triton_emu/kernel_runner.py` | New: KernelRunner (greenlet ↔ SimPy bridge) |
-| `src/kernbench/components/builtin/pe_cpu.py` | Remove Phase 0, change to KernelRunner invocation |
-| `pyproject.toml` | Add greenlet dependency |
-
-Component implementation files (pe_gemm.py, pe_dma.py, hbm_ctrl.py, etc.): **no changes**
-Benchmark kernels (benches/*.py): **no user API changes**