ADR: introduce docs/history/, merge 0011+0018, prune migration cruft

- CLAUDE.md: add ADR Lifecycle subsection (superseded → docs/history/,
  immutable numbering, no renumber)
- ADR-0011: merge ADR-0018 content as "Address Model: LA" section
  alongside PA / VA; status notes VA model is currently implemented
- ADR-0018 / 0029 / 0031: moved to docs/history/ with status updates
  (0018 merged into 0011, 0029 superseded by 0032, 0031 absorbed
  into 0001 rev 2)
- ADR-0019: rewrite Context as PE-HBM connectivity decision
  (self-contained, no LA model framing)
- ADR-0019/0020/0021/0023/0025/0027: Status Proposed → Accepted
  (code verified) and prune Implementation Notes / Affected files /
  Test strategy / "현재 상태" sub-sections describing pre-impl state
- ADR-0024/0026: same migration-flavor cleanup; 0026 also drops D6
  Migration and D8 docs-update sub-decisions
- ADR-0030: status simplified (blocker ADR-0031 now superseded)
- SPEC.md: R10 + §0.2 reflect PA / VA / LA model names
- ADR-0008/0012/0013: refresh ADR-0011 subtitle in Links

21 files changed, 553 insertions(+), 1290 deletions(-).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-19 11:42:45 -07:00
parent ecc57d050d
commit 22fd0d2b9d
23 changed files with 553 additions and 1290 deletions
+1 -90
View File
@@ -2,7 +2,7 @@
## Status
Proposed (Revision 2 — Address-based matching; peer_direction field dropped)
Accepted (Revision 2 — Address-based matching; peer_direction field dropped)
## Context
@@ -13,34 +13,6 @@ topology / dict-order에 의존하지 않고 **주소 기반**으로 일관되
2-rank bidirectional ring (또는 여러 direction이 동일 peer를 가리키는
topology 일반)에서 정확히 동작하도록 한다.
### 현재 상태 (ADR-0023 D9 구현)
`src/kernbench/components/builtin/pe_ipcq.py``_handle_meta_arrival`:
```python
def _handle_meta_arrival(self, msg: IpcqMetaArrival) -> None:
token = msg.token
sender_key = (token.src_sip, token.src_cube, token.src_pe)
for d, qp in self._queue_pairs.items():
p = qp["peer"]
if (p.sip, p.cube, p.pe) == sender_key:
qp["peer_head_cache"] = max(qp["peer_head_cache"], token.sender_seq + 1)
# ... wake recv waiters ...
return
```
`_credit_worker`도 동일한 "sender-coord-first-match" 패턴.
`src/kernbench/ccl/install.py``reverse_direction`:
```python
def reverse_direction(my_rank: int, peer_rank: int) -> str | None:
for d, target in neighbor_table[peer_rank].items():
if target == my_rank:
return d
return None
```
### 드러난 버그 — 2-rank bidirectional ring
`ring_1d(rank, world_size=2)``{"E": 1, "W": 1}` (rank 0). 양쪽 방향이 같은 peer.
@@ -289,51 +261,6 @@ for plan in plans:
---
## Test strategy
### T1. Unit — `reverse_direction` opposite-preference
`tests/test_ccl_install.py` (확장):
- Ring ws=2: `reverse_direction(0, 1, "E")` → "W", `reverse_direction(0, 1, "W")` → "E"
- Ring ws=4: `reverse_direction(0, 1, "E")` → "W" (자연스러운 opposite)
- Mesh 2×2: `reverse_direction(r, peer, "N")` → "S", "E" ↔ "W"
- Tree binary: opposite 없는 direction (parent) → fallback 경로
- Non-symmetric topology: opposite가 peer에 없고 다른 direction만 있는 경우
### T2. Runtime — `_handle_meta_arrival` dst_addr 매칭
`tests/test_pe_ipcq.py` (확장):
- 2-rank pair install 후, E direction dst_addr로 meta arrival → E의 `peer_head_cache`
증가 (W는 불변)
- W direction dst_addr로 meta arrival → W의 `peer_head_cache` 증가
- 잘못된 dst_addr (어느 rx range에도 속하지 않음) → 에러 또는 silent drop
(결정 후 명시)
### T3. Credit — `dst_rx_base_pa` 매칭
`tests/test_pe_ipcq.py` (확장):
- E direction send 후 peer가 consume → credit에 자기 W의 `my_rx_base_pa`
담아 송신 → sender의 E direction `peer_tail_cache` 증가
- W direction도 동일
### T4. E2E — 2-rank bidirectional ring
`tests/test_ipcq_e2e.py`:
- 2-rank ring_1d로 tl.send(E) + tl.recv(W) pattern이 양방향으로 작동
- ADR-0024의 `test_ccl_allreduce_matrix.py`에서 ring at ws=2가 통과
### T5. Install invariant — rx_base range disjointness
`tests/test_ccl_install_plan.py` (확장):
- I3.1 검증: `build_install_plans` 결과에서 모든 qp의 rx_base range가 disjoint
### T6. 회귀
- 기존 ws≥3 ring / mesh / tree 테스트 그대로 통과
- `test_pe_ipcq`, `test_ipcq_e2e` 기존 케이스 회귀
---
## Consequences
### Positive
@@ -354,19 +281,3 @@ for plan in plans:
- IPCQ protocol의 semantic layer (sender가 dst_addr 계산, receiver가 수신)는
불변.
---
## Affected files
| File | Change |
|------|--------|
| `src/kernbench/ccl/install.py` | D1: `reverse_direction``my_dir` 인자 추가, opposite-preference |
| `src/kernbench/components/builtin/pe_ipcq.py` | D2: `_handle_meta_arrival` dst_addr 매칭 / D3: `_credit_worker` dst_rx_base_pa 매칭 / `_delayed_credit_send``dst_rx_base_pa` 필드 채움 |
| `src/kernbench/common/ipcq_types.py` | D3: `IpcqCreditMetadata``dst_rx_base_pa` 필드 추가 |
| `src/kernbench/ccl/install_plan.py` (ADR-0024 신규) | D6: I3.1 invariant 검증 (optional) |
| `docs/adr/ADR-0023-ipcq-pe-collective.md` | Reference note: runtime 매칭 방식이 ADR-0025에서 바뀜 |
| `tests/test_ccl_install.py` | T1 |
| `tests/test_pe_ipcq.py` | T2, T3 |
| `tests/test_ipcq_e2e.py` | T4 |
| `tests/test_ccl_install_plan.py` | T5 |