ADR: bilingual structure — EN canonical in adr/, KO mirror in adr-ko/
Establish English as the canonical ADR language with Korean translations held in a parallel docs/adr-ko/ tree as derived artifacts (1:1 mirror). Promotion from adr-proposed/ to adr/ now writes English to adr/ and the Korean to adr-ko/; bidirectional sync rule documented in CLAUDE.md. - Migrate 30 ADRs in docs/adr/: 28 Korean-only translated to English, 2 bilingual pairs (ADR-0020, ADR-0023) consolidated (.en.md suffix dropped). ADR-0023 EN regenerated against KO source which had newer HW Realization Notes (D16-D23) section. - docs/adr-history/ left frozen by design (transitional state). - CLAUDE.md (Part 2): update ADR Lifecycle for 4-folder layout, mark docs/adr-ko/ as a Derived Artifact, add ADR Translation Discipline section covering bidirectional sync, conflict resolution (EN wins), and proposed-language freedom. - tools/verify_adr_lang_pairs.py: new verification tool checking pair completeness, filename mirroring, ADR-ID match, Status byte-equality. Pre-commit hook intentionally not added; run on demand or in CI. - tests/test_verify_adr_lang_pairs.py: 11 cases including CRLF/LF normalization, em-dash title separator, underscore-slug edge case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -6,51 +6,58 @@ Accepted (Revision 2 — Address-based matching; peer_direction field dropped)
|
||||
|
||||
## Context
|
||||
|
||||
### 목표
|
||||
### Goal
|
||||
|
||||
ADR-0023의 IPCQ protocol에서 **"어느 direction pair를 통한 전송인가"의 식별**을
|
||||
topology / dict-order에 의존하지 않고 **주소 기반**으로 일관되게 한다.
|
||||
2-rank bidirectional ring (또는 여러 direction이 동일 peer를 가리키는
|
||||
topology 일반)에서 정확히 동작하도록 한다.
|
||||
In the IPCQ protocol of ADR-0023, make the **identification of "which
|
||||
direction pair this transfer belongs to"** consistent and **address-based**,
|
||||
without depending on topology / dict-order. It must work correctly in a
|
||||
2-rank bidirectional ring (and more generally in any topology where
|
||||
multiple directions point to the same peer).
|
||||
|
||||
### 드러난 버그 — 2-rank bidirectional ring
|
||||
### The bug surfaced — 2-rank bidirectional ring
|
||||
|
||||
`ring_1d(rank, world_size=2)` → `{"E": 1, "W": 1}` (rank 0). 양쪽 방향이 같은 peer.
|
||||
`ring_1d(rank, world_size=2)` → `{"E": 1, "W": 1}` (rank 0). Both directions
|
||||
point to the same peer.
|
||||
|
||||
**버그 1 (install)**:
|
||||
- `reverse_direction(0, 1)` → dict order로 "E" 반환 (틀림, "W"가 맞음 — opposite
|
||||
direction convention)
|
||||
- rank 0의 E entry가 `peer.rx_base_pa = rx_base(sip1, cube0, pe0, d="E")`로 설정
|
||||
- tl.send(E) → data가 sip1의 E-rx buffer로 landing (should be W-rx)
|
||||
**Bug 1 (install)**:
|
||||
- `reverse_direction(0, 1)` → returns "E" by dict order (wrong; "W" is the
|
||||
correct answer — opposite-direction convention)
|
||||
- rank 0's E entry is set with `peer.rx_base_pa = rx_base(sip1, cube0, pe0, d="E")`
|
||||
- tl.send(E) → data lands in sip1's E-rx buffer (should be W-rx)
|
||||
|
||||
**버그 2 (runtime)**:
|
||||
- 설령 install이 올바른 주소로 설정해도, receiver의 `_handle_meta_arrival`이
|
||||
sender 좌표만으로 direction 매칭 → 첫 direction (E) 승
|
||||
- peer_head_cache[E] 증가, peer_head_cache[W]는 불변
|
||||
- Kernel의 tl.recv(W)는 peer_head_cache[W] 대기 → 영원히 블록 → IpcqDeadlock
|
||||
**Bug 2 (runtime)**:
|
||||
- Even if install set up the correct address, the receiver's
|
||||
`_handle_meta_arrival` matches direction by sender coordinates only → the
|
||||
first direction (E) wins
|
||||
- peer_head_cache[E] is incremented; peer_head_cache[W] is unchanged
|
||||
- The kernel's tl.recv(W) waits on peer_head_cache[W] → blocks forever →
|
||||
IpcqDeadlock
|
||||
|
||||
### 근본 원인
|
||||
### Root cause
|
||||
|
||||
두 축에서 동일 문제:
|
||||
1. **Install-time pairing**: "내 direction과 peer의 어느 direction이 짝인가"
|
||||
결정이 dict-iteration-order에 의존 → 여러 direction이 같은 peer를 가리킬 때
|
||||
fragile
|
||||
2. **Runtime identification**: "어느 qp를 업데이트해야 하는가" 결정이 sender
|
||||
좌표만으로 이루어짐 → direction 중복 시 ambiguous
|
||||
The same issue along two axes:
|
||||
1. **Install-time pairing**: deciding "which of my directions pairs with
|
||||
which direction of the peer" depends on dict-iteration-order → fragile
|
||||
when multiple directions point to the same peer
|
||||
2. **Runtime identification**: deciding "which qp should be updated" is
|
||||
based on sender coordinates alone → ambiguous when directions are
|
||||
duplicated
|
||||
|
||||
### 해결 방향 — address-based matching
|
||||
### Solution direction — address-based matching
|
||||
|
||||
각 PE의 rx buffer는 **direction별로 고유한 주소 range**에 위치 (rx_base_pa +
|
||||
direction_idx × bytes_per_direction). 따라서:
|
||||
Each PE's rx buffer sits at a **unique address range per direction**
|
||||
(rx_base_pa + direction_idx × bytes_per_direction). Therefore:
|
||||
|
||||
- **Runtime**: sender coord 대신 **dst_addr 범위**로 매칭 → unambiguous
|
||||
- **Install**: opposite-direction 우선 선택 heuristic (ring / mesh의 자연스러운
|
||||
대칭성)
|
||||
- `peer_direction` 같은 이중 메타데이터 불필요 — **주소가 single source of
|
||||
truth**
|
||||
- **Runtime**: match by **dst_addr range** instead of sender coord →
|
||||
unambiguous
|
||||
- **Install**: prefer the opposite direction as a heuristic (the natural
|
||||
symmetry of ring / mesh)
|
||||
- No need for redundant metadata like `peer_direction` — **address is the
|
||||
single source of truth**
|
||||
|
||||
이 설계는 **PhysAddr 전환 (ADR-0030)과 독립적**으로 작동. 현재 synthetic
|
||||
주소든 PhysAddr든 direction별 range 유일성만 지켜지면 동일하게 적용 가능.
|
||||
This design works **independently of the PhysAddr transition (ADR-0030)**.
|
||||
Whether the current addresses are synthetic or PhysAddr, the same approach
|
||||
applies as long as the per-direction range uniqueness is preserved.
|
||||
|
||||
---
|
||||
|
||||
@@ -91,17 +98,17 @@ def reverse_direction(my_rank: int, peer_rank: int, my_dir: str) -> str | None:
|
||||
return None
|
||||
```
|
||||
|
||||
호출부:
|
||||
Call site:
|
||||
|
||||
```python
|
||||
for d, peer_rank in nbrs.items():
|
||||
peer_dir = reverse_direction(r, peer_rank, d) # my_dir 전달
|
||||
peer_dir = reverse_direction(r, peer_rank, d) # pass my_dir
|
||||
if peer_dir is None:
|
||||
continue
|
||||
...
|
||||
```
|
||||
|
||||
### D2. Runtime — `_handle_meta_arrival` dst_addr 매칭
|
||||
### D2. Runtime — `_handle_meta_arrival` dst_addr matching
|
||||
|
||||
`src/kernbench/components/builtin/pe_ipcq.py`:
|
||||
|
||||
@@ -138,9 +145,10 @@ def _handle_meta_arrival(self, msg: IpcqMetaArrival) -> None:
|
||||
# Unknown dst_addr — diagnostic log (should not happen under correct install)
|
||||
```
|
||||
|
||||
Sender 좌표 검사는 **제거**. `dst_addr`가 이미 direction을 결정.
|
||||
The sender-coordinate check is **removed**. `dst_addr` already determines
|
||||
the direction.
|
||||
|
||||
### D3. Credit — `dst_rx_base_pa` 필드 추가
|
||||
### D3. Credit — add `dst_rx_base_pa` field
|
||||
|
||||
`src/kernbench/common/ipcq_types.py`:
|
||||
|
||||
@@ -148,25 +156,26 @@ Sender 좌표 검사는 **제거**. `dst_addr`가 이미 direction을 결정.
|
||||
@dataclass(frozen=True)
|
||||
class IpcqCreditMetadata:
|
||||
consumer_seq: int
|
||||
dst_rx_base_pa: int # NEW: 원 sender의 peer.rx_base_pa와 매칭용
|
||||
# 기존 필드 (diagnostic / log 용도로 유지)
|
||||
dst_rx_base_pa: int # NEW: matches the original sender's peer.rx_base_pa
|
||||
# Existing fields (kept for diagnostic / logging purposes)
|
||||
src_sip: int
|
||||
src_cube: int
|
||||
src_pe: int
|
||||
src_direction: str
|
||||
```
|
||||
|
||||
Credit 생성 시 (`_delayed_credit_send`): 자기 direction의 `my_rx_base_pa`를
|
||||
`dst_rx_base_pa`로 실어 보냄 (이게 상대방이 sender 당시 썼던 `peer.rx_base_pa`).
|
||||
When the credit is generated (`_delayed_credit_send`): it carries this
|
||||
direction's `my_rx_base_pa` as `dst_rx_base_pa` (this is the
|
||||
`peer.rx_base_pa` the other side used when it was the sender).
|
||||
|
||||
수신 측 (`_credit_worker`):
|
||||
Receiver side (`_credit_worker`):
|
||||
|
||||
```python
|
||||
def _credit_worker(self, env):
|
||||
while True:
|
||||
credit = yield self._credit_inbox.get()
|
||||
for d, qp in self._queue_pairs.items():
|
||||
# peer의 rx_base_pa와 credit의 dst_rx_base_pa가 일치하는 qp 찾기
|
||||
# Find the qp whose peer rx_base_pa matches the credit's dst_rx_base_pa
|
||||
if qp["peer"].rx_base_pa == credit.dst_rx_base_pa:
|
||||
qp["peer_tail_cache"] = max(qp["peer_tail_cache"],
|
||||
credit.consumer_seq)
|
||||
@@ -178,41 +187,45 @@ def _credit_worker(self, env):
|
||||
break
|
||||
```
|
||||
|
||||
Sender 좌표 검사 제거. `dst_rx_base_pa` 매칭으로 unambiguous.
|
||||
Sender-coordinate check removed. Matching by `dst_rx_base_pa` is
|
||||
unambiguous.
|
||||
|
||||
### D4. `IpcqInitEntry`에 `peer_direction` 필드를 **추가하지 않음**
|
||||
### D4. Do **not** add a `peer_direction` field to `IpcqInitEntry`
|
||||
|
||||
ADR-0025 rev 1에서 제안했던 `IpcqInitEntry.peer_direction`은 **불필요**.
|
||||
이유:
|
||||
- Meta arrival은 dst_addr로 매칭 (D2)
|
||||
- Credit은 dst_rx_base_pa로 매칭 (D3)
|
||||
- qp에 peer_direction 저장 필요 없음
|
||||
- Install은 rx_base_pa 계산 시 내부적으로만 peer_dir 사용 (`reverse_direction`)
|
||||
The `IpcqInitEntry.peer_direction` proposed in ADR-0025 rev 1 is
|
||||
**unnecessary**. Reasons:
|
||||
- Meta arrivals are matched by dst_addr (D2)
|
||||
- Credits are matched by dst_rx_base_pa (D3)
|
||||
- No need to store peer_direction on qp
|
||||
- Install only uses peer_dir internally when computing rx_base_pa
|
||||
(`reverse_direction`)
|
||||
|
||||
IpcqInitEntry schema 변경 없음. Rev 1 대비 **단순화**.
|
||||
No change to the IpcqInitEntry schema. **Simpler** than rev 1.
|
||||
|
||||
### D5. `IpcqDmaToken.src_direction` 유지 (diagnostic only)
|
||||
### D5. Keep `IpcqDmaToken.src_direction` (diagnostic only)
|
||||
|
||||
기존 `src_direction` 필드는 제거하지 않는다. 다음 용도로 유지:
|
||||
- Logging / trace: `KERNBENCH_CCL_TRACE=1` 출력의 `(rank, t, dir, nbytes)`
|
||||
- Diagnostics: pointer_dump 등에서 direction 표시
|
||||
- 미래 확장 여지
|
||||
The existing `src_direction` field is not removed. It is retained for:
|
||||
- Logging / trace: the `(rank, t, dir, nbytes)` output of
|
||||
`KERNBENCH_CCL_TRACE=1`
|
||||
- Diagnostics: showing direction in pointer_dump, etc.
|
||||
- Room for future extension
|
||||
|
||||
Runtime matching은 `dst_addr`만 사용.
|
||||
Runtime matching uses only `dst_addr`.
|
||||
|
||||
### D6. Invariants (ADR-0023 I3 강화)
|
||||
### D6. Invariants (strengthens ADR-0023 I3)
|
||||
|
||||
**I3 (엄격)**: 각 방향 pair `(my_direction, peer_direction)`에 대해 my
|
||||
rx_base와 peer rx_base는 **별개의 direction slot**을 가리켜야 함. Install은
|
||||
이를 보장해야 한다 (reverse_direction opposite-preference).
|
||||
**I3 (strict)**: For each direction pair `(my_direction, peer_direction)`,
|
||||
my rx_base and peer rx_base must point to **distinct direction slots**.
|
||||
Install must guarantee this (reverse_direction opposite-preference).
|
||||
|
||||
**I3.1 (신규)**: 모든 qp에 대해 `qp["my_rx_base_pa"]`와 `qp["peer"].rx_base_pa`는
|
||||
서로 disjoint한 주소 range를 점유한다 (다른 direction의 buffer는 절대 겹치지
|
||||
않음). 이것이 D2/D3의 주소-기반 매칭의 전제.
|
||||
**I3.1 (new)**: For every qp, `qp["my_rx_base_pa"]` and
|
||||
`qp["peer"].rx_base_pa` occupy mutually disjoint address ranges (buffers
|
||||
of different directions never overlap). This is the prerequisite for the
|
||||
address-based matching of D2/D3.
|
||||
|
||||
Install time에 검증 가능:
|
||||
Verifiable at install time:
|
||||
```python
|
||||
# ccl/install_plan.py: build_install_plans 끝에 assertion
|
||||
# ccl/install_plan.py: assertion at the end of build_install_plans
|
||||
all_rx_ranges = set()
|
||||
for plan in plans:
|
||||
for pe_install in plan.pe_installs:
|
||||
@@ -228,36 +241,42 @@ for plan in plans:
|
||||
|
||||
## Dependencies
|
||||
|
||||
- **ADR-0023** (IPCQ protocol): 본 ADR은 ADR-0023의 runtime 매칭 로직 수정
|
||||
(D2, D3) + install heuristic 개선 (D1). IPCQ 프로토콜의 semantic layer
|
||||
변경은 없음.
|
||||
- **ADR-0024** (launcher): 2-rank bidirectional ring이 실제 쓰이는 경우가
|
||||
ADR-0024의 ws=SIP_count 모델. 본 ADR이 그 케이스를 작동시킴.
|
||||
- **ADR-0030** (PhysAddr transition, stub): **독립적** — ADR-0025의
|
||||
주소-기반 매칭은 현재 synthetic 주소든 PhysAddr이든 동일하게 작동.
|
||||
- **ADR-0023** (IPCQ protocol): this ADR modifies ADR-0023's runtime
|
||||
matching logic (D2, D3) and improves the install heuristic (D1). No
|
||||
change to the IPCQ protocol's semantic layer.
|
||||
- **ADR-0024** (launcher): the case where a 2-rank bidirectional ring is
|
||||
actually used is the ws=SIP_count model of ADR-0024. This ADR makes that
|
||||
case work.
|
||||
- **ADR-0030** (PhysAddr transition, stub): **independent** — ADR-0025's
|
||||
address-based matching works identically whether the current addresses
|
||||
are synthetic or PhysAddr.
|
||||
|
||||
---
|
||||
|
||||
## Non-goals
|
||||
|
||||
- **IPCQ 주소 체계를 PhysAddr로 전환**: ADR-0030 scope. 본 ADR은 주소가 어떻게
|
||||
인코딩되는가와 무관.
|
||||
- **Multi-hop routing**: ADR-0023 D5의 single-hop DMA write 전제 유지.
|
||||
- **Unidir ring 특수화**: `ring_1d_unidir`는 direction 하나만 있으므로 본 버그
|
||||
무관.
|
||||
- **Migrating IPCQ addressing to PhysAddr**: ADR-0030 scope. This ADR is
|
||||
agnostic to how addresses are encoded.
|
||||
- **Multi-hop routing**: the single-hop DMA write assumption of ADR-0023
|
||||
D5 still holds.
|
||||
- **Unidir ring specialization**: `ring_1d_unidir` only has a single
|
||||
direction, so the bug does not apply.
|
||||
|
||||
---
|
||||
|
||||
## Open questions
|
||||
|
||||
- **주소 매칭 성능**: `_handle_meta_arrival`과 `_credit_worker`가 qp를 선형
|
||||
순회 (max 4 direction). 성능 영향 무시 가능 수준. 문제 시 dict lookup으로
|
||||
전환 가능 (`_qp_by_rx_base`).
|
||||
- **`IpcqDmaToken.src_direction` 필요성 재평가**: diagnostic 용도로만 남긴
|
||||
필드를 계속 유지할지, 또는 logging 외부로 분리할지. 현재는 유지.
|
||||
- **Install-time invariant 검증 cost**: D6의 I3.1 검증은 O(N_PE × N_direction)^2.
|
||||
대형 topology에서 느려질 수 있음 → interval tree 등 자료구조로 개선 가능.
|
||||
단순 구현 먼저.
|
||||
- **Address-matching performance**: `_handle_meta_arrival` and
|
||||
`_credit_worker` iterate qp linearly (max 4 directions). The performance
|
||||
impact is negligible. If it becomes an issue, this can be switched to a
|
||||
dict lookup (`_qp_by_rx_base`).
|
||||
- **Re-evaluating the need for `IpcqDmaToken.src_direction`**: whether to
|
||||
keep this field, which is only kept for diagnostics, or to split it out
|
||||
of logging. Currently retained.
|
||||
- **Cost of install-time invariant verification**: the I3.1 verification
|
||||
of D6 is O(N_PE × N_direction)^2. It could be slow on large topologies
|
||||
→ improvable via data structures such as interval trees. Simple
|
||||
implementation first.
|
||||
|
||||
---
|
||||
|
||||
@@ -265,19 +284,26 @@ for plan in plans:
|
||||
|
||||
### Positive
|
||||
|
||||
- **단순함**: `peer_direction` 이중 메타데이터 제거. 주소가 single source of truth.
|
||||
- **Unambiguous matching**: 모든 topology (direction 중복 포함)에서 동작.
|
||||
- **Schema 변경 최소**: `IpcqInitEntry` 불변, `IpcqCreditMetadata`에 1 필드 추가.
|
||||
- **PhysAddr 전환 (ADR-0030) 독립**: 주소-기반 매칭은 주소 인코딩 방식과 무관.
|
||||
- **Diagnostic 유지**: `IpcqDmaToken.src_direction`은 로깅 용도로 존치.
|
||||
- **Simplicity**: redundant `peer_direction` metadata removed. Address is
|
||||
the single source of truth.
|
||||
- **Unambiguous matching**: works on every topology (including duplicate
|
||||
directions).
|
||||
- **Minimal schema changes**: `IpcqInitEntry` unchanged, one field added
|
||||
to `IpcqCreditMetadata`.
|
||||
- **Independent of PhysAddr transition (ADR-0030)**: address-based matching
|
||||
is agnostic to the address encoding.
|
||||
- **Diagnostics retained**: `IpcqDmaToken.src_direction` is kept for
|
||||
logging.
|
||||
|
||||
### Negative
|
||||
|
||||
- Runtime 매칭이 주소 비교로 바뀌어서 디버깅 시 "왜 peer_head_cache[E]가 아닌
|
||||
W가 업데이트됐나" 같은 질문에 address range를 추적해야 함 (기존엔 direction
|
||||
이름으로 충분). 해결: pointer_dump에 "direction ↔ rx_base_pa" 매핑 포함.
|
||||
- Runtime matching is now by address comparison, so when debugging
|
||||
questions like "why did peer_head_cache[W] update rather than [E]" one
|
||||
has to follow the address range (previously the direction name was
|
||||
enough). Mitigation: include a "direction ↔ rx_base_pa" mapping in
|
||||
pointer_dump.
|
||||
|
||||
### Neutral
|
||||
|
||||
- IPCQ protocol의 semantic layer (sender가 dst_addr 계산, receiver가 수신)는
|
||||
불변.
|
||||
- The semantic layer of the IPCQ protocol (sender computes dst_addr,
|
||||
receiver receives) is unchanged.
|
||||
|
||||
Reference in New Issue
Block a user