Establish English as the canonical ADR language with Korean translations held in a parallel docs/adr-ko/ tree as derived artifacts (1:1 mirror). Promotion from adr-proposed/ to adr/ now writes English to adr/ and the Korean to adr-ko/; bidirectional sync rule documented in CLAUDE.md. - Migrate 30 ADRs in docs/adr/: 28 Korean-only translated to English, 2 bilingual pairs (ADR-0020, ADR-0023) consolidated (.en.md suffix dropped). ADR-0023 EN regenerated against KO source which had newer HW Realization Notes (D16-D23) section. - docs/adr-history/ left frozen by design (transitional state). - CLAUDE.md (Part 2): update ADR Lifecycle for 4-folder layout, mark docs/adr-ko/ as a Derived Artifact, add ADR Translation Discipline section covering bidirectional sync, conflict resolution (EN wins), and proposed-language freedom. - tools/verify_adr_lang_pairs.py: new verification tool checking pair completeness, filename mirroring, ADR-ID match, Status byte-equality. Pre-commit hook intentionally not added; run on demand or in CI. - tests/test_verify_adr_lang_pairs.py: 11 cases including CRLF/LF normalization, em-dash title separator, underscore-slug edge case. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
11 KiB
ADR-0025: IPCQ Direction Addressing — address-based matching
Status
Accepted (Revision 2 — Address-based matching; peer_direction field dropped)
Context
Goal
In the IPCQ protocol of ADR-0023, make the identification of "which direction pair this transfer belongs to" consistent and address-based, without depending on topology / dict-order. It must work correctly in a 2-rank bidirectional ring (and more generally in any topology where multiple directions point to the same peer).
The bug surfaced — 2-rank bidirectional ring
ring_1d(rank, world_size=2) → {"E": 1, "W": 1} (rank 0). Both directions
point to the same peer.
Bug 1 (install):
reverse_direction(0, 1)→ returns "E" by dict order (wrong; "W" is the correct answer — opposite-direction convention)- rank 0's E entry is set with
peer.rx_base_pa = rx_base(sip1, cube0, pe0, d="E") - tl.send(E) → data lands in sip1's E-rx buffer (should be W-rx)
Bug 2 (runtime):
- Even if install set up the correct address, the receiver's
_handle_meta_arrivalmatches direction by sender coordinates only → the first direction (E) wins - peer_head_cache[E] is incremented; peer_head_cache[W] is unchanged
- The kernel's tl.recv(W) waits on peer_head_cache[W] → blocks forever → IpcqDeadlock
Root cause
The same issue along two axes:
- Install-time pairing: deciding "which of my directions pairs with which direction of the peer" depends on dict-iteration-order → fragile when multiple directions point to the same peer
- Runtime identification: deciding "which qp should be updated" is based on sender coordinates alone → ambiguous when directions are duplicated
Solution direction — address-based matching
Each PE's rx buffer sits at a unique address range per direction (rx_base_pa + direction_idx × bytes_per_direction). Therefore:
- Runtime: match by dst_addr range instead of sender coord → unambiguous
- Install: prefer the opposite direction as a heuristic (the natural symmetry of ring / mesh)
- No need for redundant metadata like
peer_direction— address is the single source of truth
This design works independently of the PhysAddr transition (ADR-0030). Whether the current addresses are synthetic or PhysAddr, the same approach applies as long as the per-direction range uniqueness is preserved.
Decision
D1. Install — reverse_direction opposite-preference
src/kernbench/ccl/install.py:
# Extended in ADR-0032 with global_* pairs for inter-SIP directions,
# which were introduced by configure_sfr_intercube_multisip to keep
# intercube (N/S/E/W) and inter-SIP (global_N/S/E/W) namespaces disjoint.
_OPPOSITE_DIR = {
"E": "W", "W": "E", "N": "S", "S": "N",
"global_E": "global_W", "global_W": "global_E",
"global_N": "global_S", "global_S": "global_N",
}
def reverse_direction(my_rank: int, peer_rank: int, my_dir: str) -> str | None:
"""Find peer's direction that reciprocates my_dir→peer_rank.
Prefer the OPPOSITE direction (E↔W, N↔S) when the peer has it
pointing back to us. This matters in 2-rank bidirectional rings
where both E and W on one side point to the same peer — without
the preference, the first-match-wins iteration would route data
into the wrong rx slot. Falls back to any direction pointing back
for topologies without an opposite convention (tree_binary's
parent/child).
"""
nt = neighbor_table[peer_rank]
opp = _OPPOSITE_DIR.get(my_dir)
if opp is not None and nt.get(opp) == my_rank:
return opp
for d, target in nt.items():
if target == my_rank:
return d
return None
Call site:
for d, peer_rank in nbrs.items():
peer_dir = reverse_direction(r, peer_rank, d) # pass my_dir
if peer_dir is None:
continue
...
D2. Runtime — _handle_meta_arrival dst_addr matching
src/kernbench/components/builtin/pe_ipcq.py:
def _handle_meta_arrival(self, msg: IpcqMetaArrival) -> None:
"""Match incoming token to the receiver-side direction by dst_addr range.
Each direction has a unique rx buffer address range
(my_rx_base_pa + n_slots * slot_size). The token's dst_addr (set by
the sender's IPCQ when computing peer's slot address) falls within
exactly one such range. This address-based matching is unambiguous
even when multiple directions have the same peer (2-rank ring).
"""
token = msg.token
dst_addr = token.dst_addr
for d, qp in self._queue_pairs.items():
base = qp["my_rx_base_pa"]
size = qp["n_slots"] * qp["slot_size"]
if base <= dst_addr < base + size:
qp["peer_head_cache"] = max(qp["peer_head_cache"],
token.sender_seq + 1)
self._arrived_tokens.setdefault(d, []).append(token)
waiters = self._recv_waiters.get(d, [])
self._recv_waiters[d] = []
for ev in waiters:
if not ev.triggered:
ev.succeed()
any_waiters = self._any_recv_waiters
self._any_recv_waiters = []
for ev in any_waiters:
if not ev.triggered:
ev.succeed()
return
# Unknown dst_addr — diagnostic log (should not happen under correct install)
The sender-coordinate check is removed. dst_addr already determines
the direction.
D3. Credit — add dst_rx_base_pa field
src/kernbench/common/ipcq_types.py:
@dataclass(frozen=True)
class IpcqCreditMetadata:
consumer_seq: int
dst_rx_base_pa: int # NEW: matches the original sender's peer.rx_base_pa
# Existing fields (kept for diagnostic / logging purposes)
src_sip: int
src_cube: int
src_pe: int
src_direction: str
When the credit is generated (_delayed_credit_send): it carries this
direction's my_rx_base_pa as dst_rx_base_pa (this is the
peer.rx_base_pa the other side used when it was the sender).
Receiver side (_credit_worker):
def _credit_worker(self, env):
while True:
credit = yield self._credit_inbox.get()
for d, qp in self._queue_pairs.items():
# Find the qp whose peer rx_base_pa matches the credit's dst_rx_base_pa
if qp["peer"].rx_base_pa == credit.dst_rx_base_pa:
qp["peer_tail_cache"] = max(qp["peer_tail_cache"],
credit.consumer_seq)
waiters = self._send_waiters.get(d, [])
self._send_waiters[d] = []
for ev in waiters:
if not ev.triggered:
ev.succeed()
break
Sender-coordinate check removed. Matching by dst_rx_base_pa is
unambiguous.
D4. Do not add a peer_direction field to IpcqInitEntry
The IpcqInitEntry.peer_direction proposed in ADR-0025 rev 1 is
unnecessary. Reasons:
- Meta arrivals are matched by dst_addr (D2)
- Credits are matched by dst_rx_base_pa (D3)
- No need to store peer_direction on qp
- Install only uses peer_dir internally when computing rx_base_pa
(
reverse_direction)
No change to the IpcqInitEntry schema. Simpler than rev 1.
D5. Keep IpcqDmaToken.src_direction (diagnostic only)
The existing src_direction field is not removed. It is retained for:
- Logging / trace: the
(rank, t, dir, nbytes)output ofKERNBENCH_CCL_TRACE=1 - Diagnostics: showing direction in pointer_dump, etc.
- Room for future extension
Runtime matching uses only dst_addr.
D6. Invariants (strengthens ADR-0023 I3)
I3 (strict): For each direction pair (my_direction, peer_direction),
my rx_base and peer rx_base must point to distinct direction slots.
Install must guarantee this (reverse_direction opposite-preference).
I3.1 (new): For every qp, qp["my_rx_base_pa"] and
qp["peer"].rx_base_pa occupy mutually disjoint address ranges (buffers
of different directions never overlap). This is the prerequisite for the
address-based matching of D2/D3.
Verifiable at install time:
# ccl/install_plan.py: assertion at the end of build_install_plans
all_rx_ranges = set()
for plan in plans:
for pe_install in plan.pe_installs:
for entry in pe_install.neighbors:
r = (entry.my_rx_base_pa,
entry.my_rx_base_pa + plan.n_slots * plan.slot_size)
overlap = any(_ranges_overlap(r, e) for e in all_rx_ranges)
assert not overlap
all_rx_ranges.add(r)
Dependencies
- ADR-0023 (IPCQ protocol): this ADR modifies ADR-0023's runtime matching logic (D2, D3) and improves the install heuristic (D1). No change to the IPCQ protocol's semantic layer.
- ADR-0024 (launcher): the case where a 2-rank bidirectional ring is actually used is the ws=SIP_count model of ADR-0024. This ADR makes that case work.
- ADR-0030 (PhysAddr transition, stub): independent — ADR-0025's address-based matching works identically whether the current addresses are synthetic or PhysAddr.
Non-goals
- Migrating IPCQ addressing to PhysAddr: ADR-0030 scope. This ADR is agnostic to how addresses are encoded.
- Multi-hop routing: the single-hop DMA write assumption of ADR-0023 D5 still holds.
- Unidir ring specialization:
ring_1d_unidironly has a single direction, so the bug does not apply.
Open questions
- Address-matching performance:
_handle_meta_arrivaland_credit_workeriterate qp linearly (max 4 directions). The performance impact is negligible. If it becomes an issue, this can be switched to a dict lookup (_qp_by_rx_base). - Re-evaluating the need for
IpcqDmaToken.src_direction: whether to keep this field, which is only kept for diagnostics, or to split it out of logging. Currently retained. - Cost of install-time invariant verification: the I3.1 verification of D6 is O(N_PE × N_direction)^2. It could be slow on large topologies → improvable via data structures such as interval trees. Simple implementation first.
Consequences
Positive
- Simplicity: redundant
peer_directionmetadata removed. Address is the single source of truth. - Unambiguous matching: works on every topology (including duplicate directions).
- Minimal schema changes:
IpcqInitEntryunchanged, one field added toIpcqCreditMetadata. - Independent of PhysAddr transition (ADR-0030): address-based matching is agnostic to the address encoding.
- Diagnostics retained:
IpcqDmaToken.src_directionis kept for logging.
Negative
- Runtime matching is now by address comparison, so when debugging questions like "why did peer_head_cache[W] update rather than [E]" one has to follow the address range (previously the direction name was enough). Mitigation: include a "direction ↔ rx_base_pa" mapping in pointer_dump.
Neutral
- The semantic layer of the IPCQ protocol (sender computes dst_addr, receiver receives) is unchanged.