adr: add ADR-0050-0053 — close /report's second-pass G4 candidates

Documents four cross-cutting surfaces one layer deeper than the prior
G4 batch:

- 0050 par-ccl-algorithm-module-contract: how to author a new CCL
  algorithm in src/kernbench/ccl/algorithms/. Pairs with ADR-0045's
  bench-module contract. Pins the four required public symbols
  (kernel, kernel_args, TOPO_NAME_TO_KIND constants, kernel alias),
  the 9 + tl standardized kernel signature, the kernel_args tuple
  format, sip_topo_kind dispatch, and the ccl.yaml entry workflow.

- 0051 lat-routing-helper-api: every public method of AddressResolver
  (resolve, find_m_cpu, find_pcie_ep, find_io_cpu, find_all_pcie_eps)
  and PathRouter (find_path, find_path_with_distance,
  find_mcpu_dma_path, find_memory_path, find_node_path + 2 shims).
  Pins the four adjacency graphs (_adj_all / _adj / _adj_mcpu_dma /
  _adj_local) and the edge-kind exclusion sets they use, plus the
  single-owner naming convention.

- 0052 dev-oplog-memory-store-schemas: OpRecord's 7 fields, the
  per-op_name params matrix (dma_read, dma_write, gemm_*, math, math
  reduction, composite_gemm, ipcq_copy, unknown), snapshot timing
  rules (math = all inputs, dma_write = HBM-only — ADR-0027 race
  avoidance), TileToken stage_type capture, and MemoryStore's
  (space, addr) two-level dict with reference-store semantics.

- 0053 dev-topology-builder-algorithms: the 6-stage compile pipeline,
  cube_mesh.yaml's source_hash cache and its 5 input fields, the
  cube NoC auto-layout algorithm (row/col placement, HBM exclusion
  zone, PE/M_CPU/SRAM attachment via nearest-router, UCIe N/S/E/W
  distribution), the node naming convention (single-owner with
  router.py), the edge-kind catalog, the 4 view projections, and a
  table of spec-field changes vs mesh regeneration.

Bilingual pair verifier passes for all four EN/KO pairs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-22 10:52:42 -07:00
parent 9a02955770
commit bd49c93703
8 changed files with 2566 additions and 0 deletions
@@ -0,0 +1,351 @@
# ADR-0053: Topology Builder + Visualizer Algorithms
## Status
Accepted (2026-05-22).
Pins down the key algorithmic choices of the topology compile and
visualization pipeline jointly implemented by `topology/builder.py`,
`topology/mesh_gen.py`, and `topology/visualizer.py`
placement-driven router attachment, mesh auto-layout, the source_hash
cache, view projections, and SVG rendering. ADR-0006 defines the
high-level intent of topology compilation (compiled topology, distance
extraction, automatic diagram generation), but **which algorithms the
builder actually uses** was only discoverable via source grep.
## First action
When `resolve_topology(path_str)` is called, four steps run in order:
1. **Path validation** (`builder.py::resolve_topology`):
`Path(path_str).expanduser().resolve()`, existence check, file
check. Failure → `FileNotFoundError` or `ValueError`.
2. **YAML parsing** (`_read_spec`): `yaml.safe_load`. Parse errors
yield a `ValueError` with line/column. Non-dict roots are
rejected.
3. **Auto-generate the mesh** (`mesh_gen.ensure_mesh_file`): create or
reuse a `cube_mesh.yaml` next to the topology file. Cache hit on
matching source_hash; miss triggers regeneration. This step decides
the cube NoC's router grid and attachment information.
4. **Compile the graph** (`_compile_graph`): system → IO chiplets →
cubes → inter-cube edges → IO↔cube edges → system↔IO edges, then
build four view projections (system, sip, cube, pe) and wrap into
a `TopologyGraph`.
In short, **topology compilation's first act is "read topology.yaml as
a dict, create/validate cube_mesh.yaml in the same directory, then
build the flat graph + 4-view projection in system → sip → cube → pe
order"**.
## Context
`topology/` package responsibilities:
- **builder.py** (1207 lines): turns topology.yaml into a
`TopologyGraph` (nodes + edges + 4 view projections).
- **mesh_gen.py** (305 lines): auto-decides the cube NoC's router
grid and PE/UCIe/M_CPU/SRAM attachment positions and caches them in
`cube_mesh.yaml`.
- **visualizer.py** (887 lines): generates four SVG diagrams (system /
sip / cube / pe) from a `TopologyGraph`.
ADR-0006 makes the high-level decision that "the result of topology
compilation is the single source for distance metadata and diagram
generation", but specific algorithms (e.g., placement-driven nearest-
router attachment, the HBM exclusion zone, which fields in source_hash
trigger regeneration) are not in any ADR.
In particular, these decisions are absent at ADR level:
- Why is mesh_gen cached in a separate file (`cube_mesh.yaml`)?
- Which fields are in source_hash, and which changes force
regeneration?
- Why placement coordinates in mm rather than cube coordinates?
- How are the HBM exclusion zone and UCIe N/S/E/W distribution
decided inside the mesh?
- What is the abstraction-level difference among the four view
projections (system/sip/cube/pe)?
This ADR captures these decisions in one place.
## Decision
### D1. Compile pipeline — six stages
`_compile_graph(spec)`:
1. **System nodes** (`_instantiate_system`): add system-level nodes
like `fabric.switch0` and the host CPU.
2. **Per-SIP loop** (`for sip_id in range(system.sips.count)`):
- **IO chiplets** (`_instantiate_io_chiplets`): create pcie_ep /
io_cpu / io_noc / io_ucie PHYs / conn nodes and their bidirectional
internal edges.
- **Cube instantiation** (`_instantiate_cube`): using
cube_mesh.yaml's router grid, instantiate cube routers, PE
sub-components (pe_cpu, pe_dma, pe_fetch_store, pe_gemm, pe_math,
pe_mmu, pe_tcm, pe_scheduler, pe_ipcq), m_cpu, sram, hbm_ctrl,
and their internal edges.
- **Inter-cube edges** (`_add_inter_cube_edges`): the UCIe
N/S/E/W mesh edges.
- **IO ↔ cube edges** (`_add_io_to_cube_edges`): connect io_noc to
each cube's edge UCIe phy.
3. **Switch ↔ IO edges** (`_add_system_to_io_edges`): bidirectional
edges between `fabric.switch0` and each SIP's `pcie_ep` (the
cross-SIP IPCQ path of ADR-0038 D3 + ADR-0010).
4. **Build four view projections**:
- `_build_system_view(spec)` — Tray level: SIPs and the system
switch.
- `_build_sip_view(spec)` — inside one SIP: cube mesh + IO
chiplet.
- `_build_cube_view(spec)` — inside one cube: router grid + PE /
M_CPU / SRAM / HBM_CTRL attachments.
- `_build_pe_view(spec)` — inside one PE: nine sub-components +
internal edges (pe_internal kind).
5. **Return `TopologyGraph`**: `TopologyGraph(spec, nodes, edges,
system_view, sip_view, cube_view, pe_view)`.
The six stages are **ordered for a reason**: only after cubes exist
do inter-cube edges have valid src/dst, and IO chiplets must precede
the IO ↔ cube edges that reference them. New node types must slot in
the right spot.
### D2. `cube_mesh.yaml` — a separate file with a source_hash cache
`mesh_gen.ensure_mesh_file(cube_spec, mesh_path)`:
1. Compute `source_hash = _compute_source_hash(cube_spec)` from these
input fields:
- `geometry` (cube_mm.w/h …).
- `pe_layout` (corners, pe_per_corner).
- `ucie.n_connections`.
- `memory_map.hbm_mapping_mode`.
- `placement` (m_cpu/sram pos_mm).
2. If `mesh_path` (= `cube_mesh.yaml` next to topology.yaml) exists
and `existing.source_hash == source_hash`, reuse it (cache hit).
3. Otherwise, generate a new mesh via
`_generate_mesh(cube_spec, source_hash)` and write to yaml.
Caching as a separate file because:
- Mesh generation involves nontrivial PE/UCIe/router attachment math
and is too expensive to redo every time.
- Multiple runs with the same cube spec must guarantee an identical
mesh.
- The resulting mesh is itself an inspectable / debuggable artifact.
The five fields listed in source_hash are the ones that determine
mesh shape; other changes (e.g., bandwidth, overhead_ns) do not
trigger mesh regeneration.
### D3. Cube NoC mesh auto-layout
`_generate_mesh(cube_spec)`:
#### D3.1. Rows / columns
- `pe_positions = _corner_pe_positions(cube_w, cube_h)`: PE-center
coordinates (mm) per corner (NW/NE/SW/SE). Hardcoded patterns like
`(1.5, 1.5)` and `(cube_w-1.5, cube_h-1.5)`; with `pe_per_corner=2`,
each corner has two PE positions.
- `col_xs = _compute_col_positions(...)`: union of PE x-coordinates,
plus relay columns inserted when any gap exceeds
`max_spacing = 3.0 mm`.
- `row_ys, rows_per_half = _compute_row_positions(cube_h,
n_connections, pe_positions)`:
- `n_conn = max(n_connections, 2)` (hot-path minimum).
- `rows_per_half = ceil(n_conn / 2)`.
- Top half + two HBM rows + bottom half. HBM sits at
`(cube_h/2 - 1.5, cube_h/2 + 1.5)`. The gap between PE rows and
HBM rows is `hbm_gap = 1.5 mm`.
#### D3.2. HBM exclusion zone
`hbm_row_start = rows_per_half`,
`hbm_row_end = rows_per_half + 1`.
`hbm_col_start = n_cols // 2 - 1`,
`hbm_col_end = n_cols // 2`.
Router slots inside this (row, col) rectangle are marked `None` (no
router). HBM controllers are added separately as
`hbm_ctrl.pe{X}` nodes following ADR-0017 D9's per-PE partition
pattern.
#### D3.3. PE attachment
Each corner's PEs map to a row:
- Top half: NW → row 0, NE → row 1 (top_corners index).
- Bottom half: SW → row `hbm_row_end + 1`, SE → row
`hbm_row_end + 2`.
Each PE's x-coordinate attaches to the nearest column's router
(`min(range(n_cols), key=lambda c: abs(col_xs[c] - pe_x))`).
Attachment items are `pe{pe_idx}.dma`, `pe{pe_idx}.cpu`,
`pe{pe_idx}.hbm` (pushed into the router's attach list).
#### D3.4. M_CPU / SRAM attachment — nearest router by Euclidean distance
For `placement.m_cpu.pos_mm` (default `[1.5, 5.5]`) and
`placement.sram.pos_mm` (default `[1.5, 8.5]`), find the router with
the smallest Euclidean distance and append `"m_cpu"` / `"sram"` to
its attach list.
#### D3.5. UCIe N/S/E/W distribution
`ucie_pe_rows = top_pe_rows + bot_pe_rows` (total
`2 * rows_per_half`).
- UCIe-E: one PE row at a time, attach `ucie_e.c{i}` to the rightmost
column's router.
- UCIe-W: attach `ucie_w.c{i}` to the leftmost column's router (E's
mirror).
- UCIe-N/S: split PE columns into left and right halves; attach to
the top row's / bottom row's matching columns.
Each UCIe connection is suffixed `c{i}`, distributing
ucie_n_connections PHYs (ADR-0017 D5+).
### D4. Node naming convention — single ownership
builder.py creates nodes with the following naming convention (the
single-owner principle from ADR-0051 D5):
- `fabric.switch0` — system-level switch.
- `sip{S}.{io_id}.{pcie_ep|io_cpu|io_noc|io_ucie.{dir}|conn.{id}}` —
IO chiplet.
- `sip{S}.cube{C}.{m_cpu|sram|hbm_ctrl.pe{X}|noc.r{R}c{C}|...}` —
inside cube.
- `sip{S}.cube{C}.pe{P}.{pe_cpu|pe_dma|pe_fetch_store|pe_gemm|pe_math|pe_mmu|pe_tcm|pe_scheduler|pe_ipcq}` —
PE sub-components.
Changing this convention requires updating both builder.py and
router.py's helpers (ADR-0051). Components never know the convention
directly — they only call the helpers.
### D5. Edge `kind` classification
Every edge gets a `kind`; routing policy (ADR-0051 D2) reads it. Major
kinds:
- `"pe_internal"` — within a PE between sub-components.
- `"pe_to_router"` — PE_DMA ↔ cube NoC router.
- `"router_mesh"` — between cube NoC routers.
- `"router_to_hbm"`, `"router_to_mcpu"`, `"router_to_sram"`,
`"sram_to_router"`, etc. — between cube-attached components.
- `"ucie_internal"`, `"ucie_conn_to_router"`,
`"router_to_ucie_conn"`, `"ucie_conn_to_noc"`,
`"noc_to_ucie_conn"`, `"ucie_mesh"` — UCIe-related.
- `"io_internal"` — inside IO chiplet.
- `"io_to_cube"`, `"cube_to_io"` — at the IO ↔ cube boundary.
- `"pcie"` — switch ↔ pcie_ep.
- `"command"` — control-plane edges only (e.g., M_CPU ↔ NOC; excluded
from PE DMA paths).
Adding a new edge kind requires picking a category in router.py's
four adjacency graphs (ADR-0051 D2). If you forget, it defaults to
`_adj_all` only, which can produce unintended routes.
### D6. View projection — four abstraction levels
`TopologyGraph` keeps four view projections alongside the flat
nodes+edges:
- **system_view** (`_build_system_view`): Tray level. SIP blocks and
`fabric.switch0`. PCIe links shown. For external high-level
overview.
- **sip_view** (`_build_sip_view`): inside one SIP — cube mesh + IO
chiplet (pcie_ep + io_cpu + io_noc). UCIe N/S/E/W appear as
cube-cube links.
- **cube_view** (`_build_cube_view`): inside one cube — router grid +
PE / M_CPU / SRAM / HBM_CTRL attachments + UCIe PHY edges. For
intra-cube routing / placement debugging.
- **pe_view** (`_build_pe_view`): inside one PE — nine sub-components
+ internal edges (pe_internal kind). For detailed PE-internal
dataflow review.
Views are selectively rendered via the spec's
`visualization.emit_views: [system, sip, cube]` (ADR-0006). The pe
view is omitted from default output but the code is retained for
detailed debugging.
### D7. visualizer.py — SVG diagram output
`emit_diagrams(graph, out_dir)` renders every view as SVG. Key
functions:
- `_render_view_svg(view)` — generic view render (no router grid).
- `_render_cube_view_svg(view, spec)` — cube-view specific (HBM block,
router grid layout, PE/M_CPU/SRAM/HBM placement).
- `_draw_node`, `_draw_edge` — node/edge visual representation.
- `_pick_scale`, `_compute_node_sizes` — auto-scaling.
The visualizer is a **derived artifact** (ADR-0006); changes here do
not pass production checks. Aligns with CLAUDE.md's "Derived
Artifacts" guidance.
### D8. Blast radius of spec changes
| spec field | effect | mesh regenerated? |
|---------------------------------------|---------------------|-------------------|
| `system.sips.count` | SIP count, node count | No |
| `sip.cube_mesh.w/h` | cube mesh shape | No |
| `cube.geometry.cube_mm.w/h` | cube size (mm) | **Yes** |
| `cube.pe_layout.corners/pe_per_corner`| PE attachment positions | **Yes** |
| `cube.ucie.n_connections` | UCIe PHY distribution | **Yes** |
| `cube.memory_map.hbm_mapping_mode` | HBM distribution mode | **Yes** |
| `cube.placement` | M_CPU/SRAM positions | **Yes** |
| `cube.memory_map.*` (besides above) | HBM capacity / BW | No |
| `*.links.*.bw_gbs` | edge bandwidth | No |
| `*.attrs.overhead_ns` | component latency | No |
The table mirrors D2's `_compute_source_hash` inputs. Changes that
require mesh regeneration automatically invalidate `cube_mesh.yaml`'s
source_hash.
## Alternatives Considered
### A1. Regenerate the mesh on every compile without a cache file
Rejected. The cost of mesh generation would be paid repeatedly (CLI
runs, probe, tests) for the same spec, and the human-inspectable
artifact would disappear.
### A2. Merge mesh generation into builder.py
Rejected (currently). It is a 305-line algorithm of its own, and the
mesh-layout decisions (placement-driven router attachment, HBM
exclusion zone) are different from builder's general node/edge
emission. Keeping it separate respects single-responsibility.
### A3. Express placement coordinates in cube coordinates (col/row)
Rejected. mm coordinates flow consistently between the visualizer and
mesh layout (for nearest-router computation). Cube coordinates are
undefined until the router grid is fixed, so they are unsuitable as
placement input.
### A4. Lazy view projection generation
Rejected (currently). The four views are cheap to build (typically <
100 ms), and eager construction guarantees `TopologyGraph` as the
single source of truth.
### A5. Visualizer output in formats besides SVG (PNG/PDF)
Rejected. SVG is vector + text-searchable + directly renderable in
browsers. PNG conversion, when required, is downstream
post-processing (e.g., rsvg-convert).
## Consequences
- ADR-0006's high-level intent is fleshed out via D1D7; topology
changes can be assessed quickly via D8's table.
- D3's mesh-layout algorithm is ADR-locked, so future PE attachment
patterns (e.g., a 6-zone HBM split) make clear which stage they
affect.
- D5's edge-kind list and D7's view structure are explicit, giving PR
reviewers a quick map of where (builder + router + visualizer) a
new component type ripples through.
- D2's source_hash invalidation rules are explicit, so a stale
`cube_mesh.yaml` (e.g., when only bandwidth changed) is recognized
as correct behavior.