adr: add ADR-0050-0053 — close /report's second-pass G4 candidates

Documents four cross-cutting surfaces one layer deeper than the prior G4 batch: - 0050 par-ccl-algorithm-module-contract: how to author a new CCL algorithm in src/kernbench/ccl/algorithms/. Pairs with ADR-0045's bench-module contract. Pins the four required public symbols (kernel, kernel_args, TOPO_NAME_TO_KIND constants, kernel alias), the 9 + tl standardized kernel signature, the kernel_args tuple format, sip_topo_kind dispatch, and the ccl.yaml entry workflow. - 0051 lat-routing-helper-api: every public method of AddressResolver (resolve, find_m_cpu, find_pcie_ep, find_io_cpu, find_all_pcie_eps) and PathRouter (find_path, find_path_with_distance, find_mcpu_dma_path, find_memory_path, find_node_path + 2 shims). Pins the four adjacency graphs (_adj_all / _adj / _adj_mcpu_dma / _adj_local) and the edge-kind exclusion sets they use, plus the single-owner naming convention. - 0052 dev-oplog-memory-store-schemas: OpRecord's 7 fields, the per-op_name params matrix (dma_read, dma_write, gemm_*, math, math reduction, composite_gemm, ipcq_copy, unknown), snapshot timing rules (math = all inputs, dma_write = HBM-only — ADR-0027 race avoidance), TileToken stage_type capture, and MemoryStore's (space, addr) two-level dict with reference-store semantics. - 0053 dev-topology-builder-algorithms: the 6-stage compile pipeline, cube_mesh.yaml's source_hash cache and its 5 input fields, the cube NoC auto-layout algorithm (row/col placement, HBM exclusion zone, PE/M_CPU/SRAM attachment via nearest-router, UCIe N/S/E/W distribution), the node naming convention (single-owner with router.py), the edge-kind catalog, the 4 view projections, and a table of spec-field changes vs mesh regeneration. Bilingual pair verifier passes for all four EN/KO pairs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 10:52:42 -07:00
parent 9a02955770
commit bd49c93703
8 changed files with 2566 additions and 0 deletions
@@ -0,0 +1,351 @@
+# ADR-0053: Topology Builder + Visualizer Algorithms
+
+## Status
+
+Accepted (2026-05-22).
+
+Pins down the key algorithmic choices of the topology compile and
+visualization pipeline jointly implemented by `topology/builder.py`,
+`topology/mesh_gen.py`, and `topology/visualizer.py` —
+placement-driven router attachment, mesh auto-layout, the source_hash
+cache, view projections, and SVG rendering. ADR-0006 defines the
+high-level intent of topology compilation (compiled topology, distance
+extraction, automatic diagram generation), but **which algorithms the
+builder actually uses** was only discoverable via source grep.
+
+## First action
+
+When `resolve_topology(path_str)` is called, four steps run in order:
+
+1. **Path validation** (`builder.py::resolve_topology`):
+   `Path(path_str).expanduser().resolve()`, existence check, file
+   check. Failure → `FileNotFoundError` or `ValueError`.
+2. **YAML parsing** (`_read_spec`): `yaml.safe_load`. Parse errors
+   yield a `ValueError` with line/column. Non-dict roots are
+   rejected.
+3. **Auto-generate the mesh** (`mesh_gen.ensure_mesh_file`): create or
+   reuse a `cube_mesh.yaml` next to the topology file. Cache hit on
+   matching source_hash; miss triggers regeneration. This step decides
+   the cube NoC's router grid and attachment information.
+4. **Compile the graph** (`_compile_graph`): system → IO chiplets →
+   cubes → inter-cube edges → IO↔cube edges → system↔IO edges, then
+   build four view projections (system, sip, cube, pe) and wrap into
+   a `TopologyGraph`.
+
+In short, **topology compilation's first act is "read topology.yaml as
+a dict, create/validate cube_mesh.yaml in the same directory, then
+build the flat graph + 4-view projection in system → sip → cube → pe
+order"**.
+
+## Context
+
+`topology/` package responsibilities:
+
+- **builder.py** (1207 lines): turns topology.yaml into a
+  `TopologyGraph` (nodes + edges + 4 view projections).
+- **mesh_gen.py** (305 lines): auto-decides the cube NoC's router
+  grid and PE/UCIe/M_CPU/SRAM attachment positions and caches them in
+  `cube_mesh.yaml`.
+- **visualizer.py** (887 lines): generates four SVG diagrams (system /
+  sip / cube / pe) from a `TopologyGraph`.
+
+ADR-0006 makes the high-level decision that "the result of topology
+compilation is the single source for distance metadata and diagram
+generation", but specific algorithms (e.g., placement-driven nearest-
+router attachment, the HBM exclusion zone, which fields in source_hash
+trigger regeneration) are not in any ADR.
+
+In particular, these decisions are absent at ADR level:
+
+- Why is mesh_gen cached in a separate file (`cube_mesh.yaml`)?
+- Which fields are in source_hash, and which changes force
+  regeneration?
+- Why placement coordinates in mm rather than cube coordinates?
+- How are the HBM exclusion zone and UCIe N/S/E/W distribution
+  decided inside the mesh?
+- What is the abstraction-level difference among the four view
+  projections (system/sip/cube/pe)?
+
+This ADR captures these decisions in one place.
+
+## Decision
+
+### D1. Compile pipeline — six stages
+
+`_compile_graph(spec)`:
+
+1. **System nodes** (`_instantiate_system`): add system-level nodes
+   like `fabric.switch0` and the host CPU.
+2. **Per-SIP loop** (`for sip_id in range(system.sips.count)`):
+   - **IO chiplets** (`_instantiate_io_chiplets`): create pcie_ep /
+     io_cpu / io_noc / io_ucie PHYs / conn nodes and their bidirectional
+     internal edges.
+   - **Cube instantiation** (`_instantiate_cube`): using
+     cube_mesh.yaml's router grid, instantiate cube routers, PE
+     sub-components (pe_cpu, pe_dma, pe_fetch_store, pe_gemm, pe_math,
+     pe_mmu, pe_tcm, pe_scheduler, pe_ipcq), m_cpu, sram, hbm_ctrl,
+     and their internal edges.
+   - **Inter-cube edges** (`_add_inter_cube_edges`): the UCIe
+     N/S/E/W mesh edges.
+   - **IO ↔ cube edges** (`_add_io_to_cube_edges`): connect io_noc to
+     each cube's edge UCIe phy.
+3. **Switch ↔ IO edges** (`_add_system_to_io_edges`): bidirectional
+   edges between `fabric.switch0` and each SIP's `pcie_ep` (the
+   cross-SIP IPCQ path of ADR-0038 D3 + ADR-0010).
+4. **Build four view projections**:
+   - `_build_system_view(spec)` — Tray level: SIPs and the system
+     switch.
+   - `_build_sip_view(spec)` — inside one SIP: cube mesh + IO
+     chiplet.
+   - `_build_cube_view(spec)` — inside one cube: router grid + PE /
+     M_CPU / SRAM / HBM_CTRL attachments.
+   - `_build_pe_view(spec)` — inside one PE: nine sub-components +
+     internal edges (pe_internal kind).
+5. **Return `TopologyGraph`**: `TopologyGraph(spec, nodes, edges,
+   system_view, sip_view, cube_view, pe_view)`.
+
+The six stages are **ordered for a reason**: only after cubes exist
+do inter-cube edges have valid src/dst, and IO chiplets must precede
+the IO ↔ cube edges that reference them. New node types must slot in
+the right spot.
+
+### D2. `cube_mesh.yaml` — a separate file with a source_hash cache
+
+`mesh_gen.ensure_mesh_file(cube_spec, mesh_path)`:
+
+1. Compute `source_hash = _compute_source_hash(cube_spec)` from these
+   input fields:
+   - `geometry` (cube_mm.w/h …).
+   - `pe_layout` (corners, pe_per_corner).
+   - `ucie.n_connections`.
+   - `memory_map.hbm_mapping_mode`.
+   - `placement` (m_cpu/sram pos_mm).
+2. If `mesh_path` (= `cube_mesh.yaml` next to topology.yaml) exists
+   and `existing.source_hash == source_hash`, reuse it (cache hit).
+3. Otherwise, generate a new mesh via
+   `_generate_mesh(cube_spec, source_hash)` and write to yaml.
+
+Caching as a separate file because:
+
+- Mesh generation involves nontrivial PE/UCIe/router attachment math
+  and is too expensive to redo every time.
+- Multiple runs with the same cube spec must guarantee an identical
+  mesh.
+- The resulting mesh is itself an inspectable / debuggable artifact.
+
+The five fields listed in source_hash are the ones that determine
+mesh shape; other changes (e.g., bandwidth, overhead_ns) do not
+trigger mesh regeneration.
+
+### D3. Cube NoC mesh auto-layout
+
+`_generate_mesh(cube_spec)`:
+
+#### D3.1. Rows / columns
+
+- `pe_positions = _corner_pe_positions(cube_w, cube_h)`: PE-center
+  coordinates (mm) per corner (NW/NE/SW/SE). Hardcoded patterns like
+  `(1.5, 1.5)` and `(cube_w-1.5, cube_h-1.5)`; with `pe_per_corner=2`,
+  each corner has two PE positions.
+- `col_xs = _compute_col_positions(...)`: union of PE x-coordinates,
+  plus relay columns inserted when any gap exceeds
+  `max_spacing = 3.0 mm`.
+- `row_ys, rows_per_half = _compute_row_positions(cube_h,
+  n_connections, pe_positions)`:
+  - `n_conn = max(n_connections, 2)` (hot-path minimum).
+  - `rows_per_half = ceil(n_conn / 2)`.
+  - Top half + two HBM rows + bottom half. HBM sits at
+    `(cube_h/2 - 1.5, cube_h/2 + 1.5)`. The gap between PE rows and
+    HBM rows is `hbm_gap = 1.5 mm`.
+
+#### D3.2. HBM exclusion zone
+
+`hbm_row_start = rows_per_half`,
+`hbm_row_end = rows_per_half + 1`.
+`hbm_col_start = n_cols // 2 - 1`,
+`hbm_col_end = n_cols // 2`.
+
+Router slots inside this (row, col) rectangle are marked `None` (no
+router). HBM controllers are added separately as
+`hbm_ctrl.pe{X}` nodes following ADR-0017 D9's per-PE partition
+pattern.
+
+#### D3.3. PE attachment
+
+Each corner's PEs map to a row:
+
+- Top half: NW → row 0, NE → row 1 (top_corners index).
+- Bottom half: SW → row `hbm_row_end + 1`, SE → row
+  `hbm_row_end + 2`.
+
+Each PE's x-coordinate attaches to the nearest column's router
+(`min(range(n_cols), key=lambda c: abs(col_xs[c] - pe_x))`).
+Attachment items are `pe{pe_idx}.dma`, `pe{pe_idx}.cpu`,
+`pe{pe_idx}.hbm` (pushed into the router's attach list).
+
+#### D3.4. M_CPU / SRAM attachment — nearest router by Euclidean distance
+
+For `placement.m_cpu.pos_mm` (default `[1.5, 5.5]`) and
+`placement.sram.pos_mm` (default `[1.5, 8.5]`), find the router with
+the smallest Euclidean distance and append `"m_cpu"` / `"sram"` to
+its attach list.
+
+#### D3.5. UCIe N/S/E/W distribution
+
+`ucie_pe_rows = top_pe_rows + bot_pe_rows` (total
+`2 * rows_per_half`).
+
+- UCIe-E: one PE row at a time, attach `ucie_e.c{i}` to the rightmost
+  column's router.
+- UCIe-W: attach `ucie_w.c{i}` to the leftmost column's router (E's
+  mirror).
+- UCIe-N/S: split PE columns into left and right halves; attach to
+  the top row's / bottom row's matching columns.
+
+Each UCIe connection is suffixed `c{i}`, distributing
+ucie_n_connections PHYs (ADR-0017 D5+).
+
+### D4. Node naming convention — single ownership
+
+builder.py creates nodes with the following naming convention (the
+single-owner principle from ADR-0051 D5):
+
+- `fabric.switch0` — system-level switch.
+- `sip{S}.{io_id}.{pcie_ep|io_cpu|io_noc|io_ucie.{dir}|conn.{id}}` —
+  IO chiplet.
+- `sip{S}.cube{C}.{m_cpu|sram|hbm_ctrl.pe{X}|noc.r{R}c{C}|...}` —
+  inside cube.
+- `sip{S}.cube{C}.pe{P}.{pe_cpu|pe_dma|pe_fetch_store|pe_gemm|pe_math|pe_mmu|pe_tcm|pe_scheduler|pe_ipcq}` —
+  PE sub-components.
+
+Changing this convention requires updating both builder.py and
+router.py's helpers (ADR-0051). Components never know the convention
+directly — they only call the helpers.
+
+### D5. Edge `kind` classification
+
+Every edge gets a `kind`; routing policy (ADR-0051 D2) reads it. Major
+kinds:
+
+- `"pe_internal"` — within a PE between sub-components.
+- `"pe_to_router"` — PE_DMA ↔ cube NoC router.
+- `"router_mesh"` — between cube NoC routers.
+- `"router_to_hbm"`, `"router_to_mcpu"`, `"router_to_sram"`,
+  `"sram_to_router"`, etc. — between cube-attached components.
+- `"ucie_internal"`, `"ucie_conn_to_router"`,
+  `"router_to_ucie_conn"`, `"ucie_conn_to_noc"`,
+  `"noc_to_ucie_conn"`, `"ucie_mesh"` — UCIe-related.
+- `"io_internal"` — inside IO chiplet.
+- `"io_to_cube"`, `"cube_to_io"` — at the IO ↔ cube boundary.
+- `"pcie"` — switch ↔ pcie_ep.
+- `"command"` — control-plane edges only (e.g., M_CPU ↔ NOC; excluded
+  from PE DMA paths).
+
+Adding a new edge kind requires picking a category in router.py's
+four adjacency graphs (ADR-0051 D2). If you forget, it defaults to
+`_adj_all` only, which can produce unintended routes.
+
+### D6. View projection — four abstraction levels
+
+`TopologyGraph` keeps four view projections alongside the flat
+nodes+edges:
+
+- **system_view** (`_build_system_view`): Tray level. SIP blocks and
+  `fabric.switch0`. PCIe links shown. For external high-level
+  overview.
+- **sip_view** (`_build_sip_view`): inside one SIP — cube mesh + IO
+  chiplet (pcie_ep + io_cpu + io_noc). UCIe N/S/E/W appear as
+  cube-cube links.
+- **cube_view** (`_build_cube_view`): inside one cube — router grid +
+  PE / M_CPU / SRAM / HBM_CTRL attachments + UCIe PHY edges. For
+  intra-cube routing / placement debugging.
+- **pe_view** (`_build_pe_view`): inside one PE — nine sub-components
+  + internal edges (pe_internal kind). For detailed PE-internal
+  dataflow review.
+
+Views are selectively rendered via the spec's
+`visualization.emit_views: [system, sip, cube]` (ADR-0006). The pe
+view is omitted from default output but the code is retained for
+detailed debugging.
+
+### D7. visualizer.py — SVG diagram output
+
+`emit_diagrams(graph, out_dir)` renders every view as SVG. Key
+functions:
+
+- `_render_view_svg(view)` — generic view render (no router grid).
+- `_render_cube_view_svg(view, spec)` — cube-view specific (HBM block,
+  router grid layout, PE/M_CPU/SRAM/HBM placement).
+- `_draw_node`, `_draw_edge` — node/edge visual representation.
+- `_pick_scale`, `_compute_node_sizes` — auto-scaling.
+
+The visualizer is a **derived artifact** (ADR-0006); changes here do
+not pass production checks. Aligns with CLAUDE.md's "Derived
+Artifacts" guidance.
+
+### D8. Blast radius of spec changes
+
+| spec field                            | effect              | mesh regenerated? |
+|---------------------------------------|---------------------|-------------------|
+| `system.sips.count`                   | SIP count, node count | No                |
+| `sip.cube_mesh.w/h`                   | cube mesh shape     | No                |
+| `cube.geometry.cube_mm.w/h`           | cube size (mm)      | **Yes**           |
+| `cube.pe_layout.corners/pe_per_corner`| PE attachment positions | **Yes**       |
+| `cube.ucie.n_connections`             | UCIe PHY distribution | **Yes**         |
+| `cube.memory_map.hbm_mapping_mode`    | HBM distribution mode | **Yes**         |
+| `cube.placement`                      | M_CPU/SRAM positions | **Yes**          |
+| `cube.memory_map.*` (besides above)   | HBM capacity / BW   | No                |
+| `*.links.*.bw_gbs`                    | edge bandwidth      | No                |
+| `*.attrs.overhead_ns`                 | component latency   | No                |
+
+The table mirrors D2's `_compute_source_hash` inputs. Changes that
+require mesh regeneration automatically invalidate `cube_mesh.yaml`'s
+source_hash.
+
+## Alternatives Considered
+
+### A1. Regenerate the mesh on every compile without a cache file
+
+Rejected. The cost of mesh generation would be paid repeatedly (CLI
+runs, probe, tests) for the same spec, and the human-inspectable
+artifact would disappear.
+
+### A2. Merge mesh generation into builder.py
+
+Rejected (currently). It is a 305-line algorithm of its own, and the
+mesh-layout decisions (placement-driven router attachment, HBM
+exclusion zone) are different from builder's general node/edge
+emission. Keeping it separate respects single-responsibility.
+
+### A3. Express placement coordinates in cube coordinates (col/row)
+
+Rejected. mm coordinates flow consistently between the visualizer and
+mesh layout (for nearest-router computation). Cube coordinates are
+undefined until the router grid is fixed, so they are unsuitable as
+placement input.
+
+### A4. Lazy view projection generation
+
+Rejected (currently). The four views are cheap to build (typically <
+100 ms), and eager construction guarantees `TopologyGraph` as the
+single source of truth.
+
+### A5. Visualizer output in formats besides SVG (PNG/PDF)
+
+Rejected. SVG is vector + text-searchable + directly renderable in
+browsers. PNG conversion, when required, is downstream
+post-processing (e.g., rsvg-convert).
+
+## Consequences
+
+- ADR-0006's high-level intent is fleshed out via D1–D7; topology
+  changes can be assessed quickly via D8's table.
+- D3's mesh-layout algorithm is ADR-locked, so future PE attachment
+  patterns (e.g., a 6-zone HBM split) make clear which stage they
+  affect.
+- D5's edge-kind list and D7's view structure are explicit, giving PR
+  reviewers a quick map of where (builder + router + visualizer) a
+  new component type ripples through.
+- D2's source_hash invalidation rules are explicit, so a stale
+  `cube_mesh.yaml` (e.g., when only bandwidth changed) is recognized
+  as correct behavior.