# ADR-0053: Topology Builder + Visualizer Algorithms ## Status Accepted (2026-05-22). Pins down the key algorithmic choices of the topology compile and visualization pipeline jointly implemented by `topology/builder.py`, `topology/mesh_gen.py`, and `topology/visualizer.py` — placement-driven router attachment, mesh auto-layout, the source_hash cache, view projections, and SVG rendering. ADR-0006 defines the high-level intent of topology compilation (compiled topology, distance extraction, automatic diagram generation), but **which algorithms the builder actually uses** was only discoverable via source grep. ## First action When `resolve_topology(path_str)` is called, four steps run in order: 1. **Path validation** (`builder.py::resolve_topology`): `Path(path_str).expanduser().resolve()`, existence check, file check. Failure → `FileNotFoundError` or `ValueError`. 2. **YAML parsing** (`_read_spec`): `yaml.safe_load`. Parse errors yield a `ValueError` with line/column. Non-dict roots are rejected. 3. **Auto-generate the mesh** (`mesh_gen.ensure_mesh_file`): create or reuse a `cube_mesh.yaml` next to the topology file. Cache hit on matching source_hash; miss triggers regeneration. This step decides the cube NoC's router grid and attachment information. 4. **Compile the graph** (`_compile_graph`): system → IO chiplets → cubes → inter-cube edges → IO↔cube edges → system↔IO edges, then build four view projections (system, sip, cube, pe) and wrap into a `TopologyGraph`. In short, **topology compilation's first act is "read topology.yaml as a dict, create/validate cube_mesh.yaml in the same directory, then build the flat graph + 4-view projection in system → sip → cube → pe order"**. ## Context `topology/` package responsibilities: - **builder.py** (1207 lines): turns topology.yaml into a `TopologyGraph` (nodes + edges + 4 view projections). - **mesh_gen.py** (305 lines): auto-decides the cube NoC's router grid and PE/UCIe/M_CPU/SRAM attachment positions and caches them in `cube_mesh.yaml`. - **visualizer.py** (887 lines): generates four SVG diagrams (system / sip / cube / pe) from a `TopologyGraph`. ADR-0006 makes the high-level decision that "the result of topology compilation is the single source for distance metadata and diagram generation", but specific algorithms (e.g., placement-driven nearest- router attachment, the HBM exclusion zone, which fields in source_hash trigger regeneration) are not in any ADR. In particular, these decisions are absent at ADR level: - Why is mesh_gen cached in a separate file (`cube_mesh.yaml`)? - Which fields are in source_hash, and which changes force regeneration? - Why placement coordinates in mm rather than cube coordinates? - How are the HBM exclusion zone and UCIe N/S/E/W distribution decided inside the mesh? - What is the abstraction-level difference among the four view projections (system/sip/cube/pe)? This ADR captures these decisions in one place. ## Decision ### D1. Compile pipeline — six stages `_compile_graph(spec)`: 1. **System nodes** (`_instantiate_system`): add system-level nodes like `fabric.switch0` and the host CPU. 2. **Per-SIP loop** (`for sip_id in range(system.sips.count)`): - **IO chiplets** (`_instantiate_io_chiplets`): create pcie_ep / io_cpu / io_noc / io_ucie PHYs / conn nodes and their bidirectional internal edges. - **Cube instantiation** (`_instantiate_cube`): using cube_mesh.yaml's router grid, instantiate cube routers, PE sub-components (pe_cpu, pe_dma, pe_fetch_store, pe_gemm, pe_math, pe_mmu, pe_tcm, pe_scheduler, pe_ipcq), m_cpu, sram, hbm_ctrl, and their internal edges. - **Inter-cube edges** (`_add_inter_cube_edges`): the UCIe N/S/E/W mesh edges. - **IO ↔ cube edges** (`_add_io_to_cube_edges`): connect io_noc to each cube's edge UCIe phy. 3. **Switch ↔ IO edges** (`_add_system_to_io_edges`): bidirectional edges between `fabric.switch0` and each SIP's `pcie_ep` (the cross-SIP IPCQ path of ADR-0038 D3 + ADR-0010). 4. **Build four view projections**: - `_build_system_view(spec)` — Tray level: SIPs and the system switch. - `_build_sip_view(spec)` — inside one SIP: cube mesh + IO chiplet. - `_build_cube_view(spec)` — inside one cube: router grid + PE / M_CPU / SRAM / HBM_CTRL attachments. - `_build_pe_view(spec)` — inside one PE: nine sub-components + internal edges (pe_internal kind). 5. **Return `TopologyGraph`**: `TopologyGraph(spec, nodes, edges, system_view, sip_view, cube_view, pe_view)`. The six stages are **ordered for a reason**: only after cubes exist do inter-cube edges have valid src/dst, and IO chiplets must precede the IO ↔ cube edges that reference them. New node types must slot in the right spot. ### D2. `cube_mesh.yaml` — a separate file with a source_hash cache `mesh_gen.ensure_mesh_file(cube_spec, mesh_path)`: 1. Compute `source_hash = _compute_source_hash(cube_spec)` from these input fields: - `geometry` (cube_mm.w/h …). - `pe_layout` (corners, pe_per_corner). - `ucie.n_connections`. - `memory_map.hbm_mapping_mode`. - `placement` (m_cpu/sram pos_mm). 2. If `mesh_path` (= `cube_mesh.yaml` next to topology.yaml) exists and `existing.source_hash == source_hash`, reuse it (cache hit). 3. Otherwise, generate a new mesh via `_generate_mesh(cube_spec, source_hash)` and write to yaml. Caching as a separate file because: - Mesh generation involves nontrivial PE/UCIe/router attachment math and is too expensive to redo every time. - Multiple runs with the same cube spec must guarantee an identical mesh. - The resulting mesh is itself an inspectable / debuggable artifact. The five fields listed in source_hash are the ones that determine mesh shape; other changes (e.g., bandwidth, overhead_ns) do not trigger mesh regeneration. ### D3. Cube NoC mesh auto-layout `_generate_mesh(cube_spec)`: #### D3.1. Rows / columns - `pe_positions = _corner_pe_positions(cube_w, cube_h)`: PE-center coordinates (mm) per corner (NW/NE/SW/SE). Hardcoded patterns like `(1.5, 1.5)` and `(cube_w-1.5, cube_h-1.5)`; with `pe_per_corner=2`, each corner has two PE positions. - `col_xs = _compute_col_positions(...)`: union of PE x-coordinates, plus relay columns inserted when any gap exceeds `max_spacing = 3.0 mm`. - `row_ys, rows_per_half = _compute_row_positions(cube_h, n_connections, pe_positions)`: - `n_conn = max(n_connections, 2)` (hot-path minimum). - `rows_per_half = ceil(n_conn / 2)`. - Top half + two HBM rows + bottom half. HBM sits at `(cube_h/2 - 1.5, cube_h/2 + 1.5)`. The gap between PE rows and HBM rows is `hbm_gap = 1.5 mm`. #### D3.2. HBM exclusion zone `hbm_row_start = rows_per_half`, `hbm_row_end = rows_per_half + 1`. `hbm_col_start = n_cols // 2 - 1`, `hbm_col_end = n_cols // 2`. Router slots inside this (row, col) rectangle are marked `None` (no router). HBM controllers are added separately as `hbm_ctrl.pe{X}` nodes following ADR-0017 D9's per-PE partition pattern. #### D3.3. PE attachment Each corner's PEs map to a row: - Top half: NW → row 0, NE → row 1 (top_corners index). - Bottom half: SW → row `hbm_row_end + 1`, SE → row `hbm_row_end + 2`. Each PE's x-coordinate attaches to the nearest column's router (`min(range(n_cols), key=lambda c: abs(col_xs[c] - pe_x))`). Attachment items are `pe{pe_idx}.dma`, `pe{pe_idx}.cpu`, `pe{pe_idx}.hbm` (pushed into the router's attach list). #### D3.4. M_CPU / SRAM attachment — nearest router by Euclidean distance For `placement.m_cpu.pos_mm` (default `[1.5, 5.5]`) and `placement.sram.pos_mm` (default `[1.5, 8.5]`), find the router with the smallest Euclidean distance and append `"m_cpu"` / `"sram"` to its attach list. #### D3.5. UCIe N/S/E/W distribution `ucie_pe_rows = top_pe_rows + bot_pe_rows` (total `2 * rows_per_half`). - UCIe-E: one PE row at a time, attach `ucie_e.c{i}` to the rightmost column's router. - UCIe-W: attach `ucie_w.c{i}` to the leftmost column's router (E's mirror). - UCIe-N/S: split PE columns into left and right halves; attach to the top row's / bottom row's matching columns. Each UCIe connection is suffixed `c{i}`, distributing ucie_n_connections PHYs (ADR-0017 D5+). ### D4. Node naming convention — single ownership builder.py creates nodes with the following naming convention (the single-owner principle from ADR-0051 D5): - `fabric.switch0` — system-level switch. - `sip{S}.{io_id}.{pcie_ep|io_cpu|io_noc|io_ucie.{dir}|conn.{id}}` — IO chiplet. - `sip{S}.cube{C}.{m_cpu|sram|hbm_ctrl.pe{X}|noc.r{R}c{C}|...}` — inside cube. - `sip{S}.cube{C}.pe{P}.{pe_cpu|pe_dma|pe_fetch_store|pe_gemm|pe_math|pe_mmu|pe_tcm|pe_scheduler|pe_ipcq}` — PE sub-components. Changing this convention requires updating both builder.py and router.py's helpers (ADR-0051). Components never know the convention directly — they only call the helpers. ### D5. Edge `kind` classification Every edge gets a `kind`; routing policy (ADR-0051 D2) reads it. Major kinds: - `"pe_internal"` — within a PE between sub-components. - `"pe_to_router"` — PE_DMA ↔ cube NoC router. - `"router_mesh"` — between cube NoC routers. - `"router_to_hbm"`, `"router_to_mcpu"`, `"router_to_sram"`, `"sram_to_router"`, etc. — between cube-attached components. - `"ucie_internal"`, `"ucie_conn_to_router"`, `"router_to_ucie_conn"`, `"ucie_conn_to_noc"`, `"noc_to_ucie_conn"`, `"ucie_mesh"` — UCIe-related. - `"io_internal"` — inside IO chiplet. - `"io_to_cube"`, `"cube_to_io"` — at the IO ↔ cube boundary. - `"pcie"` — switch ↔ pcie_ep. - `"command"` — control-plane edges only (e.g., M_CPU ↔ NOC; excluded from PE DMA paths). Adding a new edge kind requires picking a category in router.py's four adjacency graphs (ADR-0051 D2). If you forget, it defaults to `_adj_all` only, which can produce unintended routes. ### D6. View projection — four abstraction levels `TopologyGraph` keeps four view projections alongside the flat nodes+edges: - **system_view** (`_build_system_view`): Tray level. SIP blocks and `fabric.switch0`. PCIe links shown. For external high-level overview. - **sip_view** (`_build_sip_view`): inside one SIP — cube mesh + IO chiplet (pcie_ep + io_cpu + io_noc). UCIe N/S/E/W appear as cube-cube links. - **cube_view** (`_build_cube_view`): inside one cube — router grid + PE / M_CPU / SRAM / HBM_CTRL attachments + UCIe PHY edges. For intra-cube routing / placement debugging. - **pe_view** (`_build_pe_view`): inside one PE — nine sub-components + internal edges (pe_internal kind). For detailed PE-internal dataflow review. Views are selectively rendered via the spec's `visualization.emit_views: [system, sip, cube]` (ADR-0006). The pe view is omitted from default output but the code is retained for detailed debugging. ### D7. visualizer.py — SVG diagram output `emit_diagrams(graph, out_dir)` renders every view as SVG. Key functions: - `_render_view_svg(view)` — generic view render (no router grid). - `_render_cube_view_svg(view, spec)` — cube-view specific (HBM block, router grid layout, PE/M_CPU/SRAM/HBM placement). - `_draw_node`, `_draw_edge` — node/edge visual representation. - `_pick_scale`, `_compute_node_sizes` — auto-scaling. The visualizer is a **derived artifact** (ADR-0006); changes here do not pass production checks. Aligns with CLAUDE.md's "Derived Artifacts" guidance. ### D8. Blast radius of spec changes | spec field | effect | mesh regenerated? | |---------------------------------------|---------------------|-------------------| | `system.sips.count` | SIP count, node count | No | | `sip.cube_mesh.w/h` | cube mesh shape | No | | `cube.geometry.cube_mm.w/h` | cube size (mm) | **Yes** | | `cube.pe_layout.corners/pe_per_corner`| PE attachment positions | **Yes** | | `cube.ucie.n_connections` | UCIe PHY distribution | **Yes** | | `cube.memory_map.hbm_mapping_mode` | HBM distribution mode | **Yes** | | `cube.placement` | M_CPU/SRAM positions | **Yes** | | `cube.memory_map.*` (besides above) | HBM capacity / BW | No | | `*.links.*.bw_gbs` | edge bandwidth | No | | `*.attrs.overhead_ns` | component latency | No | The table mirrors D2's `_compute_source_hash` inputs. Changes that require mesh regeneration automatically invalidate `cube_mesh.yaml`'s source_hash. ## Alternatives Considered ### A1. Regenerate the mesh on every compile without a cache file Rejected. The cost of mesh generation would be paid repeatedly (CLI runs, probe, tests) for the same spec, and the human-inspectable artifact would disappear. ### A2. Merge mesh generation into builder.py Rejected (currently). It is a 305-line algorithm of its own, and the mesh-layout decisions (placement-driven router attachment, HBM exclusion zone) are different from builder's general node/edge emission. Keeping it separate respects single-responsibility. ### A3. Express placement coordinates in cube coordinates (col/row) Rejected. mm coordinates flow consistently between the visualizer and mesh layout (for nearest-router computation). Cube coordinates are undefined until the router grid is fixed, so they are unsuitable as placement input. ### A4. Lazy view projection generation Rejected (currently). The four views are cheap to build (typically < 100 ms), and eager construction guarantees `TopologyGraph` as the single source of truth. ### A5. Visualizer output in formats besides SVG (PNG/PDF) Rejected. SVG is vector + text-searchable + directly renderable in browsers. PNG conversion, when required, is downstream post-processing (e.g., rsvg-convert). ## Consequences - ADR-0006's high-level intent is fleshed out via D1–D7; topology changes can be assessed quickly via D8's table. - D3's mesh-layout algorithm is ADR-locked, so future PE attachment patterns (e.g., a 6-zone HBM split) make clear which stage they affect. - D5's edge-kind list and D7's view structure are explicit, giving PR reviewers a quick map of where (builder + router + visualizer) a new component type ripples through. - D2's source_hash invalidation rules are explicit, so a stale `cube_mesh.yaml` (e.g., when only bandwidth changed) is recognized as correct behavior.