adr: add ADR-0050-0053 — close /report's second-pass G4 candidates
Documents four cross-cutting surfaces one layer deeper than the prior G4 batch: - 0050 par-ccl-algorithm-module-contract: how to author a new CCL algorithm in src/kernbench/ccl/algorithms/. Pairs with ADR-0045's bench-module contract. Pins the four required public symbols (kernel, kernel_args, TOPO_NAME_TO_KIND constants, kernel alias), the 9 + tl standardized kernel signature, the kernel_args tuple format, sip_topo_kind dispatch, and the ccl.yaml entry workflow. - 0051 lat-routing-helper-api: every public method of AddressResolver (resolve, find_m_cpu, find_pcie_ep, find_io_cpu, find_all_pcie_eps) and PathRouter (find_path, find_path_with_distance, find_mcpu_dma_path, find_memory_path, find_node_path + 2 shims). Pins the four adjacency graphs (_adj_all / _adj / _adj_mcpu_dma / _adj_local) and the edge-kind exclusion sets they use, plus the single-owner naming convention. - 0052 dev-oplog-memory-store-schemas: OpRecord's 7 fields, the per-op_name params matrix (dma_read, dma_write, gemm_*, math, math reduction, composite_gemm, ipcq_copy, unknown), snapshot timing rules (math = all inputs, dma_write = HBM-only — ADR-0027 race avoidance), TileToken stage_type capture, and MemoryStore's (space, addr) two-level dict with reference-store semantics. - 0053 dev-topology-builder-algorithms: the 6-stage compile pipeline, cube_mesh.yaml's source_hash cache and its 5 input fields, the cube NoC auto-layout algorithm (row/col placement, HBM exclusion zone, PE/M_CPU/SRAM attachment via nearest-router, UCIe N/S/E/W distribution), the node naming convention (single-owner with router.py), the edge-kind catalog, the 4 view projections, and a table of spec-field changes vs mesh regeneration. Bilingual pair verifier passes for all four EN/KO pairs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -0,0 +1,351 @@
|
||||
# ADR-0053: Topology Builder + Visualizer Algorithms
|
||||
|
||||
## Status
|
||||
|
||||
Accepted (2026-05-22).
|
||||
|
||||
Pins down the key algorithmic choices of the topology compile and
|
||||
visualization pipeline jointly implemented by `topology/builder.py`,
|
||||
`topology/mesh_gen.py`, and `topology/visualizer.py` —
|
||||
placement-driven router attachment, mesh auto-layout, the source_hash
|
||||
cache, view projections, and SVG rendering. ADR-0006 defines the
|
||||
high-level intent of topology compilation (compiled topology, distance
|
||||
extraction, automatic diagram generation), but **which algorithms the
|
||||
builder actually uses** was only discoverable via source grep.
|
||||
|
||||
## First action
|
||||
|
||||
When `resolve_topology(path_str)` is called, four steps run in order:
|
||||
|
||||
1. **Path validation** (`builder.py::resolve_topology`):
|
||||
`Path(path_str).expanduser().resolve()`, existence check, file
|
||||
check. Failure → `FileNotFoundError` or `ValueError`.
|
||||
2. **YAML parsing** (`_read_spec`): `yaml.safe_load`. Parse errors
|
||||
yield a `ValueError` with line/column. Non-dict roots are
|
||||
rejected.
|
||||
3. **Auto-generate the mesh** (`mesh_gen.ensure_mesh_file`): create or
|
||||
reuse a `cube_mesh.yaml` next to the topology file. Cache hit on
|
||||
matching source_hash; miss triggers regeneration. This step decides
|
||||
the cube NoC's router grid and attachment information.
|
||||
4. **Compile the graph** (`_compile_graph`): system → IO chiplets →
|
||||
cubes → inter-cube edges → IO↔cube edges → system↔IO edges, then
|
||||
build four view projections (system, sip, cube, pe) and wrap into
|
||||
a `TopologyGraph`.
|
||||
|
||||
In short, **topology compilation's first act is "read topology.yaml as
|
||||
a dict, create/validate cube_mesh.yaml in the same directory, then
|
||||
build the flat graph + 4-view projection in system → sip → cube → pe
|
||||
order"**.
|
||||
|
||||
## Context
|
||||
|
||||
`topology/` package responsibilities:
|
||||
|
||||
- **builder.py** (1207 lines): turns topology.yaml into a
|
||||
`TopologyGraph` (nodes + edges + 4 view projections).
|
||||
- **mesh_gen.py** (305 lines): auto-decides the cube NoC's router
|
||||
grid and PE/UCIe/M_CPU/SRAM attachment positions and caches them in
|
||||
`cube_mesh.yaml`.
|
||||
- **visualizer.py** (887 lines): generates four SVG diagrams (system /
|
||||
sip / cube / pe) from a `TopologyGraph`.
|
||||
|
||||
ADR-0006 makes the high-level decision that "the result of topology
|
||||
compilation is the single source for distance metadata and diagram
|
||||
generation", but specific algorithms (e.g., placement-driven nearest-
|
||||
router attachment, the HBM exclusion zone, which fields in source_hash
|
||||
trigger regeneration) are not in any ADR.
|
||||
|
||||
In particular, these decisions are absent at ADR level:
|
||||
|
||||
- Why is mesh_gen cached in a separate file (`cube_mesh.yaml`)?
|
||||
- Which fields are in source_hash, and which changes force
|
||||
regeneration?
|
||||
- Why placement coordinates in mm rather than cube coordinates?
|
||||
- How are the HBM exclusion zone and UCIe N/S/E/W distribution
|
||||
decided inside the mesh?
|
||||
- What is the abstraction-level difference among the four view
|
||||
projections (system/sip/cube/pe)?
|
||||
|
||||
This ADR captures these decisions in one place.
|
||||
|
||||
## Decision
|
||||
|
||||
### D1. Compile pipeline — six stages
|
||||
|
||||
`_compile_graph(spec)`:
|
||||
|
||||
1. **System nodes** (`_instantiate_system`): add system-level nodes
|
||||
like `fabric.switch0` and the host CPU.
|
||||
2. **Per-SIP loop** (`for sip_id in range(system.sips.count)`):
|
||||
- **IO chiplets** (`_instantiate_io_chiplets`): create pcie_ep /
|
||||
io_cpu / io_noc / io_ucie PHYs / conn nodes and their bidirectional
|
||||
internal edges.
|
||||
- **Cube instantiation** (`_instantiate_cube`): using
|
||||
cube_mesh.yaml's router grid, instantiate cube routers, PE
|
||||
sub-components (pe_cpu, pe_dma, pe_fetch_store, pe_gemm, pe_math,
|
||||
pe_mmu, pe_tcm, pe_scheduler, pe_ipcq), m_cpu, sram, hbm_ctrl,
|
||||
and their internal edges.
|
||||
- **Inter-cube edges** (`_add_inter_cube_edges`): the UCIe
|
||||
N/S/E/W mesh edges.
|
||||
- **IO ↔ cube edges** (`_add_io_to_cube_edges`): connect io_noc to
|
||||
each cube's edge UCIe phy.
|
||||
3. **Switch ↔ IO edges** (`_add_system_to_io_edges`): bidirectional
|
||||
edges between `fabric.switch0` and each SIP's `pcie_ep` (the
|
||||
cross-SIP IPCQ path of ADR-0038 D3 + ADR-0010).
|
||||
4. **Build four view projections**:
|
||||
- `_build_system_view(spec)` — Tray level: SIPs and the system
|
||||
switch.
|
||||
- `_build_sip_view(spec)` — inside one SIP: cube mesh + IO
|
||||
chiplet.
|
||||
- `_build_cube_view(spec)` — inside one cube: router grid + PE /
|
||||
M_CPU / SRAM / HBM_CTRL attachments.
|
||||
- `_build_pe_view(spec)` — inside one PE: nine sub-components +
|
||||
internal edges (pe_internal kind).
|
||||
5. **Return `TopologyGraph`**: `TopologyGraph(spec, nodes, edges,
|
||||
system_view, sip_view, cube_view, pe_view)`.
|
||||
|
||||
The six stages are **ordered for a reason**: only after cubes exist
|
||||
do inter-cube edges have valid src/dst, and IO chiplets must precede
|
||||
the IO ↔ cube edges that reference them. New node types must slot in
|
||||
the right spot.
|
||||
|
||||
### D2. `cube_mesh.yaml` — a separate file with a source_hash cache
|
||||
|
||||
`mesh_gen.ensure_mesh_file(cube_spec, mesh_path)`:
|
||||
|
||||
1. Compute `source_hash = _compute_source_hash(cube_spec)` from these
|
||||
input fields:
|
||||
- `geometry` (cube_mm.w/h …).
|
||||
- `pe_layout` (corners, pe_per_corner).
|
||||
- `ucie.n_connections`.
|
||||
- `memory_map.hbm_mapping_mode`.
|
||||
- `placement` (m_cpu/sram pos_mm).
|
||||
2. If `mesh_path` (= `cube_mesh.yaml` next to topology.yaml) exists
|
||||
and `existing.source_hash == source_hash`, reuse it (cache hit).
|
||||
3. Otherwise, generate a new mesh via
|
||||
`_generate_mesh(cube_spec, source_hash)` and write to yaml.
|
||||
|
||||
Caching as a separate file because:
|
||||
|
||||
- Mesh generation involves nontrivial PE/UCIe/router attachment math
|
||||
and is too expensive to redo every time.
|
||||
- Multiple runs with the same cube spec must guarantee an identical
|
||||
mesh.
|
||||
- The resulting mesh is itself an inspectable / debuggable artifact.
|
||||
|
||||
The five fields listed in source_hash are the ones that determine
|
||||
mesh shape; other changes (e.g., bandwidth, overhead_ns) do not
|
||||
trigger mesh regeneration.
|
||||
|
||||
### D3. Cube NoC mesh auto-layout
|
||||
|
||||
`_generate_mesh(cube_spec)`:
|
||||
|
||||
#### D3.1. Rows / columns
|
||||
|
||||
- `pe_positions = _corner_pe_positions(cube_w, cube_h)`: PE-center
|
||||
coordinates (mm) per corner (NW/NE/SW/SE). Hardcoded patterns like
|
||||
`(1.5, 1.5)` and `(cube_w-1.5, cube_h-1.5)`; with `pe_per_corner=2`,
|
||||
each corner has two PE positions.
|
||||
- `col_xs = _compute_col_positions(...)`: union of PE x-coordinates,
|
||||
plus relay columns inserted when any gap exceeds
|
||||
`max_spacing = 3.0 mm`.
|
||||
- `row_ys, rows_per_half = _compute_row_positions(cube_h,
|
||||
n_connections, pe_positions)`:
|
||||
- `n_conn = max(n_connections, 2)` (hot-path minimum).
|
||||
- `rows_per_half = ceil(n_conn / 2)`.
|
||||
- Top half + two HBM rows + bottom half. HBM sits at
|
||||
`(cube_h/2 - 1.5, cube_h/2 + 1.5)`. The gap between PE rows and
|
||||
HBM rows is `hbm_gap = 1.5 mm`.
|
||||
|
||||
#### D3.2. HBM exclusion zone
|
||||
|
||||
`hbm_row_start = rows_per_half`,
|
||||
`hbm_row_end = rows_per_half + 1`.
|
||||
`hbm_col_start = n_cols // 2 - 1`,
|
||||
`hbm_col_end = n_cols // 2`.
|
||||
|
||||
Router slots inside this (row, col) rectangle are marked `None` (no
|
||||
router). HBM controllers are added separately as
|
||||
`hbm_ctrl.pe{X}` nodes following ADR-0017 D9's per-PE partition
|
||||
pattern.
|
||||
|
||||
#### D3.3. PE attachment
|
||||
|
||||
Each corner's PEs map to a row:
|
||||
|
||||
- Top half: NW → row 0, NE → row 1 (top_corners index).
|
||||
- Bottom half: SW → row `hbm_row_end + 1`, SE → row
|
||||
`hbm_row_end + 2`.
|
||||
|
||||
Each PE's x-coordinate attaches to the nearest column's router
|
||||
(`min(range(n_cols), key=lambda c: abs(col_xs[c] - pe_x))`).
|
||||
Attachment items are `pe{pe_idx}.dma`, `pe{pe_idx}.cpu`,
|
||||
`pe{pe_idx}.hbm` (pushed into the router's attach list).
|
||||
|
||||
#### D3.4. M_CPU / SRAM attachment — nearest router by Euclidean distance
|
||||
|
||||
For `placement.m_cpu.pos_mm` (default `[1.5, 5.5]`) and
|
||||
`placement.sram.pos_mm` (default `[1.5, 8.5]`), find the router with
|
||||
the smallest Euclidean distance and append `"m_cpu"` / `"sram"` to
|
||||
its attach list.
|
||||
|
||||
#### D3.5. UCIe N/S/E/W distribution
|
||||
|
||||
`ucie_pe_rows = top_pe_rows + bot_pe_rows` (total
|
||||
`2 * rows_per_half`).
|
||||
|
||||
- UCIe-E: one PE row at a time, attach `ucie_e.c{i}` to the rightmost
|
||||
column's router.
|
||||
- UCIe-W: attach `ucie_w.c{i}` to the leftmost column's router (E's
|
||||
mirror).
|
||||
- UCIe-N/S: split PE columns into left and right halves; attach to
|
||||
the top row's / bottom row's matching columns.
|
||||
|
||||
Each UCIe connection is suffixed `c{i}`, distributing
|
||||
ucie_n_connections PHYs (ADR-0017 D5+).
|
||||
|
||||
### D4. Node naming convention — single ownership
|
||||
|
||||
builder.py creates nodes with the following naming convention (the
|
||||
single-owner principle from ADR-0051 D5):
|
||||
|
||||
- `fabric.switch0` — system-level switch.
|
||||
- `sip{S}.{io_id}.{pcie_ep|io_cpu|io_noc|io_ucie.{dir}|conn.{id}}` —
|
||||
IO chiplet.
|
||||
- `sip{S}.cube{C}.{m_cpu|sram|hbm_ctrl.pe{X}|noc.r{R}c{C}|...}` —
|
||||
inside cube.
|
||||
- `sip{S}.cube{C}.pe{P}.{pe_cpu|pe_dma|pe_fetch_store|pe_gemm|pe_math|pe_mmu|pe_tcm|pe_scheduler|pe_ipcq}` —
|
||||
PE sub-components.
|
||||
|
||||
Changing this convention requires updating both builder.py and
|
||||
router.py's helpers (ADR-0051). Components never know the convention
|
||||
directly — they only call the helpers.
|
||||
|
||||
### D5. Edge `kind` classification
|
||||
|
||||
Every edge gets a `kind`; routing policy (ADR-0051 D2) reads it. Major
|
||||
kinds:
|
||||
|
||||
- `"pe_internal"` — within a PE between sub-components.
|
||||
- `"pe_to_router"` — PE_DMA ↔ cube NoC router.
|
||||
- `"router_mesh"` — between cube NoC routers.
|
||||
- `"router_to_hbm"`, `"router_to_mcpu"`, `"router_to_sram"`,
|
||||
`"sram_to_router"`, etc. — between cube-attached components.
|
||||
- `"ucie_internal"`, `"ucie_conn_to_router"`,
|
||||
`"router_to_ucie_conn"`, `"ucie_conn_to_noc"`,
|
||||
`"noc_to_ucie_conn"`, `"ucie_mesh"` — UCIe-related.
|
||||
- `"io_internal"` — inside IO chiplet.
|
||||
- `"io_to_cube"`, `"cube_to_io"` — at the IO ↔ cube boundary.
|
||||
- `"pcie"` — switch ↔ pcie_ep.
|
||||
- `"command"` — control-plane edges only (e.g., M_CPU ↔ NOC; excluded
|
||||
from PE DMA paths).
|
||||
|
||||
Adding a new edge kind requires picking a category in router.py's
|
||||
four adjacency graphs (ADR-0051 D2). If you forget, it defaults to
|
||||
`_adj_all` only, which can produce unintended routes.
|
||||
|
||||
### D6. View projection — four abstraction levels
|
||||
|
||||
`TopologyGraph` keeps four view projections alongside the flat
|
||||
nodes+edges:
|
||||
|
||||
- **system_view** (`_build_system_view`): Tray level. SIP blocks and
|
||||
`fabric.switch0`. PCIe links shown. For external high-level
|
||||
overview.
|
||||
- **sip_view** (`_build_sip_view`): inside one SIP — cube mesh + IO
|
||||
chiplet (pcie_ep + io_cpu + io_noc). UCIe N/S/E/W appear as
|
||||
cube-cube links.
|
||||
- **cube_view** (`_build_cube_view`): inside one cube — router grid +
|
||||
PE / M_CPU / SRAM / HBM_CTRL attachments + UCIe PHY edges. For
|
||||
intra-cube routing / placement debugging.
|
||||
- **pe_view** (`_build_pe_view`): inside one PE — nine sub-components
|
||||
+ internal edges (pe_internal kind). For detailed PE-internal
|
||||
dataflow review.
|
||||
|
||||
Views are selectively rendered via the spec's
|
||||
`visualization.emit_views: [system, sip, cube]` (ADR-0006). The pe
|
||||
view is omitted from default output but the code is retained for
|
||||
detailed debugging.
|
||||
|
||||
### D7. visualizer.py — SVG diagram output
|
||||
|
||||
`emit_diagrams(graph, out_dir)` renders every view as SVG. Key
|
||||
functions:
|
||||
|
||||
- `_render_view_svg(view)` — generic view render (no router grid).
|
||||
- `_render_cube_view_svg(view, spec)` — cube-view specific (HBM block,
|
||||
router grid layout, PE/M_CPU/SRAM/HBM placement).
|
||||
- `_draw_node`, `_draw_edge` — node/edge visual representation.
|
||||
- `_pick_scale`, `_compute_node_sizes` — auto-scaling.
|
||||
|
||||
The visualizer is a **derived artifact** (ADR-0006); changes here do
|
||||
not pass production checks. Aligns with CLAUDE.md's "Derived
|
||||
Artifacts" guidance.
|
||||
|
||||
### D8. Blast radius of spec changes
|
||||
|
||||
| spec field | effect | mesh regenerated? |
|
||||
|---------------------------------------|---------------------|-------------------|
|
||||
| `system.sips.count` | SIP count, node count | No |
|
||||
| `sip.cube_mesh.w/h` | cube mesh shape | No |
|
||||
| `cube.geometry.cube_mm.w/h` | cube size (mm) | **Yes** |
|
||||
| `cube.pe_layout.corners/pe_per_corner`| PE attachment positions | **Yes** |
|
||||
| `cube.ucie.n_connections` | UCIe PHY distribution | **Yes** |
|
||||
| `cube.memory_map.hbm_mapping_mode` | HBM distribution mode | **Yes** |
|
||||
| `cube.placement` | M_CPU/SRAM positions | **Yes** |
|
||||
| `cube.memory_map.*` (besides above) | HBM capacity / BW | No |
|
||||
| `*.links.*.bw_gbs` | edge bandwidth | No |
|
||||
| `*.attrs.overhead_ns` | component latency | No |
|
||||
|
||||
The table mirrors D2's `_compute_source_hash` inputs. Changes that
|
||||
require mesh regeneration automatically invalidate `cube_mesh.yaml`'s
|
||||
source_hash.
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
### A1. Regenerate the mesh on every compile without a cache file
|
||||
|
||||
Rejected. The cost of mesh generation would be paid repeatedly (CLI
|
||||
runs, probe, tests) for the same spec, and the human-inspectable
|
||||
artifact would disappear.
|
||||
|
||||
### A2. Merge mesh generation into builder.py
|
||||
|
||||
Rejected (currently). It is a 305-line algorithm of its own, and the
|
||||
mesh-layout decisions (placement-driven router attachment, HBM
|
||||
exclusion zone) are different from builder's general node/edge
|
||||
emission. Keeping it separate respects single-responsibility.
|
||||
|
||||
### A3. Express placement coordinates in cube coordinates (col/row)
|
||||
|
||||
Rejected. mm coordinates flow consistently between the visualizer and
|
||||
mesh layout (for nearest-router computation). Cube coordinates are
|
||||
undefined until the router grid is fixed, so they are unsuitable as
|
||||
placement input.
|
||||
|
||||
### A4. Lazy view projection generation
|
||||
|
||||
Rejected (currently). The four views are cheap to build (typically <
|
||||
100 ms), and eager construction guarantees `TopologyGraph` as the
|
||||
single source of truth.
|
||||
|
||||
### A5. Visualizer output in formats besides SVG (PNG/PDF)
|
||||
|
||||
Rejected. SVG is vector + text-searchable + directly renderable in
|
||||
browsers. PNG conversion, when required, is downstream
|
||||
post-processing (e.g., rsvg-convert).
|
||||
|
||||
## Consequences
|
||||
|
||||
- ADR-0006's high-level intent is fleshed out via D1–D7; topology
|
||||
changes can be assessed quickly via D8's table.
|
||||
- D3's mesh-layout algorithm is ADR-locked, so future PE attachment
|
||||
patterns (e.g., a 6-zone HBM split) make clear which stage they
|
||||
affect.
|
||||
- D5's edge-kind list and D7's view structure are explicit, giving PR
|
||||
reviewers a quick map of where (builder + router + visualizer) a
|
||||
new component type ripples through.
|
||||
- D2's source_hash invalidation rules are explicit, so a stale
|
||||
`cube_mesh.yaml` (e.g., when only bandwidth changed) is recognized
|
||||
as correct behavior.
|
||||
Reference in New Issue
Block a user