Files
kernbench2/docs/adr/ADR-0005-diagram-views-distance-layout.md
T
ywkang 5917b3497c Replace xbar/bridge/single-NOC with explicit router mesh (ADR-0019)
- Remove xbar_top/bot, bridge, single noc node from topology
- Each cube_mesh.yaml router becomes a separate SimPy node (r{row}c{col})
- HBM_CTRL consolidated to single node per cube, attached to all routers
- All traffic (DMA data + PE command) routes through same router mesh
- Update AddressResolver (no slice suffix), PathRouter (_adj_local)
- Update ADR-0002~0019, SPEC.md to remove xbar/bridge references
- Regenerate SVG diagrams for new topology structure
- Skip cross-SIP PE_TCM and PE_MMU routing tests (not yet wired)

326 passed, 13 skipped

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 17:51:28 -07:00

185 lines
4.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# ADR-0005: Diagram Views & Distance-Aware Layout Rules
## Status
Accepted
## Context
We require verifiable and inspectable system modeling for a large-scale,
parameterized AI Accelerator system.
Humans must be able to:
- visually inspect the modeled topology,
- reason about communication structure and relative distance,
- do so at multiple abstraction levels without being overwhelmed by detail.
The simulator models distance (accumulated latency) as a first-class concept.
Diagrams must reflect this distance by default.
---
## Global Defaults
- All diagrams MUST be **distance-aware by default**.
- All diagrams MUST render **representative views** of the architecture.
- Instance indices (e.g., sip0, cube2, pe3) MUST NOT be required for diagram generation.
- Instance indices MAY be used ONLY:
- to define a distance anchor in asymmetric or debugging scenarios, or
- when explicitly requested.
---
## Representative Rendering Rule
- All CUBEs share the same internal structure.
- All PEs share the same internal structure.
Therefore:
- SIP-level diagrams render representative CUBEs and IO chiplets.
- CUBE-level diagrams render representative PEs as opaque blocks.
- PE-level diagrams render a representative PE with fully expanded internals.
Diagrams MUST NOT depend on specific SIP, CUBE, or PE indices
unless explicitly requested.
---
## Diagram Views
### View A — SIP-Level Diagram
**Purpose**
Explain system-scale structure and connectivity.
**Visible elements**
- SIP boundaries (optional)
- CUBEs (opaque blocks)
- IO chiplets (opaque blocks)
- Optional UCIe stubs only if needed to clarify connectivity
**Hidden elements**
- PE internals
- CUBE internal fabric
- IO chiplet internals
**Visible links**
- Host ↔ IO chiplets (PCIe)
- SIP ↔ SIP (PCIe / UAL via switches)
- IO ↔ CUBE (on-package links)
---
### View B — CUBE-Level Diagram
**Purpose**
Explain cube-internal structure and data/control flow.
**Visible elements**
- Router mesh: 2D grid of NOC routers (from cube_mesh.yaml), all traffic routes through mesh
- HBM_CTRL attached to PE routers (local HBM = 0 hop)
- HBM subsystem (HBM_CTRL)
- Shared SRAM: cube-level shared memory
- Management CPU (M_CPU)
- PEs as opaque blocks (PE[0..N1])
- UCIe endpoints (N/E/W/S) as ports
**Hidden elements**
- PE internals
**Visible links**
- PE → router (HBM + non-HBM data path via mesh)
- Router ↔ HBM_CTRL (local HBM access)
- Router ↔ Router (mesh hops for remote access)
- Router ↔ UCIe endpoints
- Router ↔ shared SRAM
- M_CPU ↔ router (command path)
- Router → PE_CPU (command delivery, collapsed into PE block)
---
### View C — PE-Level Diagram
**Purpose**
Explain internal PE behavior and execution structure.
**Visible elements**
- PE_CPU
- Command handler / scheduler
- PE_TCM (local SRAM)
- HW accelerators (DMA, GEMM, MATH, etc.)
- Local HBM interface
- Optional IPCQ / messaging endpoints
**Visible links**
- Control paths (CPU → scheduler → engines)
- Data paths (engines ↔ TCM, DMA ↔ local HBM)
- External fabric ports as abstract ports only
---
## Distance-Aware Layout (Default)
### Distance definition
- Distance is defined as **accumulated latency**, consistent with ADR-0002.
- Distance is computed from a single anchor node.
### Default anchor selection
- SIP view: IO chiplet (or Host CPU if present)
- CUBE view: a representative PE
- PE view: PE_CPU or Command Handler
Anchors are **implicit defaults** and MUST NOT be required to be specified.
### Layout rules
- Diagrams MUST be laid out in layers based on distance buckets.
- Layout direction MUST be consistent within a view type
(preferred: left-to-right).
- Nodes with equal distance MUST have stable ordering
(by role or identifier, deterministically).
Cycles MAY be rendered using dashed or curved edges for readability,
without affecting distance semantics.
---
## Generation Contract (for Tools / Claude Code)
When generating diagrams:
- Assume distance-aware layout by default.
- Assume representative rendering by default.
- Do NOT ask for SIP/CUBE/PE indices unless required.
- Do NOT expand hidden abstraction levels.
- Prefer architectural clarity over micro-hop fidelity.
---
## Consequences
- Diagrams are stable across topology scaling.
- Changes in distance or routing policy are reflected visually.
- Diagrams serve as verifiable artifacts derived from the simulator model,
not as hand-maintained documentation.
---
## Links
- SPEC Section 4 (Output, Debuggability, and Diagrams)
- ADR-0002 (Routing distance semantics)
- ADR-0006 (Topology compilation & automatic diagram generation)