Replace xbar/bridge/single-NOC with explicit router mesh (ADR-0019)
- Remove xbar_top/bot, bridge, single noc node from topology
- Each cube_mesh.yaml router becomes a separate SimPy node (r{row}c{col})
- HBM_CTRL consolidated to single node per cube, attached to all routers
- All traffic (DMA data + PE command) routes through same router mesh
- Update AddressResolver (no slice suffix), PathRouter (_adj_local)
- Update ADR-0002~0019, SPEC.md to remove xbar/bridge references
- Regenerate SVG diagrams for new topology structure
- Skip cross-SIP PE_TCM and PE_MMU routing tests (not yet wired)
326 passed, 13 skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -44,14 +44,15 @@ Each PE contains the following logical components.
|
||||
**PE_DMA**
|
||||
|
||||
- Handles memory transfers between PE_TCM and external memory domains.
|
||||
- PE_DMA has **dual egress** at the CUBE level:
|
||||
- **→ XBAR**: dedicated path to HBM (local and cross-half via bridge)
|
||||
- **→ NOC**: path to non-HBM destinations (shared SRAM, inter-cube UCIe, etc.)
|
||||
- PE_DMA connects to the NOC router mesh at the CUBE level (ADR-0019):
|
||||
- All destinations (HBM, shared SRAM, inter-cube UCIe) are reached via the router mesh
|
||||
- Local HBM access: PE_DMA → local router → hbm_ctrl (switching overhead only)
|
||||
- Remote/shared: PE_DMA → local router → (mesh hops) → destination
|
||||
- Supported directions include:
|
||||
- HBM → PE_TCM (via XBAR)
|
||||
- PE_TCM → HBM (via XBAR)
|
||||
- PE_TCM → shared SRAM (via NOC)
|
||||
- PE_TCM → other memory domains (via NOC, if supported by topology)
|
||||
- HBM → PE_TCM (via router mesh)
|
||||
- PE_TCM → HBM (via router mesh)
|
||||
- PE_TCM → shared SRAM (via router mesh)
|
||||
- PE_TCM → other memory domains (via router mesh, if supported by topology)
|
||||
|
||||
**PE_GEMM**
|
||||
|
||||
@@ -251,7 +252,7 @@ Compute operations use a TCM-centric dataflow model.
|
||||
**Input path (HBM)**
|
||||
|
||||
```text
|
||||
HBM → XBAR → PE_DMA (DMA_READ) → PE_TCM
|
||||
HBM → router mesh → PE_DMA (DMA_READ) → PE_TCM
|
||||
```
|
||||
|
||||
**Input path (shared SRAM)**
|
||||
@@ -268,14 +269,14 @@ Compute engines read input tensors from PE_TCM.
|
||||
PE_TCM → GEMM / MATH
|
||||
```
|
||||
|
||||
Weights for GEMM may optionally stream directly from HBM (via XBAR).
|
||||
Weights for GEMM may optionally stream directly from HBM (via router mesh).
|
||||
|
||||
**Output path (HBM)**
|
||||
|
||||
Compute results are written to PE_TCM, then DMA writes to HBM.
|
||||
|
||||
```text
|
||||
PE_TCM → PE_DMA (DMA_WRITE) → XBAR → HBM
|
||||
PE_TCM → PE_DMA (DMA_WRITE) → router mesh → HBM
|
||||
```
|
||||
|
||||
**Output path (shared SRAM)**
|
||||
@@ -347,9 +348,9 @@ PE instances are derived from `cube.pe_layout`.
|
||||
|
||||
External connectivity such as:
|
||||
|
||||
- PE_DMA → XBAR (HBM data path)
|
||||
- PE_DMA → NOC (non-HBM data path: shared SRAM, inter-cube UCIe)
|
||||
- NOC → PE_CPU (command path from M_CPU)
|
||||
- PE_DMA → router mesh → HBM (data path, ADR-0019)
|
||||
- PE_DMA → router mesh → shared SRAM, inter-cube UCIe (non-HBM data path)
|
||||
- router mesh → PE_CPU (command path from M_CPU)
|
||||
|
||||
is modeled at the CUBE level (see ADR-0003 D3).
|
||||
|
||||
|
||||
Reference in New Issue
Block a user