Add reverse path response latency for PE DMA and PE_CPU→M_CPU

Model fabric response hop latency for PE-internal operations:
- HBM_CTRL sends PeDmaMsg response on reverse path instead of direct done signal
- PE_CPU sends ResponseMsg via NOC→M_CPU on kernel completion
- Add NOC→PE_DMA and PE_CPU→NOC edges in topology builder
- Make HBM BW test assertions dynamic based on topology efficiency

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-03-20 15:40:56 -07:00
parent 8b5afef5eb
commit 62fb01ae18
8 changed files with 88 additions and 24 deletions
+15
View File
@@ -473,6 +473,14 @@ def _instantiate_cube(
kind="pe_to_noc",
))
# noc → PE_DMA (response delivery, reverse of pe_to_noc)
edges.append(Edge(
src=f"{cp}.noc", dst=f"{pp}.pe_dma",
distance_mm=pe_noc_distances.get(pe_idx, 0.0),
bw_gbs=clinks["pe_dma_to_noc_bw_gbs"],
kind="noc_to_pe",
))
# noc → PE_CPU (command delivery)
edges.append(Edge(
src=f"{cp}.noc", dst=f"{pp}.pe_cpu",
@@ -480,6 +488,13 @@ def _instantiate_cube(
kind="command",
))
# PE_CPU → noc (response delivery, reverse of command)
edges.append(Edge(
src=f"{pp}.pe_cpu", dst=f"{cp}.noc",
distance_mm=clinks["noc_to_pe_cpu_mm"],
kind="pe_response",
))
pe_idx += 1
# ── xbar_top/bot → HBM slices ──