Latency model: HBM PC striping + chunk-loop drain (ADR-0033)

Previous model double-counted slow-upstream paths (e.g., 64KB via UCIe 128 GB/s was ~2x pessimistic). HBM CTRL now distributes bursts across 8 pseudo-channels via global round-robin, with per-chunk commit timing that pipelines correctly against the bottleneck link's data arrival. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 21:59:07 -07:00
parent f6d262e359
commit 5fdb6f8797
11 changed files with 1192 additions and 52 deletions
@@ -106,7 +106,7 @@ cube:
  components:
    noc_router: { kind: noc_router, impl: builtin.forwarding, attrs: { overhead_ns: 2.0 } }
    m_cpu:      { kind: m_cpu,   impl: builtin.m_cpu,   attrs: { overhead_ns: 5.0 } }
-    hbm_ctrl:   { kind: hbm_ctrl, impl: builtin.hbm_ctrl, attrs: { capacity: 1, efficiency: 1.0 } }
+    hbm_ctrl:   { kind: hbm_ctrl, impl: builtin.hbm_ctrl, attrs: { capacity: 1, efficiency: 1.0, num_pcs: 8, burst_bytes: 256, switch_penalty_ns: 0.0 } }
    sram:       { kind: sram,     impl: builtin.sram,     attrs: { size_mb: 32, overhead_ns: 2.0 } }

  # Physical placement of non-PE components (mm coordinates)