Add probe CLI improvements, D2H read, UCIe/HBM tuning, BW sweep
- Probe CLI: restructured output (tables first, routes below), per-hop timestamps, split cross-cube into best/worst cases, D2H read section - UCIe overhead: 1ns -> 8ns per port (16ns per crossing) to fix cross-cube-best < cross-half latency inversion - HBM efficiency: added efficiency=0.8 factor to hbm_ctrl, reducing effective BW from 256 to 204.8 GB/s - Multi-size BW sweep: saturation tables (4KB-1MB) for all probe cases - Probe default data size: 4KB -> 32KB for more realistic measurements - IOChiplet NOC + D2H topology and tests - NOC mesh, xbar, BW occupancy components and tests - Cube mesh visualization diagram 278 tests pass. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -2,7 +2,7 @@
|
||||
|
||||
## Status
|
||||
|
||||
Proposed
|
||||
Accepted
|
||||
|
||||
## Context
|
||||
|
||||
@@ -43,22 +43,33 @@ Each directed edge (src → dst) results in:
|
||||
|
||||
---
|
||||
|
||||
### D2. Wire process (propagation delay)
|
||||
### D2. Wire process (propagation delay + BW occupancy)
|
||||
|
||||
For each directed edge (src, dst) in the topology graph, a SimPy wire process
|
||||
models propagation delay:
|
||||
models propagation delay and BW occupancy:
|
||||
|
||||
```python
|
||||
def wire_process(env, out_port, in_port, delay_ns):
|
||||
def wire_process(env, out_port, in_port, delay_ns, bw_gbs):
|
||||
available_at = 0.0
|
||||
while True:
|
||||
cmd = yield out_port.get()
|
||||
if bw_gbs > 0:
|
||||
nbytes = getattr(cmd, "nbytes", 0)
|
||||
if nbytes > 0:
|
||||
wait = available_at - env.now
|
||||
if wait > 0:
|
||||
yield env.timeout(wait)
|
||||
available_at = env.now + (nbytes / bw_gbs)
|
||||
yield env.timeout(delay_ns)
|
||||
yield in_port.put(cmd)
|
||||
```
|
||||
|
||||
Wire processes are started at engine initialization.
|
||||
BW constraints are enforced by the sending component's out_port capacity or token model,
|
||||
not by the wire process itself.
|
||||
Each directed edge maintains an `available_at` timestamp tracking when the link
|
||||
becomes free for the next transaction. When a transaction occupies a link, the
|
||||
next transaction on the same directed link must wait until occupancy clears
|
||||
(back-to-back serialization). TX and RX directions are independent (separate
|
||||
wire processes with separate `available_at` state).
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user