Latency model: HBM PC striping + chunk-loop drain (ADR-0033)
Previous model double-counted slow-upstream paths (e.g., 64KB via UCIe 128 GB/s was ~2x pessimistic). HBM CTRL now distributes bursts across 8 pseudo-channels via global round-robin, with per-chunk commit timing that pipelines correctly against the bottleneck link's data arrival. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -380,12 +380,18 @@ def test_pe_dma_record_start_after_channel_acquire():
|
||||
)
|
||||
|
||||
durations = [r.t_end - r.t_start for r in dma_records]
|
||||
# All three should have the same actual transfer time within ±1 ns.
|
||||
# All three should have similar transfer time. Under the PC striping
|
||||
# model (ADR-0033 D1), per-PC `available_at` state introduces small
|
||||
# timing differences between consecutive same-direction reads to the
|
||||
# same PC set (the second read's start = max(eff_start, pc_avail[pc])).
|
||||
# Tolerance widened from ±1ns to ±3ns to absorb this variance without
|
||||
# weakening the invariant that queue wait is excluded from the recorded
|
||||
# interval (still validated by the t_start >= prev_end check below).
|
||||
base = durations[0]
|
||||
assert base > 0, f"first dma duration must be positive, got {base}"
|
||||
for i, d in enumerate(durations):
|
||||
assert abs(d - base) <= 1.0, (
|
||||
f"op {i} duration {d} differs from baseline {base} by >1 ns "
|
||||
assert abs(d - base) <= 3.0, (
|
||||
f"op {i} duration {d} differs from baseline {base} by >3 ns "
|
||||
f"— record_start may still be including queue wait"
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user