ywkang
|
dcbc41571f
|
Add web topology viewer with hot path visualization
- FastAPI backend (server.py) with REST API + WebSocket for event streaming
- SVG-based topology viewer (index.html) with SIP/CUBE/PE drill-down views
- Event logging infrastructure (event_log.py) generating events from real
probe cases (H2D/D2H/PE-DMA) and bench workloads (QKV GEMM single/multi-PE)
- Timeline replay engine with play/pause, speed control, and scrubbing
- Workload selector dropdown grouped by category (Probe/Bench)
- CLI entry points: kernbench web, ./kernbench wrapper scripts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-03-19 03:19:19 -07:00 |
|
ywkang
|
d75da439c6
|
Add probe CLI improvements, D2H read, UCIe/HBM tuning, BW sweep
- Probe CLI: restructured output (tables first, routes below), per-hop
timestamps, split cross-cube into best/worst cases, D2H read section
- UCIe overhead: 1ns -> 8ns per port (16ns per crossing) to fix
cross-cube-best < cross-half latency inversion
- HBM efficiency: added efficiency=0.8 factor to hbm_ctrl, reducing
effective BW from 256 to 204.8 GB/s
- Multi-size BW sweep: saturation tables (4KB-1MB) for all probe cases
- Probe default data size: 4KB -> 32KB for more realistic measurements
- IOChiplet NOC + D2H topology and tests
- NOC mesh, xbar, BW occupancy components and tests
- Cube mesh visualization diagram
278 tests pass.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-03-19 01:16:18 -07:00 |
|