Yangwook Kang
|
d40d0cceea
|
updated venv:create to use python3 instead of python
|
2026-03-19 23:55:41 -07:00 |
|
ywkang
|
dcbc41571f
|
Add web topology viewer with hot path visualization
- FastAPI backend (server.py) with REST API + WebSocket for event streaming
- SVG-based topology viewer (index.html) with SIP/CUBE/PE drill-down views
- Event logging infrastructure (event_log.py) generating events from real
probe cases (H2D/D2H/PE-DMA) and bench workloads (QKV GEMM single/multi-PE)
- Timeline replay engine with play/pause, speed control, and scrubbing
- Workload selector dropdown grouped by category (Probe/Bench)
- CLI entry points: kernbench web, ./kernbench wrapper scripts
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-03-19 03:19:19 -07:00 |
|
ywkang
|
fc6abbc8ee
|
Add CHANGES.md, README, update SPEC/ADRs for release 2
- CHANGES.md: detailed changelog for release 1 and 2
- README.md: full project docs with install, probe, run, test usage
- SPEC.md: add ADR-0014~0017 references, update R7 for pcie_ep endpoint
- ADR-0003: update NOC description to reference ADR-0017
- ADR-0004: add HBM efficiency factor (0.8) to BW guarantee contract
- ADR-0014: status Proposed -> Accepted
- ADR-0015: update D4 to M_CPU bypass for Memory R/W, add ADR-0016/0017 links
- ADR-0016 (new): IOChiplet NOC and memory data path
- ADR-0017 (new): Cube NOC 2D mesh architecture
- Fix MD lint warnings (unfenced code blocks) across all docs
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-03-19 01:43:15 -07:00 |
|
ywkang
|
d75da439c6
|
Add probe CLI improvements, D2H read, UCIe/HBM tuning, BW sweep
- Probe CLI: restructured output (tables first, routes below), per-hop
timestamps, split cross-cube into best/worst cases, D2H read section
- UCIe overhead: 1ns -> 8ns per port (16ns per crossing) to fix
cross-cube-best < cross-half latency inversion
- HBM efficiency: added efficiency=0.8 factor to hbm_ctrl, reducing
effective BW from 256 to 204.8 GB/s
- Multi-size BW sweep: saturation tables (4KB-1MB) for all probe cases
- Probe default data size: 4KB -> 32KB for more realistic measurements
- IOChiplet NOC + D2H topology and tests
- NOC mesh, xbar, BW occupancy components and tests
- Cube mesh visualization diagram
278 tests pass.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
2026-03-19 01:16:18 -07:00 |
|
ywkang
|
6f43807900
|
commit - release 1
|
2026-03-18 11:47:48 -07:00 |
|