0e346b939d
Mirror the sccl pattern for GEMM figures: a tests/gemm/ package renders the GEMM bar charts as PNGs from the committed docs/diagrams/gemm_sweep.json, so the figures are fast test artifacts (run by default) while the heavy sim sweep stays a manual script (scripts/gemm_sweep.py, kept) wrapped by a slow regenerator test. tests/gemm/: - _gemm_plot_helpers.py: matplotlib renderers (series logic mirrors the GEMM _render_* functions in scripts/build_overview_slides.py). - test_plot_gemm_stage_breakdown.py: gemm_stage_breakdown.png (load_ref). - test_plot_gemm_mac_utilization.py: gemm_mac_utilization_measured.png + gemm_mac_utilization_theoretical_vs_measured.png (load_ref). - test_gemm_sweep.py: @pytest.mark.slow regenerator (runs scripts/gemm_sweep.py). Chart set trimmed to three (stage breakdown, MAC util, theoretical-vs-measured); "formula" relabeled to "theoretical" throughout the comparison chart. Known follow-ups (not blocking): - gemm_mac_utilization_measured.png currently plots the theoretical ideal- pipeline model, not simulator-measured data; the name is a misnomer pending a decision to repoint its content or retitle. - The theoretical-model constants (HBM 256 GB/s, T_stage 16 ns, 3 stages) are inherited verbatim from build_overview_slides.py and not yet verified against ADR-0033 / ADR-0014 / topology. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Generated Diagrams
This directory contains diagrams generated from topology compilation.
What these files are
- Derived artifacts generated from:
- compiled topology graph
- distance (accumulated latency) metadata
- view/layout rules (ADR-0005)
These files are meant for quick visual inspection and review.
Default outputs
- SIP view:
sip_view.mmd(and/orsip_view.dot) - CUBE view:
cube_view.mmd(and/orcube_view.dot) - PE view:
pe_view.mmd(and/orpe_view.dot)
How to preview
- In VS Code:
- open
.mmdor.mdcontaining Mermaid blocks and use Markdown Preview - for
.dot, use a Graphviz preview extension ordot -Tpng
- open
Notes
- Diagrams are representative and distance-aware by default.
- Instance indices are not required unless debugging asymmetry.
- Outputs should be deterministic for the same topology and rules.