eval: commit milestone bench output (track generated figures + results)

Per request, the milestone bench output is now tracked in git instead of
gitignored, so the figures/results are viewable on the remote:

- src/kernbench/benches/1H_milestone_output/gemm/  (3 PNGs + gemm_sweep.json)
- src/kernbench/benches/1H_milestone_output/ccl/   (3 per-topology PNGs,
  buffer-kind PNG+CSV, FSIM comparison PNG, topology.png, summary.csv)

Drop the .gitignore rule; update ADR-0054 D3 + Negative (EN+KO) to say the
output is committed (regenerable by rerunning the bench). Artifacts produced
by full bench runs (milestone-1h-gemm non-FAST, milestone-1h-ccl).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-22 15:37:27 -07:00
parent cc1bbd0ab7
commit b1d6fafd3a
15 changed files with 1695 additions and 9 deletions
+5 -3
View File
@@ -61,8 +61,9 @@ Both benches write to `src/kernbench/benches/1H_milestone_output/{gemm,ccl}/`
(per user request — artifacts beside the bench). The directory holds only
generated PNG/CSV/JSON (never a `.py`/`__init__.py`), so the eager-import
audit (ADR-0045 first action) ignores it — `pkgutil.iter_modules` does not
yield non-package subdirectories. It is **git-ignored** (regenerable on
demand), unlike the committed `docs/diagrams/` artifacts.
yield non-package subdirectories. It is **committed** (like the
`docs/diagrams/` artifacts) so the figures are viewable on the remote;
rerunning the bench regenerates it in place.
### D4. GEMM heavy sweep — fresh by default, `MILESTONE_FAST` to reuse
@@ -118,7 +119,8 @@ ADR-0045 D1).
sweeps, and matplotlib drawing). A "bench" that is mostly an eval harness
is unusual; this ADR legitimizes it.
- Generated artifacts live inside the source tree (`src/kernbench/benches/`)
by explicit request; git-ignored to avoid committing them.
by explicit request and are committed (so the figures are viewable on the
remote); rerunning the bench regenerates them.
- `milestone-1h-ccl` (and the default `milestone-1h-gemm`) take minutes —
acceptable for an on-demand milestone artifact, not for routine runs.