eval: commit milestone bench output (track generated figures + results)

Per request, the milestone bench output is now tracked in git instead of
gitignored, so the figures/results are viewable on the remote:

- src/kernbench/benches/1H_milestone_output/gemm/  (3 PNGs + gemm_sweep.json)
- src/kernbench/benches/1H_milestone_output/ccl/   (3 per-topology PNGs,
  buffer-kind PNG+CSV, FSIM comparison PNG, topology.png, summary.csv)

Drop the .gitignore rule; update ADR-0054 D3 + Negative (EN+KO) to say the
output is committed (regenerable by rerunning the bench). Artifacts produced
by full bench runs (milestone-1h-gemm non-FAST, milestone-1h-ccl).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-05-22 15:37:27 -07:00
parent cc1bbd0ab7
commit b1d6fafd3a
15 changed files with 1695 additions and 9 deletions
@@ -0,0 +1,37 @@
algorithm,sip_topology,n_sips,n_elem,bytes_per_pe,bytes_per_sip,latency_ns
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,8,16,256,2666.552500000015
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,32,64,1024,2747.7400000000152
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,64,128,2048,2855.990000000018
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,128,256,4096,3072.490000000019
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,512,1024,16384,3337.1133333333582
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,1024,2048,32768,3708.0333333333692
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,2048,4096,65536,4449.873333333393
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,4096,8192,131072,5933.020000000124
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,8192,16384,262144,8900.379999999863
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,16384,32768,524288,14835.099999999224
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,32768,65536,1048576,26704.540000000765
lrab_hierarchical_allreduce,mesh_2d_no_wrap,6,49152,98304,1572864,38573.97999999701
lrab_hierarchical_allreduce,ring_1d,6,8,16,256,2365.255833333347
lrab_hierarchical_allreduce,ring_1d,6,32,64,1024,2436.9433333333473
lrab_hierarchical_allreduce,ring_1d,6,64,128,2048,2532.526666666683
lrab_hierarchical_allreduce,ring_1d,6,128,256,4096,2723.693333333349
lrab_hierarchical_allreduce,ring_1d,6,512,1024,16384,3048.635000000021
lrab_hierarchical_allreduce,ring_1d,6,1024,2048,32768,3393.4016666666957
lrab_hierarchical_allreduce,ring_1d,6,2048,4096,65536,4082.401666666714
lrab_hierarchical_allreduce,ring_1d,6,4096,8192,131072,5458.80166666677
lrab_hierarchical_allreduce,ring_1d,6,8192,16384,262144,8216.934999999943
lrab_hierarchical_allreduce,ring_1d,6,16384,32768,524288,13733.201666665835
lrab_hierarchical_allreduce,ring_1d,6,32768,65536,1048576,24765.73500000064
lrab_hierarchical_allreduce,ring_1d,6,49152,98304,1572864,35798.268333331536
lrab_hierarchical_allreduce,torus_2d,6,8,16,256,1700.6025000000095
lrab_hierarchical_allreduce,torus_2d,6,32,64,1024,1753.2900000000102
lrab_hierarchical_allreduce,torus_2d,6,64,128,2048,1823.540000000012
lrab_hierarchical_allreduce,torus_2d,6,128,256,4096,1964.040000000012
lrab_hierarchical_allreduce,torus_2d,6,512,1024,16384,2196.8183333333463
lrab_hierarchical_allreduce,torus_2d,6,1024,2048,32768,2477.2783333333473
lrab_hierarchical_allreduce,torus_2d,6,2048,4096,65536,3038.1983333333583
lrab_hierarchical_allreduce,torus_2d,6,4096,8192,131072,4159.5050000000665
lrab_hierarchical_allreduce,torus_2d,6,8192,16384,262144,6403.185000000109
lrab_hierarchical_allreduce,torus_2d,6,16384,32768,524288,10890.5449999995
lrab_hierarchical_allreduce,torus_2d,6,32768,65536,1048576,19865.265000000378
lrab_hierarchical_allreduce,torus_2d,6,49152,98304,1572864,28839.98500000059
1 algorithm sip_topology n_sips n_elem bytes_per_pe bytes_per_sip latency_ns
2 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 8 16 256 2666.552500000015
3 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 32 64 1024 2747.7400000000152
4 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 64 128 2048 2855.990000000018
5 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 128 256 4096 3072.490000000019
6 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 512 1024 16384 3337.1133333333582
7 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 1024 2048 32768 3708.0333333333692
8 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 2048 4096 65536 4449.873333333393
9 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 4096 8192 131072 5933.020000000124
10 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 8192 16384 262144 8900.379999999863
11 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 16384 32768 524288 14835.099999999224
12 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 32768 65536 1048576 26704.540000000765
13 lrab_hierarchical_allreduce mesh_2d_no_wrap 6 49152 98304 1572864 38573.97999999701
14 lrab_hierarchical_allreduce ring_1d 6 8 16 256 2365.255833333347
15 lrab_hierarchical_allreduce ring_1d 6 32 64 1024 2436.9433333333473
16 lrab_hierarchical_allreduce ring_1d 6 64 128 2048 2532.526666666683
17 lrab_hierarchical_allreduce ring_1d 6 128 256 4096 2723.693333333349
18 lrab_hierarchical_allreduce ring_1d 6 512 1024 16384 3048.635000000021
19 lrab_hierarchical_allreduce ring_1d 6 1024 2048 32768 3393.4016666666957
20 lrab_hierarchical_allreduce ring_1d 6 2048 4096 65536 4082.401666666714
21 lrab_hierarchical_allreduce ring_1d 6 4096 8192 131072 5458.80166666677
22 lrab_hierarchical_allreduce ring_1d 6 8192 16384 262144 8216.934999999943
23 lrab_hierarchical_allreduce ring_1d 6 16384 32768 524288 13733.201666665835
24 lrab_hierarchical_allreduce ring_1d 6 32768 65536 1048576 24765.73500000064
25 lrab_hierarchical_allreduce ring_1d 6 49152 98304 1572864 35798.268333331536
26 lrab_hierarchical_allreduce torus_2d 6 8 16 256 1700.6025000000095
27 lrab_hierarchical_allreduce torus_2d 6 32 64 1024 1753.2900000000102
28 lrab_hierarchical_allreduce torus_2d 6 64 128 2048 1823.540000000012
29 lrab_hierarchical_allreduce torus_2d 6 128 256 4096 1964.040000000012
30 lrab_hierarchical_allreduce torus_2d 6 512 1024 16384 2196.8183333333463
31 lrab_hierarchical_allreduce torus_2d 6 1024 2048 32768 2477.2783333333473
32 lrab_hierarchical_allreduce torus_2d 6 2048 4096 65536 3038.1983333333583
33 lrab_hierarchical_allreduce torus_2d 6 4096 8192 131072 4159.5050000000665
34 lrab_hierarchical_allreduce torus_2d 6 8192 16384 262144 6403.185000000109
35 lrab_hierarchical_allreduce torus_2d 6 16384 32768 524288 10890.5449999995
36 lrab_hierarchical_allreduce torus_2d 6 32768 65536 1048576 19865.265000000378
37 lrab_hierarchical_allreduce torus_2d 6 49152 98304 1572864 28839.98500000059