b1d6fafd3a
Per request, the milestone bench output is now tracked in git instead of gitignored, so the figures/results are viewable on the remote: - src/kernbench/benches/1H_milestone_output/gemm/ (3 PNGs + gemm_sweep.json) - src/kernbench/benches/1H_milestone_output/ccl/ (3 per-topology PNGs, buffer-kind PNG+CSV, FSIM comparison PNG, topology.png, summary.csv) Drop the .gitignore rule; update ADR-0054 D3 + Negative (EN+KO) to say the output is committed (regenerable by rerunning the bench). Artifacts produced by full bench runs (milestone-1h-gemm non-FAST, milestone-1h-ccl). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2.7 KiB
2.7 KiB
| 1 | algorithm | sip_topology | n_sips | n_elem | bytes_per_pe | bytes_per_sip | latency_ns |
|---|---|---|---|---|---|---|---|
| 2 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 8 | 16 | 256 | 2666.552500000015 |
| 3 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 32 | 64 | 1024 | 2747.7400000000152 |
| 4 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 64 | 128 | 2048 | 2855.990000000018 |
| 5 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 128 | 256 | 4096 | 3072.490000000019 |
| 6 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 512 | 1024 | 16384 | 3337.1133333333582 |
| 7 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 1024 | 2048 | 32768 | 3708.0333333333692 |
| 8 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 2048 | 4096 | 65536 | 4449.873333333393 |
| 9 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 4096 | 8192 | 131072 | 5933.020000000124 |
| 10 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 8192 | 16384 | 262144 | 8900.379999999863 |
| 11 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 16384 | 32768 | 524288 | 14835.099999999224 |
| 12 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 32768 | 65536 | 1048576 | 26704.540000000765 |
| 13 | lrab_hierarchical_allreduce | mesh_2d_no_wrap | 6 | 49152 | 98304 | 1572864 | 38573.97999999701 |
| 14 | lrab_hierarchical_allreduce | ring_1d | 6 | 8 | 16 | 256 | 2365.255833333347 |
| 15 | lrab_hierarchical_allreduce | ring_1d | 6 | 32 | 64 | 1024 | 2436.9433333333473 |
| 16 | lrab_hierarchical_allreduce | ring_1d | 6 | 64 | 128 | 2048 | 2532.526666666683 |
| 17 | lrab_hierarchical_allreduce | ring_1d | 6 | 128 | 256 | 4096 | 2723.693333333349 |
| 18 | lrab_hierarchical_allreduce | ring_1d | 6 | 512 | 1024 | 16384 | 3048.635000000021 |
| 19 | lrab_hierarchical_allreduce | ring_1d | 6 | 1024 | 2048 | 32768 | 3393.4016666666957 |
| 20 | lrab_hierarchical_allreduce | ring_1d | 6 | 2048 | 4096 | 65536 | 4082.401666666714 |
| 21 | lrab_hierarchical_allreduce | ring_1d | 6 | 4096 | 8192 | 131072 | 5458.80166666677 |
| 22 | lrab_hierarchical_allreduce | ring_1d | 6 | 8192 | 16384 | 262144 | 8216.934999999943 |
| 23 | lrab_hierarchical_allreduce | ring_1d | 6 | 16384 | 32768 | 524288 | 13733.201666665835 |
| 24 | lrab_hierarchical_allreduce | ring_1d | 6 | 32768 | 65536 | 1048576 | 24765.73500000064 |
| 25 | lrab_hierarchical_allreduce | ring_1d | 6 | 49152 | 98304 | 1572864 | 35798.268333331536 |
| 26 | lrab_hierarchical_allreduce | torus_2d | 6 | 8 | 16 | 256 | 1700.6025000000095 |
| 27 | lrab_hierarchical_allreduce | torus_2d | 6 | 32 | 64 | 1024 | 1753.2900000000102 |
| 28 | lrab_hierarchical_allreduce | torus_2d | 6 | 64 | 128 | 2048 | 1823.540000000012 |
| 29 | lrab_hierarchical_allreduce | torus_2d | 6 | 128 | 256 | 4096 | 1964.040000000012 |
| 30 | lrab_hierarchical_allreduce | torus_2d | 6 | 512 | 1024 | 16384 | 2196.8183333333463 |
| 31 | lrab_hierarchical_allreduce | torus_2d | 6 | 1024 | 2048 | 32768 | 2477.2783333333473 |
| 32 | lrab_hierarchical_allreduce | torus_2d | 6 | 2048 | 4096 | 65536 | 3038.1983333333583 |
| 33 | lrab_hierarchical_allreduce | torus_2d | 6 | 4096 | 8192 | 131072 | 4159.5050000000665 |
| 34 | lrab_hierarchical_allreduce | torus_2d | 6 | 8192 | 16384 | 262144 | 6403.185000000109 |
| 35 | lrab_hierarchical_allreduce | torus_2d | 6 | 16384 | 32768 | 524288 | 10890.5449999995 |
| 36 | lrab_hierarchical_allreduce | torus_2d | 6 | 32768 | 65536 | 1048576 | 19865.265000000378 |
| 37 | lrab_hierarchical_allreduce | torus_2d | 6 | 49152 | 98304 | 1572864 | 28839.98500000059 |