Files
kernbench2/docs/diagrams/pe2pe_latency_plots/summary.csv
T
mukesh 1c5752a9ec Intercube allreduce: center root + bidirectional reduce
Move the algorithmic root cube from the corner (cube_w-1,
cube_h-1) to the geometric center (cube_w//2, cube_h//2) and
have each phase converge bidirectionally so the intra-SIP
critical path drops from ~12 hops to ~8 hops on a 4×4 mesh
(left half W→E + right half E→W in row reduce; top half N→S +
bottom half S→N in col reduce; mirrored on broadcast).

Result on torus_2d 6 SIPs at 96 KB / PE on TCM:
  before (corner root)  : 22.0 µs
  after  (center root)  : 17.2 µs   (−22%)

Same shape on ring_1d (−7%) and mesh_2d_no_wrap (−12%); also
holds across SRAM and HBM (~−20% each).

Phase 1 test (test_intercube_root_center.py) asserts the
torus_2d 96 KB latency drops below 20.5 µs and that all 96
cubes still validate (correctness preserved).

Plot updates:
- overview.png: replace constant 10.6 µs theoretical line with
  user-supplied hand-derived curve (per-cube packet count =
  bytes_per_pe × 8 PEs ÷ 128 B; 1346 ns startup + 1.20 ns/pkt).
- All summary.csv numbers and per-topology PNGs regenerated.
- pe2pe_latency_plots and ipcq diagram emitter PNGs refreshed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 21:28:58 -07:00

6.6 KiB

1hoplabelsize_bytespathtotal_ns
2h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)128ipcq31.6399999999976
3h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)128raw12.019999999996799
4h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)256ipcq33.6399999999976
5h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)256raw13.019999999996799
6h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)384ipcq35.6399999999976
7h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)384raw14.019999999996799
8h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)512ipcq37.6399999999976
9h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)512raw15.019999999996799
10h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)768ipcq41.6399999999976
11h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)768raw17.0199999999968
12h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)1024ipcq45.6399999999976
13h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)1024raw19.0199999999968
14h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)2048ipcq61.6399999999976
15h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)2048raw27.0199999999968
16h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)4096ipcq93.6399999999976
17h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)4096raw43.0199999999968
18h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)8192ipcq157.64000000000306
19h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)8192raw75.02000000000407
20h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)10240ipcq189.64000000000306
21h1_intra_horizontalIntra-cube horizontal (pe0 to pe1)10240raw91.02000000000407
22h2_intra_verticalIntra-cube vertical (pe0 to pe4)128ipcq31.6399999999976
23h2_intra_verticalIntra-cube vertical (pe0 to pe4)128raw12.019999999996799
24h2_intra_verticalIntra-cube vertical (pe0 to pe4)256ipcq33.6399999999976
25h2_intra_verticalIntra-cube vertical (pe0 to pe4)256raw13.019999999996799
26h2_intra_verticalIntra-cube vertical (pe0 to pe4)384ipcq35.6399999999976
27h2_intra_verticalIntra-cube vertical (pe0 to pe4)384raw14.019999999996799
28h2_intra_verticalIntra-cube vertical (pe0 to pe4)512ipcq37.6399999999976
29h2_intra_verticalIntra-cube vertical (pe0 to pe4)512raw15.019999999996799
30h2_intra_verticalIntra-cube vertical (pe0 to pe4)768ipcq41.6399999999976
31h2_intra_verticalIntra-cube vertical (pe0 to pe4)768raw17.0199999999968
32h2_intra_verticalIntra-cube vertical (pe0 to pe4)1024ipcq45.6399999999976
33h2_intra_verticalIntra-cube vertical (pe0 to pe4)1024raw19.0199999999968
34h2_intra_verticalIntra-cube vertical (pe0 to pe4)2048ipcq61.6399999999976
35h2_intra_verticalIntra-cube vertical (pe0 to pe4)2048raw27.0199999999968
36h2_intra_verticalIntra-cube vertical (pe0 to pe4)4096ipcq93.6399999999976
37h2_intra_verticalIntra-cube vertical (pe0 to pe4)4096raw43.0199999999968
38h2_intra_verticalIntra-cube vertical (pe0 to pe4)8192ipcq157.64000000000306
39h2_intra_verticalIntra-cube vertical (pe0 to pe4)8192raw75.02000000000407
40h2_intra_verticalIntra-cube vertical (pe0 to pe4)10240ipcq189.64000000000306
41h2_intra_verticalIntra-cube vertical (pe0 to pe4)10240raw91.02000000000407
42h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)128ipcq67.65999999999804
43h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)128raw68.53999999999724
44h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)256ipcq69.65999999999804
45h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)256raw70.03999999999724
46h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)384ipcq71.65999999999804
47h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)384raw71.53999999999724
48h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)512ipcq73.65999999999804
49h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)512raw73.03999999999724
50h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)768ipcq77.65999999999804
51h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)768raw76.03999999999724
52h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)1024ipcq81.65999999999804
53h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)1024raw79.03999999999724
54h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)2048ipcq97.65999999999804
55h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)2048raw91.03999999999724
56h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)4096ipcq129.65999999999804
57h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)4096raw115.03999999999724
58h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)8192ipcq193.65999999999985
59h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)8192raw163.04000000000087
60h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)10240ipcq225.65999999999985
61h3_inter_cube_horizontalInter-cube horizontal (cube0 to cube1)10240raw187.04000000000087
62h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)128ipcq87.65999999999804
63h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)128raw88.53999999999724
64h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)256ipcq89.65999999999804
65h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)256raw90.03999999999724
66h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)384ipcq91.65999999999804
67h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)384raw91.53999999999724
68h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)512ipcq93.65999999999804
69h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)512raw93.03999999999724
70h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)768ipcq97.65999999999804
71h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)768raw96.03999999999724
72h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)1024ipcq101.65999999999804
73h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)1024raw99.03999999999724
74h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)2048ipcq117.65999999999804
75h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)2048raw111.03999999999724
76h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)4096ipcq149.65999999999804
77h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)4096raw135.03999999999724
78h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)8192ipcq213.65999999999985
79h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)8192raw183.04000000000087
80h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)10240ipcq245.65999999999985
81h4_inter_cube_verticalInter-cube vertical (cube0 to cube4)10240raw207.04000000000087