CCL allreduce: rename to lrab_hierarchical_allreduce + descriptive plots
Rename the intercube all-reduce identity to lrab_hierarchical_allreduce (module, config key, distributed test) so the name reflects both levels it implements: LRAB intra-SIP (local reduce to center root + broadcast) and the hierarchical inter-SIP topology exchange (ring/torus/mesh). ADR-0032 slug kept as the stable decision id; pure rename, no logic change. Also in this batch: - ADR-0032 (EN+KO): document the shipped center-root bidirectional reduce (doc was stale corner-root); annotate ccl.yaml root_cube as a placeholder. - Rename allreduce + pe2pe latency plots to descriptive, title-matching filenames and retitle the in-plot headings; drop overview/overview_log. - Point the PPTX image refs at the new plot names. Doc + derived-artifact + rename only; no simulation behavior changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1,7 +1,7 @@
|
||||
"""Phase 1 test for moving the intercube_allreduce root cube from the
|
||||
"""Phase 1 test for moving the lrab_hierarchical_allreduce root cube from the
|
||||
bottom-right corner (3,3) to the geometric center (2,2).
|
||||
|
||||
Today's algorithm (intercube_allreduce.py) hardcodes
|
||||
Today's algorithm (lrab_hierarchical_allreduce.py) hardcodes
|
||||
``root_cube = (cube_h-1) * cube_w + (cube_w-1)`` (= cube 15 in 4×4).
|
||||
The intra-SIP critical path for one allreduce is therefore::
|
||||
|
||||
@@ -55,7 +55,7 @@ def _run_torus_96kb(tmp_path: Path) -> float:
|
||||
sub,
|
||||
sip_topology="torus_2d",
|
||||
n_sips=6,
|
||||
algorithm="intercube_allreduce",
|
||||
algorithm="lrab_hierarchical_allreduce",
|
||||
sip_w=3, sip_h=2,
|
||||
n_elem_override=49152, # 49152 × 2 = 96 KB / slot
|
||||
)
|
||||
@@ -70,7 +70,7 @@ def _run_torus_96kb(tmp_path: Path) -> float:
|
||||
) as ctx:
|
||||
result = run_allreduce(
|
||||
ctx, engine, spec,
|
||||
algorithm="intercube_allreduce", ccl_yaml=ccl_path,
|
||||
algorithm="lrab_hierarchical_allreduce", ccl_yaml=ccl_path,
|
||||
)
|
||||
assert result["ok_cubes"] > 0
|
||||
pe_exec_vals = [
|
||||
@@ -121,7 +121,7 @@ def test_correctness_preserved(tmp_path):
|
||||
sub,
|
||||
sip_topology="torus_2d",
|
||||
n_sips=6,
|
||||
algorithm="intercube_allreduce",
|
||||
algorithm="lrab_hierarchical_allreduce",
|
||||
sip_w=3, sip_h=2,
|
||||
n_elem_override=128, # tiny payload to keep this fast
|
||||
)
|
||||
@@ -136,7 +136,7 @@ def test_correctness_preserved(tmp_path):
|
||||
) as ctx:
|
||||
result = run_allreduce(
|
||||
ctx, engine, spec,
|
||||
algorithm="intercube_allreduce", ccl_yaml=ccl_path,
|
||||
algorithm="lrab_hierarchical_allreduce", ccl_yaml=ccl_path,
|
||||
)
|
||||
n_cubes = 6 * 16 # 6 SIPs × 16 cubes/SIP
|
||||
assert result["ok_cubes"] == n_cubes, (
|
||||
|
||||
Reference in New Issue
Block a user