CCL allreduce: rename to lrab_hierarchical_allreduce + descriptive plots
Rename the intercube all-reduce identity to lrab_hierarchical_allreduce (module, config key, distributed test) so the name reflects both levels it implements: LRAB intra-SIP (local reduce to center root + broadcast) and the hierarchical inter-SIP topology exchange (ring/torus/mesh). ADR-0032 slug kept as the stable decision id; pure rename, no logic change. Also in this batch: - ADR-0032 (EN+KO): document the shipped center-root bidirectional reduce (doc was stale corner-root); annotate ccl.yaml root_cube as a placeholder. - Rename allreduce + pe2pe latency plots to descriptive, title-matching filenames and retitle the in-plot headings; drop overview/overview_log. - Point the PPTX image refs at the new plot names. Doc + derived-artifact + rename only; no simulation behavior changed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -56,13 +56,17 @@ class Hop:
|
||||
|
||||
|
||||
HOPS = [
|
||||
Hop("h1_intra_horizontal", "Intra-cube horizontal (pe0 to pe1)",
|
||||
Hop("latency_intracube_PE0_to_PE1_horizontal",
|
||||
"Intra-cube PE-to-PE latency: PE0 → PE1 (horizontal)",
|
||||
(0, 0, 0), (0, 0, 1), "intra_E", "intra_W", True),
|
||||
Hop("h2_intra_vertical", "Intra-cube vertical (pe0 to pe4)",
|
||||
Hop("latency_intracube_PE0_to_PE4_vertical",
|
||||
"Intra-cube PE-to-PE latency: PE0 → PE4 (vertical)",
|
||||
(0, 0, 0), (0, 0, 4), "intra_S", "intra_N", True),
|
||||
Hop("h3_inter_cube_horizontal", "Inter-cube horizontal (cube0 to cube1)",
|
||||
Hop("latency_intercube_C0PE0_to_C1PE0_horizontal",
|
||||
"Inter-cube PE-to-PE latency: Cube0.PE0 → Cube1.PE0 (horizontal)",
|
||||
(0, 0, 0), (0, 1, 0), "E", "W", True),
|
||||
Hop("h4_inter_cube_vertical", "Inter-cube vertical (cube0 to cube4)",
|
||||
Hop("latency_intercube_C0PE0_to_C4PE0_vertical",
|
||||
"Inter-cube PE-to-PE latency: Cube0.PE0 → Cube4.PE0 (vertical)",
|
||||
(0, 0, 0), (0, 4, 0), "S", "N", True),
|
||||
]
|
||||
|
||||
@@ -80,7 +84,7 @@ def _measure_ipcq(hop: Hop, nbytes: int) -> float:
|
||||
engine, spec = _make_engine()
|
||||
|
||||
cfg = load_ccl_config()
|
||||
merged = resolve_algorithm_config(cfg, name="intercube_allreduce")
|
||||
merged = resolve_algorithm_config(cfg, name="lrab_hierarchical_allreduce")
|
||||
merged["slot_size"] = max(int(merged.get("slot_size", 4096)), nbytes)
|
||||
|
||||
n_elem = nbytes // ELEM_BYTES
|
||||
|
||||
Reference in New Issue
Block a user