Files
kernbench2/docs/adr-ko/INDEX.md
T
ywkang e33e76f2d1 adr: add INDEX.md (auto-generated by tools/generate_adr_index.py)
Adds a section-based table of contents for the 46-ADR corpus, mirroring
the /report skill's classification (Design Principles / High-level
Architecture / Detailed Architecture by component / Implementation
Decisions by topic). Generated for both docs/adr/ (EN titles) and
docs/adr-ko/ (KO titles) from one tool.

tools/generate_adr_index.py:
- Single CLASSIFICATION dict per ADR — add an entry when introducing a
  new ADR; the script fails loud if any file is missing from the table.
- DETAILED_COMPONENTS lists each builtin component and the ADR(s) that
  cover it (ADR-0014 appears under six PE engines; ADR-0023 under
  pe_dma + pe_ipcq).
- Accepts both ":" and "—" title separators (matching ADR-0033's
  existing format).
- --check mode for CI: exits 1 if INDEX.md is stale.

Also includes the docs/report/architecture-2026-1H.md generated by the
prior /report write (the public-facing architecture document; 836 lines,
76 source-attribution comments).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-22 11:15:37 -07:00

7.0 KiB

ADR Index

Auto-generated by tools/generate_adr_index.py. Total ADRs: 46.

Classification mirrors the /report skill's section assignment. When adding a new ADR, also add an entry to the CLASSIFICATION table in tools/generate_adr_index.py.

Design Principles

  • ADR-0013 — 검증 전략 및 Phase 1 테스트 계획
  • ADR-0033 — 레이턴시 모델: 가정 및 알려진 단순화

High-level Architecture

  • ADR-0003 — 타겟 시스템 계층 및 모델링 범위 (System hierarchy (Tray / SIP / CUBE / PE))
  • ADR-0007 — 런타임 API 및 시뮬레이션 엔진 경계 (Runtime API ↔ sim_engine boundaries)
  • ADR-0016 — IOChiplet NoC와 메모리 데이터 경로 (IOChiplet NOC and memory data path)
  • ADR-0017 — 큐브 NoC와 HBM 연결성 (Cube NOC and HBM connectivity)

Detailed Architecture

One subsection per component file under src/kernbench/components/builtin/.

forwarding

  • ADR-0037 — Forwarding 컴포넌트 (forwarding_v1)

hbm_ctrl

  • ADR-0034 — HBM 컨트롤러 내부 설계

io_cpu

  • ADR-0036 — IO_CPU 컴포넌트 모델

m_cpu

  • ADR-0035 — M_CPU 및 M_CPU.DMA 컴포넌트 모델

pcie_ep

pe_cpu

  • ADR-0014 — PE 파이프라인 실행 모델

pe_dma

  • ADR-0014 — PE 파이프라인 실행 모델
  • ADR-0023 — PE-level IPCQ — Inter-PE Collective Communication

pe_fetch_store

  • ADR-0014 — PE 파이프라인 실행 모델

pe_gemm

  • ADR-0014 — PE 파이프라인 실행 모델

pe_ipcq

  • ADR-0023 — PE-level IPCQ — Inter-PE Collective Communication

pe_math

  • ADR-0014 — PE 파이프라인 실행 모델

pe_mmu

  • ADR-0039 — PE_MMU Component Model — 컴포넌트 + 유틸리티 이중 역할

pe_scheduler

  • ADR-0014 — PE 파이프라인 실행 모델

pe_tcm

  • ADR-0040 — PE_TCM Component Model — 듀얼 채널 BW 직렬화

sram

  • ADR-0041 — Cube SRAM Component Model — terminal scratchpad on cube NoC

tiling

  • ADR-0042 — Tile Plan Generators — GEMM/Math 파이프라인 plan 빌더

Implementation Decisions

Address Scheme

  • ADR-0001 — 51비트 물리 주소 레이아웃 및 디코딩 계약
  • ADR-0011 — 메모리 주소 지정 — PA / VA / LA 주소 모델

Routing & Helper API

  • ADR-0002 — 라우팅 거리, 순서 및 우회 규칙
  • ADR-0051 — Routing Helper API — AddressResolver + PathRouter

Memory Semantics & Local-HBM Bandwidth

  • ADR-0004 — 메모리 시맨틱 및 로컬 HBM 대역폭 보장

Topology Compilation, Diagrams & Builder Algorithms

  • ADR-0005 — 다이어그램 뷰 및 거리 기반 레이아웃 규칙
  • ADR-0006 — 토폴로지 컴파일, 거리 추출, 그리고 자동 다이어그램 생성
  • ADR-0053 — Topology Builder + Visualizer Algorithms

Tensor Deployment and Allocation

  • ADR-0008 — 텐서 배포 및 할당 (호스트 할당기, PA 우선)

Kernel Execution and Host-Device Messaging

  • ADR-0009 — 커널 실행 메시징 및 완료 시맨틱
  • ADR-0012 — Host ↔ IO_CPU 메시지 스키마 (PA-우선, PE-태깅)

CLI Surface and Semantics

  • ADR-0010 — 명령줄 인터페이스 및 실행 시맨틱

Component Port/Wire Fabric Model

  • ADR-0015 — 컴포넌트 포트/와이어 모델과 패브릭 라우팅

Two-Pass Data Execution

  • ADR-0020 — 2-Pass 데이터 실행 모델 (타이밍 / 데이터 분리)

2D Grid Program Identity

  • ADR-0022 — 2D 그리드 program_id 시맨틱

Parallelism (Launcher, DP, TP, AHBM backend, CCL algorithm)

  • ADR-0024 — SIP-level Launcher — rank = SIP
  • ADR-0026 — DPPolicy = Intra-Device Only — sip/num_sips 필드 제거
  • ADR-0027 — Megatron-style Tensor Parallelism API
  • ADR-0047 — AHBM CCL Backend — torch.distributed-compat shim
  • ADR-0050 — CCL Algorithm Module Contract — ccl/algorithms/*.py

IPCQ Direction Addressing

  • ADR-0025 — IPCQ Direction Addressing — address-based matching

Intercube All-Reduce

  • ADR-0032 — 큐브 간 All-Reduce — pe0 큐브-메시 리듀스 + 다중-SIP 교환

Evaluation Harnesses

  • ADR-0043 — Allreduce 평가 하니스 — tests/sccl/
  • ADR-0044 — GEMM 평가 하니스 — scripts/gemm_sweep.py + tests/gemm/

Bench Module Contract

  • ADR-0045 — Bench Module Contract — registration, dispatch, and authoring

Kernel-side tl.* API (TLContext)

  • ADR-0046 — TLContext — Kernel-side tl.* API Contract

Memory Allocator Algorithms

  • ADR-0048 — Memory Allocator Algorithms — VirtualAllocator + PEMemAllocator

Probe Subcommand

  • ADR-0049kernbench probe Subcommand — Traffic-Pattern Verification Harness

Sim-engine Op Log and Memory Store Schemas

  • ADR-0052 — OpLog + MemoryStore Schemas — sim_engine internals