63669f82cb
- DPPolicy: 3-level (sip/cube/pe), unified naming (column_wise/row_wise) - PE_CPU: auto num_programs from cube shard count - context.launch(): per-SIP KernelLaunchMsg with local va_base + auto local shape - deploy_tensor: removed mmus param, MMU mapping is context-only responsibility - ComponentRegistry: YAML-based lazy loading (components.yaml), impls→builtin rename - VA offset bench + tests: 2D/1D, standard Triton kernel pattern Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
54 lines
2.6 KiB
YAML
54 lines
2.6 KiB
YAML
# Component implementation registry.
|
|
# Maps impl names (used in topology.yaml) to Python class paths.
|
|
# Format: impl_name: module.path:ClassName
|
|
#
|
|
# ── Adding custom components ──────────────────────────────────────────
|
|
#
|
|
# 1. Create your implementation in:
|
|
# src/kernbench/components/custom/<your_component>.py
|
|
#
|
|
# Your class must inherit from ComponentBase (or PeEngineBase for PE engines).
|
|
#
|
|
# 2. Register it below under "Custom" with a unique impl name:
|
|
# my_pe_cpu_v2: kernbench.components.custom.my_pe_cpu:MyPeCpuComponent
|
|
#
|
|
# 3. Reference it in topology.yaml:
|
|
# pe_cpu: { kind: pe_cpu, impl: my_pe_cpu_v2, attrs: { ... } }
|
|
#
|
|
# 4. Add unit tests in:
|
|
# tests/custom/test_<your_component>.py
|
|
#
|
|
# External packages also work — use the full module path:
|
|
# fast_gemm_v1: my_team.accel.fast_gemm:FastGemmComponent
|
|
# ──────────────────────────────────────────────────────────────────────
|
|
|
|
components:
|
|
# Infrastructure
|
|
forwarding_v1: kernbench.components.builtin.forwarding:TransitComponent
|
|
switch_v1: kernbench.components.builtin.forwarding:TransitComponent
|
|
noc_v1: kernbench.components.builtin.forwarding:TransitComponent
|
|
ucie_v1: kernbench.components.builtin.forwarding:TransitComponent
|
|
noc_2d_mesh_v1: kernbench.components.builtin.noc:TwoDMeshNocComponent
|
|
xbar_v1: kernbench.components.builtin.xbar:PositionAwareXbarComponent
|
|
|
|
# IO / Host interface
|
|
pcie_ep_v1: kernbench.components.builtin.pcie_ep:PcieEpComponent
|
|
io_cpu_v1: kernbench.components.builtin.io_cpu:IoCpuComponent
|
|
|
|
# Cube-level
|
|
m_cpu_v1: kernbench.components.builtin.m_cpu:MCpuComponent
|
|
hbm_ctrl_v1: kernbench.components.builtin.hbm_ctrl:HbmCtrlComponent
|
|
sram_v1: kernbench.components.builtin.sram:SramComponent
|
|
|
|
# PE-level
|
|
pe_cpu_v1: kernbench.components.builtin.pe_cpu:PeCpuComponent
|
|
pe_scheduler_v1: kernbench.components.builtin.pe_scheduler:PeSchedulerComponent
|
|
pe_dma_v1: kernbench.components.builtin.pe_dma:PeDmaComponent
|
|
pe_gemm_v1: kernbench.components.builtin.pe_gemm:PeGemmComponent
|
|
pe_math_v1: kernbench.components.builtin.pe_math:PeMathComponent
|
|
pe_mmu_v1: kernbench.components.builtin.pe_mmu:PeMmuComponent
|
|
pe_tcm_v1: kernbench.components.builtin.pe_tcm:PeTcmComponent
|
|
|
|
# Custom — add your implementations here
|
|
# pe_cpu_v2: kernbench.components.custom.my_pe_cpu:MyPeCpuComponent
|