kernbench2

ywkang/kernbench2

Fork 0

Commit Graph

Author	SHA1	Message	Date
ywkang	ff2c677a9c	Add 2D grid program_id semantics (ADR-0022) tl.program_id(axis=0) returns local PE id within cube, tl.program_id(axis=1) returns cube id. Enables cube-aware sharding in benchmark kernels. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 16:49:56 -07:00
ywkang	08812eda58	Add virtual memory support: PE_MMU, VA allocator, fabric MmuMapMsg Implement VA/MMU layer (ADR-0011 Phase 1) enabling Triton kernels to use contiguous virtual addresses on sharded tensors. Key changes: - PE_MMU component: hybrid inbox (MmuMapMsg) + sync translate() for PE_DMA - VirtualAllocator + PEMemAllocator: free-list with coalescing - MmuMapMsg/MmuUnmapMsg fabric path with SIP-level routing - DPPolicy-based mapping: replicate=local, sharded=broadcast - Tensor lifecycle: del + weakref cleanup, context manager - Rename: TensorHandle.pa→addr, DmaReadCmd.src_pa→src_addr, ctx→torch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 00:01:47 -07:00
ywkang	6f43807900	commit - release 1	2026-03-18 11:47:48 -07:00

Author

SHA1

Message

Date

ywkang

ff2c677a9c

Add 2D grid program_id semantics (ADR-0022)

tl.program_id(axis=0) returns local PE id within cube,
tl.program_id(axis=1) returns cube id. Enables cube-aware
sharding in benchmark kernels.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-09 16:49:56 -07:00

ywkang

08812eda58

Add virtual memory support: PE_MMU, VA allocator, fabric MmuMapMsg

Implement VA/MMU layer (ADR-0011 Phase 1) enabling Triton kernels to use
contiguous virtual addresses on sharded tensors.

Key changes:
- PE_MMU component: hybrid inbox (MmuMapMsg) + sync translate() for PE_DMA
- VirtualAllocator + PEMemAllocator: free-list with coalescing
- MmuMapMsg/MmuUnmapMsg fabric path with SIP-level routing
- DPPolicy-based mapping: replicate=local, sharded=broadcast
- Tensor lifecycle: del + weakref cleanup, context manager
- Rename: TensorHandle.pa→addr, DmaReadCmd.src_pa→src_addr, ctx→torch

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-26 00:01:47 -07:00

ywkang

6f43807900

commit - release 1

2026-03-18 11:47:48 -07:00

3 Commits