kernbench2/benches/ipcq_allreduce.py at 84a1325e5c8b9b2236610a25d08f41a8a607e361 - kernbench2 - YWGitServer

ywkang/kernbench2

Files

T

ywkang 08812eda58 Add virtual memory support: PE_MMU, VA allocator, fabric MmuMapMsg

Implement VA/MMU layer (ADR-0011 Phase 1) enabling Triton kernels to use
contiguous virtual addresses on sharded tensors.

Key changes:
- PE_MMU component: hybrid inbox (MmuMapMsg) + sync translate() for PE_DMA
- VirtualAllocator + PEMemAllocator: free-list with coalescing
- MmuMapMsg/MmuUnmapMsg fabric path with SIP-level routing
- DPPolicy-based mapping: replicate=local, sharded=broadcast
- Tensor lifecycle: del + weakref cleanup, context manager
- Rename: TensorHandle.pa→addr, DmaReadCmd.src_pa→src_addr, ctx→torch

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-03-26 00:01:47 -07:00

3 lines

58 B

Python

Raw Blame History

	`def run(torch):`
	`print("IPCQ all reduce kernel bench")`