Add virtual memory support: PE_MMU, VA allocator, fabric MmuMapMsg
Implement VA/MMU layer (ADR-0011 Phase 1) enabling Triton kernels to use contiguous virtual addresses on sharded tensors. Key changes: - PE_MMU component: hybrid inbox (MmuMapMsg) + sync translate() for PE_DMA - VirtualAllocator + PEMemAllocator: free-list with coalescing - MmuMapMsg/MmuUnmapMsg fabric path with SIP-level routing - DPPolicy-based mapping: replicate=local, sharded=broadcast - Tensor lifecycle: del + weakref cleanup, context manager - Rename: TensorHandle.pa→addr, DmaReadCmd.src_pa→src_addr, ctx→torch Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -28,7 +28,7 @@ class TensorHandle:
|
||||
"""
|
||||
|
||||
id: str
|
||||
pa: int # physical address in HBM/TCM
|
||||
addr: int # address (VA when MMU enabled, PA otherwise)
|
||||
shape: tuple[int, ...]
|
||||
dtype: str
|
||||
nbytes: int # total byte size
|
||||
@@ -50,19 +50,19 @@ class CompletionHandle:
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class DmaReadCmd:
|
||||
"""DMA READ: HBM → PE_TCM."""
|
||||
"""DMA READ: HBM → PE_TCM. src_addr is VA (translated to PA by PE_DMA)."""
|
||||
|
||||
handle: TensorHandle
|
||||
src_pa: int
|
||||
src_addr: int
|
||||
nbytes: int
|
||||
|
||||
|
||||
@dataclass(frozen=True)
|
||||
class DmaWriteCmd:
|
||||
"""DMA WRITE: PE_TCM → HBM."""
|
||||
"""DMA WRITE: PE_TCM → HBM. dst_addr is VA (translated to PA by PE_DMA)."""
|
||||
|
||||
handle: TensorHandle
|
||||
dst_pa: int
|
||||
dst_addr: int
|
||||
nbytes: int
|
||||
|
||||
|
||||
@@ -108,7 +108,7 @@ class CompositeCmd:
|
||||
op: Literal["gemm", "math"]
|
||||
a: TensorHandle
|
||||
b: TensorHandle | None
|
||||
out_pa: int
|
||||
out_addr: int
|
||||
out_nbytes: int
|
||||
math_op: str | None = None # for op="math": which math operation
|
||||
|
||||
|
||||
Reference in New Issue
Block a user