# ADR-0001: 51-bit Physical Address Layout & Decoding Contract ## Status Accepted (Revision 2 — 2026-04-27: concrete bit layout, rack_id removal, Tray->SIP / SIP->DIE renaming, PE/MCPU/IOCPU sub-unit tables. Supersedes ADR-0031.) ## Date 2026-04-27 (original: 2026-02-27) ## Context KernBench requires a stable, parsable physical address scheme that: - can be decoded into routing domains (SIP / die / HBM / PE-resource / IOCPU) - remains topology-agnostic (no hardcoded counts) - supports swappable policy and DI-first components - covers multiple SIPs, AHBM dies, and IO chiplet dies in a unified space ### History - Original ADR-0001 defined a 51-bit layout with `rack_id(4) + sip_id(4) + sip_seg(5) + local_offset(38)`. `rack_id` was never used in practice. - ADR-0031 (stub) requested PE-resource range partition but was never implemented. Revision 2 removes `rack_id`, renames `sip_seg -> die_id`, and provides concrete sub-unit tables for PE, MCPU, CUBE_SRAM, and IOCPU resources. ADR-0031 is superseded. ## Decision We define a **PhysAddr value object** and an **address decoding contract** that converts an integer address into routing domains. ### D1. PhysAddr is an immutable value object - PhysAddr is immutable and comparable as a pure value. - Any allocator returns a **fully specified PhysAddr** (not partial metadata). - No global state may be required to interpret a PhysAddr. ### D2. 51-bit Physical Address Layout A 51-bit physical address is adopted. #### 2.1 Top-Level Address Map ```text [50:47] sip_id (4) -- 16 SIPs [46:42] die_id (5) -- 32 dies per SIP [41: 0] local_offset (42) -- 4 TB per die ``` ```text 50 47 46 42 41 0 +---------+----------+-------------------------+ | sip_id | die_id | local_offset | +---------+----------+-------------------------+ ``` #### 2.2 die_id Allocation | die_id | Meaning | |--------|---------| | 0..15 | AHBM dies | | 16..20 | IOCHIPLET dies | | 21..31 | Reserved | #### 2.3 AHBM Die Layout Only lower 256 GB of the 4 TB die-local window is assigned. ```text [41:38] MBZ (4) [37] addr_space (1) -- 0 = local resource, 1 = HBM memory [36: 0] sub-address (37) ``` | addr_space | Meaning | |------------|---------| | 0 | Local resource | | 1 | HBM memory | ##### 2.3.1 HBM Window (addr_space = 1) ```text [36:0] hbm_offset (37) -- 128 GB decode window ``` The architectural decode window is fixed at 128 GB. Implemented capacity may be smaller depending on SKU/topology (see D4). ##### 2.3.2 Resource Window (addr_space = 0) ```text [36:34] resource_kind (3) [33: 0] kind_local (34) -- 16 GB per kind ``` | resource_kind | Meaning | |---------------|---------| | 000 | PE_LOCAL | | 001 | MCPU_LOCAL | | 010 | CUBE_SRAM | | 011..111 | Reserved | Each kind gets a 16 GB decode region. ##### 2.3.3 PE_LOCAL (resource_kind = 000) ```text [33] MBZ (1) [32:29] pe_id (4) -- 0..15 [28:25] pe_sub_unit (4) [24: 0] sub_offset (25) -- 32 MB per slot ``` 16 PEs x 16 sub-unit slots x 32 MB = 8 GB active decode. | pe_sub_unit | Name | Budget | |-------------|------|--------| | 0 | PE_CPU_DTCM | 8 KB | | 1 | MATH_ENGINE_DTCM | 8 KB | | 2 | IPCQ | 256 KB | | 3 | PE_CPU_SFR | 16 KB | | 4 | MATH_ENGINE_SFR | 16 KB | | 5 | DMA_ENGINE_SFR | 192 KB | | 6 | PE_TCM | 2 MB | | 7..15 | Reserved | -- | ##### 2.3.4 MCPU_LOCAL (resource_kind = 001) ```text [33:30] MBZ (4) [29:25] mcpu_sub_unit (5) [24: 0] sub_offset (25) -- 32 MB per slot ``` 1 GB active decode. | mcpu_sub_unit | Name | Budget | |---------------|------|--------| | 0 | MCPU_ITCM | 512 KB | | 1 | MCPU_DTCM | 512 KB | | 2 | IPCQ | 256 KB | | 3 | MCPU_SFR | 8 KB | | 4 | MCPU_DMA_SFR | 16 KB | | 5 | MCPU_SRAM | 10 MB | | 6..31 | Reserved | -- | ##### 2.3.5 CUBE_SRAM (resource_kind = 010) ```text [33:25] MBZ (9) [24: 0] sram_offset (25) -- flat 32 MB ``` #### 2.4 IOCHIPLET Die Layout Only lower 1 TB of the 4 TB die-local window is assigned. ```text [41:40] MBZ (2) [39: 0] chiplet_offset (40) -- 1 TB ``` Region split by address range: | Range | Meaning | Decode condition | |-------|---------|------------------| | [0, 2 GB) | IOCPU resource | chiplet_offset < 0x8000_0000 | | [2 GB, 1 TB) | UAL | chiplet_offset >= 0x8000_0000 | ##### 2.4.1 IOCPU Region ```text [30:27] iocpu_sub_unit (4) [26: 0] sub_offset (27) -- 128 MB per slot ``` 16 x 128 MB slots. 2 GB active decode. | iocpu_sub_unit | Name | Budget | |----------------|------|--------| | 0 | IOCPU_ITCM | 512 KB | | 1 | IOCPU_DTCM | 512 KB | | 2 | IPCQ | 2 MB | | 3 | IOCPU_SFR | 8 KB | | 4 | IO_DMA_SFR | 16 KB | | 5 | IO_SRAM | 64 MB | | 6..15 | Reserved | -- | ##### 2.4.2 UAL Region Sub-layout TBD (separate ADR). #### 2.5 Addressing Rules 1. MBZ bits must be zero. An address with non-zero MBZ bits is **architecturally invalid**. Implementation may raise a decode fault or return an error -- behavior is not prescribed by this ADR. 2. Fixed slot sizes are chosen for simple hardware decode; actual implemented capacity may be smaller than the slot. 3. Access beyond a sub-unit's implemented budget within a slot is **architecturally invalid** (same policy as MBZ). ### D3. Bitfield decoding is deterministic Given an integer address, field extraction (`sip_id`, `die_id`, `kind`, `sub_unit`, `offset`) is purely positional. No runtime state is required. Decoding deterministically maps an integer address to destination domains: `sip_id`, `die_id`, target kind (HBM / PE_LOCAL / MCPU_LOCAL / CUBE_SRAM / IOCPU / UAL). ### D4. Capacity validation may depend on topology config Whether a decoded address falls within **implemented capacity** (e.g., HBM 96 GB on a specific SKU) is checked against topology parameters provided via DI/config. Decode itself (D3) never consults topology -- only validation does. These parameters must live in the topology/config layer, not in node implementations. ### D5. Routing consumes decoded domains, not raw bits Routing policy uses decoded domains: - `src` location (sip / die / pe or node_id) - `dst` domains derived from PhysAddr decoding - `size_bytes` for size-aware link latency Routing must not inspect raw bit-fields directly except inside the decoding module. ## Alternatives Considered 1. **Keep `rack_id` (4 bits)**: Rejected -- never used in practice, consumes 4 bits that enable die-local expansion to 42 bits (IOCHIPLET 1 TB). 2. **Uniform 256 GB per die**: Rejected -- IOCHIPLET UAL requires ~1 TB. Freed rack_id bits enable 42-bit local_offset. 3. **Variable-width die windows (AHBM 256 GB, CHIPLET 1 TB via multi-seg spanning)**: Rejected -- complicates D3 (deterministic decoding). Uniform 4 TB window with MBZ padding is simpler. 4. **Use raw integers everywhere, decode ad-hoc in routing**: Rejected -- leads to duplicated logic, inconsistent routing, and hidden assumptions. 5. **Hardcode topology sizes (SIP/CUBE/PE counts) into decoding**: Rejected -- violates SPEC R3 and breaks swappability. 6. **Put decoding inside memory controllers or routers**: Rejected -- leaks policy into components, violates SPEC R4 / D5. ## Consequences ### Positive - Simple hierarchical decoder: SIP -> die -> kind -> sub-unit. - Clean separation of memory (HBM) vs local resource (PE/MCPU/SRAM/IOCPU). - Deterministic routing domains enable clear test invariants (SPEC R1, R5). - Expandable: 11 reserved die_id slots, reserved resource_kind / sub-unit slots, reserved MBZ bits. - DI-first: decoder can be swapped without changing components (SPEC R4). ### Tradeoffs - Sparse address holes due to power-of-2 slot alignment. - Large reserved/MBZ regions (intentional for future extension). - Requires explicit configuration for topology-derived sizes (D4). - Introduces a single "blessed" decoding module that must remain stable and well-tested. ## Supersedes - **ADR-0031 (PhysAddr PE-Resource Extension)**: stub status. The PE_LOCAL / MCPU_LOCAL / CUBE_SRAM sub-unit tables in D2.3.3-D2.3.5 fulfill ADR-0031's stated goals. ## Implementation Notes (Non-normative) - Recommended module: `src/kernbench/policy/address/phyaddr.py` - Tests should cover: encode/decode round-trip per kind, MBZ enforcement, die_id dispatch (AHBM / IOCHIPLET / reserved), sub-unit boundary values, backward compatibility of factory APIs. - Factory methods: `hbm_addr`, `pe_hbm_addr`, `pe_tcm_addr`, `cube_sram_addr` retain signatures (minus `rack_id`); `cube_id` parameter renamed to `die_id`. - New factories: `pe_resource_addr`, `mcpu_resource_addr`, `iocpu_resource_addr`, `ual_addr`. ## Appendix A. Address Examples ### A.1 AHBM HBM access sip=2, die=5, HBM offset=0x1000 ```text sip_id = 2 -> [50:47] = 0b0010 die_id = 5 -> [46:42] = 0b00101 addr_space = 1 -> [37] = 1 (HBM) hbm_offset = 0x1000 -> [36:0] 51-bit addr = (2 << 47) | (5 << 42) | (1 << 37) | 0x1000 ``` ### A.2 AHBM PE_LOCAL -- PE3 PE_TCM, offset=0x400 ```text sip_id = 0 -> [50:47] = 0 die_id = 0 -> [46:42] = 0 addr_space = 0 -> [37] = 0 resource_kind = 0 -> [36:34] = 000 (PE_LOCAL) pe_id = 3 -> [32:29] = 0011 pe_sub_unit = 6 -> [28:25] = 0110 (PE_TCM) sub_offset = 0x400 -> [24:0] local_offset = (0 << 34) | (3 << 29) | (6 << 25) | 0x400 ``` ### A.3 AHBM MCPU_LOCAL -- MCPU_SRAM, offset=0x0 ```text sip_id = 1 -> [50:47] = 0001 die_id = 3 -> [46:42] = 00011 addr_space = 0 -> [37] = 0 resource_kind = 1 -> [36:34] = 001 (MCPU_LOCAL) mcpu_sub_unit = 5 -> [29:25] = 00101 (MCPU_SRAM) sub_offset = 0 -> [24:0] = 0 local_offset = (1 << 34) | (5 << 25) ``` ### A.4 IOCHIPLET -- IOCPU IPCQ, offset=0x20000 ```text sip_id = 1 -> [50:47] = 0001 die_id = 17 -> [46:42] = 10001 (IOCHIPLET[1]) iocpu_sub_unit = 2 -> [30:27] = 0010 (IPCQ) sub_offset = 0x20000 -> [26:0] chiplet_offset = (2 << 27) | 0x20000 (< 0x8000_0000 -> IOCPU region) ``` ### A.5 IOCHIPLET -- UAL region, offset=4 GB ```text sip_id = 0 -> [50:47] = 0 die_id = 16 -> [46:42] = 10000 (IOCHIPLET[0]) chiplet_offset = 0x1_0000_0000 (4 GB >= 2 GB -> UAL region) ``` ## Links - SPEC.md: R1 (routing), R3 (configurable topology), R4 (DI-first), R5 (multi-domain comm) - ADR-0031: Superseded