commit - release 1
This commit is contained in:
@@ -0,0 +1,108 @@
|
||||
# ADR-0001: PhysAddr Layout & Address Decoding Contract
|
||||
|
||||
## Status
|
||||
|
||||
Accepted
|
||||
|
||||
## Date
|
||||
|
||||
2026-02-27
|
||||
|
||||
## Context
|
||||
|
||||
KernBench Graph Latency Simulator must route requests deterministically and compute end-to-end latency strictly by graph traversal.
|
||||
To model local vs remote traffic (same/different SIP, same/different CUBE, optional PE-group), requests need a stable, parsable address/location scheme that:
|
||||
|
||||
- can be decoded into routing domains (SIP/CUBE/HBM/PE-resource, etc.)
|
||||
- remains topology-agnostic (no hardcoded counts)
|
||||
- supports swappable policy and DI-first components without leaking topology assumptions into node implementations
|
||||
|
||||
## Decision
|
||||
|
||||
We define a **PhysAddr value object** and an **address decoding contract** that converts an integer address into routing domains.
|
||||
|
||||
### D1. PhysAddr is an immutable value object
|
||||
|
||||
- PhysAddr is immutable and comparable as a pure value.
|
||||
- Any allocator returns a **fully specified PhysAddr** (not partial metadata).
|
||||
- No global state may be required to interpret a PhysAddr.
|
||||
|
||||
### D2. PhysAddr fields (logical contract)
|
||||
|
||||
PhysAddr must be able to represent at least:
|
||||
|
||||
- `rack_id` (optional but reserved for scale-out)
|
||||
- `sip_id` (device / SIP domain)
|
||||
- `sip_seg` (SIP-level segment/window selection, e.g., cube window)
|
||||
- `local_offset` (offset within the chosen segment/window)
|
||||
|
||||
Decoded/derived fields may include (optional):
|
||||
|
||||
- `cube_id`
|
||||
- `kind` (e.g., HBM vs PE-resource vs raw)
|
||||
- `unit_type` / `pe_id` (if PE-level addressing is modeled)
|
||||
|
||||
**Important:** The exact bit allocation may evolve, but the *semantic fields above* must remain decodable without hidden assumptions.
|
||||
|
||||
### D3. Decoding is deterministic and policy-compatible
|
||||
|
||||
- Decoding must deterministically map an integer address to:
|
||||
- destination SIP domain (`sip_id`)
|
||||
- destination sub-domain (`cube_id` if applicable)
|
||||
- destination target kind (HBM/PE-resource/other)
|
||||
- Decoding must not depend on runtime topology sizes; it may depend on **explicit topology parameters** provided through configuration (e.g., segment size, slice size), and those parameters must live in the topology/config layer (not in random components).
|
||||
|
||||
### D4. Topology-derived constants live in the topology layer
|
||||
|
||||
Constants such as segment sizes (e.g., HBM slice size / window size) are derived from topology configuration (YAML/JSON/dict) and are provided to the decoder via DI/config.
|
||||
They must not be hardcoded in node implementations.
|
||||
|
||||
### D5. Routing consumes decoded domains, not raw bits
|
||||
|
||||
Routing policy uses decoded domains:
|
||||
|
||||
- `src` location (sip/cube/pe or node_id)
|
||||
- `dst` domains derived from PhysAddr decoding
|
||||
- `size_bytes` for size-aware link latency
|
||||
Routing must not inspect raw bit-fields directly except inside the decoding module.
|
||||
|
||||
## Alternatives Considered
|
||||
|
||||
1) **Use raw integers everywhere, decode ad-hoc in routing**
|
||||
|
||||
- Rejected: leads to duplicated logic, inconsistent routing, and hidden assumptions embedded in multiple components.
|
||||
|
||||
1) **Hardcode topology sizes (SIP/CUBE/PE counts) into decoding**
|
||||
|
||||
- Rejected: violates SPEC (R3) and breaks swappability and configuration-driven topologies.
|
||||
|
||||
1) **Put decoding inside memory controllers or routers**
|
||||
|
||||
- Rejected: leaks policy into components and undermines DI-first, swappable implementations (SPEC R4).
|
||||
|
||||
## Consequences
|
||||
|
||||
### Positive
|
||||
|
||||
- Deterministic routing domains enable clear test invariants for local vs remote paths (SPEC R1, R5).
|
||||
- Keeps topology variability (SPEC R3) while preserving consistent semantics.
|
||||
- DI-first: decoder can be swapped or extended without changing components or tests (SPEC R4).
|
||||
|
||||
### Tradeoffs / Costs
|
||||
|
||||
- Requires explicit configuration for any topology-derived sizes.
|
||||
- Introduces a single “blessed” decoding module that must remain stable and well-tested.
|
||||
|
||||
## Implementation Notes (Non-normative)
|
||||
|
||||
- Recommended module boundary:
|
||||
- `src/kernbench/policy/address/phyaddr.py`
|
||||
|
||||
- Tests should cover:
|
||||
- deterministic decoding
|
||||
- local vs remote classification from decoded fields
|
||||
- invariants: “allocator returns full PhysAddr”, “decoding requires no global state”
|
||||
|
||||
## Links
|
||||
|
||||
- SPEC.md: R1 (routing), R3 (configurable topology), R4 (DI-first), R5 (multi-domain comm)
|
||||
Reference in New Issue
Block a user