# Claude Code Instructions (Repo) This repository uses Claude Code with strict architectural and verification rules. SPEC.md and ADRs are the source of truth. --- # Part 1 — General Behavior > Reusable across repos. Describes *how* Claude Code interacts with the user > and constructs changes, independent of this project's domain. ## Design Questions - Design / architecture questions are ALWAYS allowed. - Design questions MUST NOT modify: - production code - test code - SPEC.md - ADRs - If a design question implies a change, default to Phase 1. ## Surfacing Choices Applies to both design discussions and Phase 1 proposals. - If multiple valid interpretations of the request exist, present them. Do NOT pick one silently. - If a simpler approach exists, say so. Push back when warranted — do NOT just implement the more complex path the user proposed. - State required assumptions explicitly. If uncertain, ask before assuming. ## Change & Test Protocol (Mandatory) All non-trivial changes MUST follow a two-phase process. Design discussion is always allowed. Production code changes require Phase 1 approval before Phase 2 applies them. ### Phase 1 — Proposal + Verification (No Production Code Changes) #### Purpose - Decide *what* to change and *how it will be validated* - Establish verification coverage BEFORE touching production code #### Phase 1 MUST include 1) **Design Proposal** - Explain the design change. - Explain why the change is needed. - Explain consistency with SPEC.md and relevant ADRs. 2) **Verification Plan** - SPEC requirement(s) / ADR(s) affected. - Tests that validate the change: - existing tests to run, and/or - new tests to add. - Concrete input cases used by the tests. - Expected observable assertions. - Expected changes (or no changes) in generated artifacts, if applicable. (Project-specific expectations for what these inputs/assertions look like: see Part 2 → *Verification Plan — Project Expectations*.) If the Verification Plan is missing or vague, STOP. #### Allowed in Phase 1 - Creating or modifying **test code only** - Running tests and reporting results #### Forbidden in Phase 1 - Any production code changes - Any SPEC.md or ADR modifications - Final, ready-to-apply unified diffs (Phase 2 only) #### Permitted for design discussion - Pseudocode, interface sketches, type signatures - Small illustrative snippets to clarify a design point - "Before / after" excerpts (not full diffs) #### Phase 1 Output - Proposal + Verification Plan - Tests added/modified (if any) - Test execution results (PASS / FAIL) - Clear recommendation: - "No Phase 2 needed" OR - "Await approval for Phase 2" ### Phase 2 — Apply + Verify + Rollback #### Trigger Phase 2 is triggered ONLY by the exact user approval phrase: **"ok"** #### Phase 2 Rules - Keep changes minimal and scoped to the approved Phase 1 proposal. - Modify only production files declared in Phase 1. - Avoid unrelated edits, cleanup, or formatting churn. - Automatically apply approved changes to the working tree. #### Mandatory Verification - Run the tests defined in the Phase 1 Verification Plan #### Success Path If ALL tests PASS: - Keep the applied changes - Ensure generated artifacts (if affected) are consistent - Report success concisely #### Failure Path (Mandatory) If ANY test FAILS: - Immediately rollback ALL Phase 2 changes - Do NOT keep partial changes - Report: - failing test names - error messages / assertions - brief hypothesis of the root cause - Return to Phase 1 state Tests must NEVER be weakened, removed, or altered to force Phase 2 to pass. Failing tests may indicate: - invalid assumptions, - architectural violations, - or incomplete modeling. Do not assume the test is wrong without explicit evidence. ## Allowed Exceptions (Protocol Still Required) - comments or docstrings - formatting-only changes - type annotation changes with no runtime behavior change In exceptions, Phase 1 MUST explicitly state: **"No behavior change; tests unchanged."** ## Coding Style Applies to all production code changes (Phase 2) and test code (Phase 1). The Phase 1/2 protocol decides *whether* and *what* to change; this section decides *how* the resulting diff should look. ### Simplicity First **Minimum code that solves the problem. Nothing speculative.** - Write the minimum code that satisfies the Phase 1 proposal. - No abstractions for single-use code. - No "flexibility"/"configurability" not declared in Phase 1. - No error handling for impossible scenarios. Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify. ### Surgical Changes **Touch only what you must. Clean up only your own mess.** - Touch only files declared in the Phase 1 proposal. - Don't "improve" adjacent code, comments, or formatting. - Match existing style in the file, even if you'd do it differently. - If your changes orphan imports/variables/functions, remove them. - If you notice pre-existing dead code, do NOT delete it silently. Mention it, and present options: (a) delete (with approval), (b) keep as-is, (c) refactor to make it reachable / repurposed. Let the user choose before acting. - Every changed line must trace to the Phase 1 proposal. ## Enforcement Defaults General fallbacks. Apply to anything not explicitly covered above. - If unsure whether a change is non-trivial → treat it as non-trivial. - If unsure whether Phase 2 is allowed → STOP and ask. --- # Part 2 — Project-Specific (kernbench) > Specific to this repo's domain (SIP/CUBE/PE topology, runtime API, sim_engine). > Replace this entire Part when adapting the framework to another repo. > > Contains **foundations** (Authority & Scope → Terminology → Mental Model → > Common Failure Modes) followed by **rules** (Non-Trivial, Verification Plan, > CLI, Derived Artifacts, runtime API / sim_engine Boundaries). ## Authority & Scope - SPEC.md defines the architectural contract. - ADRs (docs/adr/ADR-*.md) define non-trivial architectural decisions. - If a change conflicts with SPEC.md or an ADR: - STOP. - Explain the conflict. - Propose options (keep spec, update ADR, or narrow scope). - Do NOT silently change architecture. - The repository structure reflects architectural intent; Claude Code MUST respect existing module boundaries and file locations. ## Terminology - runtime API: Host-facing public API used by benchmarks and user code (e.g., tensor deployment, kernel launch). - simulation engine (sim_engine): Discrete-event engine responsible for request injection, scheduling, and completion tracking. - components: Device-side nodes modeling hardware behavior (IO_CPU, M_CPU, PE_CPU, routers, engines, etc.). ## Mental Model The simulator is layered along **request flow**: runtime API (host-facing: tensor ops, kernel launch; topology-agnostic, no routing — ADR-0007) ↓ sim_engine (schedules events, routes requests, tracks completion via correlation IDs) ↓ components (device-side nodes: IO_CPU, M_CPU, PE_CPU, routers, engines — model HW behavior including interconnect) Configuration & decisions (orthogonal to request flow): - **topology** — compiled at config time (ADR-0006); defines which components exist and how they connect. Authoritative graph for sim_engine. - **policy** (routing / address / placement) — consulted by sim_engine during request handling. Invariant: all latency arises from **explicit scheduled events on modeled components and links** (SPEC §0.1, R8). No implicit waits, no magic delays. Stay within layer boundaries; do not collapse or bypass for convenience. ## Common Failure Modes Anti-patterns that violate the Mental Model or Golden Invariants (SPEC §0.1). If your change does any of these, STOP and reconsider. - **runtime topology mutation** — topology is compiled at config time; do not add/remove nodes or edges during simulation (ADR-0006). - **nondeterministic iteration order** — never iterate sets, unordered dicts, or anything else with implementation-defined order on the critical path. Determinism is required (SPEC §0.1). - **routing policy inside runtime API** — runtime API is topology-agnostic; routing/fan-out belongs in policy + sim_engine (ADR-0007). - **latency modeled outside sim_engine scheduling** — every delay must come from an explicit scheduled event on a modeled component or link (SPEC §0.1, R8). No magic sleeps, no hardcoded constants smuggled in. - **hidden cross-layer coupling** — do not skip layer interfaces. e.g., runtime API must not call into components directly, bypassing sim_engine. - **silent ADR/SPEC reinterpretation** — surface conflicts; do not paper over them. See *Authority & Scope* above. - **weakening tests to make Phase 2 pass** — fix the code, not the test. See *Part 1 → Phase 2 → Failure Path*. ## What Counts as "Non-Trivial" (Protocol Required) Any of the following: - routing policy or ordering changes - topology builder changes (nodes, links, parameters) - address decoding / PhysAddr behavior - latency composition rules - changes affecting determinism or connectivity - changes touching two or more production files ## Verification Plan — Project Expectations Concrete forms that Part 1's *Verification Plan* MUST take in this repo: - SPEC requirement(s) / ADR(s) affected (e.g., R1/R2/R5, ADR-0002). - Concrete input cases: - topology (SIP / CUBE / PE layout) - request parameters (src, dst, size_bytes). - Expected observable assertions, such as: - hop trace contains key waypoints, - latency invariants (e.g., > 0, monotonic increase), - deterministic route selection. - **expected changes (or no changes) in generated diagrams**, if applicable. ## CLI Semantics - `kernbench run --device ` runs the benchmark on a single device. - Omitting `--device` runs the benchmark on all devices discovered in the topology (logically parallel). - Device enumeration is handled by the CLI only; benchmarks MUST remain single-device. ## Derived Artifacts (Clarification) - Generated diagrams under `docs/diagrams/` are **derived artifacts**, not production code. - Creating or updating files in `docs/diagrams/`: - does NOT count as a production code change, - does NOT require Phase 2 approval, - MUST be consistent with SPEC.md and ADRs. ## runtime API / sim_engine Boundaries - runtime API MUST NOT hardcode topology/routing or internal hop sequences. - sim_engine MUST remain independent of runtime API semantics (no tensor/kernel policy logic).