Files
kernbench2/docs
ywkang 161132cdcb ADR-0021: PE pipeline refactor — component separation + token self-routing
Design for refactoring pe_accel monolith into independent builtin components:
- D1: 6 independent components (scheduler, DMA, fetch_store, GEMM, MATH, TCM)
- D2: Token self-routing — scheduler only dispatches + tracks completion
- D3: done signal = simpy.Event (HW wire), data = message (queue)
- D4: Async pipeline with single FIFO feeder, command-level ordering
- D5: PE_FETCH_STORE separates TCM↔register from compute
- D6: Compute components implement _process() only, chaining in base
- D7: Topology adds pe_fetch_store + chaining edges
- D8: Existing builtin/pe_accel → builtin_legacy backup, new builtin
- D9: TileToken with plan + stage_idx for self-routing

Key decisions from review:
- No PipelineManager object — scheduler + existing ports sufficient
- PipelineContext with exactly-once completion contract
- _feed_loop singleton per scheduler, FIFO command ordering
- Intra-PE chaining: no explicit latency model
- Latency models ported from pe_accel current implementation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 23:21:40 -07:00
..