Files

T

ywkang 168b0c89f0 ADR: translate adr-ko/ to Korean, fix ADR-0013 slug, refine Status check

Follow-up to the bilingual-structure commit: docs/adr-ko/ now holds
only Korean versions (24 files translated from English placeholders),
ADR-0013 slug uses kebab-case in both folders, and the verify tool
allows translated parenthetical commentary in the Status block.

- Translate 24 English files in docs/adr-ko/ to Korean. The previous
  bilingual-structure commit had left these as English copies because
  their source content was already English; this commit fulfills the
  policy that docs/adr-ko/ contains only Korean.
- Rename ADR-0013 in both adr/ and adr-ko/ from
  ver-verification_strategy.md to ver-verification-strategy.md
  (kebab-case consistency with other ADRs).
- CLAUDE.md (ADR Translation Discipline): clarify that only the
  Status lifecycle keyword (Accepted / Proposed / Stub / Draft /
  Superseded by ADR-NNNN / Merged into ADR-NNNN) must match across
  EN and KO; parenthetical commentary and trailing list items may be
  translated.
- tools/verify_adr_lang_pairs.py: replace byte-equal Status check
  with normalize_status_keyword() which strips parenthetical
  commentary and takes only the first non-empty line.
- tests/test_verify_adr_lang_pairs.py: update existing test names,
  add coverage for translated parenthetical, translated trailing
  list, and Superseded-by-NNNN keyword equality.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-20 08:17:56 -07:00

3.1 KiB

Raw Permalink Blame History

ADR-0008: 텐서 배포 및 할당 (호스트 할당기, PA 우선)

Status

Accepted

Context

벤치마크는 PyTorch와 유사한 텐서 시맨틱을 요구한다:

텐서 생성 (empty, fill),
가속기 디바이스로의 배포 (tensor.to()).

현실적인 시스템에서는 호스트 소프트웨어가 할당·매핑을 관리하고 DMA/MMU 매핑을 설치한다. Phase 0에서는 (ADR-0011) 다음으로 단순화한다:

디바이스 메모리 동작은 PA만 사용,
VA/MMU/IOMMU는 모델링하지 않는다.

호스트↔디바이스 인터페이스를 최소로 유지하기 위해 별도의 AllocateTensorMeta 메시지는 피한다. 대신 호스트 할당은 PA 샤드 맵을 생성하여 MemoryWrite/Read와 KernelLaunch가 직접 사용한다.

Decision

D1. Tensor는 PA 샤드 매핑을 가진 호스트 소유 핸들

Tensor 객체는 다음을 캡슐화하는 호스트 소유 핸들이다:

shape과 dtype,
초기화 의도,
PA 샤드 맵 형태의 디바이스 배치 및 할당 메타데이터.

배포 이후 Tensor 핸들은 다음을 포함해야 한다:

각각 (sip, cube, pe, pa, nbytes, offset_bytes)를 가진 샤드 리스트.

이 PA 샤드 매핑이 커널 인수 바인딩의 단일 진실 원천이다.

D2. 배포는 호스트 할당기를 사용한다 (Phase 0)

Phase 0에서 텐서 배포는 호스트 할당기를 통해 PA 샤드 매핑을 생성한다:

배치(split/replicate/hybrid)는 DP 정책에 의해 결정,
할당은 PE 수준에서 PA 범위를 부여하고 샤드 매핑을 반환,
Tensor 핸들은 결정론적으로 결과 샤드 리스트를 저장.

Phase 0에서는 호스트가 보는 별도의 디바이스 할당 RPC는 필요하지 않다.

D3. 데이터 초기화와 전송은 MemoryWrite/Read만 사용

텐서가 함의하는 모든 데이터 초기화나 전송(예: fill, copy)은 Host ↔ IO_CPU 메시지만으로 표현되어야 한다:

MemoryWrite
MemoryRead

규칙:

MemoryWrite/Read는 PA + (sip, cube, pe) 태그를 참조해야 한다 (ADR-0012).
할당 메타데이터는 별도의 할당 메시지로 임베드되어서는 안 된다.
대량 텐서 데이터는 Phase 0 메시지에 임베드되어서는 안 된다.

시뮬레이션 엔진은 MemoryWrite/Read를 그래프를 통해 스케줄하므로 레이턴시는 명시적 순회로 계산된다.

D4. 확장 경로 (호환성 유지)

향후 ADR이 다음을 추가하여 선택적인 VA/MMU/IOMMU 모델링을 도입할 수 있다:

텐서 핸들에 가상 주소,
매핑 설치 단계,
변환 레이턴시·페이지 granularity.

Phase 0의 PA 샤드 맵은 유효한 fast-path 구성으로 유지된다.

Consequences

Host↔IO_CPU 계약이 최소(MemoryRead/Write + KernelLaunch)로 유지된다.
KernelLaunch가 샤드 태그를 통해 PE별 데이터 배치를 명시적으로 전달할 수 있다.
초기 구현이 단순하고 테스트 가능하게 유지된다.

3.1 KiB Raw Permalink Blame History