TMA/WGMMA Async Double-Buffer Pipeline
H100 Producer-Consumer Warp Groups • Issue is instant • Execution is async
← Prev
Next →
▶ Auto
Step
0
/
0
📤
Producer WG
Idle
⚡
Consumer WG
Idle
GMEM (Global Memory)
↓
↓
Buffer 0
Empty
Buffer 1
Empty
SMEM (Shared Memory)
↓
↓
GMEM (Results)
Hardware Timeline — Producer/Consumer Warp Groups
Producer WG (TMA)
Consumer WG (TC)