Expand description
§Single-SSD pathway (official-kernel form)
Realises the Mamba-3 trapezoidal recurrence as a single SSD call (the
official Triton-SISO / Tilelang-MIMO form): a key scale
scaleₜ = γₜ + (1−λₜ₊₁)·Δₜ₊₁, a strict lower-triangular intra-chunk mask, a
same-step γ correction, and a boundary-β seed folded into the initial state.
Uses ≈ half the training memory of the
double_ssd pathway. Its cache’s SSM
accumulator h' has different mid-sequence semantics than the double-SSD
state (hence a distinct cache type), but the two coincide at sequence
boundaries and inter-convert via field-identity From impls.
Modules§
- cache
- The single-SSD cache (same fields as double-SSD, different
ssmsemantics). - prelude
- Public re-exports for the single-SSD pathway.
- single_
ssd forward_single_ssd(scale + boundary-β seed) andstep_single_ssd.- ssd
- The standard SSD kernels specialised to the single-pass scale/mask. Standard MIMO-first SSD kernels specialised to the single-pass form.