Expand description
§Mamba-3 Inference Caches
During autoregressive (token-by-token) generation, three pieces of state must be preserved between calls:
- SSM hidden state —
hₜ ∈ ℝ^{P×N}per head, compressed context. - Previous K state —
B_{t-1}per rank[batch, mimo_rank, nheads, state_rank], needed for the β term of the trapezoidal recurrence. - Previous V state —
x_{t-1}per head[batch, nheads, per_head_dim], paired with k_state to reconstruct β B_{t-1} ⊗ x_{t-1}. - Cumulative RoPE angle — the accumulated rotation angle up to position
t, needed to correctly continue data-dependent rotary embeddings.
Note: Mamba-3 has no conv cache (the short 1-D convolution present in Mamba-3 is removed; its role is absorbed by the trapezoidal discretization and the learnable B/C biases).
Structs§
- Mamba3
Cache - The mutable state carried between decoding steps for a single Mamba-3 layer.
- Mamba3
Cache Config - Configuration / factory for a single
Mamba3Cache. - Mamba3
Cache Record - The record type for the module.
- Mamba3
Cache Record Item - The record item type for the module.
- Mamba3
Caches - A collection of per-layer caches for a complete Mamba-3 network.
- Mamba3
Caches Config - Configuration / factory for
Mamba3Caches. - Mamba3
Caches Record - The record type for the module.
- Mamba3
Caches Record Item - The record item type for the module.