Skip to main content

Module cache

Module cache 

Source
Expand description

The double-SSD cache (ssm/k_state/v_state/cum_angle; no conv cache).

§Mamba-3 Inference Caches

During autoregressive (token-by-token) generation, three pieces of state must be preserved between calls:

  1. SSM hidden statehₜ ∈ ℝ^{per_head_dim×state_rank} per head, compressed context.
  2. Previous K stateBₜ₋₁ per rank [batch, mimo_rank, nheads, state_rank], needed for the β term of the (double-ssd) trapezoidal recurrence.
  3. Previous V statexₜ₋₁ per head [batch, nheads, per_head_dim], paired with k_state to reconstruct β Bₜ₋₁ ⊗ xₜ₋₁.
  4. Cumulative RoPE angle — the accumulated rotation angle up to position t, needed to correctly continue data-dependent rotary embeddings.

Note: Mamba-3 has no conv cache (the short 1-dimensional convolution present in Mamba-3 is removed; its role is absorbed by the trapezoidal discretization and the learnable B/C biases).

Structs§

Mamba3DoubleSsdCache
The mutable state carried between decoding steps for a single Mamba-3 layer.
Mamba3DoubleSsdCacheConfig
Configuration / factory for a single Mamba3DoubleSsdCache.
Mamba3DoubleSsdCacheRecord
The record type for the module.
Mamba3DoubleSsdCacheRecordItem
The record item type for the module.
Mamba3DoubleSsdCaches
A collection of per-layer caches for a complete Mamba-3 network.
Mamba3DoubleSsdCachesConfig
Configuration / factory for Mamba3DoubleSsdCaches.
Mamba3DoubleSsdCachesRecord
The record type for the module.
Mamba3DoubleSsdCachesRecordItem
The record item type for the module.