Skip to main content

Module cache

Module cache 

Source
Expand description

§Mamba-3 Inference Caches

During autoregressive (token-by-token) generation, three pieces of state must be preserved between calls:

  1. SSM hidden statehₜ ∈ ℝ^{P×N} per head, compressed context.
  2. Previous K stateB_{t-1} per rank [batch, mimo_rank, nheads, state_rank], needed for the β term of the trapezoidal recurrence.
  3. Previous V statex_{t-1} per head [batch, nheads, per_head_dim], paired with k_state to reconstruct β B_{t-1} ⊗ x_{t-1}.
  4. Cumulative RoPE angle — the accumulated rotation angle up to position t, needed to correctly continue data-dependent rotary embeddings.

Note: Mamba-3 has no conv cache (the short 1-D convolution present in Mamba-3 is removed; its role is absorbed by the trapezoidal discretization and the learnable B/C biases).

Structs§

Mamba3Cache
The mutable state carried between decoding steps for a single Mamba-3 layer.
Mamba3CacheConfig
Configuration / factory for a single Mamba3Cache.
Mamba3CacheRecord
The record type for the module.
Mamba3CacheRecordItem
The record item type for the module.
Mamba3Caches
A collection of per-layer caches for a complete Mamba-3 network.
Mamba3CachesConfig
Configuration / factory for Mamba3Caches.
Mamba3CachesRecord
The record type for the module.
Mamba3CachesRecordItem
The record item type for the module.