Struct Mamba3Cache
Source pub struct Mamba3Cache<B: Backend> {
pub ssm_bhpr: Tensor<B, 4>,
pub k_state_brhn: Tensor<B, 4>,
pub v_state_bhp: Tensor<B, 3>,
pub cum_angle_bhr: Tensor<B, 3>,
}
Expand description
SSM hidden state hₜ.
Updated via the trapezoidal recurrence:
hₜ = αₜ hₜ₋₁ + βₜ (sum_r K_{t-1}[r] ⊗ (V_{t-1} * mimo_x[r])) + γₜ (sum_r Bₜ[r] ⊗ (xₜ * mimo_x[r]))
Shape: [batch, nheads, per_head_dim, state_rank]
Previous token’s B per rank = B_{t-1}[r].
Used to reconstruct the β term: β * sum_r B_{t-1}[r] ⊗ (x_{t-1} * mimo_x[r]).
For SISO (mimo_rank=1) this is shape [batch, 1, nheads, state_rank].
Shape: [batch, mimo_rank, nheads, state_rank]
Previous token’s x = x_{t-1}.
Combined with k_state_brhn and mimo_x to produce the β term.
Shape: [batch, nheads, per_head_dim]
Cumulative data-dependent RoPE angle up to the current position.
Each step updates: cum_angle_{t} = cum_angle_{t-1} + Δ_t · tanh(θ_t) · π
Starts at zero for fresh sequences; continued across calls for streaming.
Shape: [batch, nheads, num_rope_angles]
Inner module without auto-differentiation.
Returns the same module, but on the inner backend without auto-differentiation.
Wraps an inner module back into an auto-diff module.
Performs copy-assignment from
source.
Read more
Formats the value using the given formatter.
Read more
Formats the value using the given formatter.
Read more
The module with auto-differentiation.
Type to save and load the module.
Load the module state from a record.
Convert the module into a record containing the state.
Get the number of parameters the module has, including all of its sub-modules.
Visit each tensor parameter in the module with a
visitor.
Map each tensor parameter in the module with a
mapper.
Return all the devices found in the underneath module tree added to the given vector
without duplicates.
Move the module and all of its sub-modules to the given device.
Read more
Fork the module and all of its sub-modules to the given device.
Read more
Return all the devices found in the underneath module tree without duplicates.
Each tensor in the module tree will not require grad.
Read more
Move the module and all of its sub-modules to the autodiff backend.
Read more
Quantize the weights of the module.
Formats the module with provided display settings.
Read more
Custom display settings for the module.
Read more
Attributes of the module used for display purposes.
Read more
Gets the number of the parameters of the module.
Immutably borrows from an owned value.
Read more
Mutably borrows from an owned value.
Read more
🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from
self to
dest.
Read more
Returns the argument unchanged.
Calls U::from(self).
That is, this conversion is whatever the implementation of
From<T> for U chooses to do.
The resulting type after obtaining ownership.
Creates owned data from borrowed data, usually by cloning.
Read more
Uses borrowed data to replace owned data, usually by cloning.
Read more
Converts the given value to a
String.
Read more
The type returned in the event of a conversion error.
Performs the conversion.
The type returned in the event of a conversion error.
Performs the conversion.