Enum Mamba3SsdPath

Source

pub enum Mamba3SsdPath {
    Minimal(Option<usize>),
    Serial(Option<usize>),
    SerialRecalculated(Option<usize>),
}

Expand description

Ssd algorithm selection.

Each variant carries the chunk length Q for the SSD algorithm. Larger values increase the intra-chunk GEMM work and reduce the inter-chunk scan length. Optimal value is approximately √(state_rank · per_head_dim).

Variants§

§

Minimal(Option<usize>)

Minimal SSD.

This algorithm mostly uses batched matmuls. For the backward operation, this relies on autodiff. See [chunked_selective_scan] for more info.

For training, you may prefer using SerialRecalculated instead.

Based on /mamba_ssm/modules/ssd_minimal.py from the state-spaces/mamba github reference.

§

Serial(Option<usize>)

(Hybrid) Serial SSD.

This algorithm uses a serial loop over the nchunks, besides batched matmuls. For the backward operation, this relies on autodiff. For a custom backwards that saves memory, see SerialRecalculated.

Based on 5 kernels on /mamba_ssm/ops/triton/ from the state-spaces/mamba github reference:

ssd_chunk_state.py (K1, K3).
ssd_bmm.py (K2).
ssd_state_passing.py (K4).
ssd_chunk_scan.py (K5).

§

SerialRecalculated(Option<usize>)

(Hybrid) Serial SSD that triggers recalculations for the backward pass.

This algorithm uses a serial loop over the nchunks, besides batched matmuls. Contains a custom backward operation that saves memory. For an autodiff backwards, see Serial.

Based on the combined kernel /mamba_ssm/ops/triton/ssd_combined.py from the state-spaces/mamba github reference.

Enum Mamba3SsdPath Copy item path

Variants§

Minimal(Option<usize>)

Serial(Option<usize>)

SerialRecalculated(Option<usize>)

Implementations§

impl Mamba3SsdPath

pub fn optimal_default(state_rank: usize, per_head_dim: usize) -> usize

pub fn core_optimal(state_rank: usize, per_head_dim: usize) -> Self

pub fn core_optimal_from_block<B: Backend>(block: &Mamba3<B>) -> Self

pub fn chunked_optimal(state_rank: usize, per_head_dim: usize) -> Self

pub fn chunked_optimal_from_block<B: Backend>(block: &Mamba3<B>) -> Self

pub fn chunked_recalculated_optimal( state_rank: usize, per_head_dim: usize, ) -> Self

pub fn chunked_recalculated_optimal_from_block<B: Backend>( block: &Mamba3<B>, ) -> Self

pub fn chunk_len(&self) -> Option<usize>

pub fn chunk_len_or_optimal( &self, state_rank: usize, per_head_dim: usize, ) -> usize

pub fn run<B: Backend + Mamba3BackendExt>( &self, input: Mamba3SsdInput<B>, ) -> (Tensor<B, 6>, Tensor<B, 4>)

§Returns

Trait Implementations§

impl Clone for Mamba3SsdPath

fn clone(&self) -> Mamba3SsdPath

fn clone_from(&mut self, source: &Self)

impl Debug for Mamba3SsdPath

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Default for Mamba3SsdPath

fn default() -> Mamba3SsdPath

Auto Trait Implementations§

impl Freeze for Mamba3SsdPath

impl RefUnwindSafe for Mamba3SsdPath

impl Send for Mamba3SsdPath

impl Sync for Mamba3SsdPath

impl Unpin for Mamba3SsdPath

impl UnsafeUnpin for Mamba3SsdPath

impl UnwindSafe for Mamba3SsdPath

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Enum Mamba3SsdPath

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,