Enum Mamba2SsdPath

Source

pub enum Mamba2SsdPath {
    Minimal(Option<usize>),
    Serial(Option<usize>),
    SerialRecalculated(Option<usize>),
}

Expand description

Ssd algorithm selection.

Each variant carries the chunk length Q for the SSD algorithm.
Larger values increase the intra-chunk GEMM work and reduce the inter-chunk scan length.
Optimal value is approximately √(state_rank · per_head_dim).

Variants§

§

Minimal(Option<usize>)

Minimal SSD.

This algorithm mostly uses batched matmuls. For the backward operation, this relies on autodiff.
See [chunked_selective_scan] for more info.

For training, you may prefer using SerialRecalculated instead.

Based on /mamba_ssm/modules/ssd_minimal.py from the state-spaces/mamba github reference.

§

Serial(Option<usize>)

(Hybrid) Serial SSD.

This algorithm uses a serial loop over the nchunks, besides batched matmuls. For the backward operation, this relies on autodiff.
For a custom backwards that saves memory, see SerialRecalculated.

Based on 5 kernels on /mamba_ssm/ops/triton/ from the state-spaces/mamba github reference:

ssd_chunk_state.py (K1, K3).
ssd_bmm.py (K2).
ssd_state_passing.py (K4).
ssd_chunk_scan.py (K5).

§

SerialRecalculated(Option<usize>)

(Hybrid) Serial SSD that triggers recalculations for the backward pass.

This algorithm uses a serial loop over the nchunks, besides batched matmuls. Contains a custom backward operation that saves memory.
For an autodiff backwards, see Serial.

Based on the combined kernel /mamba_ssm/ops/triton/ssd_combined.py from the state-spaces/mamba github reference.

Enum Mamba2SsdPath Copy item path

Variants§

Minimal(Option<usize>)

Serial(Option<usize>)

SerialRecalculated(Option<usize>)

Implementations§

impl Mamba2SsdPath

pub fn optimal_default(state_rank: usize, per_head_dim: usize) -> usize

pub fn core_optimal(state_rank: usize, per_head_dim: usize) -> Self

pub fn core_optimal_from_block<B: Backend>(block: &Mamba2<B>) -> Self

pub fn chunked_optimal(state_rank: usize, per_head_dim: usize) -> Self

pub fn chunked_optimal_from_block<B: Backend>(block: &Mamba2<B>) -> Self

pub fn chunked_recalculated_optimal( state_rank: usize, per_head_dim: usize, ) -> Self

pub fn chunked_recalculated_optimal_from_block<B: Backend>( block: &Mamba2<B>, ) -> Self

pub fn chunk_len(&self) -> Option<usize>

pub fn chunk_len_or_optimal( &self, state_rank: usize, per_head_dim: usize, ) -> usize

pub fn run<B: Backend + Mamba2BackendExt>( &self, input: Mamba2SsdInput<B>, ) -> (Tensor<B, 5>, Tensor<B, 4>)

§Returns

Trait Implementations§

impl Clone for Mamba2SsdPath

fn clone(&self) -> Mamba2SsdPath

fn clone_from(&mut self, source: &Self)

impl Debug for Mamba2SsdPath

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Default for Mamba2SsdPath

fn default() -> Mamba2SsdPath

Auto Trait Implementations§

impl Freeze for Mamba2SsdPath

impl RefUnwindSafe for Mamba2SsdPath

impl Send for Mamba2SsdPath

impl Sync for Mamba2SsdPath

impl Unpin for Mamba2SsdPath

impl UnsafeUnpin for Mamba2SsdPath

impl UnwindSafe for Mamba2SsdPath

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Enum Mamba2SsdPath

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,