Skip to main content

MambaVocabNetConfig

Enum MambaVocabNetConfig 

Source
pub enum MambaVocabNetConfig {
    Mamba1 {
        n_real_layers: usize,
        n_virtual_layers: Option<(usize, Schedule)>,
        vocab_size: usize,
        pad_vocab_size_multiple: usize,
        mamba_block: Mamba1Config,
        missing_lm_head: bool,
        ignore_first_residual: bool,
        ignore_last_residual: bool,
        residuals: ResidualsConfig,
    },
    Mamba2 {
        n_real_layers: usize,
        n_virtual_layers: Option<(usize, Schedule)>,
        vocab_size: usize,
        pad_vocab_size_multiple: usize,
        mamba_block: Mamba2Config,
        missing_lm_head: bool,
        ignore_first_residual: bool,
        ignore_last_residual: bool,
        residuals: ResidualsConfig,
    },
    Mamba3 {
        n_real_layers: usize,
        n_virtual_layers: Option<(usize, Schedule)>,
        vocab_size: usize,
        pad_vocab_size_multiple: usize,
        mamba_block: Mamba3Config,
        missing_lm_head: bool,
        ignore_first_residual: bool,
        ignore_last_residual: bool,
        residuals: ResidualsConfig,
    },
}
Expand description

The serializable, documentation-friendly config for MambaVocabNet. Each variant is concrete (per-family), so #[derive(Config)] applies; init builds the matching network variant.

Variants§

§

Mamba1

Build a Mamba-1 language model.

Fields

§n_real_layers: usize

Number of real layers.

§n_virtual_layers: Option<(usize, Schedule)>

Optional virtual-layer scheduling.

§vocab_size: usize

Unpadded vocabulary size.

§pad_vocab_size_multiple: usize

Round vocab_size up to a multiple of this (1 disables rounding).

§mamba_block: Mamba1Config

Shared block config.

§missing_lm_head: bool

Tie the LM head to the (transposed) embedding weights when true.

§ignore_first_residual: bool

Suppress the first virtual layer’s residual (Pre-LN skip / MultiGate seed carry). See Layers.

§ignore_last_residual: bool

Suppress the last virtual layer’s residual (output is the last layer’s transform alone). See Layers.

§residuals: ResidualsConfig

Inter-layer residual scheme (plain additive vs Multi-Gate).

§

Mamba2

Build a Mamba-2 language model.

Fields

§n_real_layers: usize

Number of real layers.

§n_virtual_layers: Option<(usize, Schedule)>

Optional virtual-layer scheduling.

§vocab_size: usize

Unpadded vocabulary size.

§pad_vocab_size_multiple: usize

Round vocab_size up to a multiple of this (1 disables rounding).

§mamba_block: Mamba2Config

Shared block config.

§missing_lm_head: bool

Tie the LM head to the (transposed) embedding weights when true.

§ignore_first_residual: bool

Suppress the first virtual layer’s residual (Pre-LN skip / MultiGate seed carry). See Layers.

§ignore_last_residual: bool

Suppress the last virtual layer’s residual (output is the last layer’s transform alone). See Layers.

§residuals: ResidualsConfig

Inter-layer residual scheme (plain additive vs Multi-Gate).

§

Mamba3

Build a Mamba-3 language model.

Fields

§n_real_layers: usize

Number of real layers.

§n_virtual_layers: Option<(usize, Schedule)>

Optional virtual-layer scheduling.

§vocab_size: usize

Unpadded vocabulary size.

§pad_vocab_size_multiple: usize

Round vocab_size up to a multiple of this (1 disables rounding).

§mamba_block: Mamba3Config

Shared block config.

§missing_lm_head: bool

Tie the LM head to the (transposed) embedding weights when true.

§ignore_first_residual: bool

Suppress the first virtual layer’s residual (Pre-LN skip / MultiGate seed carry). See Layers.

§ignore_last_residual: bool

Suppress the last virtual layer’s residual (output is the last layer’s transform alone). See Layers.

§residuals: ResidualsConfig

Inter-layer residual scheme (plain additive vs Multi-Gate).

Implementations§

Source§

impl MambaVocabNetConfig

Source

pub fn init(&self, device: &Device) -> MambaVocabNet

Allocate and initialise the selected language model on device.

Trait Implementations§

Source§

impl Clone for MambaVocabNetConfig

Source§

fn clone(&self) -> Self

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Config for MambaVocabNetConfig

§

fn load_binary(data: &[u8]) -> Result<Self, ConfigError>

Loads the configuration from a binary buffer. Read more
Source§

impl Debug for MambaVocabNetConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de> Deserialize<'de> for MambaVocabNetConfig

Source§

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl Display for MambaVocabNetConfig

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl Serialize for MambaVocabNetConfig

Source§

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T> ToString for T
where T: Display + ?Sized,

Source§

fn to_string(&self) -> String

Converts the given value to a String. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.