Skip to main content

Mamba2NetworkConfig

burn_mamba::mamba2::network

Struct Mamba2NetworkConfig

pub struct Mamba2NetworkConfig {
    pub n_real_layers: usize,
    pub n_virtual_layers: Option<(usize, Schedule)>,
    pub vocab_size: usize,
    pub pad_vocab_size_multiple: usize,
    pub mamba_block: Mamba2Config,
    pub missing_lm_head: bool,
}

Expand description

Configuration / factory for Mamba2Network.

Fields§

§n_real_layers: usize

Number of real (weight-bearing) Mamba-2 layers.

§n_virtual_layers: Option<(usize, Schedule)>

Optional virtual-layer scheduling. See Mamba2Layers for details.

§vocab_size: usize

The unpadded vocabulary size as specified by the tokenizer.

At initialisation this value is rounded up to the nearest multiple of pad_vocab_size_multiple to obtain the actual embedding / logit dimension padded_vocab_size.

§pad_vocab_size_multiple: usize

Vocabulary size will be rounded up to a multiple of this value.

Set to 1 to disable rounding. Common values: 8, 16, 64.

§mamba_block: Mamba2Config

Configuration shared by all Mamba-2 blocks.

§missing_lm_head: bool

When true, the LM head weight is not allocated separately; instead the transposed embedding matrix is used directly (weight tying).

Implementations§

impl Mamba2NetworkConfig

pub fn new( n_real_layers: usize, vocab_size: usize, pad_vocab_size_multiple: usize, mamba_block: Mamba2Config, missing_lm_head: bool, ) -> Self

Create a new instance of the config.

§Arguments

§Required Arguments

§`n_real_layers`

Number of real (weight-bearing) Mamba-2 layers.

§`vocab_size`

The unpadded vocabulary size as specified by the tokenizer.

At initialisation this value is rounded up to the nearest multiple of pad_vocab_size_multiple to obtain the actual embedding / logit dimension padded_vocab_size.

§`pad_vocab_size_multiple`

Vocabulary size will be rounded up to a multiple of this value.

Set to 1 to disable rounding. Common values: 8, 16, 64.

§`mamba_block`

Configuration shared by all Mamba-2 blocks.

§`missing_lm_head`

When true, the LM head weight is not allocated separately; instead the transposed embedding matrix is used directly (weight tying).

§Default Arguments

§`n_virtual_layers`

Optional virtual-layer scheduling. See Mamba2Layers for details.

Defaults to "None"

impl Mamba2NetworkConfig

pub fn with_n_virtual_layers( self, n_virtual_layers: Option<(usize, Schedule)>, ) -> Self

Sets the value for the field n_virtual_layers.

Optional virtual-layer scheduling. See Mamba2Layers for details.

Defaults to "None"

impl Mamba2NetworkConfig

pub fn init<B: Backend>(&self, device: &B::Device) -> Mamba2Network<B>

Allocate and initialise the full network on device.

Trait Implementations§

impl Clone for Mamba2NetworkConfig

fn clone(&self) -> Self

Returns a duplicate of the value. Read more

1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

impl Config for Mamba2NetworkConfig

fn load_binary(data: &[u8]) -> Result<Self, ConfigError>

Loads the configuration from a binary buffer. Read more

impl Debug for Mamba2NetworkConfig

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

impl<'de> Deserialize<'de> for Mamba2NetworkConfig

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more

impl Display for Mamba2NetworkConfig

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

impl Serialize for Mamba2NetworkConfig

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

impl Freeze for Mamba2NetworkConfig

impl RefUnwindSafe for Mamba2NetworkConfig

impl Send for Mamba2NetworkConfig

impl Sync for Mamba2NetworkConfig

impl Unpin for Mamba2NetworkConfig

impl UnsafeUnpin for Mamba2NetworkConfig

impl UnwindSafe for Mamba2NetworkConfig

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> CloneToUninit for T
where T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dest. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> ToOwned for T
where T: Clone,

type Owned = T

The resulting type after obtaining ownership.

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

impl<T> ToString for T
where T: Display + ?Sized,

fn to_string(&self) -> String

Converts the given value to a String. Read more

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,