Skip to main content

BidiLayers

burn_mamba::modules::bidi

Struct BidiLayers

pub struct BidiLayers<M: Module> {
    pub n_real_layers: usize,
    pub n_virtual_layers: Option<(usize, BidiSchedule)>,
    pub real_layers: Vec<Layer<M>>,
    pub ignore_first_residual: bool,
    pub ignore_last_residual: bool,
    pub outputs_merge: Vec<OutputMerge>,
    pub residuals: Residuals,
    pub class_latents: Vec<ClassLatent>,
    pub class_latents_emb: Option<Param<Tensor<2>>>,
}

Expand description

A stack of bidirectional Layer pairs with optional virtual-layer scheduling — one struct for every Mamba-x family.

Fields§

§n_real_layers: usize

Number of real (weight-bearing) layers; must be even (used in pairs).

§n_virtual_layers: Option<(usize, BidiSchedule)>

Optional (n_virtual_layers, schedule) for weight-sharing.

§real_layers: Vec<Layer<M>>

The weight-bearing layers, length n_real_layers.

§ignore_first_residual: bool

Zero the first virtual pair’s residual when true.

§ignore_last_residual: bool

Zero the last virtual pair’s residual when true.

§outputs_merge: Vec<OutputMerge>

One direction-merge per pair, length n_real_layers / 2.

§residuals: Residuals

How residuals are threaded between pairs (plain additive vs Multi-Gate). The MGR unit is the pair: one module per real/virtual pair.

§class_latents: Vec<ClassLatent>

Positions of the stack-level class latents, spliced into the sequence once before the first pair (independent of any per-pair class latents).

§class_latents_emb: Option<Param<Tensor<2>>>

The stack-level class-latent embeddings, [num_class_latents, d_model].

Implementations§

impl<M: MambaBlock + Clone> BidiLayers<M>
where M::SsdPath: Clone,

pub fn class_latent_output_indices(&self, orig_len: usize) -> Vec<usize>

Output positions of the stack-level class latents for an orig_len input.

fn insert_latents(&self, x: Tensor<3>) -> Tensor<3>

Splice this bidi-layers’ class latents into x (no-op when there are none).

fn multi_gate_streams_seed(&self, x: &Tensor<3>) -> Option<Tensor<4>>

Seed the MultiGate streams from a full-sequence input — n_stream copies of x as [batch, sequence, n_stream, d_model] — or None for the Standard path. Panics if MultiGate is paired with stack-level class latents.

pub fn forward( &self, x: Tensor<3>, caches: Option<M::Caches>, ssd_path: M::SsdPath, ) -> (Tensor<3>, M::Caches)

[batch, sequence, d_model] → [batch, sequence, d_model] (sequence grows by the stack-level class-latent count).

Each pair returns its merged transform F_l (no residual). With Residuals::Standard the input skip is added per pair (unless suppressed). With Residuals::MultiGate the skip is dropped and n_stream parallel streams — seeded from x — carry the residual between pairs: each pair reads their attention-pooled aggregate as input and its merged output is gated back into every stream (see MultiGate).

Trait Implementations§

impl<M> AutodiffModule for BidiLayers<M>
where M: AutodiffModule + ModuleDisplay + Module,

fn valid(&self) -> Self

Returns the same module, but on the inner backend without auto-differentiation.

fn from_inner(module: Self) -> Self

Wraps an inner module back into an auto-diff module.

impl<M> Clone for BidiLayers<M>
where M: Module + ModuleDisplay + Module,

fn clone(&self) -> Self

Returns a duplicate of the value. Read more

1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

impl<M: Debug + Module> Debug for BidiLayers<M>

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

impl<M> Display for BidiLayers<M>
where M: Module + ModuleDisplay + Module,

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

impl<M> Module for BidiLayers<M>
where M: Module + ModuleDisplay + Module,

type Record = BidiLayersRecord<M>

Type to save and load the module.

fn load_record(self, record: Self::Record) -> Self

Load the module state from a record.

fn into_record(self) -> Self::Record

Convert the module into a record containing the state.

fn num_params(&self) -> usize

Get the number of parameters the module has, including all of its sub-modules.

fn visit<Visitor: ModuleVisitor>(&self, visitor: &mut Visitor)

Visit each tensor parameter in the module with a visitor.

fn map<Mapper: ModuleMapper>(self, mapper: &mut Mapper) -> Self

Map each tensor parameter in the module with a mapper.

fn collect_devices(&self, devices: Devices) -> Devices

Return all the devices found in the underneath module tree added to the given vector without duplicates.

fn to_device(self, device: &Device) -> Self

Move the module and all of its sub-modules to the given device. Read more

fn fork(self, device: &Device) -> Self

Fork the module and all of its sub-modules to the given device. Read more

fn devices(&self) -> Vec<Device>

Return all the devices found in the underneath module tree without duplicates.

fn no_grad(self) -> Self

Each tensor in the module tree will not require grad. Read more

fn train(self) -> Self
where Self: AutodiffModule,

Move the module and all of its sub-modules to the autodiff backend. Read more

fn quantize_weights(self, quantizer: &mut Quantizer) -> Self

Quantize the weights of the module.

impl<M> ModuleDisplay for BidiLayers<M>
where M: Module + ModuleDisplay + Module,

fn format(&self, passed_settings: DisplaySettings) -> String

Formats the module with provided display settings. Read more

fn custom_settings(&self) -> Option<DisplaySettings>

Custom display settings for the module. Read more

fn custom_content(&self, _content: Content) -> Option<Content>

Custom attributes for the module. Read more

impl<M> ModuleDisplayDefault for BidiLayers<M>
where M: Module + ModuleDisplay + Module,

fn content(&self, content: Content) -> Option<Content>

Attributes of the module used for display purposes. Read more

fn num_params(&self) -> usize

Gets the number of the parameters of the module.

Auto Trait Implementations§

impl<M> !Freeze for BidiLayers<M>

impl<M> !RefUnwindSafe for BidiLayers<M>

impl<M> !UnwindSafe for BidiLayers<M>

impl<M> Send for BidiLayers<M>

impl<M> Sync for BidiLayers<M>
where M: Sync,

impl<M> Unpin for BidiLayers<M>
where M: Unpin,

impl<M> UnsafeUnpin for BidiLayers<M>

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> CloneToUninit for T
where T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)

Performs copy-assignment from self to dest. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> ToOwned for T
where T: Clone,

type Owned = T

The resulting type after obtaining ownership.

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more

impl<T> ToString for T
where T: Display + ?Sized,

fn to_string(&self) -> String

Converts the given value to a String. Read more

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.