pub struct Layer<M: Module> {
pub norm: RmsNorm,
pub mamba_block: M,
pub class_latents: Vec<ClassLatent>,
pub class_latents_emb: Option<Param<Tensor<2>>>,
}Expand description
A single Pre-LN block wrapper computing M(RMSNorm(x)) — the residual is
not applied here. The enclosing Layers owns
that decision (add the input back, suppress it on the first/last layer, or
thread it through Multi-Gate streams), so no input clone / zero-add is wasted
when no residual is wanted.
May carry its own ClassLatents. In step they are spliced via the
index cursor; in forward the caller splices them first (via
Self::insert_latents) so the residual it adds sees the same lengthened
sequence. They are independent of any class latents on the enclosing
Layers.
Fields§
§norm: RmsNormPre-norm applied before the inner block.
mamba_block: MThe inner Mamba-x SSM block.
class_latents: Vec<ClassLatent>Positions of this layer’s class latents (empty ⇒ none).
class_latents_emb: Option<Param<Tensor<2>>>The class-latent embeddings, [num_class_latents, d_model] (None ⇒ none).
Implementations§
Source§impl<M: MambaBlock> Layer<M>
impl<M: MambaBlock> Layer<M>
Sourcepub(crate) fn insert_latents(&self, x: Tensor<3>) -> Tensor<3>
pub(crate) fn insert_latents(&self, x: Tensor<3>) -> Tensor<3>
Splice this layer’s class latents into x (no-op when there are none).
Public to the crate so Layers can lengthen the
sequence itself (and add the matching residual) before calling
Self::forward.
Sourcepub fn forward(
&self,
x: Tensor<3>,
cache: Option<M::Cache>,
ssd_path: M::SsdPath,
) -> (Tensor<3>, M::Cache)
pub fn forward( &self, x: Tensor<3>, cache: Option<M::Cache>, ssd_path: M::SsdPath, ) -> (Tensor<3>, M::Cache)
Full-sequence Pre-LN block without the residual: M(RMSNorm(x)).
The caller owns any class-latent insertion (Self::insert_latents) and
the residual.
Sourcepub fn step(
&self,
x: Tensor<2>,
cache: Option<M::Cache>,
index: Option<&mut usize>,
) -> (Tensor<2>, M::Cache)
pub fn step( &self, x: Tensor<2>, cache: Option<M::Cache>, index: Option<&mut usize>, ) -> (Tensor<2>, M::Cache)
Single-token Pre-LN block step without the residual.
index is the running cursor into this layer’s output sequence. With
Some, whenever it lands on one of this layer’s class-latent positions
those latents are stepped first (each advancing index, recursing with
None); only the user token’s output and cache are returned. With None
no class latents are injected — and Middle/End latents panic (their
positions need the full sequence; use forward). The residual is the
caller’s responsibility.
Sourcepub fn step_infinite(&self, x: Tensor<2>) -> Tensor<2>
pub fn step_infinite(&self, x: Tensor<2>) -> Tensor<2>
Stationary fixed point of the Pre-LN block under a constant token,
without the residual: the step counterpart of infinitely many
identical tokens (closed form, no cache — see
MambaBlock::block_step_infinite). Cursorless: class latents are not
injected (Middle/End latents panic, as in a None-cursor step).
Sourcepub fn step_n_approx(
&self,
x: Tensor<2>,
n: usize,
cache: Option<M::Cache>,
) -> (Tensor<2>, M::Cache)
pub fn step_n_approx( &self, x: Tensor<2>, n: usize, cache: Option<M::Cache>, ) -> (Tensor<2>, M::Cache)
Closed-form jump equivalent to n cursorless Self::step calls on
the same constant token, without the residual (see
MambaBlock::block_step_n_approx).