Struct LatentNetwork

Source

pub struct LatentNetwork<M: Module> {
    pub in_proj: Linear,
    pub layers: Layers<M>,
    pub out_proj: Linear,
    pub class_tokens: Vec<ClassToken>,
    pub class_tokens_emb: Option<Param<Tensor<2>>>,
}

Expand description

A feature/regression network on latents: in_proj (input_size → d_model) → Layers<M> → out_proj (d_model → output_size).

Fields§

§in_proj: Linear

Linear projection input_size → d_model.

§layers: Layers<M>

The shared Mamba-x layer stack.

§out_proj: Linear

Linear projection d_model → output_size.

§class_tokens: Vec<ClassToken>

Positions of the network’s class tokens, spliced into the input sequence (at input_size width) before in_proj. Empty ⇒ none.

§class_tokens_emb: Option<Param<Tensor<2>>>

The class-token embeddings, [num_class_tokens, input_size].

Implementations§

Source §

impl<M: MambaBlock> LatentNetwork<M>
where M::SsdPath: Clone,

Source

pub fn class_token_output_indices(&self, orig_len: usize) -> Vec<usize>

Output positions of the class tokens for an orig_len input.

Source

fn insert_tokens(&self, x: Tensor<3>) -> Tensor<3>

Splice this network’s class latents into x (no-op when there are none).

Source

pub fn forward( &self, x: Tensor<3>, caches: Option<M::Caches>, ssd_path: M::SsdPath, ) -> (Tensor<3>, M::Caches)

in_proj → layers → out_proj over a full sequence ([batch, sequence, input_size] → [batch, sequence (+ class tokens), output_size]).

Source

pub fn step( &self, x: Tensor<2>, caches: Option<M::Caches>, own_index: Option<&mut usize>, layers_own_index: Option<&mut usize>, layer_indices: Option<&mut Vec<usize>>, ) -> (Tensor<2>, M::Caches)

Single-token step ([batch, input_size] → [batch, output_size]).

Three independent class cursors:

own_index — the network’s own Self::class_tokens (spliced before in_proj). When it lands on a class-token position those tokens are stepped first (each a full network pass, advancing own_index), then the user token; only the user token’s output is returned.
layers_own_index / layer_indices — forwarded straight to the inner Layers::step (stack-level latents, and the per-virtual-layer cursor vector respectively).

As in forward, the network’s class tokens are part of the sequence that enters the layers, so each is threaded through the layers (carrying the inner cursors) just like the user token — only the user token’s output is returned. A None cursor skips that level; Middle/End markers panic for the cursored level (use forward).

Source

pub fn step_infinite(&self, x: Tensor<2>) -> Tensor<2>

Stationary fixed point of the network under a constant input token: in_proj → Layers::step_infinite → out_proj, no caches. Cursorless (class tokens are not injected).

Source

pub fn step_n_approx( &self, x: Tensor<2>, n: usize, caches: Option<M::Caches>, ) -> (Tensor<2>, M::Caches)

Approximate jump of n consecutive cursorless Self::step calls on the same constant token — see Layers::step_n_approx for the approximation contract.