pub struct LatentNetwork<M: Module> {
pub in_proj: Linear,
pub layers: Layers<M>,
pub out_proj: Linear,
pub class_tokens: Vec<ClassToken>,
pub class_tokens_emb: Option<Param<Tensor<2>>>,
}Expand description
A feature/regression network on latents:
in_proj (input_size → d_model) → Layers<M> → out_proj (d_model → output_size).
Fields§
§in_proj: LinearLinear projection input_size → d_model.
layers: Layers<M>The shared Mamba-x layer stack.
out_proj: LinearLinear projection d_model → output_size.
class_tokens: Vec<ClassToken>Positions of the network’s class tokens, spliced into the input sequence
(at input_size width) before in_proj. Empty ⇒ none.
class_tokens_emb: Option<Param<Tensor<2>>>The class-token embeddings, [num_class_tokens, input_size].
Implementations§
Source§impl<M: MambaBlock> LatentNetwork<M>
impl<M: MambaBlock> LatentNetwork<M>
Sourcepub fn class_token_output_indices(&self, orig_len: usize) -> Vec<usize>
pub fn class_token_output_indices(&self, orig_len: usize) -> Vec<usize>
Output positions of the class tokens for an orig_len input.
Sourcefn insert_tokens(&self, x: Tensor<3>) -> Tensor<3>
fn insert_tokens(&self, x: Tensor<3>) -> Tensor<3>
Splice this network’s class latents into x (no-op when there are none).
Sourcepub fn forward(
&self,
x: Tensor<3>,
caches: Option<M::Caches>,
ssd_path: M::SsdPath,
) -> (Tensor<3>, M::Caches)
pub fn forward( &self, x: Tensor<3>, caches: Option<M::Caches>, ssd_path: M::SsdPath, ) -> (Tensor<3>, M::Caches)
in_proj → layers → out_proj over a full sequence
([batch, sequence, input_size] → [batch, sequence (+ class tokens), output_size]).
Sourcepub fn step(
&self,
x: Tensor<2>,
caches: Option<M::Caches>,
own_index: Option<&mut usize>,
layers_own_index: Option<&mut usize>,
layer_indices: Option<&mut Vec<usize>>,
) -> (Tensor<2>, M::Caches)
pub fn step( &self, x: Tensor<2>, caches: Option<M::Caches>, own_index: Option<&mut usize>, layers_own_index: Option<&mut usize>, layer_indices: Option<&mut Vec<usize>>, ) -> (Tensor<2>, M::Caches)
Single-token step ([batch, input_size] → [batch, output_size]).
Three independent class cursors:
own_index— the network’s ownSelf::class_tokens(spliced beforein_proj). When it lands on a class-token position those tokens are stepped first (each a full network pass, advancingown_index), then the user token; only the user token’s output is returned.layers_own_index/layer_indices— forwarded straight to the innerLayers::step(stack-level latents, and the per-virtual-layer cursor vector respectively).
As in forward, the network’s class tokens are part of the sequence that
enters the layers, so each is threaded through the layers (carrying the
inner cursors) just like the user token — only the user token’s output is
returned. A None cursor skips that level; Middle/End markers panic
for the cursored level (use forward).
Sourcepub fn step_infinite(&self, x: Tensor<2>) -> Tensor<2>
pub fn step_infinite(&self, x: Tensor<2>) -> Tensor<2>
Stationary fixed point of the network under a constant input token:
in_proj → Layers::step_infinite → out_proj, no caches.
Cursorless (class tokens are not injected).
Sourcepub fn step_n_approx(
&self,
x: Tensor<2>,
n: usize,
caches: Option<M::Caches>,
) -> (Tensor<2>, M::Caches)
pub fn step_n_approx( &self, x: Tensor<2>, n: usize, caches: Option<M::Caches>, ) -> (Tensor<2>, M::Caches)
Approximate jump of n consecutive cursorless Self::step calls on
the same constant token — see Layers::step_n_approx for the
approximation contract.