pub enum MambaLatentNetConfig {
Mamba1 {
input_size: usize,
n_real_layers: usize,
n_virtual_layers: Option<(usize, Schedule)>,
mamba_block: Mamba1Config,
output_size: usize,
class_tokens: Vec<ClassToken>,
ignore_first_residual: bool,
ignore_last_residual: bool,
residuals: ResidualsConfig,
},
Mamba2 {
input_size: usize,
n_real_layers: usize,
n_virtual_layers: Option<(usize, Schedule)>,
mamba_block: Mamba2Config,
output_size: usize,
class_tokens: Vec<ClassToken>,
ignore_first_residual: bool,
ignore_last_residual: bool,
residuals: ResidualsConfig,
},
Mamba3 {
input_size: usize,
n_real_layers: usize,
n_virtual_layers: Option<(usize, Schedule)>,
mamba_block: Mamba3Config,
output_size: usize,
class_tokens: Vec<ClassToken>,
ignore_first_residual: bool,
ignore_last_residual: bool,
residuals: ResidualsConfig,
},
}Expand description
The serializable, documentation-friendly config for MambaLatentNet. Each
variant is concrete (per-family), so #[derive(Config)] applies; init
builds the matching network variant.
Variants§
Mamba1
Build a Mamba-1 latent network.
Fields
mamba_block: Mamba1ConfigShared block config.
class_tokens: Vec<ClassToken>Network-level class tokens, spliced into the input before in_proj.
ignore_first_residual: boolSuppress the first virtual layer’s residual (Pre-LN skip / MultiGate
seed carry). See Layers.
ignore_last_residual: boolSuppress the last virtual layer’s residual (output is the last
layer’s transform alone). See Layers.
residuals: ResidualsConfigInter-layer residual scheme (plain additive vs Multi-Gate).
Mamba2
Build a Mamba-2 latent network.
Fields
mamba_block: Mamba2ConfigShared block config.
class_tokens: Vec<ClassToken>Network-level class tokens, spliced into the input before in_proj.
ignore_first_residual: boolSuppress the first virtual layer’s residual (Pre-LN skip / MultiGate
seed carry). See Layers.
ignore_last_residual: boolSuppress the last virtual layer’s residual (output is the last
layer’s transform alone). See Layers.
residuals: ResidualsConfigInter-layer residual scheme (plain additive vs Multi-Gate).
Mamba3
Build a Mamba-3 latent network.
Fields
mamba_block: Mamba3ConfigShared block config.
class_tokens: Vec<ClassToken>Network-level class tokens, spliced into the input before in_proj.
ignore_first_residual: boolSuppress the first virtual layer’s residual (Pre-LN skip / MultiGate
seed carry). See Layers.
ignore_last_residual: boolSuppress the last virtual layer’s residual (output is the last
layer’s transform alone). See Layers.
residuals: ResidualsConfigInter-layer residual scheme (plain additive vs Multi-Gate).
Implementations§
Source§impl MambaLatentNetConfig
impl MambaLatentNetConfig
Sourcepub fn init(&self, device: &Device) -> MambaLatentNet
pub fn init(&self, device: &Device) -> MambaLatentNet
Allocate and initialise the selected network on device.