Skip to main content

Module rms_norm_gated

Module rms_norm_gated 

Source
Expand description

RMSNorm followed by a SiLU(z) gate (Mamba-2 output norm). RMS normalisation fused with a SiLU(z) gate — the Mamba-2 output norm.

norm_before_gate selects the order of the two operations:

  • true — normalise, then gate: y = (x / rms(x) · γ) · SiLU(z)
  • false — gate, then normalise: y = rms(x · SiLU(z)) · γ applied to x · SiLU(z)

The numerical-stability epsilon is the per-dtype [div_eps] (so there is no configurable epsilon); the fp16 path uses the same max(|x|)-rescaling trick as RmsNorm.

Structs§

RmsNormGated
Applies Gated Rms Normalization over an input tensor along the last dimension.
RmsNormGatedConfig
Configuration to create a RmsNormGated layer.
RmsNormGatedRecord
The record type for the module.
RmsNormGatedRecordItem
The record item type for the module.