Skip to main content

apply_rope_partial

Function apply_rope_partial 

Source
pub fn apply_rope_partial<const D: usize>(
    x: Tensor<D>,
    angles: Tensor<D>,
    rope_dim: usize,
    rotate_pairwise: bool,
) -> Tensor<D>
Expand description

Apply RoPE to only the rotation-active entries of the last dimension; the remainder passes through unchanged. Falls back to apply_rope when rope_dim == state_rank (full RoPE), and is the identity when rope_dim == 0 (RoPE disabled, rope_fraction = 0) — angles is ignored.

Pairing scheme (must match the reference kernels — see Section “Data-Dependent RoPE” in the paper, and mamba3_siso_fwd.py / mamba3_mimo_fwd.py):

  • rotate_pairwise = true (SISO, interleaved/NeoX): pairs (0,1), (2,3), …. Only pairs 0..num_rope_angles are rotated; pairs beyond are passed through. Equivalent to slicing the first rope_dim entries and rotating them.
  • rotate_pairwise = false (MIMO, half-and-half/GPT-J): pair distance is always state_rank/2, i.e. element n is paired with element state_rank/2 + n. With partial RoPE only the first num_rope_angles pairs are rotated; the remaining elements in both halves pass through.