pub fn apply_rope_partial<const D: usize>(
x: Tensor<D>,
angles: Tensor<D>,
rope_dim: usize,
rotate_pairwise: bool,
) -> Tensor<D>Expand description
Apply RoPE to only the rotation-active entries of the last dimension; the
remainder passes through unchanged. Falls back to apply_rope when
rope_dim == state_rank (full RoPE), and is the identity when
rope_dim == 0 (RoPE disabled, rope_fraction = 0) — angles is ignored.
Pairing scheme (must match the reference kernels — see Section
“Data-Dependent RoPE” in the paper, and mamba3_siso_fwd.py /
mamba3_mimo_fwd.py):
rotate_pairwise = true(SISO, interleaved/NeoX): pairs(0,1), (2,3), …. Only pairs0..num_rope_anglesare rotated; pairs beyond are passed through. Equivalent to slicing the firstrope_dimentries and rotating them.rotate_pairwise = false(MIMO, half-and-half/GPT-J): pair distance is alwaysstate_rank/2, i.e. elementnis paired with elementstate_rank/2 + n. With partial RoPE only the firstnum_rope_anglespairs are rotated; the remaining elements in both halves pass through.