Expand description
Group→head expansion of B/C (GQA-style sharing). Grouped-Query Attention (GQA) dimension expansion.
Mamba-2 and Mamba-3 produce B and C projections per-group (size ngroups),
but the chunkwise SSD algorithms consume them per-head (size nheads).
This helper bridges the two by replicating each group’s vector across the
heads_per_group = nheads / ngroups heads of that group.
Functions§
- gqa_
expand_ to_ heads - Expand a tensor’s
ngroupsdim atgroup_diminto annheadsdim, by replicating each group’s slice acrossheads_per_group = nheads / ngroupsheads of that group.