Module gqa

Expand description

Group→head expansion of B/C (GQA-style sharing). Grouped-Query Attention (GQA) dimension expansion.

Mamba-2 and Mamba-3 produce B and C projections per-group (size ngroups), but the chunkwise SSD algorithms consume them per-head (size nheads). This helper bridges the two by replicating each group’s vector across the heads_per_group = nheads / ngroups heads of that group.