state-space-models on Sparse Notes

state-space-models on Sparse Notes https://sparsenotes.com/tags/state-space-models/ Recent content in state-space-models on Sparse Notes https://sparsenotes.com/images/og-default.png https://sparsenotes.com/images/og-default.png Hugo -- gohugo.io Sat, 23 May 2026 00:00:00 +0000 Gated DeltaNet-2: Decoupling Erase and Write in Linear Attention https://sparsenotes.com/posts/2026/05/papers/gated-deltanet-2/ Sat, 23 May 2026 00:00:00 +0000 https://sparsenotes.com/posts/2026/05/papers/gated-deltanet-2/ NVIDIA's Hatamizadeh, Choi, and Kautz introduce a linear-attention layer that splits the single scalar 'delta gate' into separate channel-wise erase and write gates — cleanly recovering KDA and Gated DeltaNet as tied subspaces, and beating both on long-context recall.