latent-reasoning

Weekly Paper Notes — one of the top picks from the May 24–30, 2026 CS paper digest. Area: AI / ML. Authors: Lukas Aichberger, Sepp Hochreiter (JKU Linz / NXAI) arXiv: 2605.30343 · PDF TL;DR Modern reasoning LLMs scale test-time compute by emitting long chains of thought — but every “thought token” is forced to round-trip through the autoregressive decoder, conflating internal computation with external communication. Reasoning in Memory (RiM) instead inserts blocks of fixed special tokens that act as scratch space for the model’s working memory....