training-methods on Sparse Notes

training-methods on Sparse Notes https://sparsenotes.com/tags/training-methods/ Recent content in training-methods on Sparse Notes https://sparsenotes.com/images/og-default.png https://sparsenotes.com/images/og-default.png Hugo -- gohugo.io Sat, 06 Jun 2026 00:00:00 +0000 Pretraining Recurrent Networks without Recurrence https://sparsenotes.com/posts/2026/06/papers/2026-06-06-smt-rnn-without-recurrence/ Sat, 06 Jun 2026 00:00:00 +0000 https://sparsenotes.com/posts/2026/06/papers/2026-06-06-smt-rnn-without-recurrence/ Supervised Memory Training (SMT) sidesteps BPTT by reducing RNN training to one-step memory-transition labels supplied by a Transformer-based predictive-state encoder — recovering O(1) gradient paths and time-parallel training.