MLn CLUB · WEEK 6

Self-Distillation Enables Continual Learning

About this week

Why does on-policy learning dramatically reduce catastrophic forgetting? Does SDFT outperform SFT, RLHF-style pipelines, and continual pre-training? This week we read Self-Distillation Fine-Tuning (SDFT), a framework for continual learning that casts the model as both teacher and student at once — generating its own on-policy training data so it can absorb new tasks while preserving what it already knows. Join us at CASI for discussion at 8 pm, (optional) quiet reading from 7 pm.

Reading