LeJEPA: Provable and Scalable Self-Supervised Learning Without the Heuristics
About this week
Most self-supervised learning methods work by carefully balancing instability. Remove stop-gradients, momentum encoders, or augmentation tricks, and they collapse. We’ve gotten strong results, but not a clean understanding of why they work. LeJEPA pushes in the opposite direction: instead of stabilizing training with heuristics, it builds a system where the objective itself prevents collapse. The core idea is simple but sharp: learn representations by predicting latent structure across views, while designing the objective so that trivial solutions (e.g., constant embeddings) are provably suboptimal. This removes the need for asymmetric updates used in methods like BYOL or contrastive negatives from SimCLR. Discussion at 20:00, (optional) quiet reading from 19:00.