Research

Academic or research source. Check the methodology, sample size, and whether it's been replicated.

Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models

Compositional generalization, the ability to recognize familiar parts in novel contexts, is a defining property of intelligent systems. Although modern models are trained on massive datasets, they...

arXiv cs.CV · Feb 27, 2026 18:32 UTC · Paper: ~15 min

2-Minute Brief

According to arXiv cs.CV: Compositional generalization, the ability to recognize familiar parts in novel contexts, is a defining property of intelligent systems. Although modern models are trained on massive datasets, they still cover only a tiny fraction of the combinatorial space of possible inputs, raising the question of what structure representations must have to support generalization to unseen combinations. We formalize three desiderata for compositional generalization under standard training (divisibility, transf

Read Original

Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models

TLDR

Compositional generalization, the ability to recognize familiar parts in novel contexts, is a defining property of intelligent systems. Although modern models are trained on massive datasets, they...

Artifacts

Paper PDF

2-Minute Brief

According to arXiv cs.CV: Compositional generalization, the ability to recognize familiar parts in novel contexts, is a defining property of intelligent systems. Although modern models are trained on massive datasets, they still cover only a tiny fraction of the combinatorial space of possible inputs, raising the question of what structure representations must have to support generalization to unseen combinations. We formalize three desiderata for compositional generalization under standard training (divisibility, transf

Open

O open S save B back M mode