Skip to content
Mobrief
Research

Academic or research source. Check the methodology, sample size, and whether it's been replicated.

Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models

Compositional generalization, the ability to recognize familiar parts in novel contexts, is a defining property of intelligent systems. Although modern models are trained on massive datasets, they...

2-Minute Brief
  • According to arXiv cs.CV: Compositional generalization, the ability to recognize familiar parts in novel contexts, is a defining property of intelligent systems. Although modern models are trained on massive datasets, they still cover only a tiny fraction of the combinatorial space of possible inputs, raising the question of what structure representations must have to support generalization to unseen combinations. We formalize three desiderata for compositional generalization under standard training (divisibility, transf
Read Original

Compositional Generalization Requires Linear, Orthogonal Representations in Vision Embedding Models

TLDR

Compositional generalization, the ability to recognize familiar parts in novel contexts, is a defining property of intelligent systems. Although modern models are trained on massive datasets, they...

Artifacts
Paper PDF
2-Minute Brief
  • According to arXiv cs.CV: Compositional generalization, the ability to recognize familiar parts in novel contexts, is a defining property of intelligent systems. Although modern models are trained on massive datasets, they still cover only a tiny fraction of the combinatorial space of possible inputs, raising the question of what structure representations must have to support generalization to unseen combinations. We formalize three desiderata for compositional generalization under standard training (divisibility, transf
Open
O open S save B back M mode