5. Cross-Scenario Universal Phenomena: Paper List
"When many different complex systems exhibit the same universal behavior, it strongly suggests that a description exists that is simpler than any particular model in isolation... behavior shared across many systems should depend primarily on the features common to all such systems."
A. Universal Inductive Biases & Architectural Convergences
Focuses on how disparate network structures—such as ConvNets vs. Transformers or UNet vs. Vision Transformers—converge to near-identical performance boundaries and pixel-level functional mappings under equal computational resource envelopes.
- Dosovitskiy et al. [2020] — An image is worth 16x16 words: Transformers for image recognition at scale
- Liu et al. [2022] — A convnet for the 2020s
- Peebles & Xie [2023] — Scalable diffusion models with transformers
- Bhojanapalli et al. [2021] — Understanding robustness of visual transformers
- Cordonnier et al. [2020] — On the relationship between self-attention and convolutional layers
B. Universal Statistical Structure Latent in Data
Focuses on isolating the mathematical commonalities embedded across diverse data modalities—such as multiscale wavelets, Zipfian power laws in text, and compositional semantic hierarchies.
- Zipf [1949] — Human behavior and the principle of least effort: An introduction to human ecology
- Mallat [1999] — A wavelet tour of signal processing: The sparse way
- Poggio et al. [2017] — Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review
- Bietti & Mairal [2019] — On the inductive bias of deep convolutional networks
- Michaud et al. [2023] — The quantization model of neural scaling
C. The Platonic Representation Hypothesis & Alignment
Focuses on investigating why internal neural activation layouts across different weights, objectives (supervised vs. self-supervised), modalities, and even biological brains actively align toward a shared, objective representation of reality.
- Huh et al. [2024] — The platonic representation hypothesis
- Kornblith et al. [2019] — Similarity of neural network representations revisited
- Yamins et al. [2014] — Performance-optimized hierarchical models predict neural responses in higher visual cortex
- Radford et al. [2021] — Learning transferable visual models from natural language supervision
- Schrimpf et al. [2021] — Brain-score: Which artificial neural network is most brain-like?
- Raghu et al. [2021] — Do vision transformers see like convolutional networks?
- Chung et al. [2024] — Neural representation similarity metrics: A foundational guide for understanding universality classes
- Bordelon et al. [2024c] — The geometry of representation alignment in deep networks