5. Cross-Scenario Universal Phenomena: Paper List

"When many different complex systems exhibit the same universal behavior, it strongly suggests that a description exists that is simpler than any particular model in isolation... behavior shared across many systems should depend primarily on the features common to all such systems."

A. Universal Inductive Biases & Architectural Convergences

Focuses on how disparate network structures—such as ConvNets vs. Transformers or UNet vs. Vision Transformers—converge to near-identical performance boundaries and pixel-level functional mappings under equal computational resource envelopes.

Dosovitskiy et al. [2020] — An image is worth 16x16 words: Transformers for image recognition at scale
Liu et al. [2022] — A convnet for the 2020s
Peebles & Xie [2023] — Scalable diffusion models with transformers
Bhojanapalli et al. [2021] — Understanding robustness of visual transformers
Cordonnier et al. [2020] — On the relationship between self-attention and convolutional layers

B. Universal Statistical Structure Latent in Data

Focuses on isolating the mathematical commonalities embedded across diverse data modalities—such as multiscale wavelets, Zipfian power laws in text, and compositional semantic hierarchies.

Zipf [1949] — Human behavior and the principle of least effort: An introduction to human ecology
Mallat [1999] — A wavelet tour of signal processing: The sparse way
Poggio et al. [2017] — Why and when can deep-but not shallow-networks avoid the curse of dimensionality: a review
Bietti & Mairal [2019] — On the inductive bias of deep convolutional networks
Michaud et al. [2023] — The quantization model of neural scaling

C. The Platonic Representation Hypothesis & Alignment

Focuses on investigating why internal neural activation layouts across different weights, objectives (supervised vs. self-supervised), modalities, and even biological brains actively align toward a shared, objective representation of reality.

Huh et al. [2024] — The platonic representation hypothesis
Kornblith et al. [2019] — Similarity of neural network representations revisited
Yamins et al. [2014] — Performance-optimized hierarchical models predict neural responses in higher visual cortex
Radford et al. [2021] — Learning transferable visual models from natural language supervision
Schrimpf et al. [2021] — Brain-score: Which artificial neural network is most brain-like?
Raghu et al. [2021] — Do vision transformers see like convolutional networks?
Chung et al. [2024] — Neural representation similarity metrics: A foundational guide for understanding universality classes
Bordelon et al. [2024c] — The geometry of representation alignment in deep networks

5. Cross-Scenario Universal Phenomena: Paper List ​

A. Universal Inductive Biases & Architectural Convergences ​

B. Universal Statistical Structure Latent in Data ​

C. The Platonic Representation Hypothesis & Alignment ​

5. Cross-Scenario Universal Phenomena: Paper List

A. Universal Inductive Biases & Architectural Convergences

B. Universal Statistical Structure Latent in Data

C. The Platonic Representation Hypothesis & Alignment