I am a senior research scientist at Google Brain. For the past decade I have been doing research at the intersection between Machine Learning and Physics. My work has focused on two lines of study: 1) modelling complex physical systems using machine learning, and 2) understanding neural networks in the highly overparameterized limit using statistical physics. In each case I have led the development of large open-source software projects to distill our results so they may be as impactful as possible.

Topic (1) began during graduate school with a series of papers using Support Vector Machines to identify structure hiding wiithin disordered systems. The techniques we developed (published in Science [1], Nature Physics [2, 3], PNAS [4, 5], and Physical Review Letters [6]) are now essential tools for Physicists studying disordered systems of all kinds and have garnered over 1000 citations. Subsequently, upon joining Google Brain, I worked on one of the first papers proposing Graph Neural Networks as a promising model family for modelling molecular systems [7]. This work was published at ICML and has been cited over 3,500 times. Recently, I realized that existing Physics software was a poor match for the Machine Learning techniques we were developing. I therefore spent several years leading the development of JAX MD, a molecular simulation library written in JAX. JAX MD has been used in 10 publications so far, has 770 Github stars, and was published as a spotlight at NeurIPS [8].

I began studying topic (2) when I arrived at Google. I helped organize an effort to use techniques from statistical physics to precisely characterize signal propagation in randomly initialized neural networks. The formalism we developed and extended to most commonly used neural network architectures has led to actionable initialization strategies for practitioners (published at ICLR [9, 10], ICML [11, 12, 13], and NeurIPS [14, 15]). This work paved the way for research establishing a connection between deep overparameterized neural networks and Gaussian Processes (GP), the Neural Tangent Kernel (NTK) theory, and Tensor Programs. Altogether papers that I co-authored as part of this effort have been cited 2,400 times. In 2018 I realized that much of the laborious mathematical analysis could be automated. I therefore spent several years assembling and leading a team to build the Neural Tangents library, which automatically computes the GP and NTK for a wide range of neural network architectures. Neural Tangents powers wide network research and has been used in 83 publications so far with 1.8k stars on Github. The Neural Tangents paper was published as a spotlight at ICLR [16].

Employment History
2019-Present Senior Research Scientist Google Brain
2017-19 Research Scientist
2016-17 Brain Resident
2015-16 Postdoctoral Researcher Harvard University
2010-15 Ph.D. in Physics University of Pennsylvania
Thesis: A structural perspective on disordered solids
Awarded the Herbert B. Callen Memorial Prize,
"For his pioneering work using machine learning methods to identify a structural order parameter for glassy and plastic dynamics."
2006-10 B.A. in Physics & Mathematics (High Honors, Phi Beta Kappa) Swarthmore College
Open-Source Projects
2019-Present Neural Tangents: Fast and Easy Infinite Networks in Python ICLR 2020 Spotlight
Stars: 1.8k; Used in 83 Publications.
2019-Present JAX MD: A Framework for Differentiable Physics Neurips 2020 Spotlight
Stars: 770; Used in 10 Publications.
Publications (* = First Author, = Last Author)
2022 Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models arXiv
Deep equilibrium networks are sensitive to initialization statistics ICML
Fast Finite-Width Neural Tangent Kernel ICML
∂PV An end-to-end differentiable solar-cell simulator Comp. Phys. Comm.
2021 JAX MD, A Framework for Differentiable Physics* J. Stat. Mech.
Gradients are Not All You Need arXiv
Deep Kernel Shaping arXiv
Learn2Hop: Learned Optimization on Rough Landscapes ICML
Tilting the playing field: Dynamical loss functions for machine learning ICML (Long Oral)
Whitening and Second Order Optimization Impair Generalization ICML
Designing self-assembling kinetics with differentiable statistical physics models PNAS
2020 JAX MD: A Framework for Differentiable Physics* NeurIPS (Spotlight)
Finite Verses Infinite Neural Networks: an Empirical Study NeurIPS (Spotlight)
Unifying framework for strong and fragile liquids via machine learning arXiv
Disentangling Trainability and Generalization in Deep Neural Networks ICML
Unveiling the predictive power of static structure in glassy systems Nature Physics
Statistical mechanics of deep learning Rev. of Cond. Matt. Phys.
On the infinite width limit of neural networks with a standard parameterization arXiv
Adversarial forces of physical models NeurIPS ML4PS Workshop
Message passing neural networks ML Meets Quantum Physics
2019 Neural Tangents: Fast and Easy Infinite Neural Networks in Python* ICLR (Spotlight)
MetaInit: Initializing learning by learning to initialize NeurIPS
Wide neural networks evolve as linear models under gradient descent NeurIPS (Spotlight)
A mean field theory of batch normalization ICLR
Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs arXiv
Heterogeneous activation, local structure, and softness in supercooled liquids Phys. Rev. Lett.
2018 Dynamical isometry and a mean field theory of RNNs ICML
Dynamical isometry and a mean field theory of CNNs ICML
Combining machine learning and physics to understand glassy systems* J. of Phys.: Conf. Series
The emergence of spectral universality in deep networks AAAI
Adversarial Spheres arXiv
2017 Mean Field Residual Networks: On the Edge of Chaos NeurIPS
Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice NeurIPS
Structure-property relationships from universal signatures of plasticity in disordered solids* Science
Intriguing properties of adversarial examples arXiv
Deep neural networks as gaussian processes ICLR
A correspondence between random neural networks and statistical field theory arXiv
Relationship between local structure and relaxation in out-of-equilibrium glassy systems* PNAS
Disconnecting structure and dynamics in glassy thin films PNAS
Prediction errors of molecular machine learning models lower than hybrid DFT error J. Chem. Theory Comput.
Neural message passing for quantum chemistry ICML
Explaining the learning dynamics of direct feedback alignment ICLR (Workshop)
2016 A structural approach to relaxation in glassy liquids* Nature Physics
Deep Information Propagation* ICLR
Structural properties of defects in glassy liquids* J. Phys. Chem. B
Nonlinear sigma models with compact hyperbolic target spaces J. of High Energy Physics
2015 Strain fluctuations and elastic moduli in disordered solids* Phys. Rev. E
Identifying structural flow defects in disordered solids using machine-learning methods* Phys. Rev. Lett.
2014 Understanding plastic deformation in thermal glasses from single-soft-spot dynamics* Phys. Rev. X
Predicting plasticity with soft vibrational modes: From dislocations to glasses Phys. Rev. E
Phonon dispersion and elastic moduli of two-dimensional disordered colloidal packings Phys. Rev. E
2013 Phonons in two-dimensional soft colloidal crystals Phys. Rev. E
Stability of jammed packings II: the transverse length scale* Soft Matter
2009 Specialization as an optimal strategy under varying external conditions ICRA
Invited Talks


Differentiable Programming for Materials Design
 Materials Research Society
 Data Driven Design of Heterogeneous Materials

Honolulu, HI
University of Chicago



JAX MD: A Framework for Differentiable Molecular Dynamics
 National Institute of Health
 Weights & Biases
 JAX Ecosystem Seminar
 Google Research Conference
 Physics ∩ ML Seminar
 Scientific Machine Learning Webinar Series (Hosted by CMU)
 Glassy Systems and Inter-disciplinary applications
 AI in Multi-Fidelity, Multi-Scale, and Multi-Physics Simulation of Materials
 Condensed Matter Physics Seminar at UPenn
 Condensed Matter Physics Seminar at Univ. of Minnesota
 Keynote at Materials Science & Technology Workshop

Cargese, France
Oak Ridge National Lab, TN
Philadelphia, PA
St. Paul, MN

Machine Learning in Shifting Environments
 Advanced Control Methods for Particle Accelerators

Santa Fe, NM

Priors for deep infinite networks
 Theoretical Physics for Machine Learning
 Glassy Systems and Inter-disciplinary applications

Aspen, CO
Cargese, France

Combining machine learning and physics to understand glassy systems
 High-Dimensional Data-Driven Science

Kyoto, Japan

Understanding deep neural networks using statistical field theory
 Statistical Machine Learning Theory in Big Data Sciences
 Condensed Matter Theory Group Meeting

Sendai, Japan
Tokyo, Japan

Understanding disordered systems in- and out-of-equilibrium from local structure
 Condensed Matter Theory Kids Seminar, Harvard University

Cambridge, MA


A structural approach to relaxation in glassy liquids
 Kavli Seminar, Harvard University
 Condensed Matter Theory Seminar, Institute of Physics
 Condensed Matter Theory Seminar, Peking University

Cambridge, MA
San Luis Potosi, Mexico
Beijing, China

Softness and kinetic heterogeneities in glassy liquids
 Glotzer Group, University of Michigan

Michigan, MI