new

Get trending papers in your email inbox!

Subscribe

Daily Papers

byAK and the research community

Aug 20

Robust Binding Energy Distribution Sampling on Amorphous Solid Water Models. Method testing and validation with NH3, CO and CH4

This work aims to develop a method based on a structurally reliable ice model and a statistically and physico-chemically robust approach for BE distribution inference, with the aim to be applicable to various relevant interstellar species. A multiscale computational approach is presented, with a Molecular Dynamics (MD) Heat & Quench protocol for the amorphous water ice model, and an ONIOM(B3LYP-D3(BJ)/6-311+G**:GFN2-xtb) scheme for the BE inference, with a prime emphasis onto the BE/real system size convergence. The sampling of the binding configurations is twofold, exploring both regularly spaced binding sites, as well as various adsorbate-to-substrate orientations on each locally distinct site. This second source of BE diversity accounts for the local roughness of the potential energy landscape of the substrate. Three different adsorbate test cases are considered, i.e. NH3, CO and CH4, owing to their significance in dust icy mantles, and their distinct binding behavior with water ices. The BE distributions for NH3, CO and CH4 have been inferred, with converged statistics. The distribution for NH3 is better represented by a double Gaussian component profile. Three starting adsorbate orientations per site are required to reach convergence for both Gaussian components of NH3, while 2 orientations are sufficient for CO, and one unique for CH4 (symmetric). Further geometrical and molecular surrounding insights have been provided. These results encompass previously reported results.

Generative Discovery of Novel Chemical Designs using Diffusion Modeling and Transformer Deep Neural Networks with Application to Deep Eutectic Solvents

We report a series of deep learning models to solve complex forward and inverse design problems in molecular modeling and design. Using both diffusion models inspired by nonequilibrium thermodynamics and attention-based transformer architectures, we demonstrate a flexible framework to capture complex chemical structures. First trained on the QM9 dataset and a series of quantum mechanical properties (e.g. homo, lumo, free energy, heat capacity, etc.), we then generalize the model to study and design key properties of deep eutectic solvents. In addition to separate forward and inverse models, we also report an integrated fully prompt-based multi-task generative pretrained transformer model that solves multiple forward, inverse design, and prediction tasks, flexibly and within one model. We show that the multi-task generative model has the overall best performance and allows for flexible integration of multiple objectives, within one model, and for distinct chemistries, suggesting that synergies emerge during training of this large language model. Trained jointly in tasks related to the QM9 dataset and deep eutectic solvents (DESs), the model can predict various quantum mechanical properties and critical properties to achieve deep eutectic solvent behavior. Several novel combinations of DESs are proposed based on this framework.

Deep learning probability flows and entropy production rates in active matter

Active matter systems, from self-propelled colloids to motile bacteria, are characterized by the conversion of free energy into useful work at the microscopic scale. These systems generically involve physics beyond the reach of equilibrium statistical mechanics, and a persistent challenge has been to understand the nature of their nonequilibrium states. The entropy production rate and the magnitude of the steady-state probability current provide quantitative ways to do so by measuring the breakdown of time-reversal symmetry and the strength of nonequilibrium transport of measure. Yet, their efficient computation has remained elusive, as they depend on the system's unknown and high-dimensional probability density. Here, building upon recent advances in generative modeling, we develop a deep learning framework that estimates the score of this density. We show that the score, together with the microscopic equations of motion, gives direct access to the entropy production rate, the probability current, and their decomposition into local contributions from individual particles, spatial regions, and degrees of freedom. To represent the score, we introduce a novel, spatially-local transformer-based network architecture that learns high-order interactions between particles while respecting their underlying permutation symmetry. We demonstrate the broad utility and scalability of the method by applying it to several high-dimensional systems of interacting active particles undergoing motility-induced phase separation (MIPS). We show that a single instance of our network trained on a system of 4096 particles at one packing fraction can generalize to other regions of the phase diagram, including systems with as many as 32768 particles. We use this observation to quantify the spatial structure of the departure from equilibrium in MIPS as a function of the number of particles and the packing fraction.

Nonequilibrium Phenomena in Driven and Active Coulomb Field Theories

The classical Coulomb gas model has served as one of the most versatile frameworks in statistical physics, connecting a vast range of phenomena across many different areas. Nonequilibrium generalisations of this model have so far been studied much more scarcely. With the abundance of contemporary research into active and driven systems, one would naturally expect that such generalisations of systems with long-ranged Coulomb-like interactions will form a fertile playground for interesting developments. Here, we present two examples of novel macroscopic behaviour that arise from nonequilibrium fluctuations in long-range interacting systems, namely (1) unscreened long-ranged correlations in strong electrolytes driven by an external electric field and the associated fluctuation-induced forces in the confined Casimir geometry, and (2) out-of-equilibrium critical behaviour in self-chemotactic models that incorporate the particle polarity in the chemotactic response of the cells. Both of these systems have nonlocal Coulomb-like interactions among their constituent particles, namely, the electrostatic interactions in the case of the driven electrolyte, and the chemotactic forces mediated by fast-diffusing signals in the case of self-chemotactic systems. The results presented here hint to the rich phenomenology of nonequilibrium effects that can arise from strong fluctuations in Coulomb interacting systems, and a rich variety of potential future directions, which are discussed.

Linear statistics for Coulomb gases: higher order cumulants

We consider N classical particles interacting via the Coulomb potential in spatial dimension d and in the presence of an external trap, at equilibrium at inverse temperature beta. In the large N limit, the particles are confined within a droplet of finite size. We study smooth linear statistics, i.e. the fluctuations of sums of the form {cal L}_N = sum_{i=1}^N f({bf x}_i), where {bf x}_i's are the positions of the particles and where f({bf x}_i) is a sufficiently regular function. There exists at present standard results for the first and second moments of {cal L}_N in the large N limit, as well as associated Central Limit Theorems in general dimension and for a wide class of confining potentials. Here we obtain explicit expressions for the higher order cumulants of {cal L}_N at large N, when the function f({bf x})=f(|{bf x}|) and the confining potential are both rotationnally invariant. A remarkable feature of our results is that these higher cumulants depend only on the value of f'(|{bf x}|) and its higher order derivatives evaluated exactly at the boundary of the droplet, which in this case is a d-dimensional sphere. In the particular two-dimensional case d=2 at the special value beta=2, a connection to the Ginibre ensemble allows us to derive these results in an alternative way using the tools of determinantal point processes. Finally we also obtain the large deviation form of the full probability distribution function of {cal L}_N.

Rise and Fall of Anderson Localization by Lattice Vibrations: A Time-Dependent Machine Learning Approach

The intricate relationship between electrons and the crystal lattice is a linchpin in condensed matter, traditionally described by the Fr\"ohlich model encompassing the lowest-order lattice-electron coupling. Recently developed quantum acoustics, emphasizing the wave nature of lattice vibrations, has enabled the exploration of previously uncharted territories of electron-lattice interaction not accessible with conventional tools such as perturbation theory. In this context, our agenda here is two-fold. First, we showcase the application of machine learning methods to categorize various interaction regimes within the subtle interplay of electrons and the dynamical lattice landscape. Second, we shed light on a nebulous region of electron dynamics identified by the machine learning approach and then attribute it to transient localization, where strong lattice vibrations result in a momentary Anderson prison for electronic wavepackets, which are later released by the evolution of the lattice. Overall, our research illuminates the spectrum of dynamics within the Fr\"ohlich model, such as transient localization, which has been suggested as a pivotal factor contributing to the mysteries surrounding strange metals. Furthermore, this paves the way for utilizing time-dependent perspectives in machine learning techniques for designing materials with tailored electron-lattice properties.

Simulating 2+1D Lattice Quantum Electrodynamics at Finite Density with Neural Flow Wavefunctions

We present a neural flow wavefunction, Gauge-Fermion FlowNet, and use it to simulate 2+1D lattice compact quantum electrodynamics with finite density dynamical fermions. The gauge field is represented by a neural network which parameterizes a discretized flow-based transformation of the amplitude while the fermionic sign structure is represented by a neural net backflow. This approach directly represents the U(1) degree of freedom without any truncation, obeys Guass's law by construction, samples autoregressively avoiding any equilibration time, and variationally simulates Gauge-Fermion systems with sign problems accurately. In this model, we investigate confinement and string breaking phenomena in different fermion density and hopping regimes. We study the phase transition from the charge crystal phase to the vacuum phase at zero density, and observe the phase seperation and the net charge penetration blocking effect under magnetic interaction at finite density. In addition, we investigate a magnetic phase transition due to the competition effect between the kinetic energy of fermions and the magnetic energy of the gauge field. With our method, we further note potential differences on the order of the phase transitions between a continuous U(1) system and one with finite truncation. Our state-of-the-art neural network approach opens up new possibilities to study different gauge theories coupled to dynamical matter in higher dimensions.

ConStellaration: A dataset of QI-like stellarator plasma boundaries and optimization benchmarks

Stellarators are magnetic confinement devices under active development to deliver steady-state carbon-free fusion energy. Their design involves a high-dimensional, constrained optimization problem that requires expensive physics simulations and significant domain expertise. Recent advances in plasma physics and open-source tools have made stellarator optimization more accessible. However, broader community progress is currently bottlenecked by the lack of standardized optimization problems with strong baselines and datasets that enable data-driven approaches, particularly for quasi-isodynamic (QI) stellarator configurations, considered as a promising path to commercial fusion due to their inherent resilience to current-driven disruptions. Here, we release an open dataset of diverse QI-like stellarator plasma boundary shapes, paired with their ideal magnetohydrodynamic (MHD) equilibria and performance metrics. We generated this dataset by sampling a variety of QI fields and optimizing corresponding stellarator plasma boundaries. We introduce three optimization benchmarks of increasing complexity: (1) a single-objective geometric optimization problem, (2) a "simple-to-build" QI stellarator, and (3) a multi-objective ideal-MHD stable QI stellarator that investigates trade-offs between compactness and coil simplicity. For every benchmark, we provide reference code, evaluation scripts, and strong baselines based on classical optimization techniques. Finally, we show how learned models trained on our dataset can efficiently generate novel, feasible configurations without querying expensive physics oracles. By openly releasing the dataset along with benchmark problems and baselines, we aim to lower the entry barrier for optimization and machine learning researchers to engage in stellarator design and to accelerate cross-disciplinary progress toward bringing fusion energy to the grid.

Extreme Event Prediction with Multi-agent Reinforcement Learning-based Parametrization of Atmospheric and Oceanic Turbulence

Global climate models (GCMs) are the main tools for understanding and predicting climate change. However, due to limited numerical resolutions, these models suffer from major structural uncertainties; e.g., they cannot resolve critical processes such as small-scale eddies in atmospheric and oceanic turbulence. Thus, such small-scale processes have to be represented as a function of the resolved scales via closures (parametrization). The accuracy of these closures is particularly important for capturing climate extremes. Traditionally, such closures are based on heuristics and simplifying assumptions about the unresolved physics. Recently, supervised-learned closures, trained offline on high-fidelity data, have been shown to outperform the classical physics-based closures. However, this approach requires a significant amount of high-fidelity training data and can also lead to instabilities. Reinforcement learning is emerging as a potent alternative for developing such closures as it requires only low-order statistics and leads to stable closures. In Scientific Multi-Agent Reinforcement Learning (SMARL) computational elements serve a dual role of discretization points and learning agents. We leverage SMARL and fundamentals of turbulence physics to learn closures for prototypes of atmospheric and oceanic turbulence. The policy is trained using only the enstrophy spectrum, which is nearly invariant and can be estimated from a few high-fidelity samples (these few samples are far from enough for supervised/offline learning). We show that these closures lead to stable low-resolution simulations that, at a fraction of the cost, can reproduce the high-fidelity simulations' statistics, including the tails of the probability density functions. The results demonstrate the high potential of SMARL for closure modeling for GCMs, especially in the regime of scarce data and indirect observations.

Modeling transport in weakly collisional plasmas using thermodynamic forcing

How momentum, energy, and magnetic fields are transported in the presence of macroscopic gradients is a fundamental question in plasma physics. Answering this question is especially challenging for weakly collisional, magnetized plasmas, where macroscopic gradients influence the plasma's microphysical structure. In this paper, we introduce thermodynamic forcing, a new method for systematically modeling how macroscopic gradients in magnetized or unmagnetized plasmas shape the distribution functions of constituent particles. In this method, we propose to apply an anomalous force to those particles inducing the anisotropy that would naturally emerge due to macroscopic gradients in weakly collisional plasmas. We implement thermodynamic forcing in particle-in-cell (TF-PIC) simulations using a modified Vay particle pusher and validate it against analytic solutions of the equations of motion. We then carry out a series of simulations of electron-proton plasmas with periodic boundary conditions using TF-PIC. First, we confirm that the properties of two electron-scale kinetic instabilities -- one driven by a temperature gradient and the other by pressure anisotropy -- are consistent with previous results. Then, we demonstrate that in the presence of multiple macroscopic gradients, the saturated state can differ significantly from current expectations. This work enables, for the first time, systematic and self-consistent transport modeling in weakly collisional plasmas, with broad applications in astrophysics, laser-plasma physics, and inertial confinement fusion.

CHGNet: Pretrained universal neural network potential for charge-informed atomistic modeling

The simulation of large-scale systems with complex electron interactions remains one of the greatest challenges for the atomistic modeling of materials. Although classical force fields often fail to describe the coupling between electronic states and ionic rearrangements, the more accurate ab-initio molecular dynamics suffers from computational complexity that prevents long-time and large-scale simulations, which are essential to study many technologically relevant phenomena, such as reactions, ion migrations, phase transformations, and degradation. In this work, we present the Crystal Hamiltonian Graph neural Network (CHGNet) as a novel machine-learning interatomic potential (MLIP), using a graph-neural-network-based force field to model a universal potential energy surface. CHGNet is pretrained on the energies, forces, stresses, and magnetic moments from the Materials Project Trajectory Dataset, which consists of over 10 years of density functional theory static and relaxation trajectories of sim 1.5 million inorganic structures. The explicit inclusion of magnetic moments enables CHGNet to learn and accurately represent the orbital occupancy of electrons, enhancing its capability to describe both atomic and electronic degrees of freedom. We demonstrate several applications of CHGNet in solid-state materials, including charge-informed molecular dynamics in Li_xMnO_2, the finite temperature phase diagram for Li_xFePO_4 and Li diffusion in garnet conductors. We critically analyze the significance of including charge information for capturing appropriate chemistry, and we provide new insights into ionic systems with additional electronic degrees of freedom that can not be observed by previous MLIPs.

A Periodic Bayesian Flow for Material Generation

Generative modeling of crystal data distribution is an important yet challenging task due to the unique periodic physical symmetry of crystals. Diffusion-based methods have shown early promise in modeling crystal distribution. More recently, Bayesian Flow Networks were introduced to aggregate noisy latent variables, resulting in a variance-reduced parameter space that has been shown to be advantageous for modeling Euclidean data distributions with structural constraints (Song et al., 2023). Inspired by this, we seek to unlock its potential for modeling variables located in non-Euclidean manifolds e.g. those within crystal structures, by overcoming challenging theoretical issues. We introduce CrysBFN, a novel crystal generation method by proposing a periodic Bayesian flow, which essentially differs from the original Gaussian-based BFN by exhibiting non-monotonic entropy dynamics. To successfully realize the concept of periodic Bayesian flow, CrysBFN integrates a new entropy conditioning mechanism and empirically demonstrates its significance compared to time-conditioning. Extensive experiments over both crystal ab initio generation and crystal structure prediction tasks demonstrate the superiority of CrysBFN, which consistently achieves new state-of-the-art on all benchmarks. Surprisingly, we found that CrysBFN enjoys a significant improvement in sampling efficiency, e.g., ~100x speedup 10 v.s. 2000 steps network forwards) compared with previous diffusion-based methods on MP-20 dataset. Code is available at https://github.com/wu-han-lin/CrysBFN.

Lagrangian PINNs: A causality-conforming solution to failure modes of physics-informed neural networks

Physics-informed neural networks (PINNs) leverage neural-networks to find the solutions of partial differential equation (PDE)-constrained optimization problems with initial conditions and boundary conditions as soft constraints. These soft constraints are often considered to be the sources of the complexity in the training phase of PINNs. Here, we demonstrate that the challenge of training (i) persists even when the boundary conditions are strictly enforced, and (ii) is closely related to the Kolmogorov n-width associated with problems demonstrating transport, convection, traveling waves, or moving fronts. Given this realization, we describe the mechanism underlying the training schemes such as those used in eXtended PINNs (XPINN), curriculum regularization, and sequence-to-sequence learning. For an important category of PDEs, i.e., governed by non-linear convection-diffusion equation, we propose reformulating PINNs on a Lagrangian frame of reference, i.e., LPINNs, as a PDE-informed solution. A parallel architecture with two branches is proposed. One branch solves for the state variables on the characteristics, and the second branch solves for the low-dimensional characteristics curves. The proposed architecture conforms to the causality innate to the convection, and leverages the direction of travel of the information in the domain. Finally, we demonstrate that the loss landscapes of LPINNs are less sensitive to the so-called "complexity" of the problems, compared to those in the traditional PINNs in the Eulerian framework.

Physics-Informed Neural Networks for One-Dimensional Quantum Well Problems

We implement physics-informed neural networks (PINNs) to solve the time-independent Schr\"odinger equation for three canonical one-dimensional quantum potentials: an infinite square well, a finite square well, and a finite barrier. The PINN models incorporate trial wavefunctions that exactly satisfy boundary conditions (Dirichlet zeros at domain boundaries), and they optimize a loss functional combining the PDE residual with a normalization constraint. For the infinite well, the ground-state energy is known (E = pi^2 in dimensionless units) and held fixed in training, whereas for the finite well and barrier, the eigenenergy is treated as a trainable parameter. We use fully-connected neural networks with smooth activation functions to represent the wavefunction and demonstrate that PINNs can learn the ground-state eigenfunctions and eigenvalues for these quantum systems. The results show that the PINN-predicted wavefunctions closely match analytical solutions or expected behaviors, and the learned eigenenergies converge to known values. We present training logs and convergence of the energy parameter, as well as figures comparing the PINN solutions to exact results. The discussion addresses the performance of PINNs relative to traditional numerical methods, highlighting challenges such as convergence to the correct eigenvalue, sensitivity to initialization, and the difficulty of modeling discontinuous potentials. We also discuss the importance of the normalization term to resolve the scaling ambiguity of the wavefunction. Finally, we conclude that PINNs are a viable approach for quantum eigenvalue problems, and we outline future directions including extensions to higher-dimensional and time-dependent Schr\"odinger equations.

Empirical Analysis of the Hessian of Over-Parametrized Neural Networks

We study the properties of common loss surfaces through their Hessian matrix. In particular, in the context of deep learning, we empirically show that the spectrum of the Hessian is composed of two parts: (1) the bulk centered near zero, (2) and outliers away from the bulk. We present numerical evidence and mathematical justifications to the following conjectures laid out by Sagun et al. (2016): Fixing data, increasing the number of parameters merely scales the bulk of the spectrum; fixing the dimension and changing the data (for instance adding more clusters or making the data less separable) only affects the outliers. We believe that our observations have striking implications for non-convex optimization in high dimensions. First, the flatness of such landscapes (which can be measured by the singularity of the Hessian) implies that classical notions of basins of attraction may be quite misleading. And that the discussion of wide/narrow basins may be in need of a new perspective around over-parametrization and redundancy that are able to create large connected components at the bottom of the landscape. Second, the dependence of small number of large eigenvalues to the data distribution can be linked to the spectrum of the covariance matrix of gradients of model outputs. With this in mind, we may reevaluate the connections within the data-architecture-algorithm framework of a model, hoping that it would shed light into the geometry of high-dimensional and non-convex spaces in modern applications. In particular, we present a case that links the two observations: small and large batch gradient descent appear to converge to different basins of attraction but we show that they are in fact connected through their flat region and so belong to the same basin.

Foundation Inference Models for Markov Jump Processes

Markov jump processes are continuous-time stochastic processes which describe dynamical systems evolving in discrete state spaces. These processes find wide application in the natural sciences and machine learning, but their inference is known to be far from trivial. In this work we introduce a methodology for zero-shot inference of Markov jump processes (MJPs), on bounded state spaces, from noisy and sparse observations, which consists of two components. First, a broad probability distribution over families of MJPs, as well as over possible observation times and noise mechanisms, with which we simulate a synthetic dataset of hidden MJPs and their noisy observation process. Second, a neural network model that processes subsets of the simulated observations, and that is trained to output the initial condition and rate matrix of the target MJP in a supervised way. We empirically demonstrate that one and the same (pretrained) model can infer, in a zero-shot fashion, hidden MJPs evolving in state spaces of different dimensionalities. Specifically, we infer MJPs which describe (i) discrete flashing ratchet systems, which are a type of Brownian motors, and the conformational dynamics in (ii) molecular simulations, (iii) experimental ion channel data and (iv) simple protein folding models. What is more, we show that our model performs on par with state-of-the-art models which are finetuned to the target datasets.

Enhanced Sampling, Public Dataset and Generative Model for Drug-Protein Dissociation Dynamics

Drug-protein binding and dissociation dynamics are fundamental to understanding molecular interactions in biological systems. While many tools for drug-protein interaction studies have emerged, especially artificial intelligence (AI)-based generative models, predictive tools on binding/dissociation kinetics and dynamics are still limited. We propose a novel research paradigm that combines molecular dynamics (MD) simulations, enhanced sampling, and AI generative models to address this issue. We propose an enhanced sampling strategy to efficiently implement the drug-protein dissociation process in MD simulations and estimate the free energy surface (FES). We constructed a program pipeline of MD simulations based on this sampling strategy, thus generating a dataset including 26,612 drug-protein dissociation trajectories containing about 13 million frames. We named this dissociation dynamics dataset DD-13M and used it to train a deep equivariant generative model UnbindingFlow, which can generate collision-free dissociation trajectories. The DD-13M database and UnbindingFlow model represent a significant advancement in computational structural biology, and we anticipate its broad applicability in machine learning studies of drug-protein interactions. Our ongoing efforts focus on expanding this methodology to encompass a broader spectrum of drug-protein complexes and exploring novel applications in pathway prediction.

Von Mises Mixture Distributions for Molecular Conformation Generation

Molecules are frequently represented as graphs, but the underlying 3D molecular geometry (the locations of the atoms) ultimately determines most molecular properties. However, most molecules are not static and at room temperature adopt a wide variety of geometries or conformations. The resulting distribution on geometries p(x) is known as the Boltzmann distribution, and many molecular properties are expectations computed under this distribution. Generating accurate samples from the Boltzmann distribution is therefore essential for computing these expectations accurately. Traditional sampling-based methods are computationally expensive, and most recent machine learning-based methods have focused on identifying modes in this distribution rather than generating true samples. Generating such samples requires capturing conformational variability, and it has been widely recognized that the majority of conformational variability in molecules arises from rotatable bonds. In this work, we present VonMisesNet, a new graph neural network that captures conformational variability via a variational approximation of rotatable bond torsion angles as a mixture of von Mises distributions. We demonstrate that VonMisesNet can generate conformations for arbitrary molecules in a way that is both physically accurate with respect to the Boltzmann distribution and orders of magnitude faster than existing sampling methods.

BAMBOO: a predictive and transferable machine learning force field framework for liquid electrolyte development

Despite the widespread applications of machine learning force field (MLFF) on solids and small molecules, there is a notable gap in applying MLFF to complex liquid electrolytes. In this work, we introduce BAMBOO (ByteDance AI Molecular Simulation Booster), a novel framework for molecular dynamics (MD) simulations, with a demonstration of its capabilities in the context of liquid electrolytes for lithium batteries. We design a physics-inspired graph equivariant transformer architecture as the backbone of BAMBOO to learn from quantum mechanical simulations. Additionally, we pioneer an ensemble knowledge distillation approach and apply it on MLFFs to improve the stability of MD simulations. Finally, we propose the density alignment algorithm to align BAMBOO with experimental measurements. BAMBOO demonstrates state-of-the-art accuracy in predicting key electrolyte properties such as density, viscosity, and ionic conductivity across various solvents and salt combinations. Our current model, trained on more than 15 chemical species, achieves the average density error of 0.01 g/cm^3 on various compositions compared with experimental data. Moreover, our model demonstrates transferability to molecules not included in the quantum mechanical dataset. We envision this work as paving the way to a "universal MLFF" capable of simulating properties of common organic liquids.

Cybloids - Creation and Control of Cybernetic Colloids

Colloids play an important role in fundamental science as well as in nature and technology. They have had a strong impact on the fundamental understanding of statistical physics. For example, colloids have helped to obtain a better understanding of collective phenomena, ranging from phase transitions and glass formation to the swarming of active Brownian particles. Yet the success of colloidal systems hinges crucially on the specific physical and chemical properties of the colloidal particles, i.e. particles with the appropriate characteristics must be available. Here we present an idea to create particles with freely selectable properties. The properties might depend, for example, on the presence of other particles (hence mimicking specific pair or many-body interactions), previous configurations (hence introducing some memory or feedback), or a directional bias (hence changing the dynamics). Without directly interfering with the sample, each particle is fully controlled and can receive external commands through a predefined algorithm that can take into account any input parameters. This is realized with computer-controlled colloids, which we term cybloids - short for cybernetic colloids. The potential of cybloids is illustrated by programming a time-delayed external potential acting on a single colloid and interaction potentials for many colloids. Both an attractive harmonic potential and an annular potential are implemented. For a single particle, this programming can cause subdiffusive behavior or lend activity. For many colloids, the programmed interaction potential allows to select a crystal structure at wish. Beyond these examples, we discuss further opportunities which cybloids offer.

Ergotropy and Capacity Optimization in Heisenberg Spin Chain Quantum Batteries

This study examines the performance of finite spin quantum batteries (QBs) using Heisenberg spin models with Dzyaloshinsky-Moriya (DM) and Kaplan--Shekhtman--Entin-Wohlman--Aharony (KSEA) interactions. The QBs are modeled as interacting quantum spins in local inhomogeneous magnetic fields, inducing variable Zeeman splitting. We derive analytical expressions for the maximal extractable work, ergotropy and the capacity of QBs, as recently examined by Yang et al. [Phys. Rev. Lett. 131, 030402 (2023)]. These quantities are analytically linked through certain quantum correlations, as posited in the aforementioned study. Different Heisenberg spin chain models exhibit distinct behaviors under varying conditions, emphasizing the importance of model selection for optimizing QB performance. In antiferromagnetic (AFM) systems, maximum ergotropy occurs with a Zeeman splitting field applied to either spin, while ferromagnetic (FM) systems benefit from a uniform Zeeman field. Temperature significantly impacts QB performance, with ergotropy in the AFM case being generally more robust against temperature increases compared to the FM case. Incorporating DM and KSEA couplings can significantly enhance the capacity and ergotropy extraction of QBs. However, there exists a threshold beyond which additional increases in these interactions cause a sharp decline in capacity and ergotropy. This behavior is influenced by temperature and quantum coherence, which signal the occurrence of a sudden phase transition. The resource theory of quantum coherence proposed by Baumgratz et al. [Phys. Rev. Lett. 113, 140401 (2014)] plays a crucial role in enhancing ergotropy and capacity. However, ergotropy is limited by both the system's capacity and the amount of coherence. These findings support the theoretical framework of spin-based QBs and may benefit future research on quantum energy storage devices.

Latent Traversals in Generative Models as Potential Flows

Despite the significant recent progress in deep generative models, the underlying structure of their latent spaces is still poorly understood, thereby making the task of performing semantically meaningful latent traversals an open research challenge. Most prior work has aimed to solve this challenge by modeling latent structures linearly, and finding corresponding linear directions which result in `disentangled' generations. In this work, we instead propose to model latent structures with a learned dynamic potential landscape, thereby performing latent traversals as the flow of samples down the landscape's gradient. Inspired by physics, optimal transport, and neuroscience, these potential landscapes are learned as physically realistic partial differential equations, thereby allowing them to flexibly vary over both space and time. To achieve disentanglement, multiple potentials are learned simultaneously, and are constrained by a classifier to be distinct and semantically self-consistent. Experimentally, we demonstrate that our method achieves both more qualitatively and quantitatively disentangled trajectories than state-of-the-art baselines. Further, we demonstrate that our method can be integrated as a regularization term during training, thereby acting as an inductive bias towards the learning of structured representations, ultimately improving model likelihood on similarly structured data.

Scalable Bayesian Uncertainty Quantification for Neural Network Potentials: Promise and Pitfalls

Neural network (NN) potentials promise highly accurate molecular dynamics (MD) simulations within the computational complexity of classical MD force fields. However, when applied outside their training domain, NN potential predictions can be inaccurate, increasing the need for Uncertainty Quantification (UQ). Bayesian modeling provides the mathematical framework for UQ, but classical Bayesian methods based on Markov chain Monte Carlo (MCMC) are computationally intractable for NN potentials. By training graph NN potentials for coarse-grained systems of liquid water and alanine dipeptide, we demonstrate here that scalable Bayesian UQ via stochastic gradient MCMC (SG-MCMC) yields reliable uncertainty estimates for MD observables. We show that cold posteriors can reduce the required training data size and that for reliable UQ, multiple Markov chains are needed. Additionally, we find that SG-MCMC and the Deep Ensemble method achieve comparable results, despite shorter training and less hyperparameter tuning of the latter. We show that both methods can capture aleatoric and epistemic uncertainty reliably, but not systematic uncertainty, which needs to be minimized by adequate modeling to obtain accurate credible intervals for MD observables. Our results represent a step towards accurate UQ that is of vital importance for trustworthy NN potential-based MD simulations required for decision-making in practice.

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

We introduce a new family of physics-inspired generative models termed PFGM++ that unifies diffusion models and Poisson Flow Generative Models (PFGM). These models realize generative trajectories for N dimensional data by embedding paths in N{+}D dimensional space while still controlling the progression with a simple scalar norm of the D additional variables. The new models reduce to PFGM when D{=}1 and to diffusion models when D{to}infty. The flexibility of choosing D allows us to trade off robustness against rigidity as increasing D results in more concentrated coupling between the data and the additional variable norms. We dispense with the biased large batch field targets used in PFGM and instead provide an unbiased perturbation-based objective similar to diffusion models. To explore different choices of D, we provide a direct alignment method for transferring well-tuned hyperparameters from diffusion models (D{to} infty) to any finite D values. Our experiments show that models with finite D can be superior to previous state-of-the-art diffusion models on CIFAR-10/FFHQ 64{times}64 datasets, with FID scores of 1.91/2.43 when D{=}2048/128. In class-conditional setting, D{=}2048 yields current state-of-the-art FID of 1.74 on CIFAR-10. In addition, we demonstrate that models with smaller D exhibit improved robustness against modeling errors. Code is available at https://github.com/Newbeeer/pfgmpp

Replica symmetry breaking in dense neural networks

Understanding the glassy nature of neural networks is pivotal both for theoretical and computational advances in Machine Learning and Theoretical Artificial Intelligence. Keeping the focus on dense associative Hebbian neural networks, the purpose of this paper is two-fold: at first we develop rigorous mathematical approaches to address properly a statistical mechanical picture of the phenomenon of {\em replica symmetry breaking} (RSB) in these networks, then -- deepening results stemmed via these routes -- we aim to inspect the {\em glassiness} that they hide. In particular, regarding the methodology, we provide two techniques: the former is an adaptation of the transport PDE to the case, while the latter is an extension of Guerra's interpolation breakthrough. Beyond coherence among the results, either in replica symmetric and in the one-step replica symmetry breaking level of description, we prove the Gardner's picture and we identify the maximal storage capacity by a ground-state analysis in the Baldi-Venkatesh high-storage regime. In the second part of the paper we investigate the glassy structure of these networks: in contrast with the replica symmetric scenario (RS), RSB actually stabilizes the spin-glass phase. We report huge differences w.r.t. the standard pairwise Hopfield limit: in particular, it is known that it is possible to express the free energy of the Hopfield neural network as a linear combination of the free energies of an hard spin glass (i.e. the Sherrington-Kirkpatrick model) and a soft spin glass (the Gaussian or "spherical" model). This is no longer true when interactions are more than pairwise (whatever the level of description, RS or RSB): for dense networks solely the free energy of the hard spin glass survives, proving a huge diversity in the underlying glassiness of associative neural networks.

Meta Learning of Interface Conditions for Multi-Domain Physics-Informed Neural Networks

Physics-informed neural networks (PINNs) are emerging as popular mesh-free solvers for partial differential equations (PDEs). Recent extensions decompose the domain, applying different PINNs to solve the equation in each subdomain and aligning the solution at the interface of the subdomains. Hence, they can further alleviate the problem complexity, reduce the computational cost, and allow parallelization. However, the performance of the multi-domain PINNs is sensitive to the choice of the interface conditions for solution alignment. While quite a few conditions have been proposed, there is no suggestion about how to select the conditions according to specific problems. To address this gap, we propose META Learning of Interface Conditions (METALIC), a simple, efficient yet powerful approach to dynamically determine the optimal interface conditions for solving a family of parametric PDEs. Specifically, we develop two contextual multi-arm bandit models. The first one applies to the entire training procedure, and online updates a Gaussian process (GP) reward surrogate that given the PDE parameters and interface conditions predicts the solution error. The second one partitions the training into two stages, one is the stochastic phase and the other deterministic phase; we update a GP surrogate for each phase to enable different condition selections at the two stages so as to further bolster the flexibility and performance. We have shown the advantage of METALIC on four bench-mark PDE families.

A Graph Neural Network for the Era of Large Atomistic Models

Foundation models, or large atomistic models (LAMs), aim to universally represent the ground-state potential energy surface (PES) of atomistic systems as defined by density functional theory (DFT). The scaling law is pivotal in the development of large models, suggesting that their generalizability in downstream tasks consistently improves with increased model size, expanded training datasets, and larger computational budgets. In this study, we present DPA3, a multi-layer graph neural network founded on line graph series (LiGS), designed explicitly for the era of LAMs. We demonstrate that the generalization error of the DPA3 model adheres to the scaling law. The scalability in the number of model parameters is attained by stacking additional layers within DPA3. Additionally, the model employs a dataset encoding mechanism that decouples the scaling of training data size from the model size within its multi-task training framework. When trained as problem-oriented potential energy models, the DPA3 model exhibits superior accuracy in the majority of benchmark cases, encompassing systems with diverse features, including molecules, bulk materials, surface and cluster catalysts, two-dimensional materials, and battery materials. When trained as a LAM on the OpenLAM-v1 dataset, the DPA-3.1-3M model exhibits state-of-the-art performance in the LAMBench benchmark suite for LAMs, demonstrating lowest overall zero-shot generalization error across 17 downstream tasks from a broad spectrum of research domains. This performance suggests superior accuracy as an out-of-the-box potential model, requiring minimal fine-tuning data for downstream scientific applications.

Statistical mechanics of continual learning: variational principle and mean-field potential

An obstacle to artificial general intelligence is set by continual learning of multiple tasks of different nature. Recently, various heuristic tricks, both from machine learning and from neuroscience angles, were proposed, but they lack a unified theory ground. Here, we focus on continual learning in single-layered and multi-layered neural networks of binary weights. A variational Bayesian learning setting is thus proposed, where the neural networks are trained in a field-space, rather than gradient-ill-defined discrete-weight space, and furthermore, weight uncertainty is naturally incorporated, and modulates synaptic resources among tasks. From a physics perspective, we translate the variational continual learning into Franz-Parisi thermodynamic potential framework, where previous task knowledge acts as a prior and a reference as well. We thus interpret the continual learning of the binary perceptron in a teacher-student setting as a Franz-Parisi potential computation. The learning performance can then be analytically studied with mean-field order parameters, whose predictions coincide with numerical experiments using stochastic gradient descent methods. Based on the variational principle and Gaussian field approximation of internal preactivations in hidden layers, we also derive the learning algorithm considering weight uncertainty, which solves the continual learning with binary weights using multi-layered neural networks, and performs better than the currently available metaplasticity algorithm. Our proposed principled frameworks also connect to elastic weight consolidation, weight-uncertainty modulated learning, and neuroscience inspired metaplasticity, providing a theory-grounded method for the real-world multi-task learning with deep networks.

Dense Hebbian neural networks: a replica symmetric picture of supervised learning

We consider dense, associative neural-networks trained by a teacher (i.e., with supervision) and we investigate their computational capabilities analytically, via statistical-mechanics of spin glasses, and numerically, via Monte Carlo simulations. In particular, we obtain a phase diagram summarizing their performance as a function of the control parameters such as quality and quantity of the training dataset, network storage and noise, that is valid in the limit of large network size and structureless datasets: these networks may work in a ultra-storage regime (where they can handle a huge amount of patterns, if compared with shallow neural networks) or in a ultra-detection regime (where they can perform pattern recognition at prohibitive signal-to-noise ratios, if compared with shallow neural networks). Guided by the random theory as a reference framework, we also test numerically learning, storing and retrieval capabilities shown by these networks on structured datasets as MNist and Fashion MNist. As technical remarks, from the analytic side, we implement large deviations and stability analysis within Guerra's interpolation to tackle the not-Gaussian distributions involved in the post-synaptic potentials while, from the computational counterpart, we insert Plefka approximation in the Monte Carlo scheme, to speed up the evaluation of the synaptic tensors, overall obtaining a novel and broad approach to investigate supervised learning in neural networks, beyond the shallow limit, in general.

ProtAgents: Protein discovery via large language model multi-agent collaborations combining physics and machine learning

Designing de novo proteins beyond those found in nature holds significant promise for advancements in both scientific and engineering applications. Current methodologies for protein design often rely on AI-based models, such as surrogate models that address end-to-end problems by linking protein structure to material properties or vice versa. However, these models frequently focus on specific material objectives or structural properties, limiting their flexibility when incorporating out-of-domain knowledge into the design process or comprehensive data analysis is required. In this study, we introduce ProtAgents, a platform for de novo protein design based on Large Language Models (LLMs), where multiple AI agents with distinct capabilities collaboratively address complex tasks within a dynamic environment. The versatility in agent development allows for expertise in diverse domains, including knowledge retrieval, protein structure analysis, physics-based simulations, and results analysis. The dynamic collaboration between agents, empowered by LLMs, provides a versatile approach to tackling protein design and analysis problems, as demonstrated through diverse examples in this study. The problems of interest encompass designing new proteins, analyzing protein structures and obtaining new first-principles data -- natural vibrational frequencies -- via physics simulations. The concerted effort of the system allows for powerful automated and synergistic design of de novo proteins with targeted mechanical properties. The flexibility in designing the agents, on one hand, and their capacity in autonomous collaboration through the dynamic LLM-based multi-agent environment on the other hand, unleashes great potentials of LLMs in addressing multi-objective materials problems and opens up new avenues for autonomous materials discovery and design.

Towards Cross Domain Generalization of Hamiltonian Representation via Meta Learning

Recent advances in deep learning for physics have focused on discovering shared representations of target systems by incorporating physics priors or inductive biases into neural networks. While effective, these methods are limited to the system domain, where the type of system remains consistent and thus cannot ensure the adaptation to new, or unseen physical systems governed by different laws. For instance, a neural network trained on a mass-spring system cannot guarantee accurate predictions for the behavior of a two-body system or any other system with different physical laws. In this work, we take a significant leap forward by targeting cross domain generalization within the field of Hamiltonian dynamics. We model our system with a graph neural network and employ a meta learning algorithm to enable the model to gain experience over a distribution of tasks and make it adapt to new physics. Our approach aims to learn a unified Hamiltonian representation that is generalizable across multiple system domains, thereby overcoming the limitations of system-specific models. Our results demonstrate that the meta-trained model not only adapts effectively to new systems but also captures a generalized Hamiltonian representation that is consistent across different physical domains. Overall, through the use of meta learning, we offer a framework that achieves cross domain generalization, providing a step towards a unified model for understanding a wide array of dynamical systems via deep learning.

Robust Model-Based Optimization for Challenging Fitness Landscapes

Protein design, a grand challenge of the day, involves optimization on a fitness landscape, and leading methods adopt a model-based approach where a model is trained on a training set (protein sequences and fitness) and proposes candidates to explore next. These methods are challenged by sparsity of high-fitness samples in the training set, a problem that has been in the literature. A less recognized but equally important problem stems from the distribution of training samples in the design space: leading methods are not designed for scenarios where the desired optimum is in a region that is not only poorly represented in training data, but also relatively far from the highly represented low-fitness regions. We show that this problem of "separation" in the design space is a significant bottleneck in existing model-based optimization tools and propose a new approach that uses a novel VAE as its search model to overcome the problem. We demonstrate its advantage over prior methods in robustly finding improved samples, regardless of the imbalance and separation between low- and high-fitness training samples. Our comprehensive benchmark on real and semi-synthetic protein datasets as well as solution design for physics-informed neural networks, showcases the generality of our approach in discrete and continuous design spaces. Our implementation is available at https://github.com/sabagh1994/PGVAE.

Autoregressive Transformer Neural Network for Simulating Open Quantum Systems via a Probabilistic Formulation

The theory of open quantum systems lays the foundations for a substantial part of modern research in quantum science and engineering. Rooted in the dimensionality of their extended Hilbert spaces, the high computational complexity of simulating open quantum systems calls for the development of strategies to approximate their dynamics. In this paper, we present an approach for tackling open quantum system dynamics. Using an exact probabilistic formulation of quantum physics based on positive operator-valued measure (POVM), we compactly represent quantum states with autoregressive transformer neural networks; such networks bring significant algorithmic flexibility due to efficient exact sampling and tractable density. We further introduce the concept of String States to partially restore the symmetry of the autoregressive transformer neural network and improve the description of local correlations. Efficient algorithms have been developed to simulate the dynamics of the Liouvillian superoperator using a forward-backward trapezoid method and find the steady state via a variational formulation. Our approach is benchmarked on prototypical one and two-dimensional systems, finding results which closely track the exact solution and achieve higher accuracy than alternative approaches based on using Markov chain Monte Carlo to sample restricted Boltzmann machines. Our work provides general methods for understanding quantum dynamics in various contexts, as well as techniques for solving high-dimensional probabilistic differential equations in classical setups.

Understanding and Mitigating Distribution Shifts For Machine Learning Force Fields

Machine Learning Force Fields (MLFFs) are a promising alternative to expensive ab initio quantum mechanical molecular simulations. Given the diversity of chemical spaces that are of interest and the cost of generating new data, it is important to understand how MLFFs generalize beyond their training distributions. In order to characterize and better understand distribution shifts in MLFFs, we conduct diagnostic experiments on chemical datasets, revealing common shifts that pose significant challenges, even for large foundation models trained on extensive data. Based on these observations, we hypothesize that current supervised training methods inadequately regularize MLFFs, resulting in overfitting and learning poor representations of out-of-distribution systems. We then propose two new methods as initial steps for mitigating distribution shifts for MLFFs. Our methods focus on test-time refinement strategies that incur minimal computational cost and do not use expensive ab initio reference labels. The first strategy, based on spectral graph theory, modifies the edges of test graphs to align with graph structures seen during training. Our second strategy improves representations for out-of-distribution systems at test-time by taking gradient steps using an auxiliary objective, such as a cheap physical prior. Our test-time refinement strategies significantly reduce errors on out-of-distribution systems, suggesting that MLFFs are capable of and can move towards modeling diverse chemical spaces, but are not being effectively trained to do so. Our experiments establish clear benchmarks for evaluating the generalization capabilities of the next generation of MLFFs. Our code is available at https://tkreiman.github.io/projects/mlff_distribution_shifts/.

Stability of Superconducting Strings

We investigate the stability of superconducting strings as bound states of strings and fermion zero modes at both the classical and quantum levels. The dynamics of these superconducting strings can result in a stable configuration, known as a vorton. We mainly focus on global strings, but the majority of the discussion can be applied to local strings. Using lattice simulations, we study the classical dynamics of superconducting strings and confirm that they relax to the vorton configuration through Nambu-Goldstone boson radiation, with no evidence of over-shooting that would destabilize the vorton. We explore the tunneling of fermion zero modes out of the strings. Both our classical analysis and quantum calculations yield consistent results: the maximum energy of the zero mode significantly exceeds the fermion mass, in contrast to previous literature. Additionally, we introduce a world-sheet formalism to evaluate the decay rate of zero modes into other particles, which constitute the dominant decay channel. We also identify additional processes that trigger zero-mode decay due to non-adiabatic changes of the string configuration. In these decay processes, the rates are suppressed by the curvature of string loops, with exponential suppression for large masses of the final states. We further study the scattering with light charged particles surrounding the string core produced by the zero-mode current and find that a wide zero-mode wavefunction can enhance vorton stability.

Multiphysics Bench: Benchmarking and Investigating Scientific Machine Learning for Multiphysics PDEs

Solving partial differential equations (PDEs) with machine learning has recently attracted great attention, as PDEs are fundamental tools for modeling real-world systems that range from fundamental physical science to advanced engineering disciplines. Most real-world physical systems across various disciplines are actually involved in multiple coupled physical fields rather than a single field. However, previous machine learning studies mainly focused on solving single-field problems, but overlooked the importance and characteristics of multiphysics problems in real world. Multiphysics PDEs typically entail multiple strongly coupled variables, thereby introducing additional complexity and challenges, such as inter-field coupling. Both benchmarking and solving multiphysics problems with machine learning remain largely unexamined. To identify and address the emerging challenges in multiphysics problems, we mainly made three contributions in this work. First, we collect the first general multiphysics dataset, the Multiphysics Bench, that focuses on multiphysics PDE solving with machine learning. Multiphysics Bench is also the most comprehensive PDE dataset to date, featuring the broadest range of coupling types, the greatest diversity of PDE formulations, and the largest dataset scale. Second, we conduct the first systematic investigation on multiple representative learning-based PDE solvers, such as PINNs, FNO, DeepONet, and DiffusionPDE solvers, on multiphysics problems. Unfortunately, naively applying these existing solvers usually show very poor performance for solving multiphysics. Third, through extensive experiments and discussions, we report multiple insights and a bag of useful tricks for solving multiphysics with machine learning, motivating future directions in the study and simulation of complex, coupled physical systems.

Str2Str: A Score-based Framework for Zero-shot Protein Conformation Sampling

The dynamic nature of proteins is crucial for determining their biological functions and properties, for which Monte Carlo (MC) and molecular dynamics (MD) simulations stand as predominant tools to study such phenomena. By utilizing empirically derived force fields, MC or MD simulations explore the conformational space through numerically evolving the system via Markov chain or Newtonian mechanics. However, the high-energy barrier of the force fields can hamper the exploration of both methods by the rare event, resulting in inadequately sampled ensemble without exhaustive running. Existing learning-based approaches perform direct sampling yet heavily rely on target-specific simulation data for training, which suffers from high data acquisition cost and poor generalizability. Inspired by simulated annealing, we propose Str2Str, a novel structure-to-structure translation framework capable of zero-shot conformation sampling with roto-translation equivariant property. Our method leverages an amortized denoising score matching objective trained on general crystal structures and has no reliance on simulation data during both training and inference. Experimental results across several benchmarking protein systems demonstrate that Str2Str outperforms previous state-of-the-art generative structure prediction models and can be orders of magnitude faster compared to long MD simulations. Our open-source implementation is available at https://github.com/lujiarui/Str2Str

Stochastic acceleration in arbitrary astrophysical environments

Turbulent magnetic fields are to some extent a universal feature in astrophysical phenomena. Charged particles that encounter these turbulence get on average accelerated according to the so-called second-order Fermi process. However, in most astrophysical environments there are additional competing processes, such as different kinds of first-order energy changes and particle escape, that effect the resulting momentum distribution of the particles. In this work we provide to our knowledge the first semi-analytical solution of the isotropic steady-state momentum diffusion equation including continuous and catastrophic momentum changes that can be applied to any arbitrary astrophysical system of interest. Here, we adopt that the assigned magnetic turbulence is constrained on a finite range and the particle flux vanishes beyond these boundaries. Consequently, we show that the so-called pile-up bump -- that has for some special cases long been established -- is a universal feature of stochastic acceleration that emerges around the momentum chi_{rm eq} where acceleration and continuous loss are in equilibrium if the particle's residence time in the system is sufficient at chi_{rm eq}. In general, the impact of continuous and catastrophic momentum changes plays a crucial role in the shape of the steady-state momentum distribution of the accelerated particles, where simplified unbroken power-law approximations are often not adequate.

The Open Catalyst 2020 (OC20) Dataset and Community Challenges

Catalyst discovery and optimization is key to solving many societal and energy challenges including solar fuels synthesis, long-term energy storage, and renewable fertilizer production. Despite considerable effort by the catalysis community to apply machine learning models to the computational catalyst discovery process, it remains an open challenge to build models that can generalize across both elemental compositions of surfaces and adsorbate identity/configurations, perhaps because datasets have been smaller in catalysis than related fields. To address this we developed the OC20 dataset, consisting of 1,281,040 Density Functional Theory (DFT) relaxations (~264,890,000 single point evaluations) across a wide swath of materials, surfaces, and adsorbates (nitrogen, carbon, and oxygen chemistries). We supplemented this dataset with randomly perturbed structures, short timescale molecular dynamics, and electronic structure analyses. The dataset comprises three central tasks indicative of day-to-day catalyst modeling and comes with pre-defined train/validation/test splits to facilitate direct comparisons with future model development efforts. We applied three state-of-the-art graph neural network models (CGCNN, SchNet, Dimenet++) to each of these tasks as baseline demonstrations for the community to build on. In almost every task, no upper limit on model size was identified, suggesting that even larger models are likely to improve on initial results. The dataset and baseline models are both provided as open resources, as well as a public leader board to encourage community contributions to solve these important tasks.

Low-energy Injection and Nonthermal Particle Acceleration in Relativistic Magnetic Turbulence

Relativistic magnetic turbulence has been proposed as a process for producing nonthermal particles in high-energy astrophysics. Particle energization may be contributed by both magnetic reconnection and turbulent fluctuations, but their interplay is poorly understood. It has been suggested that during magnetic reconnection the parallel electric field dominates particle acceleration up to the lower bound of the power-law particle spectrum, but recent studies show that electric fields perpendicular to magnetic field can play an important, if not dominant role. In this study, we carry out 2D fully kinetic particle-in-cell simulations of magnetically dominated decaying turbulence in a relativistic pair plasma. For a fixed magnetization parameter sigma_0=20, we find that the injection energy {varepsilon}_{rm inj} converges with increasing domain size to {varepsilon}_{rm inj}simeq 10m_ec^2. In contrast, the power-law index, the cut-off energy, and the power-law extent increase steadily with domain size. We trace a large number of particles and evaluate the contributions of the work done by the parallel (W_parallel) and perpendicular (W_perp) electric fields during both the injection phase and the post-injection phase. We find that during the injection phase, the W_perp contribution increases with domain size, suggesting that it may eventually dominate injection for a sufficiently large domain. In contrast, both components contribute equally during the post-injection phase, insensitive to the domain size. For high energy ({varepsilon}varepsilon_{rm inj}) particles, W_perp dominates the subsequent energization. These findings may improve our understanding of nonthermal particles and their emissions in astrophysical plasmas.

Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

A class of generative models that unifies flow-based and diffusion-based methods is introduced. These models extend the framework proposed in Albergo & Vanden-Eijnden (2023), enabling the use of a broad class of continuous-time stochastic processes called `stochastic interpolants' to bridge any two arbitrary probability density functions exactly in finite time. These interpolants are built by combining data from the two prescribed densities with an additional latent variable that shapes the bridge in a flexible way. The time-dependent probability density function of the stochastic interpolant is shown to satisfy a first-order transport equation as well as a family of forward and backward Fokker-Planck equations with tunable diffusion coefficient. Upon consideration of the time evolution of an individual sample, this viewpoint immediately leads to both deterministic and stochastic generative models based on probability flow equations or stochastic differential equations with an adjustable level of noise. The drift coefficients entering these models are time-dependent velocity fields characterized as the unique minimizers of simple quadratic objective functions, one of which is a new objective for the score of the interpolant density. We show that minimization of these quadratic objectives leads to control of the likelihood for generative models built upon stochastic dynamics, while likelihood control for deterministic dynamics is more stringent. We also discuss connections with other methods such as score-based diffusion models, stochastic localization processes, probabilistic denoising techniques, and rectifying flows. In addition, we demonstrate that stochastic interpolants recover the Schr\"odinger bridge between the two target densities when explicitly optimizing over the interpolant. Finally, algorithmic aspects are discussed and the approach is illustrated on numerical examples.

Improving equilibrium propagation without weight symmetry through Jacobian homeostasis

Equilibrium propagation (EP) is a compelling alternative to the backpropagation of error algorithm (BP) for computing gradients of neural networks on biological or analog neuromorphic substrates. Still, the algorithm requires weight symmetry and infinitesimal equilibrium perturbations, i.e., nudges, to estimate unbiased gradients efficiently. Both requirements are challenging to implement in physical systems. Yet, whether and how weight asymmetry affects its applicability is unknown because, in practice, it may be masked by biases introduced through the finite nudge. To address this question, we study generalized EP, which can be formulated without weight symmetry, and analytically isolate the two sources of bias. For complex-differentiable non-symmetric networks, we show that the finite nudge does not pose a problem, as exact derivatives can still be estimated via a Cauchy integral. In contrast, weight asymmetry introduces bias resulting in low task performance due to poor alignment of EP's neuronal error vectors compared to BP. To mitigate this issue, we present a new homeostatic objective that directly penalizes functional asymmetries of the Jacobian at the network's fixed point. This homeostatic objective dramatically improves the network's ability to solve complex tasks such as ImageNet 32x32. Our results lay the theoretical groundwork for studying and mitigating the adverse effects of imperfections of physical networks on learning algorithms that rely on the substrate's relaxation dynamics.

Grad DFT: a software library for machine learning enhanced density functional theory

Density functional theory (DFT) stands as a cornerstone method in computational quantum chemistry and materials science due to its remarkable versatility and scalability. Yet, it suffers from limitations in accuracy, particularly when dealing with strongly correlated systems. To address these shortcomings, recent work has begun to explore how machine learning can expand the capabilities of DFT; an endeavor with many open questions and technical challenges. In this work, we present Grad DFT: a fully differentiable JAX-based DFT library, enabling quick prototyping and experimentation with machine learning-enhanced exchange-correlation energy functionals. Grad DFT employs a pioneering parametrization of exchange-correlation functionals constructed using a weighted sum of energy densities, where the weights are determined using neural networks. Moreover, Grad DFT encompasses a comprehensive suite of auxiliary functions, notably featuring a just-in-time compilable and fully differentiable self-consistent iterative procedure. To support training and benchmarking efforts, we additionally compile a curated dataset of experimental dissociation energies of dimers, half of which contain transition metal atoms characterized by strong electronic correlations. The software library is tested against experimental results to study the generalization capabilities of a neural functional across potential energy surfaces and atomic species, as well as the effect of training data noise on the resulting model accuracy.

Multi-marginal Schrödinger Bridges with Iterative Reference Refinement

Practitioners frequently aim to infer an unobserved population trajectory using sample snapshots at multiple time points. For instance, in single-cell sequencing, scientists would like to learn how gene expression evolves over time. But sequencing any cell destroys that cell. So we cannot access any cell's full trajectory, but we can access snapshot samples from many cells. Stochastic differential equations are commonly used to analyze systems with full individual-trajectory access; since here we have only sample snapshots, these methods are inapplicable. The deep learning community has recently explored using Schr\"odinger bridges (SBs) and their extensions to estimate these dynamics. However, these methods either (1) interpolate between just two time points or (2) require a single fixed reference dynamic within the SB, which is often just set to be Brownian motion. But learning piecewise from adjacent time points can fail to capture long-term dependencies. And practitioners are typically able to specify a model class for the reference dynamic but not the exact values of the parameters within it. So we propose a new method that (1) learns the unobserved trajectories from sample snapshots across multiple time points and (2) requires specification only of a class of reference dynamics, not a single fixed one. In particular, we suggest an iterative projection method inspired by Schr\"odinger bridges; we alternate between learning a piecewise SB on the unobserved trajectories and using the learned SB to refine our best guess for the dynamics within the reference class. We demonstrate the advantages of our method via a well-known simulated parametric model from ecology, simulated and real data from systems biology, and real motion-capture data.

Causality and Renormalization in Finite-Time-Path Out-of-Equilibrium φ^3 QFT

Our aim is to contribute to quantum field theory (QFT) formalisms useful for descriptions of short time phenomena, dominant especially in heavy ion collisions. We formulate out-of-equilibrium QFT within the finite-time-path formalism (FTP) and renormalization theory (RT). The potential conflict of FTP and RT is investigated in g phi^3 QFT, by using the retarded/advanced (R/A) basis of Green functions and dimensional renormalization (DR). For example, vertices immediately after (in time) divergent self-energy loops do not conserve energy, as integrals diverge. We "repair" them, while keeping d<4, to obtain energy conservation at those vertices. Already in the S-matrix theory, the renormalized, finite part of Feynman self-energy Sigma_{F}(p_0) does not vanish when |p_0|rightarrowinfty and cannot be split to retarded and advanced parts. In the Glaser--Epstein approach, the causality is repaired in the composite object G_F(p_0)Sigma_{F}(p_0). In the FTP approach, after repairing the vertices, the corresponding composite objects are G_R(p_0)Sigma_{R}(p_0) and Sigma_{A}(p_0)G_A(p_0). In the limit drightarrow 4, one obtains causal QFT. The tadpole contribution splits into diverging and finite parts. The diverging, constant component is eliminated by the renormalization condition langle 0|phi|0rangle =0 of the S-matrix theory. The finite, oscillating energy-nonconserving tadpole contributions vanish in the limit trightarrow infty .

MatterGen: a generative model for inorganic materials design

The design of functional materials with desired properties is essential in driving technological advances in areas like energy storage, catalysis, and carbon capture. Generative models provide a new paradigm for materials design by directly generating entirely novel materials given desired property constraints. Despite recent progress, current generative models have low success rate in proposing stable crystals, or can only satisfy a very limited set of property constraints. Here, we present MatterGen, a model that generates stable, diverse inorganic materials across the periodic table and can further be fine-tuned to steer the generation towards a broad range of property constraints. To enable this, we introduce a new diffusion-based generative process that produces crystalline structures by gradually refining atom types, coordinates, and the periodic lattice. We further introduce adapter modules to enable fine-tuning towards any given property constraints with a labeled dataset. Compared to prior generative models, structures produced by MatterGen are more than twice as likely to be novel and stable, and more than 15 times closer to the local energy minimum. After fine-tuning, MatterGen successfully generates stable, novel materials with desired chemistry, symmetry, as well as mechanical, electronic and magnetic properties. Finally, we demonstrate multi-property materials design capabilities by proposing structures that have both high magnetic density and a chemical composition with low supply-chain risk. We believe that the quality of generated materials and the breadth of MatterGen's capabilities represent a major advancement towards creating a universal generative model for materials design.

Gradual Optimization Learning for Conformational Energy Minimization

Molecular conformation optimization is crucial to computer-aided drug discovery and materials design. Traditional energy minimization techniques rely on iterative optimization methods that use molecular forces calculated by a physical simulator (oracle) as anti-gradients. However, this is a computationally expensive approach that requires many interactions with a physical simulator. One way to accelerate this procedure is to replace the physical simulator with a neural network. Despite recent progress in neural networks for molecular conformation energy prediction, such models are prone to distribution shift, leading to inaccurate energy minimization. We find that the quality of energy minimization with neural networks can be improved by providing optimization trajectories as additional training data. Still, it takes around 5 times 10^5 additional conformations to match the physical simulator's optimization quality. In this work, we present the Gradual Optimization Learning Framework (GOLF) for energy minimization with neural networks that significantly reduces the required additional data. The framework consists of an efficient data-collecting scheme and an external optimizer. The external optimizer utilizes gradients from the energy prediction model to generate optimization trajectories, and the data-collecting scheme selects additional training data to be processed by the physical simulator. Our results demonstrate that the neural network trained with GOLF performs on par with the oracle on a benchmark of diverse drug-like molecules using 50x less additional data.

Neural Network Approximations of PDEs Beyond Linearity: A Representational Perspective

A burgeoning line of research leverages deep neural networks to approximate the solutions to high dimensional PDEs, opening lines of theoretical inquiry focused on explaining how it is that these models appear to evade the curse of dimensionality. However, most prior theoretical analyses have been limited to linear PDEs. In this work, we take a step towards studying the representational power of neural networks for approximating solutions to nonlinear PDEs. We focus on a class of PDEs known as nonlinear elliptic variational PDEs, whose solutions minimize an Euler-Lagrange energy functional E(u) = int_Omega L(x, u(x), nabla u(x)) - f(x) u(x)dx. We show that if composing a function with Barron norm b with partial derivatives of L produces a function of Barron norm at most B_L b^p, the solution to the PDE can be epsilon-approximated in the L^2 sense by a function with Barron norm Oleft(left(dB_Lright)^{max{p log(1/ epsilon), p^{log(1/epsilon)}}}right). By a classical result due to Barron [1993], this correspondingly bounds the size of a 2-layer neural network needed to approximate the solution. Treating p, epsilon, B_L as constants, this quantity is polynomial in dimension, thus showing neural networks can evade the curse of dimensionality. Our proof technique involves neurally simulating (preconditioned) gradient in an appropriate Hilbert space, which converges exponentially fast to the solution of the PDE, and such that we can bound the increase of the Barron norm at each iterate. Our results subsume and substantially generalize analogous prior results for linear elliptic PDEs over a unit hypercube.

Algorithm-assisted discovery of an intrinsic order among mathematical constants

In recent decades, a growing number of discoveries in fields of mathematics have been assisted by computer algorithms, primarily for exploring large parameter spaces that humans would take too long to investigate. As computers and algorithms become more powerful, an intriguing possibility arises - the interplay between human intuition and computer algorithms can lead to discoveries of novel mathematical concepts that would otherwise remain elusive. To realize this perspective, we have developed a massively parallel computer algorithm that discovers an unprecedented number of continued fraction formulas for fundamental mathematical constants. The sheer number of formulas discovered by the algorithm unveils a novel mathematical structure that we call the conservative matrix field. Such matrix fields (1) unify thousands of existing formulas, (2) generate infinitely many new formulas, and most importantly, (3) lead to unexpected relations between different mathematical constants, including multiple integer values of the Riemann zeta function. Conservative matrix fields also enable new mathematical proofs of irrationality. In particular, we can use them to generalize the celebrated proof by Ap\'ery for the irrationality of zeta(3). Utilizing thousands of personal computers worldwide, our computer-supported research strategy demonstrates the power of experimental mathematics, highlighting the prospects of large-scale computational approaches to tackle longstanding open problems and discover unexpected connections across diverse fields of science.

Interplay between thermal and compositional gradients decides the microstructure during thermomigration: a phase-field study

The presence of thermal gradients in alloys often leads to non-uniformity in concentration profiles, which can induce the thermomigration of microstructural features such as precipitates. To investigate such microstructural changes, we present a phase-field model that incorporates coupling between concentration and thermal gradients. First, we simulated the evolution of non-uniform concentration profiles in the single-phase regions of Fe-C and Fe-N alloy systems due to imposed thermal gradients. To validate our model with the classical experiments performed by Darken and Oriani, we studied the evolution of spatially varying concentration profiles where thermal gradients encompass single-phase and two-phase regions. We developed a parameterized thermodynamic description of the two-phase region of a binary alloy to systematically study the effect of interactions between chemically-driven and thermal gradient-driven diffusion of solute on the evolution of precipitates. Our simulations show how thermal gradient, precipitate size, and interparticle distance influence the migration and associated morphological changes of precipitates. The composition profiles and migration rates obtained from single-particle simulations show an exact match with our analytical model. We use twoparticle simulations to show conditions under which thermomigration induces the growth of the smaller particle and shrinkage of the larger one in contrast to the isothermal Ostwald ripening behavior. Our multiparticle simulations show similar behavior during coarsening. Moreover, in the presence of a thermal gradient, there is a shift in the center of mass of the precipitates towards the high-temperature region. Thus, our study offers new insights into the phenomena of microstructure evolution in the presence of thermal gradient.