new

Get trending papers in your email inbox!

Subscribe

byAK and the research community

Jun 6

Generative Marginalization Models

We introduce marginalization models (MaMs), a new family of generative models for high-dimensional discrete data. They offer scalable and flexible generative modeling with tractable likelihoods by explicitly modeling all induced marginal distributions. Marginalization models enable fast evaluation of arbitrary marginal probabilities with a single forward pass of the neural network, which overcomes a major limitation of methods with exact marginal inference, such as autoregressive models (ARMs). We propose scalable methods for learning the marginals, grounded in the concept of "marginalization self-consistency". Unlike previous methods, MaMs support scalable training of any-order generative models for high-dimensional problems under the setting of energy-based training, where the goal is to match the learned distribution to a given desired probability (specified by an unnormalized (log) probability function such as energy function or reward function). We demonstrate the effectiveness of the proposed model on a variety of discrete data distributions, including binary images, language, physical systems, and molecules, for maximum likelihood and energy-based training settings. MaMs achieve orders of magnitude speedup in evaluating the marginal probabilities on both settings. For energy-based training tasks, MaMs enable any-order generative modeling of high-dimensional problems beyond the capability of previous methods. Code is at https://github.com/PrincetonLIPS/MaM.

Learning from Label Proportions: Bootstrapping Supervised Learners via Belief Propagation

Learning from Label Proportions (LLP) is a learning problem where only aggregate level labels are available for groups of instances, called bags, during training, and the aim is to get the best performance at the instance-level on the test data. This setting arises in domains like advertising and medicine due to privacy considerations. We propose a novel algorithmic framework for this problem that iteratively performs two main steps. For the first step (Pseudo Labeling) in every iteration, we define a Gibbs distribution over binary instance labels that incorporates a) covariate information through the constraint that instances with similar covariates should have similar labels and b) the bag level aggregated label. We then use Belief Propagation (BP) to marginalize the Gibbs distribution to obtain pseudo labels. In the second step (Embedding Refinement), we use the pseudo labels to provide supervision for a learner that yields a better embedding. Further, we iterate on the two steps again by using the second step's embeddings as new covariates for the next iteration. In the final iteration, a classifier is trained using the pseudo labels. Our algorithm displays strong gains against several SOTA baselines (up to 15%) for the LLP Binary Classification problem on various dataset types - tabular and Image. We achieve these improvements with minimal computational overhead above standard supervised learning due to Belief Propagation, for large bag sizes, even for a million samples.

An analytic redshift-independent formulation of baryonic effects on the matter power spectrum

Baryonic effects created by feedback processes associated with galaxy formation are an important, poorly constrained systematic effect for models of large-scale structure as probed by weak gravitational lensing. Upcoming surveys require fast methods to predict and marginalize over the potential impact of baryons on the total matter power spectrum. Here we use the FLAMINGO cosmological hydrodynamical simulations to test a recent proposal to approximate the matter power spectrum as the sum of the linear matter power spectrum and a constant multiple, A_{rm mod}, of the difference between the linear and non-linear gravity-only power spectra. We show that replacing this constant multiple with a one-parameter family of sigmoid functions of the wavenumber k allows to us match the predictions of simulations with different feedback strengths for z leq 1, k < 3~hrm Mpc^{-1}, and the different cosmological models in the FLAMINGO suite. The baryonic response predicted by FLAMINGO models that use jet-like AGN feedback instead of the fiducial thermally-driven AGN feedback can also be reproduced, but at the cost of increasing the number of parameters in the sigmoid function from one to three. The assumption that A_{rm mod} depends only on k breaks down for decaying dark matter models, highlighting the need for more advanced baryon response models when studying cosmological models that deviate strongly from LambdaCDM.