Skip to main content

Showing 1–18 of 18 results for author: Berner, J

  1. arXiv:2407.07873  [pdf, other

    cs.LG math.DS math.OC math.PR stat.ML

    Dynamical Measure Transport and Neural PDE Solvers for Sampling

    Authors: Jingtong Sun, Julius Berner, Lorenz Richter, Marius Zeinhofer, Johannes Müller, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: The task of sampling from a probability density can be approached as transporting a tractable density function to the target, known as dynamical measure transport. In this work, we tackle it through a principled unified framework using deterministic or stochastic evolutions described by partial differential equations (PDEs). This framework incorporates prior trajectory-based sampling methods, such… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  2. arXiv:2407.03925  [pdf, other

    cs.LG

    Reduced-Order Neural Operators: Learning Lagrangian Dynamics on Highly Sparse Graphs

    Authors: Hrishikesh Viswanath, Yue Chang, Julius Berner, Peter Yichen Chen, Aniket Bera

    Abstract: We present a neural operator architecture to simulate Lagrangian dynamics, such as fluid flow, granular flows, and elastoplasticity. Traditional numerical methods, such as the finite element method (FEM), suffer from long run times and large memory consumption. On the other hand, approaches based on graph neural networks are faster but still suffer from long computation times on dense graphs, whic… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  3. arXiv:2407.01521  [pdf, other

    cs.LG cs.AI cs.CV

    Improving Diffusion Inverse Problem Solving with Decoupled Noise Annealing

    Authors: Bingliang Zhang, Wenda Chu, Julius Berner, Chenlin Meng, Anima Anandkumar, Yang Song

    Abstract: Diffusion models have recently achieved success in solving Bayesian inverse problems with learned data priors. Current methods build on top of the diffusion sampling process, where each denoising step makes small modifications to samples from the previous step. However, this process struggles to correct errors from earlier sampling steps, leading to worse performance in complicated nonlinear inver… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  4. arXiv:2406.03494  [pdf, other

    cs.LG math.NA stat.ML

    Solving Poisson Equations using Neural Walk-on-Spheres

    Authors: Hong Chul Nam, Julius Berner, Anima Anandkumar

    Abstract: We propose Neural Walk-on-Spheres (NWoS), a novel neural PDE solver for the efficient solution of high-dimensional Poisson equations. Leveraging stochastic representations and Walk-on-Spheres methods, we develop novel losses for neural networks based on the recursive solution of Poisson equations on spheres inside the domain. The resulting method is highly parallelizable and does not require spati… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Accepted at ICML 2024

  5. arXiv:2403.12553  [pdf, other

    cs.LG

    Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs

    Authors: Md Ashiqur Rahman, Robert Joseph George, Mogab Elleithy, Daniel Leibovici, Zongyi Li, Boris Bonev, Colin White, Julius Berner, Raymond A. Yeh, Jean Kossaifi, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: Existing neural operator architectures face challenges when solving multiphysics problems with coupled partial differential equations (PDEs), due to complex geometries, interactions between physical variables, and the lack of large amounts of high-resolution training data. To address these issues, we propose Codomain Attention Neural Operator (CoDA-NO), which tokenizes functions along the codomain… ▽ More

    Submitted 5 April, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

  6. arXiv:2403.03542  [pdf, other

    cs.LG math.NA

    DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training

    Authors: Zhongkai Hao, Chang Su, Songming Liu, Julius Berner, Chengyang Ying, Hang Su, Anima Anandkumar, Jian Song, Jun Zhu

    Abstract: Pre-training has been investigated to improve the efficiency and performance of training neural operators in data-scarce settings. However, it is largely in its infancy due to the inherent complexity and diversity, such as long trajectories, multiple scales and varying dimensions of partial differential equations (PDEs) data. In this paper, we present a new auto-regressive denoising pre-training s… ▽ More

    Submitted 6 May, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

  7. arXiv:2402.16845  [pdf, other

    cs.LG cs.AI math.NA

    Neural Operators with Localized Integral and Differential Kernels

    Authors: Miguel Liu-Schiaffini, Julius Berner, Boris Bonev, Thorsten Kurth, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: Neural operators learn mappings between function spaces, which is practical for learning solution operators of PDEs and other scientific modeling applications. Among them, the Fourier neural operator (FNO) is a popular architecture that performs global convolutions in the Fourier space. However, such global operations are often prone to over-smoothing and may fail to capture local details. In cont… ▽ More

    Submitted 8 June, 2024; v1 submitted 26 February, 2024; originally announced February 2024.

    Comments: Accepted at 2024 International Conference on Machine Learning

  8. arXiv:2312.04556  [pdf, other

    cs.CL cs.AI cs.LG math.HO

    Large Language Models for Mathematicians

    Authors: Simon Frieder, Julius Berner, Philipp Petersen, Thomas Lukasiewicz

    Abstract: Large language models (LLMs) such as ChatGPT have received immense interest for their general-purpose language understanding and, in particular, their ability to generate high-quality text or computer code. For many professions, LLMs represent an invaluable tool that can speed up and improve the quality of work. In this note, we discuss to what extent they can aid professional mathematicians. We f… ▽ More

    Submitted 2 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

    Journal ref: International Mathematical News 254 (2023) 1-20

  9. arXiv:2307.01198  [pdf, other

    cs.LG math.OC math.PR stat.ML

    Improved sampling via learned diffusions

    Authors: Lorenz Richter, Julius Berner

    Abstract: Recently, a series of papers proposed deep learning-based approaches to sample from target distributions using controlled diffusion processes, being trained only on the unnormalized target densities without access to samples. Building on previous work, we identify these approaches as special cases of a generalized Schrödinger bridge problem, seeking a stochastic evolution between a given prior dis… ▽ More

    Submitted 23 May, 2024; v1 submitted 3 July, 2023; originally announced July 2023.

    Comments: Accepted at ICLR 2024

    Journal ref: International Conference on Learning Representations, 2024

  10. arXiv:2301.13867  [pdf, other

    cs.LG cs.AI cs.CL

    Mathematical Capabilities of ChatGPT

    Authors: Simon Frieder, Luca Pinchetti, Alexis Chevalier, Ryan-Rhys Griffiths, Tommaso Salvatori, Thomas Lukasiewicz, Philipp Christian Petersen, Julius Berner

    Abstract: We investigate the mathematical capabilities of two iterations of ChatGPT (released 9-January-2023 and 30-January-2023) and of GPT-4 by testing them on publicly available datasets, as well as hand-crafted ones, using a novel methodology. In contrast to formal mathematics, where large databases of formal proofs are available (e.g., the Lean Mathematical Library), current datasets of natural-languag… ▽ More

    Submitted 20 July, 2023; v1 submitted 31 January, 2023; originally announced January 2023.

    Comments: Added further evaluations on another ChatGPT version and on GPT-4. The GHOSTS and miniGHOSTS datasets are available at https://github.com/xyfrieder/science-GHOSTS

    Journal ref: NeurIPS 2023 Datasets and Benchmarks

  11. arXiv:2211.01364  [pdf, other

    cs.LG math.OC stat.ML

    An optimal control perspective on diffusion-based generative modeling

    Authors: Julius Berner, Lorenz Richter, Karen Ullrich

    Abstract: We establish a connection between stochastic optimal control and generative models based on stochastic differential equations (SDEs), such as recently developed diffusion probabilistic models. In particular, we derive a Hamilton-Jacobi-Bellman equation that governs the evolution of the log-densities of the underlying SDE marginals. This perspective allows to transfer methods from optimal control t… ▽ More

    Submitted 26 March, 2024; v1 submitted 2 November, 2022; originally announced November 2022.

    Comments: Accepted for oral presentation at NeurIPS 2022 Workshop on Score-Based Methods

    Journal ref: Transactions on Machine Learning Research, 2024

  12. arXiv:2206.10588  [pdf, other

    cs.LG math.NA stat.ML

    Robust SDE-Based Variational Formulations for Solving Linear PDEs via Deep Learning

    Authors: Lorenz Richter, Julius Berner

    Abstract: The combination of Monte Carlo methods and deep learning has recently led to efficient algorithms for solving partial differential equations (PDEs) in high dimensions. Related learning problems are often stated as variational formulations based on associated stochastic differential equations (SDEs), which allow the minimization of corresponding losses using gradient-based optimization methods. In… ▽ More

    Submitted 5 August, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: Accepted at ICML 2022

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, 2022, pp. 18649-18666

  13. arXiv:2205.13531  [pdf, other

    cs.LG stat.ML

    Learning ReLU networks to high uniform accuracy is intractable

    Authors: Julius Berner, Philipp Grohs, Felix Voigtlaender

    Abstract: Statistical learning theory provides bounds on the necessary number of training samples needed to reach a prescribed accuracy in a learning problem formulated over a given target class. This accuracy is typically measured in terms of a generalization error, that is, an expected value of a given loss function. However, for several applications -- for example in a security-critical context or for pr… ▽ More

    Submitted 28 February, 2023; v1 submitted 26 May, 2022; originally announced May 2022.

    Comments: Accepted at ICLR 2023

  14. The Modern Mathematics of Deep Learning

    Authors: Julius Berner, Philipp Grohs, Gitta Kutyniok, Philipp Petersen

    Abstract: We describe the new field of mathematical analysis of deep learning. This field emerged around a list of research questions that were not answered within the classical framework of learning theory. These questions concern: the outstanding generalization power of overparametrized neural networks, the role of depth in deep architectures, the apparent absence of the curse of dimensionality, the surpr… ▽ More

    Submitted 8 February, 2023; v1 submitted 9 May, 2021; originally announced May 2021.

    Comments: A version of this review paper appears as a chapter in the book "Mathematical Aspects of Deep Learning" by Cambridge University Press

    Journal ref: Mathematical Aspects of Deep Learning, pp. 1-111. Cambridge University Press, 2022

  15. arXiv:2011.04602  [pdf, other

    cs.LG math.NA stat.ML

    Numerically Solving Parametric Families of High-Dimensional Kolmogorov Partial Differential Equations via Deep Learning

    Authors: Julius Berner, Markus Dablander, Philipp Grohs

    Abstract: We present a deep learning algorithm for the numerical solution of parametric families of high-dimensional linear Kolmogorov partial differential equations (PDEs). Our method is based on reformulating the numerical approximation of a whole family of Kolmogorov PDEs as a single statistical learning problem using the Feynman-Kac formula. Successful numerical experiments are presented, which empirica… ▽ More

    Submitted 9 November, 2020; originally announced November 2020.

    Comments: Accepted at NeurIPS 2020

    Journal ref: Advances in Neural Information Processing Systems 33, 2020, pp. 16615-16627

  16. arXiv:1905.09803  [pdf, ps, other

    cs.LG math.FA math.OC stat.ML

    How degenerate is the parametrization of neural networks with the ReLU activation function?

    Authors: Julius Berner, Dennis Elbrächter, Philipp Grohs

    Abstract: Neural network training is usually accomplished by solving a non-convex optimization problem using stochastic gradient descent. Although one optimizes over the networks parameters, the main loss function generally only depends on the realization of the neural network, i.e. the function it computes. Studying the optimization problem over the space of realizations opens up new ways to understand neu… ▽ More

    Submitted 8 February, 2023; v1 submitted 23 May, 2019; originally announced May 2019.

    Comments: Accepted at NeurIPS 2019

    Journal ref: Advances in Neural Information Processing Systems 32, 2019, pp. 7790-7801

  17. Towards a regularity theory for ReLU networks -- chain rule and global error estimates

    Authors: Julius Berner, Dennis Elbrächter, Philipp Grohs, Arnulf Jentzen

    Abstract: Although for neural networks with locally Lipschitz continuous activation functions the classical derivative exists almost everywhere, the standard chain rule is in general not applicable. We will consider a way of introducing a derivative for neural networks that admits a chain rule, which is both rigorous and easy to work with. In addition we will present a method of converting approximation res… ▽ More

    Submitted 13 May, 2019; originally announced May 2019.

    Comments: Accepted for presentation at SampTA 2019

    Journal ref: 13th International conference on Sampling Theory and Applications (SampTA), 2019, pp. 1-5

  18. arXiv:1809.03062  [pdf, ps, other

    cs.LG math.NA stat.ML

    Analysis of the Generalization Error: Empirical Risk Minimization over Deep Artificial Neural Networks Overcomes the Curse of Dimensionality in the Numerical Approximation of Black-Scholes Partial Differential Equations

    Authors: Julius Berner, Philipp Grohs, Arnulf Jentzen

    Abstract: The development of new classification and regression algorithms based on empirical risk minimization (ERM) over deep neural network hypothesis classes, coined deep learning, revolutionized the area of artificial intelligence, machine learning, and data analysis. In particular, these methods have been applied to the numerical solution of high-dimensional partial differential equations with great su… ▽ More

    Submitted 11 November, 2020; v1 submitted 9 September, 2018; originally announced September 2018.

    MSC Class: 60H30; 65C30; 62M45; 68T05

    Journal ref: SIAM Journal on Mathematics of Data Science 2(3), 2020, pp. 631-657