-
Dynamical Measure Transport and Neural PDE Solvers for Sampling
Authors:
Jingtong Sun,
Julius Berner,
Lorenz Richter,
Marius Zeinhofer,
Johannes Müller,
Kamyar Azizzadenesheli,
Anima Anandkumar
Abstract:
The task of sampling from a probability density can be approached as transporting a tractable density function to the target, known as dynamical measure transport. In this work, we tackle it through a principled unified framework using deterministic or stochastic evolutions described by partial differential equations (PDEs). This framework incorporates prior trajectory-based sampling methods, such…
▽ More
The task of sampling from a probability density can be approached as transporting a tractable density function to the target, known as dynamical measure transport. In this work, we tackle it through a principled unified framework using deterministic or stochastic evolutions described by partial differential equations (PDEs). This framework incorporates prior trajectory-based sampling methods, such as diffusion models or Schrödinger bridges, without relying on the concept of time-reversals. Moreover, it allows us to propose novel numerical methods for solving the transport task and thus sampling from complicated targets without the need for the normalization constant or data samples. We employ physics-informed neural networks (PINNs) to approximate the respective PDE solutions, implying both conceptional and computational advantages. In particular, PINNs allow for simulation- and discretization-free optimization and can be trained very efficiently, leading to significantly better mode coverage in the sampling task compared to alternative methods. Moreover, they can readily be fine-tuned with Gauss-Newton methods to achieve high accuracy in sampling.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
Kronecker-Factored Approximate Curvature for Physics-Informed Neural Networks
Authors:
Felix Dangel,
Johannes Müller,
Marius Zeinhofer
Abstract:
Physics-informed neural networks (PINNs) are infamous for being hard to train. Recently, second-order methods based on natural gradient and Gauss-Newton methods have shown promising performance, improving the accuracy achieved by first-order methods by several orders of magnitude. While promising, the proposed methods only scale to networks with a few thousand parameters due to the high computatio…
▽ More
Physics-informed neural networks (PINNs) are infamous for being hard to train. Recently, second-order methods based on natural gradient and Gauss-Newton methods have shown promising performance, improving the accuracy achieved by first-order methods by several orders of magnitude. While promising, the proposed methods only scale to networks with a few thousand parameters due to the high computational cost to evaluate, store, and invert the curvature matrix. We propose Kronecker-factored approximate curvature (KFAC) for PINN losses that greatly reduces the computational cost and allows scaling to much larger networks. Our approach goes beyond the established KFAC for traditional deep learning problems as it captures contributions from a PDE's differential operator that are crucial for optimization. To establish KFAC for such losses, we use Taylor-mode automatic differentiation to describe the differential operator's computation graph as a forward network with shared weights. This allows us to apply KFAC thanks to a recently-developed general formulation for networks with weight sharing. Empirically, we find that our KFAC-based optimizers are competitive with expensive second-order methods on small problems, scale more favorably to higher-dimensional neural networks and PDEs, and consistently outperform first-order methods and LBFGS.
△ Less
Submitted 27 May, 2024; v1 submitted 24 May, 2024;
originally announced May 2024.
-
Achieving High Accuracy with PINNs via Energy Natural Gradients
Authors:
Johannes Müller,
Marius Zeinhofer
Abstract:
We propose energy natural gradient descent, a natural gradient method with respect to a Hessian-induced Riemannian metric as an optimization algorithm for physics-informed neural networks (PINNs) and the deep Ritz method. As a main motivation we show that the update direction in function space resulting from the energy natural gradient corresponds to the Newton direction modulo an orthogonal proje…
▽ More
We propose energy natural gradient descent, a natural gradient method with respect to a Hessian-induced Riemannian metric as an optimization algorithm for physics-informed neural networks (PINNs) and the deep Ritz method. As a main motivation we show that the update direction in function space resulting from the energy natural gradient corresponds to the Newton direction modulo an orthogonal projection onto the model's tangent space. We demonstrate experimentally that energy natural gradient descent yields highly accurate solutions with errors several orders of magnitude smaller than what is obtained when training PINNs with standard optimizers like gradient descent or Adam, even when those are allowed significantly more computation time.
△ Less
Submitted 15 August, 2023; v1 submitted 25 February, 2023;
originally announced February 2023.
-
The Deep Ritz Method for Parametric $p$-Dirichlet Problems
Authors:
Alex Kaltenbach,
Marius Zeinhofer
Abstract:
We establish error estimates for the approximation of parametric $p$-Dirichlet problems deploying the Deep Ritz Method. Parametric dependencies include, e.g., varying geometries and exponents $p\in (1,\infty)$. Combining the derived error estimates with quantitative approximation theorems yields error decay rates and establishes that the Deep Ritz Method retains the favorable approximation capabil…
▽ More
We establish error estimates for the approximation of parametric $p$-Dirichlet problems deploying the Deep Ritz Method. Parametric dependencies include, e.g., varying geometries and exponents $p\in (1,\infty)$. Combining the derived error estimates with quantitative approximation theorems yields error decay rates and establishes that the Deep Ritz Method retains the favorable approximation capabilities of neural networks in the approximation of high dimensional functions which makes the method attractive for parametric problems. Finally, we present numerical examples to illustrate potential applications.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Error Estimates for the Deep Ritz Method with Boundary Penalty
Authors:
Johannes Müller,
Marius Zeinhofer
Abstract:
We estimate the error of the Deep Ritz Method for linear elliptic equations. For Dirichlet boundary conditions, we estimate the error when the boundary values are imposed through the boundary penalty method. Our results apply to arbitrary sets of ansatz functions and estimate the error in dependence of the optimization accuracy, the approximation capabilities of the ansatz class and -- in the case…
▽ More
We estimate the error of the Deep Ritz Method for linear elliptic equations. For Dirichlet boundary conditions, we estimate the error when the boundary values are imposed through the boundary penalty method. Our results apply to arbitrary sets of ansatz functions and estimate the error in dependence of the optimization accuracy, the approximation capabilities of the ansatz class and -- in the case of Dirichlet boundary values -- the penalization strength $λ$. To the best of our knowledge, our results are presently the only ones in the literature that treat the case of Dirichlet boundary conditions in full generality, i.e., without a lower order term that leads to coercivity on all of $H^1(Ω)$. Further, we discuss the implications of our results for ansatz classes which are given through ReLU networks and the relation to existing estimates for finite element functions. For high dimensional problems our results show that the favourable approximation capabilities of neural networks for smooth functions are inherited by the Deep Ritz Method.
△ Less
Submitted 5 September, 2022; v1 submitted 1 March, 2021;
originally announced March 2021.
-
Deep Ritz revisited
Authors:
Johannes Müller,
Marius Zeinhofer
Abstract:
Recently, progress has been made in the application of neural networks to the numerical analysis of partial differential equations (PDEs). In the latter the variational formulation of the Poisson problem is used in order to obtain an objective function - a regularised Dirichlet energy - that was used for the optimisation of some neural networks. In this notes we use the notion of $Γ$-convergence t…
▽ More
Recently, progress has been made in the application of neural networks to the numerical analysis of partial differential equations (PDEs). In the latter the variational formulation of the Poisson problem is used in order to obtain an objective function - a regularised Dirichlet energy - that was used for the optimisation of some neural networks. In this notes we use the notion of $Γ$-convergence to show that ReLU networks of growing architecture that are trained with respect to suitably regularised Dirichlet energies converge to the true solution of the Poisson problem. We discuss how this approach generalises to arbitrary variational problems under certain universality assumptions of neural networks and see that this covers some nonlinear stationary PDEs like the $p$-Laplace.
△ Less
Submitted 10 January, 2020; v1 submitted 9 December, 2019;
originally announced December 2019.