-
LeanAgent: Lifelong Learning for Formal Theorem Proving
Authors:
Adarsh Kumarappan,
Mo Tiwari,
Peiyang Song,
Robert Joseph George,
Chaowei Xiao,
Anima Anandkumar
Abstract:
Large Language Models (LLMs) have been successful in mathematical reasoning tasks such as formal theorem proving when integrated with interactive proof assistants like Lean. Existing approaches involve training or fine-tuning an LLM on a specific dataset to perform well on particular domains, such as undergraduate-level mathematics. These methods struggle with generalizability to advanced mathemat…
▽ More
Large Language Models (LLMs) have been successful in mathematical reasoning tasks such as formal theorem proving when integrated with interactive proof assistants like Lean. Existing approaches involve training or fine-tuning an LLM on a specific dataset to perform well on particular domains, such as undergraduate-level mathematics. These methods struggle with generalizability to advanced mathematics. A fundamental limitation is that these approaches operate on static domains, failing to capture how mathematicians often work across multiple domains and projects simultaneously or cyclically. We present LeanAgent, a novel lifelong learning framework for theorem proving that continuously generalizes to and improves on ever-expanding mathematical knowledge without forgetting previously learned knowledge. LeanAgent introduces several key innovations, including a curriculum learning strategy that optimizes the learning trajectory in terms of mathematical difficulty, a dynamic database for efficient management of evolving mathematical knowledge, and progressive training to balance stability and plasticity. LeanAgent successfully proves 162 theorems previously unproved by humans across 23 diverse Lean repositories, many from advanced mathematics. It performs significantly better than the static LLM baseline, proving challenging theorems in domains like abstract algebra and algebraic topology while showcasing a clear progression of learning from basic concepts to advanced topics. In addition, we analyze LeanAgent's superior performance on key lifelong learning metrics. LeanAgent achieves exceptional scores in stability and backward transfer, where learning new tasks improves performance on previously learned tasks. This emphasizes LeanAgent's continuous generalizability and improvement, explaining its superior theorem-proving performance.
△ Less
Submitted 17 October, 2024; v1 submitted 8 October, 2024;
originally announced October 2024.
-
Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs
Authors:
Md Ashiqur Rahman,
Robert Joseph George,
Mogab Elleithy,
Daniel Leibovici,
Zongyi Li,
Boris Bonev,
Colin White,
Julius Berner,
Raymond A. Yeh,
Jean Kossaifi,
Kamyar Azizzadenesheli,
Anima Anandkumar
Abstract:
Existing neural operator architectures face challenges when solving multiphysics problems with coupled partial differential equations (PDEs), due to complex geometries, interactions between physical variables, and the lack of large amounts of high-resolution training data. To address these issues, we propose Codomain Attention Neural Operator (CoDA-NO), which tokenizes functions along the codomain…
▽ More
Existing neural operator architectures face challenges when solving multiphysics problems with coupled partial differential equations (PDEs), due to complex geometries, interactions between physical variables, and the lack of large amounts of high-resolution training data. To address these issues, we propose Codomain Attention Neural Operator (CoDA-NO), which tokenizes functions along the codomain or channel space, enabling self-supervised learning or pretraining of multiple PDE systems. Specifically, we extend positional encoding, self-attention, and normalization layers to the function space. CoDA-NO can learn representations of different PDE systems with a single model. We evaluate CoDA-NO's potential as a backbone for learning multiphysics PDEs over multiple systems by considering few-shot learning settings. On complex downstream tasks with limited data, such as fluid flow simulations and fluid-structure interactions, we found CoDA-NO to outperform existing methods on the few-shot learning task by over $36\%$. The code is available at https://github.com/ashiq24/CoDA-NO.
△ Less
Submitted 5 April, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
Hybrid Dealiased Convolutions
Authors:
Robert Joseph George,
Noel Murasko,
John C. Bowman
Abstract:
This paper proposes a practical and efficient solution for computing convolutions using hybrid dealiasing. It offers an alternative to explicit or implicit dealiasing and includes an optimized hyperparameter tuning algorithm that uses experience to find the optimal parameters. Machine learning algorithms and efficient heuristics are also developed to estimate optimal parameters for larger convolut…
▽ More
This paper proposes a practical and efficient solution for computing convolutions using hybrid dealiasing. It offers an alternative to explicit or implicit dealiasing and includes an optimized hyperparameter tuning algorithm that uses experience to find the optimal parameters. Machine learning algorithms and efficient heuristics are also developed to estimate optimal parameters for larger convolution problems using only small squares/rectangles.
△ Less
Submitted 14 May, 2023;
originally announced June 2023.
-
Incremental Spatial and Spectral Learning of Neural Operators for Solving Large-Scale PDEs
Authors:
Robert Joseph George,
Jiawei Zhao,
Jean Kossaifi,
Zongyi Li,
Anima Anandkumar
Abstract:
Fourier Neural Operators (FNO) offer a principled approach to solving challenging partial differential equations (PDE) such as turbulent flows. At the core of FNO is a spectral layer that leverages a discretization-convergent representation in the Fourier domain, and learns weights over a fixed set of frequencies. However, training FNO presents two significant challenges, particularly in large-sca…
▽ More
Fourier Neural Operators (FNO) offer a principled approach to solving challenging partial differential equations (PDE) such as turbulent flows. At the core of FNO is a spectral layer that leverages a discretization-convergent representation in the Fourier domain, and learns weights over a fixed set of frequencies. However, training FNO presents two significant challenges, particularly in large-scale, high-resolution applications: (i) Computing Fourier transform on high-resolution inputs is computationally intensive but necessary since fine-scale details are needed for solving many PDEs, such as fluid flows, (ii) selecting the relevant set of frequencies in the spectral layers is challenging, and too many modes can lead to overfitting, while too few can lead to underfitting. To address these issues, we introduce the Incremental Fourier Neural Operator (iFNO), which progressively increases both the number of frequency modes used by the model as well as the resolution of the training data. We empirically show that iFNO reduces total training time while maintaining or improving generalization performance across various datasets. Our method demonstrates a 10% lower testing error, using 20% fewer frequency modes compared to the existing Fourier Neural Operator, while also achieving a 30% faster training.
△ Less
Submitted 4 March, 2024; v1 submitted 28 November, 2022;
originally announced November 2022.
-
Numerical Analysis for Real-time Nonlinear Model Predictive Control of Ethanol Steam Reformers
Authors:
Robert Joseph George,
Xinwei Yu
Abstract:
The utilization of renewable energy technologies, particularly hydrogen, has seen a boom in interest and has spread throughout the world. Ethanol steam reformation is one of the primary methods capable of producing hydrogen efficiently and reliably. This paper provides an in-depth study of the reformulated system both theoretically and numerically, as well as a plan to explore the possibility of c…
▽ More
The utilization of renewable energy technologies, particularly hydrogen, has seen a boom in interest and has spread throughout the world. Ethanol steam reformation is one of the primary methods capable of producing hydrogen efficiently and reliably. This paper provides an in-depth study of the reformulated system both theoretically and numerically, as well as a plan to explore the possibility of converting the system into its conservation form. Lastly, we offer an overview of several numerical approaches for solving the general first-order quasi-linear hyperbolic equation to the particular model for ethanol steam reforming (ESR). We conclude by presenting some results that would enable the usage of these ODE/PDE solvers to be used in non-linear model predictive control (NMPC) algorithms and discuss the limitations of our approach and directions for future work.
△ Less
Submitted 25 April, 2023; v1 submitted 24 October, 2022;
originally announced October 2022.