subscribe to arXiv mailings

Celcomen: spatial causal disentanglement for single-cell and tissue perturbation modeling

Authors: Stathis Megas, Daniel G. Chen, Krzysztof Polanski, Moshe Eliasof, Carola-Bibiane Schonlieb, Sarah A. Teichmann

Abstract: Celcomen leverages a mathematical causality framework to disentangle intra- and inter- cellular gene regulation programs in spatial transcriptomics and single-cell data through a generative graph neural network. It can learn gene-gene interactions, as well as generate post-perturbation counterfactual spatial transcriptomics, thereby offering access to experimentally inaccessible samples. We valida… ▽ More Celcomen leverages a mathematical causality framework to disentangle intra- and inter- cellular gene regulation programs in spatial transcriptomics and single-cell data through a generative graph neural network. It can learn gene-gene interactions, as well as generate post-perturbation counterfactual spatial transcriptomics, thereby offering access to experimentally inaccessible samples. We validated its disentanglement, identifiability, and counterfactual prediction capabilities through simulations and in clinically relevant human glioblastoma, human fetal spleen, and mouse lung cancer samples. Celcomen provides the means to model disease and therapy induced changes allowing for new insights into single-cell spatially resolved tissue responses relevant to human health. △ Less

Submitted 9 September, 2024; originally announced September 2024.

arXiv:2408.10436 [pdf, other]

Learning Regularization for Graph Inverse Problems

Authors: Moshe Eliasof, Md Shahriar Rahim Siddiqui, Carola-Bibiane Schönlieb, Eldad Haber

Abstract: In recent years, Graph Neural Networks (GNNs) have been utilized for various applications ranging from drug discovery to network design and social networks. In many applications, it is impossible to observe some properties of the graph directly; instead, noisy and indirect measurements of these properties are available. These scenarios are coined as Graph Inverse Problems (GRIP). In this work, we… ▽ More In recent years, Graph Neural Networks (GNNs) have been utilized for various applications ranging from drug discovery to network design and social networks. In many applications, it is impossible to observe some properties of the graph directly; instead, noisy and indirect measurements of these properties are available. These scenarios are coined as Graph Inverse Problems (GRIP). In this work, we introduce a framework leveraging GNNs to solve GRIPs. The framework is based on a combination of likelihood and prior terms, which are used to find a solution that fits the data while adhering to learned prior information. Specifically, we propose to combine recent deep learning techniques that were developed for inverse problems, together with GNN architectures, to formulate and solve GRIP. We study our approach on a number of representative problems that demonstrate the effectiveness of the framework. △ Less

Submitted 19 August, 2024; originally announced August 2024.

arXiv:2407.02013 [pdf, other]

DiGRAF: Diffeomorphic Graph-Adaptive Activation Function

Authors: Krishna Sri Ipsit Mantri, Xinzhi Wang, Carola-Bibiane Schönlieb, Bruno Ribeiro, Beatrice Bevilacqua, Moshe Eliasof

Abstract: In this paper, we propose a novel activation function tailored specifically for graph data in Graph Neural Networks (GNNs). Motivated by the need for graph-adaptive and flexible activation functions, we introduce DiGRAF, leveraging Continuous Piecewise-Affine Based (CPAB) transformations, which we augment with an additional GNN to learn a graph-adaptive diffeomorphic activation function in an end-… ▽ More In this paper, we propose a novel activation function tailored specifically for graph data in Graph Neural Networks (GNNs). Motivated by the need for graph-adaptive and flexible activation functions, we introduce DiGRAF, leveraging Continuous Piecewise-Affine Based (CPAB) transformations, which we augment with an additional GNN to learn a graph-adaptive diffeomorphic activation function in an end-to-end manner. In addition to its graph-adaptivity and flexibility, DiGRAF also possesses properties that are widely recognized as desirable for activation functions, such as differentiability, boundness within the domain and computational efficiency. We conduct an extensive set of experiments across diverse datasets and tasks, demonstrating a consistent and superior performance of DiGRAF compared to traditional and graph-specific activation functions, highlighting its effectiveness as an activation function for GNNs. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.19253 [pdf, other]

Advection Augmented Convolutional Neural Networks

Authors: Niloufar Zakariaei, Siddharth Rout, Eldad Haber, Moshe Eliasof

Abstract: Many problems in physical sciences are characterized by the prediction of space-time sequences. Such problems range from weather prediction to the analysis of disease propagation and video prediction. Modern techniques for the solution of these problems typically combine Convolution Neural Networks (CNN) architecture with a time prediction mechanism. However, oftentimes, such approaches underperfo… ▽ More Many problems in physical sciences are characterized by the prediction of space-time sequences. Such problems range from weather prediction to the analysis of disease propagation and video prediction. Modern techniques for the solution of these problems typically combine Convolution Neural Networks (CNN) architecture with a time prediction mechanism. However, oftentimes, such approaches underperform in the long-range propagation of information and lack explainability. In this work, we introduce a physically inspired architecture for the solution of such problems. Namely, we propose to augment CNNs with advection by designing a novel semi-Lagrangian push operator. We show that the proposed operator allows for the non-local transformation of information compared with standard convolutional kernels. We then complement it with Reaction and Diffusion neural components to form a network that mimics the Reaction-Advection-Diffusion equation, in high dimensions. We demonstrate the effectiveness of our network on a number of spatio-temporal datasets that show their merit. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.10871 [pdf, other]

Graph Neural Reaction Diffusion Models

Authors: Moshe Eliasof, Eldad Haber, Eran Treister

Abstract: The integration of Graph Neural Networks (GNNs) and Neural Ordinary and Partial Differential Equations has been extensively studied in recent years. GNN architectures powered by neural differential equations allow us to reason about their behavior, and develop GNNs with desired properties such as controlled smoothing or energy conservation. In this paper we take inspiration from Turing instabiliti… ▽ More The integration of Graph Neural Networks (GNNs) and Neural Ordinary and Partial Differential Equations has been extensively studied in recent years. GNN architectures powered by neural differential equations allow us to reason about their behavior, and develop GNNs with desired properties such as controlled smoothing or energy conservation. In this paper we take inspiration from Turing instabilities in a Reaction Diffusion (RD) system of partial differential equations, and propose a novel family of GNNs based on neural RD systems. We \textcolor{black}{demonstrate} that our RDGNN is powerful for the modeling of various data types, from homophilic, to heterophilic, and spatio-temporal datasets. We discuss the theoretical properties of our RDGNN, its implementation, and show that it improves or offers competitive performance to state-of-the-art methods. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: Accepted at SIAM Journal on Scientific Computing (Submitted for review on 06/2023)

arXiv:2406.10863 [pdf, other]

Global-Local Graph Neural Networks for Node-Classification

Authors: Moshe Eliasof, Eran Treister

Abstract: The task of graph node classification is often approached by utilizing a local Graph Neural Network (GNN), that learns only local information from the node input features and their adjacency. In this paper, we propose to improve the performance of node classification GNNs by utilizing both global and local information, specifically by learning label- and node- features. We therefore call our metho… ▽ More The task of graph node classification is often approached by utilizing a local Graph Neural Network (GNN), that learns only local information from the node input features and their adjacency. In this paper, we propose to improve the performance of node classification GNNs by utilizing both global and local information, specifically by learning label- and node- features. We therefore call our method Global-Local-GNN (GLGNN). To learn proper label features, for each label, we maximize the similarity between its features and nodes features that belong to the label, while maximizing the distance between nodes that do not belong to the considered label. We then use the learnt label features to predict the node classification map. We demonstrate our GLGNN using three different GNN backbones, and show that our approach improves baseline performance, revealing the importance of global information utilization for node classification. △ Less

Submitted 16 June, 2024; originally announced June 2024.

Comments: Accepted to Pattern Recognition Letters (Submitted for review on 03/2023)

arXiv:2405.01009 [pdf, other]

Tackling Graph Oversquashing by Global and Local Non-Dissipativity

Authors: Alessio Gravina, Moshe Eliasof, Claudio Gallicchio, Davide Bacciu, Carola-Bibiane Schönlieb

Abstract: A common problem in Message-Passing Neural Networks is oversquashing -- the limited ability to facilitate effective information flow between distant nodes. Oversquashing is attributed to the exponential decay in information transmission as node distances increase. This paper introduces a novel perspective to address oversquashing, leveraging properties of global and local non-dissipativity, that e… ▽ More A common problem in Message-Passing Neural Networks is oversquashing -- the limited ability to facilitate effective information flow between distant nodes. Oversquashing is attributed to the exponential decay in information transmission as node distances increase. This paper introduces a novel perspective to address oversquashing, leveraging properties of global and local non-dissipativity, that enable the maintenance of a constant information flow rate. Namely, we present SWAN, a uniquely parameterized model GNN with antisymmetry both in space and weight domains, as a means to obtain non-dissipativity. Our theoretical analysis asserts that by achieving these properties, SWAN offers an enhanced ability to transmit information over extended distances. Empirical evaluations on synthetic and real-world benchmarks that emphasize long-range interactions validate the theoretical understanding of SWAN, and its ability to mitigate oversquashing. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.13344 [pdf, other]

GRANOLA: Adaptive Normalization for Graph Neural Networks

Authors: Moshe Eliasof, Beatrice Bevilacqua, Carola-Bibiane Schönlieb, Haggai Maron

Abstract: In recent years, significant efforts have been made to refine the design of Graph Neural Network (GNN) layers, aiming to overcome diverse challenges, such as limited expressive power and oversmoothing. Despite their widespread adoption, the incorporation of off-the-shelf normalization layers like BatchNorm or InstanceNorm within a GNN architecture may not effectively capture the unique characteris… ▽ More In recent years, significant efforts have been made to refine the design of Graph Neural Network (GNN) layers, aiming to overcome diverse challenges, such as limited expressive power and oversmoothing. Despite their widespread adoption, the incorporation of off-the-shelf normalization layers like BatchNorm or InstanceNorm within a GNN architecture may not effectively capture the unique characteristics of graph-structured data, potentially reducing the expressive power of the overall architecture. Moreover, existing graph-specific normalization layers often struggle to offer substantial and consistent benefits. In this paper, we propose GRANOLA, a novel graph-adaptive normalization layer. Unlike existing normalization layers, GRANOLA normalizes node features by adapting to the specific characteristics of the graph, particularly by generating expressive representations of its neighborhood structure, obtained by leveraging the propagation of Random Node Features (RNF) in the graph. We present theoretical results that support our design choices. Our extensive empirical evaluation of various graph benchmarks underscores the superior performance of GRANOLA over existing normalization techniques. Furthermore, GRANOLA emerges as the top-performing method among all baselines within the same time complexity of Message Passing Neural Networks (MPNNs). △ Less

Submitted 20 April, 2024; originally announced April 2024.

arXiv:2404.04874 [pdf, other]

Graph Neural Networks for Binary Programming

Authors: Moshe Eliasof, Eldad Haber

Abstract: This paper investigates a link between Graph Neural Networks (GNNs) and Binary Programming (BP) problems, laying the groundwork for GNNs to approximate solutions for these computationally challenging problems. By analyzing the sensitivity of BP problems, we are able to frame the solution of BP problems as a heterophilic node classification task. We then propose Binary-Programming GNN (BPGNN), an a… ▽ More This paper investigates a link between Graph Neural Networks (GNNs) and Binary Programming (BP) problems, laying the groundwork for GNNs to approximate solutions for these computationally challenging problems. By analyzing the sensitivity of BP problems, we are able to frame the solution of BP problems as a heterophilic node classification task. We then propose Binary-Programming GNN (BPGNN), an architecture that integrates graph representation learning techniques with BP-aware features to approximate BP solutions efficiently. Additionally, we introduce a self-supervised data generation mechanism, to enable efficient and tractable training data acquisition even for large-scale BP problems. Experimental evaluations of BPGNN across diverse BP problem sizes showcase its superior performance compared to exhaustive search and heuristic approaches. Finally, we discuss open challenges in the under-explored field of BP problems with GNNs. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2402.04653 [pdf, other]

An Over Complete Deep Learning Method for Inverse Problems

Authors: Moshe Eliasof, Eldad Haber, Eran Treister

Abstract: Obtaining meaningful solutions for inverse problems has been a major challenge with many applications in science and engineering. Recent machine learning techniques based on proximal and diffusion-based methods have shown promising results. However, as we show in this work, they can also face challenges when applied to some exemplary problems. We show that similar to previous works on over-complet… ▽ More Obtaining meaningful solutions for inverse problems has been a major challenge with many applications in science and engineering. Recent machine learning techniques based on proximal and diffusion-based methods have shown promising results. However, as we show in this work, they can also face challenges when applied to some exemplary problems. We show that similar to previous works on over-complete dictionaries, it is possible to overcome these shortcomings by embedding the solution into higher dimensions. The novelty of the work proposed is that we jointly design and learn the embedding and the regularizer for the embedding vector. We demonstrate the merit of this approach on several exemplary and common inverse problems. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2401.11074 [pdf, other]

On The Temporal Domain of Differential Equation Inspired Graph Neural Networks

Authors: Moshe Eliasof, Eldad Haber, Eran Treister, Carola-Bibiane Schönlieb

Abstract: Graph Neural Networks (GNNs) have demonstrated remarkable success in modeling complex relationships in graph-structured data. A recent innovation in this field is the family of Differential Equation-Inspired Graph Neural Networks (DE-GNNs), which leverage principles from continuous dynamical systems to model information flow on graphs with built-in properties such as feature smoothing or preservat… ▽ More Graph Neural Networks (GNNs) have demonstrated remarkable success in modeling complex relationships in graph-structured data. A recent innovation in this field is the family of Differential Equation-Inspired Graph Neural Networks (DE-GNNs), which leverage principles from continuous dynamical systems to model information flow on graphs with built-in properties such as feature smoothing or preservation. However, existing DE-GNNs rely on first or second-order temporal dependencies. In this paper, we propose a neural extension to those pre-defined temporal dependencies. We show that our model, called TDE-GNN, can capture a wide range of temporal dynamics that go beyond typical first or second-order methods, and provide use cases where existing temporal models are challenged. We demonstrate the benefit of learning the temporal dependencies using our method rather than using pre-defined temporal dynamics on several graph benchmarks. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: AISTATS 2024

arXiv:2311.06942 [pdf, other]

Resilient Graph Neural Networks: A Coupled Dynamical Systems Approach

Authors: Moshe Eliasof, Davide Murari, Ferdia Sherry, Carola-Bibiane Schönlieb

Abstract: Graph Neural Networks (GNNs) have established themselves as a key component in addressing diverse graph-based tasks. Despite their notable successes, GNNs remain susceptible to input perturbations in the form of adversarial attacks. This paper introduces an innovative approach to fortify GNNs against adversarial perturbations through the lens of coupled dynamical systems. Our method introduces gra… ▽ More Graph Neural Networks (GNNs) have established themselves as a key component in addressing diverse graph-based tasks. Despite their notable successes, GNNs remain susceptible to input perturbations in the form of adversarial attacks. This paper introduces an innovative approach to fortify GNNs against adversarial perturbations through the lens of coupled dynamical systems. Our method introduces graph neural layers based on differential equations with contractive properties, which, as we show, improve the robustness of GNNs. A distinctive feature of the proposed approach is the simultaneous learned evolution of both the node features and the adjacency matrix, yielding an intrinsic enhancement of model robustness to perturbations in the input features and the connectivity of the graph. We mathematically derive the underpinnings of our novel architecture and provide theoretical insights to reason about its expected behavior. We demonstrate the efficacy of our method through numerous real-world benchmarks, reading on par or improved performance compared to existing methods. △ Less

Submitted 11 September, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

Comments: ECAI 2024

arXiv:2310.20082 [pdf, other]

Efficient Subgraph GNNs by Learning Effective Selection Policies

Authors: Beatrice Bevilacqua, Moshe Eliasof, Eli Meirom, Bruno Ribeiro, Haggai Maron

Abstract: Subgraph GNNs are provably expressive neural architectures that learn graph representations from sets of subgraphs. Unfortunately, their applicability is hampered by the computational complexity associated with performing message passing on many subgraphs. In this paper, we consider the problem of learning to select a small subset of the large set of possible subgraphs in a data-driven fashion. We… ▽ More Subgraph GNNs are provably expressive neural architectures that learn graph representations from sets of subgraphs. Unfortunately, their applicability is hampered by the computational complexity associated with performing message passing on many subgraphs. In this paper, we consider the problem of learning to select a small subset of the large set of possible subgraphs in a data-driven fashion. We first motivate the problem by proving that there are families of WL-indistinguishable graphs for which there exist efficient subgraph selection policies: small subsets of subgraphs that can already identify all the graphs within the family. We then propose a new approach, called Policy-Learn, that learns how to select subgraphs in an iterative manner. We prove that, unlike popular random policies and prior work addressing the same problem, our architecture is able to learn the efficient policies mentioned above. Our experimental results demonstrate that Policy-Learn outperforms existing baselines across a wide range of datasets. △ Less

Submitted 20 March, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

Comments: ICLR 2024 Camera Ready; 27 pages, 3 figures

arXiv:2307.16092 [pdf, ps, other]

Feature Transportation Improves Graph Neural Networks

Authors: Moshe Eliasof, Eldad Haber, Eran Treister

Abstract: Graph neural networks (GNNs) have shown remarkable success in learning representations for graph-structured data. However, GNNs still face challenges in modeling complex phenomena that involve feature transportation. In this paper, we propose a novel GNN architecture inspired by Advection-Diffusion-Reaction systems, called ADR-GNN. Advection models feature transportation, while diffusion captures… ▽ More Graph neural networks (GNNs) have shown remarkable success in learning representations for graph-structured data. However, GNNs still face challenges in modeling complex phenomena that involve feature transportation. In this paper, we propose a novel GNN architecture inspired by Advection-Diffusion-Reaction systems, called ADR-GNN. Advection models feature transportation, while diffusion captures the local smoothing of features, and reaction represents the non-linear transformation between feature channels. We provide an analysis of the qualitative behavior of ADR-GNN, that shows the benefit of combining advection, diffusion, and reaction. To demonstrate its efficacy, we evaluate ADR-GNN on real-world node classification and spatio-temporal datasets, and show that it improves or offers competitive performance compared to state-of-the-art networks. △ Less

Submitted 20 December, 2023; v1 submitted 29 July, 2023; originally announced July 2023.

Comments: AAAI 2024

arXiv:2304.00015 [pdf, other]

DRIP: Deep Regularizers for Inverse Problems

Authors: Moshe Eliasof, Eldad Haber, Eran Treister

Abstract: In this paper we consider inverse problems that are mathematically ill-posed. That is, given some (noisy) data, there is more than one solution that approximately fits the data. In recent years, deep neural techniques that find the most appropriate solution, in the sense that it contains a-priori information, were developed. However, they suffer from several shortcomings. First, most techniques ca… ▽ More In this paper we consider inverse problems that are mathematically ill-posed. That is, given some (noisy) data, there is more than one solution that approximately fits the data. In recent years, deep neural techniques that find the most appropriate solution, in the sense that it contains a-priori information, were developed. However, they suffer from several shortcomings. First, most techniques cannot guarantee that the solution fits the data at inference. Second, while the derivation of the techniques is inspired by the existence of a valid scalar regularization function, such techniques do not in practice rely on such a function, and therefore veer away from classical variational techniques. In this work we introduce a new family of neural regularizers for the solution of inverse problems. These regularizers are based on a variational formulation and are guaranteed to fit the data. We demonstrate their use on a number of highly ill-posed problems, from image deblurring to limited angle tomography. △ Less

Submitted 25 August, 2023; v1 submitted 30 March, 2023; originally announced April 2023.

arXiv:2303.02918 [pdf, other]

Graph Positional Encoding via Random Feature Propagation

Authors: Moshe Eliasof, Fabrizio Frasca, Beatrice Bevilacqua, Eran Treister, Gal Chechik, Haggai Maron

Abstract: Two main families of node feature augmentation schemes have been explored for enhancing GNNs: random features and spectral positional encoding. Surprisingly, however, there is still no clear understanding of the relation between these two augmentation schemes. Here we propose a novel family of positional encoding schemes which draws a link between the above two approaches and improves over both. T… ▽ More Two main families of node feature augmentation schemes have been explored for enhancing GNNs: random features and spectral positional encoding. Surprisingly, however, there is still no clear understanding of the relation between these two augmentation schemes. Here we propose a novel family of positional encoding schemes which draws a link between the above two approaches and improves over both. The new approach, named Random Feature Propagation (RFP), is inspired by the power iteration method and its generalizations. It concatenates several intermediate steps of an iterative algorithm for computing the dominant eigenvectors of a propagation matrix, starting from random node features. Notably, these propagation steps are based on graph-dependent propagation operators that can be either predefined or learned. We explore the theoretical and empirical benefits of RFP. First, we provide theoretical justifications for using random features, for incorporating early propagation steps, and for using multiple random initializations. Then, we empirically demonstrate that RFP significantly outperforms both spectral PE and random features in multiple node classification and graph classification benchmarks. △ Less

Submitted 19 July, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

Comments: ICML 2023

arXiv:2211.16631 [pdf, other]

Every Node Counts: Improving the Training of Graph Neural Networks on Node Classification

Authors: Moshe Eliasof, Eldad Haber, Eran Treister

Abstract: Graph Neural Networks (GNNs) are prominent in handling sparse and unstructured data efficiently and effectively. Specifically, GNNs were shown to be highly effective for node classification tasks, where labelled information is available for only a fraction of the nodes. Typically, the optimization process, through the objective function, considers only labelled nodes while ignoring the rest. In th… ▽ More Graph Neural Networks (GNNs) are prominent in handling sparse and unstructured data efficiently and effectively. Specifically, GNNs were shown to be highly effective for node classification tasks, where labelled information is available for only a fraction of the nodes. Typically, the optimization process, through the objective function, considers only labelled nodes while ignoring the rest. In this paper, we propose novel objective terms for the training of GNNs for node classification, aiming to exploit all the available data and improve accuracy. Our first term seeks to maximize the mutual information between node and label features, considering both labelled and unlabelled nodes in the optimization process. Our second term promotes anisotropic smoothness in the prediction maps. Lastly, we propose a cross-validating gradients approach to enhance the learning from labelled data. Our proposed objectives are general and can be applied to various GNNs and require no architectural modifications. Extensive experiments demonstrate our approach using popular GNNs like GCN, GAT and GCNII, reading a consistent and significant accuracy improvement on 10 real-world node classification datasets. △ Less

Submitted 29 November, 2022; originally announced November 2022.

arXiv:2210.17224 [pdf, other]

Improving Graph Neural Networks with Learnable Propagation Operators

Authors: Moshe Eliasof, Lars Ruthotto, Eran Treister

Abstract: Graph Neural Networks (GNNs) are limited in their propagation operators. In many cases, these operators often contain non-negative elements only and are shared across channels, limiting the expressiveness of GNNs. Moreover, some GNNs suffer from over-smoothing, limiting their depth. On the other hand, Convolutional Neural Networks (CNNs) can learn diverse propagation filters, and phenomena like ov… ▽ More Graph Neural Networks (GNNs) are limited in their propagation operators. In many cases, these operators often contain non-negative elements only and are shared across channels, limiting the expressiveness of GNNs. Moreover, some GNNs suffer from over-smoothing, limiting their depth. On the other hand, Convolutional Neural Networks (CNNs) can learn diverse propagation filters, and phenomena like over-smoothing are typically not apparent in CNNs. In this paper, we bridge these gaps by incorporating trainable channel-wise weighting factors $ω$ to learn and mix multiple smoothing and sharpening propagation operators at each layer. Our generic method is called $ω$GNN, and is easy to implement. We study two variants: $ω$GCN and $ω$GAT. For $ω$GCN, we theoretically analyse its behaviour and the impact of $ω$ on the obtained node features. Our experiments confirm these findings, demonstrating and explaining how both variants do not over-smooth. Additionally, we experiment with 15 real-world datasets on node- and graph-classification tasks, where our $ω$GCN and $ω$GAT perform on par with state-of-the-art methods. △ Less

Submitted 5 May, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

Comments: Accepted to the International Conference on Machine Learning (ICML) 2023

arXiv:2210.11810 [pdf, other]

Unsupervised Image Semantic Segmentation through Superpixels and Graph Neural Networks

Authors: Moshe Eliasof, Nir Ben Zikri, Eran Treister

Abstract: Unsupervised image segmentation is an important task in many real-world scenarios where labelled data is of scarce availability. In this paper we propose a novel approach that harnesses recent advances in unsupervised learning using a combination of Mutual Information Maximization (MIM), Neural Superpixel Segmentation and Graph Neural Networks (GNNs) in an end-to-end manner, an approach that has n… ▽ More Unsupervised image segmentation is an important task in many real-world scenarios where labelled data is of scarce availability. In this paper we propose a novel approach that harnesses recent advances in unsupervised learning using a combination of Mutual Information Maximization (MIM), Neural Superpixel Segmentation and Graph Neural Networks (GNNs) in an end-to-end manner, an approach that has not been explored yet. We take advantage of the compact representation of superpixels and combine it with GNNs in order to learn strong and semantically meaningful representations of images. Specifically, we show that our GNN based approach allows to model interactions between distant pixels in the image and serves as a strong prior to existing CNNs for an improved accuracy. Our experiments reveal both the qualitative and quantitative advantages of our approach compared to current state-of-the-art methods over four popular datasets. △ Less

Submitted 21 October, 2022; originally announced October 2022.

arXiv:2208.09433 [pdf, other]

Estimating a potential without the agony of the partition function

Authors: Eldad Haber, Moshe Eliasof, Luis Tenorio

Abstract: Estimating a Gibbs density function given a sample is an important problem in computational statistics and statistical learning. Although the well established maximum likelihood method is commonly used, it requires the computation of the partition function (i.e., the normalization of the density). This function can be easily calculated for simple low-dimensional problems but its computation is d… ▽ More Estimating a Gibbs density function given a sample is an important problem in computational statistics and statistical learning. Although the well established maximum likelihood method is commonly used, it requires the computation of the partition function (i.e., the normalization of the density). This function can be easily calculated for simple low-dimensional problems but its computation is difficult or even intractable for general densities and high-dimensional problems. In this paper we propose an alternative approach based on Maximum A-Posteriori (MAP) estimators, we name Maximum Recovery MAP (MR-MAP), to derive estimators that do not require the computation of the partition function, and reformulate the problem as an optimization problem. We further propose a least-action type potential that allows us to quickly solve the optimization problem as a feed-forward hyperbolic neural network. We demonstrate the effectiveness of our methods on some standard data sets. △ Less

Submitted 11 March, 2023; v1 submitted 19 August, 2022; originally announced August 2022.

arXiv:2207.07408 [pdf, other]

pathGCN: Learning General Graph Spatial Operators from Paths

Authors: Moshe Eliasof, Eldad Haber, Eran Treister

Abstract: Graph Convolutional Networks (GCNs), similarly to Convolutional Neural Networks (CNNs), are typically based on two main operations - spatial and point-wise convolutions. In the context of GCNs, differently from CNNs, a pre-determined spatial operator based on the graph Laplacian is often chosen, allowing only the point-wise operations to be learnt. However, learning a meaningful spatial operator i… ▽ More Graph Convolutional Networks (GCNs), similarly to Convolutional Neural Networks (CNNs), are typically based on two main operations - spatial and point-wise convolutions. In the context of GCNs, differently from CNNs, a pre-determined spatial operator based on the graph Laplacian is often chosen, allowing only the point-wise operations to be learnt. However, learning a meaningful spatial operator is critical for developing more expressive GCNs for improved performance. In this paper we propose pathGCN, a novel approach to learn the spatial operator from random paths on the graph. We analyze the convergence of our method and its difference from existing GCNs. Furthermore, we discuss several options of combining our learnt spatial operator with point-wise convolutions. Our extensive experiments on numerous datasets suggest that by properly learning both the spatial and point-wise convolutions, phenomena like over-smoothing can be inherently avoided, and new state-of-the-art performance is achieved. △ Less

Submitted 15 July, 2022; originally announced July 2022.

Comments: ICML 2022

arXiv:2206.10213 [pdf, other]

Rethinking Unsupervised Neural Superpixel Segmentation

Authors: Moshe Eliasof, Nir Ben Zikri, Eran Treister

Abstract: Recently, the concept of unsupervised learning for superpixel segmentation via CNNs has been studied. Essentially, such methods generate superpixels by convolutional neural network (CNN) employed on a single image, and such CNNs are trained without any labels or further information. Thus, such approach relies on the incorporation of priors, typically by designing an objective function that guides… ▽ More Recently, the concept of unsupervised learning for superpixel segmentation via CNNs has been studied. Essentially, such methods generate superpixels by convolutional neural network (CNN) employed on a single image, and such CNNs are trained without any labels or further information. Thus, such approach relies on the incorporation of priors, typically by designing an objective function that guides the solution towards a meaningful superpixel segmentation. In this paper we propose three key elements to improve the efficacy of such networks: (i) the similarity of the \emph{soft} superpixelated image compared to the input image, (ii) the enhancement and consideration of object edges and boundaries and (iii) a modified architecture based on atrous convolution, which allow for a wider field of view, functioning as a multi-scale component in our network. By experimenting with the BSDS500 dataset, we find evidence to the significance of our proposal, both qualitatively and quantitatively. △ Less

Submitted 21 June, 2022; originally announced June 2022.

Comments: ICIP 2022

arXiv:2110.04824 [pdf, other]

Haar Wavelet Feature Compression for Quantized Graph Convolutional Networks

Authors: Moshe Eliasof, Benjamin Bodner, Eran Treister

Abstract: Graph Convolutional Networks (GCNs) are widely used in a variety of applications, and can be seen as an unstructured version of standard Convolutional Neural Networks (CNNs). As in CNNs, the computational cost of GCNs for large input graphs (such as large point clouds or meshes) can be high and inhibit the use of these networks, especially in environments with low computational resources. To ease… ▽ More Graph Convolutional Networks (GCNs) are widely used in a variety of applications, and can be seen as an unstructured version of standard Convolutional Neural Networks (CNNs). As in CNNs, the computational cost of GCNs for large input graphs (such as large point clouds or meshes) can be high and inhibit the use of these networks, especially in environments with low computational resources. To ease these costs, quantization can be applied to GCNs. However, aggressive quantization of the feature maps can lead to a significant degradation in performance. On a different note, Haar wavelet transforms are known to be one of the most effective and efficient approaches to compress signals. Therefore, instead of applying aggressive quantization to feature maps, we propose to utilize Haar wavelet compression and light quantization to reduce the computations and the bandwidth involved with the network. We demonstrate that this approach surpasses aggressive feature quantization by a significant margin, for a variety of problems ranging from node classification to point cloud classification and part and semantic segmentation. △ Less

Submitted 10 October, 2021; originally announced October 2021.

arXiv:2109.00095 [pdf, ps, other]

doi 10.1007/s40687-022-00354-y

Quantized Convolutional Neural Networks Through the Lens of Partial Differential Equations

Authors: Ido Ben-Yair, Gil Ben Shalom, Moshe Eliasof, Eran Treister

Abstract: Quantization of Convolutional Neural Networks (CNNs) is a common approach to ease the computational burden involved in the deployment of CNNs, especially on low-resource edge devices. However, fixed-point arithmetic is not natural to the type of computations involved in neural networks. In this work, we explore ways to improve quantized CNNs using PDE-based perspective and analysis. First, we harn… ▽ More Quantization of Convolutional Neural Networks (CNNs) is a common approach to ease the computational burden involved in the deployment of CNNs, especially on low-resource edge devices. However, fixed-point arithmetic is not natural to the type of computations involved in neural networks. In this work, we explore ways to improve quantized CNNs using PDE-based perspective and analysis. First, we harness the total variation (TV) approach to apply edge-aware smoothing to the feature maps throughout the network. This aims to reduce outliers in the distribution of values and promote piece-wise constant maps, which are more suitable for quantization. Secondly, we consider symmetric and stable variants of common CNNs for image classification, and Graph Convolutional Networks (GCNs) for graph node-classification. We demonstrate through several experiments that the property of forward stability preserves the action of a network under different quantization rates. As a result, stable quantized networks behave similarly to their non-quantized counterparts even though they rely on fewer parameters. We also find that at times, stability even aids in improving accuracy. These properties are of particular interest for sensitive, resource-constrained, low-power or real-time applications like autonomous driving. △ Less

Submitted 3 August, 2022; v1 submitted 31 August, 2021; originally announced September 2021.

arXiv:2108.01938 [pdf, other]

PDE-GCN: Novel Architectures for Graph Neural Networks Motivated by Partial Differential Equations

Authors: Moshe Eliasof, Eldad Haber, Eran Treister

Abstract: Graph neural networks are increasingly becoming the go-to approach in various fields such as computer vision, computational biology and chemistry, where data are naturally explained by graphs. However, unlike traditional convolutional neural networks, deep graph networks do not necessarily yield better performance than shallow graph networks. This behavior usually stems from the over-smoothing phe… ▽ More Graph neural networks are increasingly becoming the go-to approach in various fields such as computer vision, computational biology and chemistry, where data are naturally explained by graphs. However, unlike traditional convolutional neural networks, deep graph networks do not necessarily yield better performance than shallow graph networks. This behavior usually stems from the over-smoothing phenomenon. In this work, we propose a family of architectures to control this behavior by design. Our networks are motivated by numerical methods for solving Partial Differential Equations (PDEs) on manifolds, and as such, their behavior can be explained by similar analysis. Moreover, as we demonstrate using an extensive set of experiments, our PDE-motivated networks can generalize and be effective for various types of problems from different fields. Our architectures obtain better or on par with the current state-of-the-art results for problems that are typically approached using different architectures. △ Less

Submitted 26 October, 2021; v1 submitted 4 August, 2021; originally announced August 2021.

Comments: NeurIPS 2021

arXiv:2102.03881 [pdf, other]

Mimetic Neural Networks: A unified framework for Protein Design and Folding

Authors: Moshe Eliasof, Tue Boesen, Eldad Haber, Chen Keasar, Eran Treister

Abstract: Recent advancements in machine learning techniques for protein folding motivate better results in its inverse problem -- protein design. In this work we introduce a new graph mimetic neural network, MimNet, and show that it is possible to build a reversible architecture that solves the structure and design problems in tandem, allowing to improve protein design when the structure is better estimate… ▽ More Recent advancements in machine learning techniques for protein folding motivate better results in its inverse problem -- protein design. In this work we introduce a new graph mimetic neural network, MimNet, and show that it is possible to build a reversible architecture that solves the structure and design problems in tandem, allowing to improve protein design when the structure is better estimated. We use the ProteinNet data set and show that the state of the art results in protein design can be improved, given recent architectures for protein folding. △ Less

Submitted 7 February, 2021; originally announced February 2021.

arXiv:2011.09128 [pdf, other]

MGIC: Multigrid-in-Channels Neural Network Architectures

Authors: Moshe Eliasof, Jonathan Ephrath, Lars Ruthotto, Eran Treister

Abstract: We present a multigrid-in-channels (MGIC) approach that tackles the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs). Thereby our approach addresses the redundancy in CNNs that is also exposed by the recent success of lightweight CNNs. Lightweight CNNs can achieve comparable accuracy to standard CNNs with fewer par… ▽ More We present a multigrid-in-channels (MGIC) approach that tackles the quadratic growth of the number of parameters with respect to the number of channels in standard convolutional neural networks (CNNs). Thereby our approach addresses the redundancy in CNNs that is also exposed by the recent success of lightweight CNNs. Lightweight CNNs can achieve comparable accuracy to standard CNNs with fewer parameters; however, the number of weights still scales quadratically with the CNN's width. Our MGIC architectures replace each CNN block with an MGIC counterpart that utilizes a hierarchy of nested grouped convolutions of small group size to address this. Hence, our proposed architectures scale linearly with respect to the network's width while retaining full coupling of the channels as in standard CNNs. Our extensive experiments on image classification, segmentation, and point cloud classification show that applying this strategy to different architectures like ResNet and MobileNetV3 reduces the number of parameters while obtaining similar or better accuracy. △ Less

Submitted 26 September, 2022; v1 submitted 17 November, 2020; originally announced November 2020.

Comments: This paper supersedes arXiv:2006.06799 Accepted to SISC

arXiv:2006.04115 [pdf, ps, other]

DiffGCN: Graph Convolutional Networks via Differential Operators and Algebraic Multigrid Pooling

Authors: Moshe Eliasof, Eran Treister

Abstract: Graph Convolutional Networks (GCNs) have shown to be effective in handling unordered data like point clouds and meshes. In this work we propose novel approaches for graph convolution, pooling and unpooling, inspired from finite differences and algebraic multigrid frameworks. We form a parameterized convolution kernel based on discretized differential operators, leveraging the graph mass, gradient… ▽ More Graph Convolutional Networks (GCNs) have shown to be effective in handling unordered data like point clouds and meshes. In this work we propose novel approaches for graph convolution, pooling and unpooling, inspired from finite differences and algebraic multigrid frameworks. We form a parameterized convolution kernel based on discretized differential operators, leveraging the graph mass, gradient and Laplacian. This way, the parameterization does not depend on the graph structure, only on the meaning of the network convolutions as differential operators. To allow hierarchical representations of the input, we propose pooling and unpooling operations that are based on algebraic multigrid methods, which are mainly used to solve partial differential equations on unstructured grids. To motivate and explain our method, we compare it to standard convolutional neural networks, and show their similarities and relations in the case of a regular grid. Our proposed method is demonstrated in various experiments like classification and part-segmentation, achieving on par or better than state of the art results. We also analyze the computational cost of our method compared to other GCNs. △ Less

Submitted 22 October, 2020; v1 submitted 7 June, 2020; originally announced June 2020.

arXiv:1910.13157 [pdf, other]

doi 10.1109/JSTSP.2020.2972775

LeanConvNets: Low-cost Yet Effective Convolutional Neural Networks

Authors: Jonathan Ephrath, Moshe Eliasof, Lars Ruthotto, Eldad Haber, Eran Treister

Abstract: Convolutional Neural Networks (CNNs) have become indispensable for solving machine learning tasks in speech recognition, computer vision, and other areas that involve high-dimensional data. A CNN filters the input feature using a network containing spatial convolution operators with compactly supported stencils. In practice, the input data and the hidden features consist of a large number of chann… ▽ More Convolutional Neural Networks (CNNs) have become indispensable for solving machine learning tasks in speech recognition, computer vision, and other areas that involve high-dimensional data. A CNN filters the input feature using a network containing spatial convolution operators with compactly supported stencils. In practice, the input data and the hidden features consist of a large number of channels, which in most CNNs are fully coupled by the convolution operators. This coupling leads to immense computational cost in the training and prediction phase. In this paper, we introduce LeanConvNets that are derived by sparsifying fully-coupled operators in existing CNNs. Our goal is to improve the efficiency of CNNs by reducing the number of weights, floating point operations and latency times, with minimal loss of accuracy. Our lean convolution operators involve tuning parameters that controls the trade-off between the network's accuracy and computational costs. These convolutions can be used in a wide range of existing networks, and we exemplify their use in residual networks (ResNets). Using a range of benchmark problems from image classification and semantic segmentation, we demonstrate that the resulting LeanConvNet's accuracy is close to state-of-the-art networks while being computationally less expensive. In our tests, the lean versions of ResNet in most cases outperform comparable reduced architectures such as MobileNets and ShuffleNets. △ Less

Submitted 12 February, 2020; v1 submitted 29 October, 2019; originally announced October 2019.

arXiv:1904.10379 [pdf, other]

Multi-modal 3D Shape Reconstruction Under Calibration Uncertainty using Parametric Level Set Methods

Authors: Moshe Eliasof, Andrei Sharf, Eran Treister

Abstract: We consider the problem of 3D shape reconstruction from multi-modal data, given uncertain calibration parameters. Typically, 3D data modalities can be in diverse forms such as sparse point sets, volumetric slices, 2D photos and so on. To jointly process these data modalities, we exploit a parametric level set method that utilizes ellipsoidal radial basis functions. This method not only allows us t… ▽ More We consider the problem of 3D shape reconstruction from multi-modal data, given uncertain calibration parameters. Typically, 3D data modalities can be in diverse forms such as sparse point sets, volumetric slices, 2D photos and so on. To jointly process these data modalities, we exploit a parametric level set method that utilizes ellipsoidal radial basis functions. This method not only allows us to analytically and compactly represent the object, it also confers on us the ability to overcome calibration related noise that originates from inaccurate acquisition parameters. This essentially implicit regularization leads to a highly robust and scalable reconstruction, surpassing other traditional methods. In our results we first demonstrate the ability of the method to compactly represent complex objects. We then show that our reconstruction method is robust both to a small number of measurements and to noise in the acquisition parameters. Finally, we demonstrate our reconstruction abilities from diverse modalities such as volume slices obtained from liquid displacement (similar to CTscans and XRays), and visual measurements obtained from shape silhouettes. △ Less

Submitted 20 December, 2019; v1 submitted 23 April, 2019; originally announced April 2019.

Showing 1–30 of 30 results for author: Eliasof, M