-
gRNAde: Geometric Deep Learning for 3D RNA inverse design
Authors:
Chaitanya K. Joshi,
Arian R. Jamasb,
Ramon Viñas,
Charles Harris,
Simon V. Mathis,
Alex Morehead,
Rishabh Anand,
Pietro Liò
Abstract:
Computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D geometry and conformational diversity. We introduce gRNAde, a geometric RNA design pipeline operating on 3D RNA backbones to design sequences that explicitly account for structure and dynamics. Under the hood, gRNAde is a mul…
▽ More
Computational RNA design tasks are often posed as inverse problems, where sequences are designed based on adopting a single desired secondary structure without considering 3D geometry and conformational diversity. We introduce gRNAde, a geometric RNA design pipeline operating on 3D RNA backbones to design sequences that explicitly account for structure and dynamics. Under the hood, gRNAde is a multi-state Graph Neural Network that generates candidate RNA sequences conditioned on one or more 3D backbone structures where the identities of the bases are unknown. On a single-state fixed backbone re-design benchmark of 14 RNA structures from the PDB identified by Das et al. [2010], gRNAde obtains higher native sequence recovery rates (56% on average) compared to Rosetta (45% on average), taking under a second to produce designs compared to the reported hours for Rosetta. We further demonstrate the utility of gRNAde on a new benchmark of multi-state design for structurally flexible RNAs, as well as zero-shot ranking of mutational fitness landscapes in a retrospective analysis of a recent ribozyme. Open source code: https://github.com/chaitjo/geometric-rna-design
△ Less
Submitted 6 October, 2024; v1 submitted 24 May, 2023;
originally announced May 2023.
-
A Graph-based Imputation Method for Sparse Medical Records
Authors:
Ramon Vinas,
Xu Zheng,
Jer Hayes
Abstract:
Electronic Medical Records (EHR) are extremely sparse. Only a small proportion of events (symptoms, diagnoses, and treatments) are observed in the lifetime of an individual. The high degree of missingness of EHR can be attributed to a large number of factors, including device failure, privacy concerns, or other unexpected reasons. Unfortunately, many traditional imputation methods are not well sui…
▽ More
Electronic Medical Records (EHR) are extremely sparse. Only a small proportion of events (symptoms, diagnoses, and treatments) are observed in the lifetime of an individual. The high degree of missingness of EHR can be attributed to a large number of factors, including device failure, privacy concerns, or other unexpected reasons. Unfortunately, many traditional imputation methods are not well suited for highly sparse data and scale poorly to high dimensional datasets. In this paper, we propose a graph-based imputation method that is both robust to sparsity and to unreliable unmeasured events. Our approach compares favourably to several standard and state-of-the-art imputation methods in terms of performance and runtime. Moreover, results indicate that the model learns to embed different event types in a clinically meaningful way. Our work can facilitate the diagnosis of novel diseases based on the clinical history of past events, with the potential to increase our understanding of the landscape of comorbidities.
△ Less
Submitted 17 November, 2021;
originally announced November 2021.
-
Attentional Meta-learners for Few-shot Polythetic Classification
Authors:
Ben Day,
Ramon Viñas,
Nikola Simidjievski,
Pietro Liò
Abstract:
Polythetic classifications, based on shared patterns of features that need neither be universal nor constant among members of a class, are common in the natural world and greatly outnumber monothetic classifications over a set of features. We show that threshold meta-learners, such as Prototypical Networks, require an embedding dimension that is exponential in the number of task-relevant features…
▽ More
Polythetic classifications, based on shared patterns of features that need neither be universal nor constant among members of a class, are common in the natural world and greatly outnumber monothetic classifications over a set of features. We show that threshold meta-learners, such as Prototypical Networks, require an embedding dimension that is exponential in the number of task-relevant features to emulate these functions. In contrast, attentional classifiers, such as Matching Networks, are polythetic by default and able to solve these problems with a linear embedding dimension. However, we find that in the presence of task-irrelevant features, inherent to meta-learning problems, attentional models are susceptible to misclassification. To address this challenge, we propose a self-attention feature-selection mechanism that adaptively dilutes non-discriminative features. We demonstrate the effectiveness of our approach in meta-learning Boolean functions, and synthetic and real-world few-shot learning tasks.
△ Less
Submitted 27 June, 2022; v1 submitted 9 June, 2021;
originally announced June 2021.
-
Investigating Estimated Kolmogorov Complexity as a Means of Regularization for Link Prediction
Authors:
Paris D. L. Flood,
Ramon Viñas,
Pietro Liò
Abstract:
Link prediction in graphs is an important task in the fields of network science and machine learning. We investigate a flexible means of regularization for link prediction based on an approximation of the Kolmogorov complexity of graphs that is differentiable and compatible with recent advances in link prediction algorithms. Informally, the Kolmogorov complexity of an object is the length of the s…
▽ More
Link prediction in graphs is an important task in the fields of network science and machine learning. We investigate a flexible means of regularization for link prediction based on an approximation of the Kolmogorov complexity of graphs that is differentiable and compatible with recent advances in link prediction algorithms. Informally, the Kolmogorov complexity of an object is the length of the shortest computer program that produces the object. Complex networks are often generated, in part, by simple mechanisms; for example, many citation networks and social networks are approximately scale-free and can be explained by preferential attachment. A preference for predicting graphs with simpler generating mechanisms motivates our choice of Kolmogorov complexity as a regularization term. In our experiments the regularization method shows good performance on many diverse real-world networks, however we determine that this is likely due to an aggregation method rather than any actual estimation of Kolmogorov complexity.
△ Less
Submitted 23 February, 2021; v1 submitted 7 June, 2020;
originally announced June 2020.