Quantitative Biology (q-bio)

On the control of recurrent neural networks using constant inputs
Cyprien Tamekue, Ruiqi Chen, ShiNung Ching
Oct 23 2024 math.OC math.DS q-bio.NC arXiv:2410.17199v1

@misc{2410.17199, author = {Cyprien Tamekue and Ruiqi Chen and ShiNung Ching}, title = {{O}n the control of recurrent neural networks using constant inputs}, year = {2024}, eprint = {2410.17199}, note = {arXiv:2410.17199v1} }
PDF
This paper investigates the controllability properties of a general class of recurrent neural networks that are widely used for hypothesis generation in theoretical neuroscience, including the modeling of large-scale human brain dynamics. Our study focuses on the control synthesis of such networks using constant and piecewise constant inputs, motivated by emerging applications in non-invasive neurostimulation such as transcranial direct current stimulation (tDCS). The neural network model considered is a continuous Hopfield-type system with nonlinear activation functions and arbitrary input matrices representing interactions among multiple brain regions. Our main contribution is the formulation and solution of a control synthesis problem for these nonlinear systems. We provide a proper generalization of the variation of the constants formula that constitutes a novel representation of the system's state trajectory. This representation admits a verifiable condition on the existence of the constant control input to solve a short-time two-point boundary value problem in the state space. This formulation admits a synthesis for the input in question, which can be realized using modern algorithmic optimization tools. In the case of linear activation functions, this analysis and synthesis reduces to the verification of algebraic conditions on the system matrices. Simulation results are presented to illustrate the theoretical findings and demonstrate the efficacy of the proposed control strategies. These results offer a novel control synthesis for an important class of neural network models that may, in turn, enable the design of brain stimulation protocols to modulate whole-brain activity in therapeutic and cognitive enhancement applications.
Variational autoencoders stabilise TCN performance when classifying weakly labelled bioacoustics data
Laia Garrobé Fonollosa, Douglas Gillespie, Lina Stankovic, Vladimir Stankovic, Luke Rendell
Oct 23 2024 cs.SD eess.AS q-bio.QM arXiv:2410.17006v1

@misc{2410.17006, author = {Laia Garrobé Fonollosa and Douglas Gillespie and Lina Stankovic and Vladimir Stankovic and Luke Rendell}, title = {{V}ariational autoencoders stabilise {TCN} performance when classifying weakly labelled bioacoustics data}, year = {2024}, eprint = {2410.17006}, note = {arXiv:2410.17006v1} }
PDF
Passive acoustic monitoring (PAM) data is often weakly labelled, audited at the scale of detection presence or absence on timescales of minutes to hours. Moreover, this data exhibits great variability from one deployment to the next, due to differences in ambient noise and the signals across sources and geographies. This study proposes a two-step solution to leverage weakly annotated data for training Deep Learning (DL) detection models. Our case study involves binary classification of the presence/absence of sperm whale (\textitPhyseter macrocephalus) click trains in 4-minute-long recordings from a dataset comprising diverse sources and deployment conditions to maximise generalisability. We tested methods for extracting acoustic features from lengthy audio segments and integrated Temporal Convolutional Networks (TCNs) trained on the extracted features for sequence classification. For feature extraction, we introduced a new approach using Variational AutoEncoders (VAEs) to extract information from both waveforms and spectrograms, which eliminates the necessity for manual threshold setting or time-consuming strong labelling. For classification, TCNs were trained separately on sequences of either VAE embeddings or handpicked acoustic features extracted from the waveform and spectrogram representations using classical methods, to compare the efficacy of the two approaches. The TCN demonstrated robust classification capabilities on a validation set, achieving accuracies exceeding 85\% when applied to 4-minute acoustic recordings. Notably, TCNs trained on handpicked acoustic features exhibited greater variability in performance across recordings from diverse deployment conditions, whereas those trained on VAEs showed a more consistent performance, highlighting the robust transferability of VAEs for feature extraction across different deployment conditions.
Expected Density of Random Minimizers
Shay Golan, Arseny M. Shur
Oct 23 2024 math.CO q-bio.GN arXiv:2410.16968v1

@misc{2410.16968, author = {Shay Golan and Arseny M.~Shur}, title = {{E}xpected {D}ensity of {R}andom {M}inimizers}, year = {2024}, eprint = {2410.16968}, note = {arXiv:2410.16968v1} }
PDF
Minimizer schemes, or just minimizers, are a very important computational primitive in sampling and sketching biological strings. Assuming a fixed alphabet of size $\sigma$, a minimizer is defined by two integers $k,w\ge2$ and a total order $\rho$ on strings of length $k$ (also called $k$-mers). A string is processed by a sliding window algorithm that chooses, in each window of length $w+k-1$, its minimal $k$-mer with respect to $\rho$. A key characteristic of the minimizer is the expected density of chosen $k$-mers among all $k$-mers in a random infinite $\sigma$-ary string. Random minimizers, in which the order $\rho$ is chosen uniformly at random, are often used in applications. However, little is known about their expected density $\mathcal{DR}_\sigma(k,w)$ besides the fact that it is close to $\frac{2}{w+1}$ unless $w\gg k$. We first show that $\mathcal{DR}_\sigma(k,w)$ can be computed in $O(k\sigma^{k+w})$ time. Then we attend to the case $w\le k$ and present a formula that allows one to compute $\mathcal{DR}_\sigma(k,w)$ in just $O(w^2)$ time. Further, we describe the behaviour of $\mathcal{DR}_\sigma(k,w)$ in this case, establishing the connection between $\mathcal{DR}_\sigma(k,w)$, $\mathcal{DR}_\sigma(k+1,w)$, and $\mathcal{DR}_\sigma(k,w+1)$. In particular, we show that $\mathcal{DR}_\sigma(k,w)<\frac{2}{w+1}$ (by a tiny margin) unless $w$ is small. We conclude with some partial results and conjectures for the case $w>k$.
Optimizing First-Line Therapeutics in Non-Small Cell Lung Cancer: Insights from Joint Modeling and Large-Scale Data Analysis
Benjamin K. Schneider, Sebastien Benzekry, Jonathan P. Mochel
Oct 23 2024 q-bio.BM q-bio.TO arXiv:2410.16967v1

@misc{2410.16967, author = {Benjamin K.~Schneider and Sebastien Benzekry and Jonathan P.~Mochel}, title = {{O}ptimizing {F}irst-{L}ine {T}herapeutics in {N}on-{S}mall {C}ell {L}ung {C}ancer: {I}nsights from {J}oint {M}odeling and {L}arge-{S}cale {D}ata {A}nalysis}, year = {2024}, eprint = {2410.16967}, note = {arXiv:2410.16967v1} }
PDF
Non-small cell lung cancer (NSCLC) is often intrinsically resistant to several first- and second-line therapeutics and can rapidly acquire further resistance after a patient begins receiving treatment. Treatment outcomes are therefore significantly impacted by the optimization of therapeutic scheduling. Previous preclinical research has suggested scheduling bevacizumab in sequence with combination antiproliferatives could significantly improve clinical outcomes. Mathematical modeling is a well-suited tool for investigating this proposed scheduling modification. To address this critical need, individual patient tumor data from 11 clinical trials in NSCLC has been collated and used to develop a semi-mechanistic model of NSCLC growth and response to the various therapeutics represented in those trials. Precise estimates of clinical parameters fundamental to cancer modeling have been produced - such as the rate of acquired resistance to various pharmaceuticals, the relationship between drug concentration and cancer cell death, as well as the fine temporal dynamics of vascular remodeling in response to bevacizumab. In a reserved portion of the dataset, this model was used to predict the efficacy of individual treatment time courses with a mean error rate of 59.7% after a single tumor measurement and 11.7% after three successive tumor measurements. A delay of 9.6 hours between pemetrexed-cisplatin and bevacizumab administration is predicted to optimize the benefit of sequential administration. At this gap, approximately 93.5% of simulated patients benefited from a gap in sequential administration compared with concomitant administration. Of those simulated patients, the mean improvement in tumor reduction was 20.7%. This result suggests that scheduling a modest gap between the administration of bevacizumab and its partner antiproliferatives could meaningfully improve patient outcomes in NSCLC.
IdenBAT: Disentangled Representation Learning for Identity-Preserved Brain Age Transformation
Junyeong Maeng, Kwanseok Oh, Wonsik Jung, Heung-Il Suk
Oct 23 2024 eess.IV cs.AI cs.CV q-bio.NC arXiv:2410.16945v1

@misc{2410.16945, author = {Junyeong Maeng and Kwanseok Oh and Wonsik Jung and Heung-Il Suk}, title = {{I}den{BAT}: {D}isentangled {R}epresentation {L}earning for {I}dentity-{P}reserved {B}rain {A}ge {T}ransformation}, year = {2024}, eprint = {2410.16945}, note = {arXiv:2410.16945v1} }
PDF
Brain age transformation aims to convert reference brain images into synthesized images that accurately reflect the age-specific features of a target age group. The primary objective of this task is to modify only the age-related attributes of the reference image while preserving all other age-irrelevant attributes. However, achieving this goal poses substantial challenges due to the inherent entanglement of various image attributes within features extracted from a backbone encoder, resulting in simultaneous alterations during the image generation. To address this challenge, we propose a novel architecture that employs disentangled representation learning for identity-preserved brain age transformation called IdenBAT. This approach facilitates the decomposition of image features, ensuring the preservation of individual traits while selectively transforming age-related characteristics to match those of the target age group. Through comprehensive experiments conducted on both 2D and full-size 3D brain datasets, our method adeptly converts input images to target age while retaining individual characteristics accurately. Furthermore, our approach demonstrates superiority over existing state-of-the-art regarding performance fidelity.
DNAHLM -- DNA sequence and Human Language mixed large language Model
Wang Liang
Oct 23 2024 q-bio.GN cs.LG arXiv:2410.16917v1

@misc{2410.16917, author = {Wang Liang}, title = {{DNAHLM} -- {DNA} sequence and {H}uman {L}anguage mixed large language {M}odel}, year = {2024}, eprint = {2410.16917}, note = {arXiv:2410.16917v1} }
PDF
There are already many DNA large language models, but most of them still follow traditional uses, such as extracting sequence features for classification tasks. More innovative applications of large language models, such as prompt engineering, RAG, and zero-shot or few-shot prediction, remain challenging for DNA-based models. The key issue lies in the fact that DNA models and human natural language models are entirely separate; however, techniques like prompt engineering require the use of natural language, thereby significantly limiting the application of DNA large language models. This paper introduces a hybrid model trained on the GPT-2 network, combining DNA sequences and English text to explore the potential of using prompts and fine-tuning in DNA models. The model has demonstrated its effectiveness in DNA related zero-shot prediction and multitask application.
Topological and Graph Theoretical Analysis of Dynamic Functional Connectivity for Autism Spectrum Disorder
Yuzhe Chen, Dayu Qin, Ercan Engin Kuruoglu
Oct 23 2024 q-bio.NC q-bio.QM arXiv:2410.16874v1

@misc{2410.16874, author = {Yuzhe Chen and Dayu Qin and Ercan Engin Kuruoglu}, title = {{T}opological and {G}raph {T}heoretical {A}nalysis of {D}ynamic {F}unctional {C}onnectivity for {A}utism {S}pectrum {D}isorder}, year = {2024}, eprint = {2410.16874}, note = {arXiv:2410.16874v1} }
PDF
Autism Spectrum Disorder (ASD) is a prevalent neurological disorder. However, the multi-faceted symptoms and large individual differences among ASD patients are hindering the diagnosis process, which largely relies on subject descriptions and lacks quantitative biomarkers. To remediate such problems, this paper explores the use of graph theory and topological data analysis (TDA) to study brain activity in ASD patients and normal controls. We employ the Mapper algorithm in TDA and the distance correlation graphical model (DCGM) in graph theory to create brain state networks, then innovatively adopt complex network metrics in Graph signal processing (GSP) and physical quantities to analyze brain activities over time. Our findings reveal statistical differences in network characteristics between ASD and control groups. Compared to normal subjects, brain state networks of ASD patients tend to have decreased modularity, higher von Neumann entropy, increased Betti-0 numbers, and decreased Betti-1 numbers. These findings attest to the biological traits of ASD, suggesting less transitioning in brain dynamics. These findings offer potential biomarkers for ASD diagnosis and deepen our understanding of its neural correlations.
Cortical Dynamics of Neural-Connectivity Fields
Gerald K. Cooray, Vernon Cooray, Karl J. Friston
Oct 23 2024 q-bio.NC arXiv:2410.16852v1

@misc{2410.16852, author = {Gerald K.~Cooray and Vernon Cooray and Karl J.~Friston}, title = {{C}ortical {D}ynamics of {N}eural-{C}onnectivity {F}ields}, year = {2024}, eprint = {2410.16852}, note = {arXiv:2410.16852v1} }
PDF
Macroscopic studies of cortical tissue reveal a prevalence of oscillatory activity, that reflect a fine tuning of neural interactions. This research extends neural field theories by incorporating generalized oscillatory dynamics into previous work on conservative or semi-conservative neural field dynamics. Prior studies have largely assumed isotropic connections among neural units; however, this study demonstrates that a broad range of anisotropic and fluctuating connections can still sustain oscillations. Using Lagrangian field methods, we examine different types of connectivity, their dynamics, and potential interactions with neural fields. From this theoretical foundation, we derive a framework that incorporates Hebbian and non-Hebbian learning, i.e., plasticity, into the study of neural fields via the concept of a connectivity field.
Bacterial Pathogenicity Regulation by RNA-binding Antiterminators
Diane Soussan, Ali Tahrioui, R R de la Haba, Adrien Forge, Sylvie Chevalier, Olivier Lesouhaitier, Cécile Muller
Oct 23 2024 q-bio.MN arXiv:2410.16752v1

@misc{2410.16752, author = {Diane Soussan and Ali Tahrioui and R R de la Haba and Adrien Forge and Sylvie Chevalier and Olivier Lesouhaitier and Cécile Muller}, title = {{B}acterial {P}athogenicity {R}egulation by {RNA}-binding {A}ntiterminators}, year = {2024}, eprint = {2410.16752}, note = {arXiv:2410.16752v1} }
PDF
Antiterminators are essential components of bacterial transcriptional regulation, allowing the control of gene expression in response to fluctuating environmental conditions. RNA-binding antiterminators are particularly important regulatory proteins that play a significant role in preventing transcription termination by binding to specific RNA sequences. These RNA-binding antiterminators have been extensively studied for their roles in regulating various metabolic pathways. However, their role in modulating the physiology of pathogens requires further investigations. This review focuses on these RNA-binding proteins in both Gram-positive and Gram-negative bacteria, particularly on their structures, mechanism of action, and target genes. Additionally, the involvement of the antitermination mechanisms in bacterial pathogenicity will be discussed. This knowledge is crucial for understanding the regulatory mechanisms that govern bacterial pathogenicity, opening up exciting prospects for future research, and potentially new alternative strategies to fight against infectious diseases.
Hierarchical Classification for Predicting Metastasis Using Elastic-Net Regularization on Gene Expression Data
Alex Chu, Benjamin Osafo Agyare, Blessing Oloyede
Oct 23 2024 q-bio.GN arXiv:2410.16741v1

@misc{2410.16741, author = {Alex Chu and Benjamin Osafo Agyare and Blessing Oloyede}, title = {{H}ierarchical {C}lassification for {P}redicting {M}etastasis {U}sing {E}lastic-{N}et {R}egularization on {G}ene {E}xpression {D}ata}, year = {2024}, eprint = {2410.16741}, note = {arXiv:2410.16741v1} }
PDF
Metastasis is a leading cause of cancer-related mortality and remains challenging to detect during early stages. Accurate identification of cancers likely to metastasize can improve treatment strategies and patient outcomes. This study leverages publicly available gene expression profiles from primary cancers, with and without distal metastasis, to build predictive models. We utilize elastic net regularization within a hierarchical classification framework to predict both the tissue of origin and the metastasis status of primary tumors. Our elastic net-based hierarchical classification achieved a tissue-of-origin prediction accuracy of 97%, and a metastasis prediction accuracy of 90%. Notably, mitochondrial gene expression exhibited significant negative correlations with metastasis, providing potential biological insights into the underlying mechanisms of cancer progression.
MeMDLM: De Novo Membrane Protein Design with Masked Discrete Diffusion Protein Language Models
Shrey Goel, Vishrut Thoutam, Edgar Mariano Marroquin, Aaron Gokaslan, Arash Firouzbakht, Sophia Vincoff, Volodymyr Kuleshov, Huong T. Kratochvil, Pranam Chatterjee
Oct 23 2024 q-bio.BM arXiv:2410.16735v1

@misc{2410.16735, author = {Shrey Goel and Vishrut Thoutam and Edgar Mariano Marroquin and Aaron Gokaslan and Arash Firouzbakht and Sophia Vincoff and Volodymyr Kuleshov and Huong T.~Kratochvil and Pranam Chatterjee}, title = {{M}e{MDLM}: {D}e {N}ovo {M}embrane {P}rotein {D}esign with {M}asked {D}iscrete {D}iffusion {P}rotein {L}anguage {M}odels}, year = {2024}, eprint = {2410.16735}, note = {arXiv:2410.16735v1} }
PDF
Masked Diffusion Language Models (MDLMs) have recently emerged as a strong class of generative models, paralleling state-of-the-art (SOTA) autoregressive (AR) performance across natural language modeling domains. While there have been advances in AR as well as both latent and discrete diffusion-based approaches for protein sequence design, masked diffusion language modeling with protein language models (pLMs) is unexplored. In this work, we introduce MeMDLM, an MDLM tailored for membrane protein design, harnessing the SOTA pLM ESM-2 to de novo generate realistic membrane proteins for downstream experimental applications. Our evaluations demonstrate that MeMDLM-generated proteins exceed AR-based methods by generating sequences with greater transmembrane (TM) character. We further apply our design framework to scaffold soluble and TM motifs in sequences, demonstrating that MeMDLM-reconstructed sequences achieve greater biological similarity to their original counterparts compared to SOTA inpainting methods. Finally, we show that MeMDLM captures physicochemical membrane protein properties with similar fidelity as SOTA pLMs, paving the way for experimental applications. In total, our pipeline motivates future exploration of MDLM-based pLMs for protein design.
AskBeacon -- Performing genomic data exchange and analytics with natural language
Anuradha Wickramarachchi, Shakila Tonni, Sonali Majumdar, Sarvnaz Karimi, Sulev Kõks, Brendan Hosking, Jordi Rambla, Natalie A. Twine, Yatish Jain, Denis C. Bauer
Oct 23 2024 cs.AI cs.CY q-bio.GN arXiv:2410.16700v1

@misc{2410.16700, author = {Anuradha Wickramarachchi and Shakila Tonni and Sonali Majumdar and Sarvnaz Karimi and Sulev Kõks and Brendan Hosking and Jordi Rambla and Natalie A.~Twine and Yatish Jain and Denis C.~Bauer}, title = {{A}sk{B}eacon -- {P}erforming genomic data exchange and analytics with natural language}, year = {2024}, eprint = {2410.16700}, note = {arXiv:2410.16700v1} }
PDF
Enabling clinicians and researchers to directly interact with global genomic data resources by removing technological barriers is vital for medical genomics. AskBeacon enables Large Language Models to be applied to securely shared cohorts via the GA4GH Beacon protocol. By simply "asking" Beacon, actionable insights can be gained, analyzed and made publication-ready.
An Exploration of Modeling Approaches for Capturing Seasonal Transmission in Stochastic Epidemic Models
Mahmudul Bari Hridoy
Oct 23 2024 q-bio.PE stat.AP arXiv:2410.16664v1

@misc{2410.16664, author = {Mahmudul Bari Hridoy}, title = {{A}n {E}xploration of {M}odeling {A}pproaches for {C}apturing {S}easonal {T}ransmission in {S}tochastic {E}pidemic {M}odels}, year = {2024}, eprint = {2410.16664}, note = {arXiv:2410.16664v1} }
PDF
Seasonal variations in the incidence of infectious diseases are a well-established phenomenon, driven by factors such as climate changes, social behaviors, and ecological interactions that influence host susceptibility and transmission rates. While seasonality plays a significant role in shaping epidemiological dynamics, it is often overlooked in both empirical and theoretical studies. Incorporating seasonal parameters into mathematical models of infectious diseases is crucial for accurately capturing disease dynamics, enhancing the predictive power of these models, and developing successful control strategies. This paper highlights key modeling approaches for incorporating seasonality into disease transmission, including sinusoidal functions, periodic piecewise linear functions, Fourier series expansions, Gaussian functions, and data-driven methods, accompanied by real-world examples. Additionally, a stochastic Susceptible-Infected-Recovered (SIR) model with seasonal transmission is demonstrated through numerical simulations. Important outcome measures, such as the basic and instantaneous reproduction numbers and the probability of a disease outbreak using branching process approximation of the Markov chain, are also presented to illustrate the impact of seasonality on disease dynamics.
Real-time Sub-milliwatt Epilepsy Detection Implemented on a Spiking Neural Network Edge Inference Processor
Ruixin Lia, Guoxu Zhaoa, Dylan Richard Muir, Yuya Ling, Karla Burelo, Mina Khoei, Dong Wang, Yannan Xing, Ning Qiao
Oct 23 2024 eess.SP cs.LG cs.NE q-bio.NC arXiv:2410.16613v1

@misc{2410.16613, author = {Ruixin Lia and Guoxu Zhaoa and Dylan Richard Muir and Yuya Ling and Karla Burelo and Mina Khoei and Dong Wang and Yannan Xing and Ning Qiao}, title = {{R}eal-time {S}ub-milliwatt {E}pilepsy {D}etection {I}mplemented on a {S}piking {N}eural {N}etwork {E}dge {I}nference {P}rocessor}, year = {2024}, eprint = {2410.16613}, howpublished = {Computers in Biology and Medicine(2024), 183, 109225}, doi = {10.1016/j.compbiomed.2024.109225}, note = {arXiv:2410.16613v1} }
PDF
Analyzing electroencephalogram (EEG) signals to detect the epileptic seizure status of a subject presents a challenge to existing technologies aimed at providing timely and efficient diagnosis. In this study, we aimed to detect interictal and ictal periods of epileptic seizures using a spiking neural network (SNN). Our proposed approach provides an online and real-time preliminary diagnosis of epileptic seizures and helps to detect possible pathological conditions.To validate our approach, we conducted experiments using multiple datasets. We utilized a trained SNN to identify the presence of epileptic seizures and compared our results with those of related studies. The SNN model was deployed on Xylo, a digital SNN neuromorphic processor designed to process temporal signals. Xylo efficiently simulates spiking leaky integrate-and-fire neurons with exponential input synapses. Xylo has much lower energy requirments than traditional approaches to signal processing, making it an ideal platform for developing low-power seizure detection systems.Our proposed method has a high test accuracy of 93.3% and 92.9% when classifying ictal and interictal periods. At the same time, the application has an average power consumption of 87.4 uW(IO power) + 287.9 uW(computational power) when deployed to Xylo. Our method demonstrates excellent low-latency performance when tested on multiple datasets. Our work provides a new solution for seizure detection, and it is expected to be widely used in portable and wearable devices in the future.
Gradient-Free Supervised Learning using Spike-Timing-Dependent Plasticity for Image Recognition
Wei Xie
Oct 23 2024 cs.CV q-bio.NC arXiv:2410.16524v1

@misc{2410.16524, author = {Wei Xie}, title = {{G}radient-{F}ree {S}upervised {L}earning using {S}pike-{T}iming-{D}ependent {P}lasticity for {I}mage {R}ecognition}, year = {2024}, eprint = {2410.16524}, note = {arXiv:2410.16524v1} }
PDF
An approach to supervised learning in spiking neural networks is presented using a gradient-free method combined with spike-timing-dependent plasticity for image recognition. The proposed network architecture is scalable to multiple layers, enabling the development of more complex and deeper SNN models. The effectiveness of this method is demonstrated by its application to the MNIST dataset, showing good learning accuracy. The proposed method provides a robust and efficient alternative to the backpropagation-based method in supervised learning.
QuickBind: A Light-Weight And Interpretable Molecular Docking Model
Wojtek Treyde, Seohyun Chris Kim, Nazim Bouatta, Mohammed AlQuraishi
Oct 23 2024 q-bio.BM cs.LG arXiv:2410.16474v1

@misc{2410.16474, author = {Wojtek Treyde and Seohyun Chris Kim and Nazim Bouatta and Mohammed AlQuraishi}, title = {{Q}uick{B}ind: {A} {L}ight-{W}eight {A}nd {I}nterpretable {M}olecular {D}ocking {M}odel}, year = {2024}, eprint = {2410.16474}, note = {arXiv:2410.16474v1} }
PDF
Predicting a ligand's bound pose to a target protein is a key component of early-stage computational drug discovery. Recent developments in machine learning methods have focused on improving pose quality at the cost of model runtime. For high-throughput virtual screening applications, this exposes a capability gap that can be filled by moderately accurate but fast pose prediction. To this end, we developed QuickBind, a light-weight pose prediction algorithm. We assess QuickBind on widely used benchmarks and find that it provides an attractive trade-off between model accuracy and runtime. To facilitate virtual screening applications, we augment QuickBind with a binding affinity module and demonstrate its capabilities for multiple clinically-relevant drug targets. Finally, we investigate the mechanistic basis by which QuickBind makes predictions and find that it has learned key physicochemical properties of molecular docking, providing new insights into how machine learning models generate protein-ligand poses. By virtue of its simplicity, QuickBind can serve as both an effective virtual screening tool and a minimal test bed for exploring new model architectures and innovations. Model code and weights are available at https://github.com/aqlaboratory/QuickBind .
Comprehensive benchmarking of large language models for RNA secondary structure prediction
L.I. Zablocki, L.A. Bugnon, M. Gerard, L. Di Persia, G. Stegmayer, D.H. Milone
Oct 23 2024 cs.AI cs.LG q-bio.BM arXiv:2410.16212v1

@misc{2410.16212, author = {L.I.~Zablocki and L.A.~Bugnon and M.~Gerard and L.~Di Persia and G.~Stegmayer and D.H.~Milone}, title = {{C}omprehensive benchmarking of large language models for {RNA} secondary structure prediction}, year = {2024}, eprint = {2410.16212}, note = {arXiv:2410.16212v1} }
PDF
Inspired by the success of large language models (LLM) for DNA and proteins, several LLM for RNA have been developed recently. RNA-LLM uses large datasets of RNA sequences to learn, in a self-supervised way, how to represent each RNA base with a semantically rich numerical vector. This is done under the hypothesis that obtaining high-quality RNA representations can enhance data-costly downstream tasks. Among them, predicting the secondary structure is a fundamental task for uncovering RNA functional mechanisms. In this work we present a comprehensive experimental analysis of several pre-trained RNA-LLM, comparing them for the RNA secondary structure prediction task in an unified deep learning framework. The RNA-LLM were assessed with increasing generalization difficulty on benchmark datasets. Results showed that two LLM clearly outperform the other models, and revealed significant challenges for generalization in low-homology scenarios.
The Interplay Between Physical Activity, Protein Consumption, and Sleep Quality in Muscle Protein Synthesis
Ayush Devkota, Manakamana Gautam, Uttam Dhakal, Suman Devkota, Gaurav Kumar Gupta, Ujjwal Nepal, Amey Dinesh Dhuru, Aniket Kumar Singh
Oct 23 2024 q-bio.TO q-bio.CB arXiv:2410.16169v1

@misc{2410.16169, author = {Ayush Devkota and Manakamana Gautam and Uttam Dhakal and Suman Devkota and Gaurav Kumar Gupta and Ujjwal Nepal and Amey Dinesh Dhuru and Aniket Kumar Singh}, title = {{T}he {I}nterplay {B}etween {P}hysical {A}ctivity, {P}rotein {C}onsumption, and {S}leep {Q}uality in {M}uscle {P}rotein {S}ynthesis}, year = {2024}, eprint = {2410.16169}, note = {arXiv:2410.16169v1} }
PDF
This systematic review examines the synergistic and individual influences of resistance exercise, dietary protein supplementation, and sleep/recovery on muscle protein synthesis (MPS). Electronic databases such as Scopus, Google Scholar, and Web of Science were extensively used. Studies were selected based on relevance to the criteria and were ensured to be directly applicable to the objectives. Research indicates that a protein dose of 20 to 25 grams maximally stimulates MPS post-resistance training. It is observed that physically frail individuals aged 76 to 92 and middle-aged adults aged 62 to 74 have lower mixed muscle protein synthetic rates than individuals aged 20 to 32. High-whey protein and leucine-enriched supplements enhance MPS more efficiently than standard dairy products in older adults engaged in resistance programs. Similarly, protein intake before sleep boosts overnight MPS rates, which helps prevent muscle loss associated with sleep debt, exercise-induced damage, and muscle-wasting conditions like sarcopenia and cachexia. Resistance exercise is a functional intervention to achieve muscular adaptation and improve function. Future research should focus on variables such as fluctuating fitness levels, age groups, genetics, and lifestyle factors to generate more accurate and beneficial results.
Networks: The Visual Language of Complexity
Blai Vidiella, Salva Duran-Nebreda, Sergi Valverde
Oct 23 2024 cond-mat.dis-nn physics.soc-ph q-bio.MN q-bio.PE arXiv:2410.16158v1

@misc{2410.16158, author = {Blai Vidiella and Salva Duran-Nebreda and Sergi Valverde}, title = {{N}etworks: {T}he {V}isual {L}anguage of {C}omplexity}, year = {2024}, eprint = {2410.16158}, note = {arXiv:2410.16158v1} }
PDF
Understanding the origins of complexity is a fundamental challenge with implications for biological and technological systems. Network theory emerges as a powerful tool to model complex systems. Networks are an intuitive framework to represent inter-dependencies among many system components, facilitating the study of both local and global properties. However, it is unclear whether we can define a universal theoretical framework for evolving networks. While basic growth mechanisms, like preferential attachment, recapitulate common properties such as the power-law degree distribution, they fall short in capturing other system-specific properties. Tinkering, on the other hand, has shown to be very successful in generating modular or nested structures "for-free", highlighting the role of internal, non-adaptive mechanisms in the evolution of complexity. Different network extensions, like hypergraphs, have been recently developed to integrate exogenous factors in evolutionary models, as pairwise interactions are insufficient to capture environmentally-mediated species associations. As we confront global societal and climatic challenges, the study of network and hypergraphs provides valuable insights, emphasizing the importance of scientific exploration in understanding and managing complexity.
Modeling dynamic neural activity by combining naturalistic video stimuli and stimulus-independent latent factors
Finn Schmidt, Suhas Shrinivasan, Polina Turishcheva, Fabian H. Sinz
Oct 23 2024 q-bio.NC cs.AI arXiv:2410.16136v1

@misc{2410.16136, author = {Finn Schmidt and Suhas Shrinivasan and Polina Turishcheva and Fabian H.~Sinz}, title = {{M}odeling dynamic neural activity by combining naturalistic video stimuli and stimulus-independent latent factors}, year = {2024}, eprint = {2410.16136}, note = {arXiv:2410.16136v1} }
PDF
Understanding how the brain processes dynamic natural stimuli remains a fundamental challenge in neuroscience. Current dynamic neural encoding models either take stimuli as input but ignore shared variability in neural responses, or they model this variability by deriving latent embeddings from neural responses or behavior while ignoring the visual input. To address this gap, we propose a probabilistic model that incorporates video inputs along with stimulus-independent latent factors to capture variability in neuronal responses, predicting a joint distribution for the entire population. After training and testing our model on mouse V1 neuronal responses, we found that it outperforms video-only models in terms of log-likelihood and achieves further improvements when conditioned on responses from other neurons. Furthermore, we find that the learned latent factors strongly correlate with mouse behavior, although the model was trained without behavior data.
The role of spike-timing-dependent plasticity and random inputs in neurodegenerative diseases and neuromorphic systems
Thoa Thieu, Roderick Melnik
Oct 23 2024 q-bio.NC arXiv:2410.16123v1

@misc{2410.16123, author = {Thoa Thieu and Roderick Melnik}, title = {{T}he role of spike-timing-dependent plasticity and random inputs in neurodegenerative diseases and neuromorphic systems}, year = {2024}, eprint = {2410.16123}, note = {arXiv:2410.16123v1} }
PDF
Neuronal oscillations are related to symptoms of Parkinson's disease. The random inputs could affect such oscillations in the brain states that translate collective activities of neurons interconnected via synaptic connections. In this paper, we study coupled effects of channels and synaptic dynamics under the stochastic influence, together with spike-timing-dependent plasticity (STDP) of healthy brain cells with applications to Parkinson's disease (PD). In particular, we investigate the effects of random inputs and input correlations in a subthalamic nucleus (STN) cell membrane potential model. Our numerical results show that the random inputs strongly affect the spiking activities of the STN neuron not only in the case of healthy cells but also in the case of PD cells in the presence of DBS treatment. The STDP increases the interspike interval (ISI) regularity of spike trains of the output neurons. However, the existence of a random refractory period and random input current in the system may substantially influence an increased irregularity of spike trains of the output neurons. Furthermore, the presence of the stochastic influence together with spike-timing-dependent plasticity could increase the correlation of the neurons. These effects would potentially contribute to the management of PD symptoms.
Computational design of target-specific linear peptide binders with TransformerBeta
Haowen Zhao, Francesco A. Aprile, Barbara Bravi
Oct 23 2024 q-bio.BM cs.LG arXiv:2410.16302v1

@misc{2410.16302, author = {Haowen Zhao and Francesco A.~Aprile and Barbara Bravi}, title = {{C}omputational design of target-specific linear peptide binders with {T}ransformer{B}eta}, year = {2024}, eprint = {2410.16302}, note = {arXiv:2410.16302v1} }
PDF
The computational prediction and design of peptide binders targeting specific linear epitopes is crucial in biological and biomedical research, yet it remains challenging due to their highly dynamic nature and the scarcity of experimentally solved binding data. To address this problem, we built an unprecedentedly large-scale library of peptide pairs within stable secondary structures (beta sheets), leveraging newly available AlphaFold predicted structures. We then developed a machine learning method based on the Transformer architecture for the design of specific linear binders, in analogy to a language translation task. Our method, TransformerBeta, accurately predicts specific beta strand interactions and samples sequences with beta sheet-like molecular properties, while capturing interpretable physico-chemical interaction patterns. As such, it can propose specific candidate binders targeting linear epitope for experimental validation to inform protein design.

Recent comments