Quantitative Biology
See recent articles
Showing new listings for Friday, 25 October 2024
- [1] arXiv:2410.18094 [pdf, other]
-
Title: Self-supervised inter-intra period-aware ECG representation learning for detecting atrial fibrillationXiangqian Zhu, Mengnan Shi, Xuexin Yu, Chang Liu, Xiaocong Lian, Jintao Fei, Jiangying Luo, Xin Jin, Ping Zhang, Xiangyang JiComments: Preprint submitted to Biomedical Signal Processing and ControlSubjects: Quantitative Methods (q-bio.QM); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Signal Processing (eess.SP)
Atrial fibrillation is a commonly encountered clinical arrhythmia associated with stroke and increased mortality. Since professional medical knowledge is required for annotation, exploiting a large corpus of ECGs to develop accurate supervised learning-based atrial fibrillation algorithms remains challenging. Self-supervised learning (SSL) is a promising recipe for generalized ECG representation learning, eliminating the dependence on expensive labeling. However, without well-designed incorporations of knowledge related to atrial fibrillation, existing SSL approaches typically suffer from unsatisfactory capture of robust ECG representations. In this paper, we propose an inter-intra period-aware ECG representation learning approach. Considering ECGs of atrial fibrillation patients exhibit the irregularity in RR intervals and the absence of P-waves, we develop specific pre-training tasks for interperiod and intraperiod representations, aiming to learn the single-period stable morphology representation while retaining crucial interperiod features. After further fine-tuning, our approach demonstrates remarkable AUC performances on the BTCH dataset, \textit{i.e.}, 0.953/0.996 for paroxysmal/persistent atrial fibrillation detection. On commonly used benchmarks of CinC2017 and CPSC2021, the generalization capability and effectiveness of our methodology are substantiated with competitive results.
- [2] arXiv:2410.18110 [pdf, html, other]
-
Title: Learning Image Derived PDE-Phenotypes from fMRI DataComments: The study demonstrates a novel approach to extracting meaningful PDE-features from fMRI data for neurological disorder analysis to understand the role of oxygen transport (delivery $\&$ consumption) in the brain during neural activity relevant for studying intracranial pathologiesSubjects: Neurons and Cognition (q-bio.NC)
Partial Differential Equations (PDEs) model various physical phenomena, such as electromagnetic fields and fluid mechanics. Methods like Sparse Identification of Nonlinear Dynamics (SINDy) and PDE-Net 2.0 have been developed to identify and model PDEs based on data using sparse optimization and deep neural networks, respectively. While PDE models are less commonly applied to fMRI data, they hold the potential for uncovering hidden connections and essential components in brain activity. Using the ADHD200 dataset, we applied Canonical Independent Component Analysis (CanICA) and Uniform Manifold Approximation (UMAP) for dimensionality reduction of fMRI data. We then used Sparse Ridge Regression to identify PDEs from the reduced data, achieving high accuracy in classifying attention deficit hyperactivity disorder (ADHD). The study demonstrates a novel approach to extracting meaningful features from fMRI data for neurological disorder analysis to understand the role of oxygen transport (delivery $\&$ consumption) in the brain during neural activity relevant for studying intracranial pathologies.
- [3] arXiv:2410.18143 [pdf, html, other]
-
Title: Optimizing information transmission in neural induction constrains cell surface contacts of ascidian embryosSubjects: Tissues and Organs (q-bio.TO)
The onset of neural induction in the anterior ectoderm of ascidian embryos is regulated at the extracellular level by FGF signaling molecules, which control the acquisition of neural fate through the activation of the ERK pathway. Among the anterior ectoderm cells exposed to FGF, only a fraction will acquire neural fate. The selection of neural precursors depends on the quasi-invariant geometry of the embryo, which imposes upon each ectoderm cell a precise area of cell surface contact with underlying FGF-expressing (mesendoderm) cells. Here, we investigate information transmission between FGF and activated ERK and how this depends on the geometry of the system. Optimizing information transmission with the constraint that the total FGF-emitting surface area is restricted, as in the embryo, we find that the surface contacts with FGF that maximize information transmission are close to those observed experimentally. This information optimal solution is compatible with the anterior ectoderm cells having different areas of cell surface exposure to FGF, allowing the embryo to use cell surface areas as a regulatory mechanism for differentiating the outcome of cells that sense a constant FGF concentration.
- [4] arXiv:2410.18222 [pdf, html, other]
-
Title: KMS states of Information Flow in Directed Brain Synaptic NetworksComments: 6 page, 4 figuresSubjects: Neurons and Cognition (q-bio.NC); Disordered Systems and Neural Networks (cond-mat.dis-nn); Statistical Mechanics (cond-mat.stat-mech); Quantum Physics (quant-ph)
The brain's synaptic network, characterized by parallel connections and feedback loops, drives information flow between neurons through a large system with infinitely many degrees of freedom. This system is best modeled by the graph $C^*$-algebra of the underlying directed graph, the Toeplitz-Cuntz-Krieger algebra, which captures the diversity of potential information pathways. Coupled with the gauge action, this graph algebra defines an {\em algebraic quantum system}, and here we demonstrate that its thermodynamic properties provide a natural framework for describing the dynamic mappings of information flow within the network. Specifically, we show that the KMS states of this system yield global statistical measures of neuronal interactions, with computational illustrations based on the {\em C. elegans} synaptic network.
- [5] arXiv:2410.18394 [pdf, html, other]
-
Title: Simultaneously Infer Cell Pseudotime,Velocity Field and Gene Interaction from Multi-Branch scRNA-seq Data with scPNSubjects: Molecular Networks (q-bio.MN); Genomics (q-bio.GN); Quantitative Methods (q-bio.QM)
Modeling cellular dynamics from single-cell RNA sequencing (scRNA-seq) data is critical for understanding cell development and underlying gene regulatory relationships. Many current methods rely on single-cell velocity to obtain pseudotime, which can lead to inconsistencies between pseudotime and velocity. It is challenging to simultaneously infer cell pseudotime and gene interaction networks, especially in multi-branch differentiation scenarios. We present single-cell Piecewise Network (scPN), a novel high-dimensional dynamical modeling approach that iteratively extracts temporal patterns and inter-gene relationships from scRNA-seq data. To tackle multi-branch differentiation challenges, scPN models gene regulatory dynamics using piecewise gene-gene interaction networks, offering an interpretable framework for deciphering complex gene regulation patterns over time. Results on synthetic data and multiple scRNA-seq datasets demonstrate the superior performance of scPN in reconstructing cellular dynamics and identifying key transcription factors involved in development compared to existing methods. To the best of our knowledge, scPN is the first attempt at modeling that can recover pseudotime, velocity fields, and gene interactions all at once on multi-branch datasets.
- [6] arXiv:2410.18403 [pdf, html, other]
-
Title: Structure Language Models for Protein Conformation GenerationComments: Preprint. Under ReviewSubjects: Biomolecules (q-bio.BM); Machine Learning (cs.LG)
Proteins adopt multiple structural conformations to perform their diverse biological functions, and understanding these conformations is crucial for advancing drug discovery. Traditional physics-based simulation methods often struggle with sampling equilibrium conformations and are computationally expensive. Recently, deep generative models have shown promise in generating protein conformations as a more efficient alternative. However, these methods predominantly rely on the diffusion process within a 3D geometric space, which typically centers around the vicinity of metastable states and is often inefficient in terms of runtime. In this paper, we introduce Structure Language Modeling (SLM) as a novel framework for efficient protein conformation generation. Specifically, the protein structures are first encoded into a compact latent space using a discrete variational auto-encoder, followed by conditional language modeling that effectively captures sequence-specific conformation distributions. This enables a more efficient and interpretable exploration of diverse ensemble modes compared to existing methods. Based on this general framework, we instantiate SLM with various popular LM architectures as well as proposing the ESMDiff, a novel BERT-like structure language model fine-tuned from ESM3 with masked diffusion. We verify our approach in various scenarios, including the equilibrium dynamics of BPTI, conformational change pairs, and intrinsically disordered proteins. SLM provides a highly efficient solution, offering a 20-100x speedup than existing methods in generating diverse conformations, shedding light on promising avenues for future research.
- [7] arXiv:2410.18621 [pdf, html, other]
-
Title: Evolutionary Dispersal of Ecological Species via Multi-Agent Deep Reinforcement LearningSubjects: Populations and Evolution (q-bio.PE); Machine Learning (cs.LG); Dynamical Systems (math.DS)
Understanding species dynamics in heterogeneous environments is essential for ecosystem studies. Traditional models assumed homogeneous habitats, but recent approaches include spatial and temporal variability, highlighting species migration. We adopt starvation-driven diffusion (SDD) models as nonlinear diffusion to describe species dispersal based on local resource conditions, showing advantages for species survival. However, accurate prediction remains challenging due to model simplifications. This study uses multi-agent reinforcement learning (MARL) with deep Q-networks (DQN) to simulate single species and predator-prey interactions, incorporating SDD-type rewards. Our simulations reveal evolutionary dispersal strategies, providing insights into species dispersal mechanisms and validating traditional mathematical models.
- [8] arXiv:2410.18710 [pdf, other]
-
Title: Uncovering the Genetic Basis of Glioblastoma Heterogeneity through Multimodal Analysis of Whole Slide Images and RNA Sequencing DataAhmad Berjaoui, Louis Roussel, Eduardo Hugo Sanchez, Elizabeth Cohen-Jonathan Moyal (CRCT, IUCT Oncopole)Subjects: Quantitative Methods (q-bio.QM); Artificial Intelligence (cs.AI)
Glioblastoma is a highly aggressive form of brain cancer characterized by rapid progression and poor prognosis. Despite advances in treatment, the underlying genetic mechanisms driving this aggressiveness remain poorly understood. In this study, we employed multimodal deep learning approaches to investigate glioblastoma heterogeneity using joint image/RNA-seq analysis. Our results reveal novel genes associated with glioblastoma. By leveraging a combination of whole-slide images and RNA-seq, as well as introducing novel methods to encode RNA-seq data, we identified specific genetic profiles that may explain different patterns of glioblastoma progression. These findings provide new insights into the genetic mechanisms underlying glioblastoma heterogeneity and highlight potential targets for therapeutic intervention.
- [9] arXiv:2410.18732 [pdf, other]
-
Title: Natural selection at multiple scalesSubjects: Populations and Evolution (q-bio.PE)
Natural selection acts on traits at different scales, often with opposing consequences. This article identifies the particular forces that act at each scale and how those forces combine to determine the overall evolutionary outcome. A series of extended models derive from the tragedy of the commons, illustrating opposing forces at different scales. Examples include the primary tension between conflict and cooperation and the evolution of virulence, sex ratios, dispersal, and evolvability. The unified analysis subsumes interactions within and between species by generalizing multitrait interactions. Expanded notions of recombination and cotransmission arise. The core theoretical approach isolates the fundamental forces of selection, including marginal valuation, correlation between interacting entities, and reproductive value. Those fundamental forces act as partial causes that combine at different temporal and spatial scales. Modeling focuses on statics, in the sense of how different forces at various scales tend to oppose each other, ultimately combining to shape traits. That type of static analysis emphasizes explanation rather than the calculation of dynamics. The article builds on the duality between explanation versus calculation in terms of statics versus dynamics. The literature often poses that duality as a controversy, whereas this article develops the pair as complementary tools that together provide deeper understanding. Along the way, the unified approach clarifies the subtle distinctions between kin selection, multilevel selection, and inclusive fitness, subsuming these topics into the broader perspectives of fundamental forces and multiple scales.
- [10] arXiv:2410.18849 [pdf, html, other]
-
Title: Bioenergetic trophic trade-offs determine mass-dependent extinction thresholds across the CenozoicJustin D. Yeakel, Matthew C. Hutchinson, Christopher P. Kempes, Paul L. Koch, Jacquelyn L. Gill, Mathias M. PiresComments: 14 pages, 3 figures, SI AppendicesSubjects: Populations and Evolution (q-bio.PE)
Body size drives the energetic demands of organisms, constraining trophic interactions between species and playing a significant role in shaping the feasibility of species' populations in a community. On macroevolutionary timescales, these demands feed back to shape the selective landscape driving the evolution of body size and diet. We develop a theoretical framework for a three-level trophic food chain -- typical for terrestrial mammalian ecosystems -- premised on bioenergetic trade-offs to explore mammalian population dynamics. Our results show that interactions between predators, prey, and external subsidies generate instabilities linked to body size extrema, corresponding to observed limits of predator size and diet. These instabilities generate size-dependent constraints on coexistence and highlight a feasibility range for carnivore size between 40 to 110 kg, encompassing the mean body size of terrestrial Cenozoic hypercarnivores. Finally, we show that predator dietary generalization confers a selective advantage to larger carnivores, which then declines at megapredator body sizes, aligning with diet breadth estimates for contemporary and Pleistocene species. Our framework underscores the importance of understanding macroevolutionary constraints through the lens of ecological pressures, where the selective forces shaping and reshaping the dynamics of communities can be explored.
- [11] arXiv:2410.18864 [pdf, html, other]
-
Title: Omics-driven hybrid dynamic modeling of bioprocesses with uncertainty estimationSubjects: Quantitative Methods (q-bio.QM); Machine Learning (cs.LG)
This work presents an omics-driven modeling pipeline that integrates machine-learning tools to facilitate the dynamic modeling of multiscale biological systems. Random forests and permutation feature importance are proposed to mine omics datasets, guiding feature selection and dimensionality reduction for dynamic modeling. Continuous and differentiable machine-learning functions can be trained to link the reduced omics feature set to key components of the dynamic model, resulting in a hybrid model. As proof of concept, we apply this framework to a high-dimensional proteomics dataset of $\textit{Saccharomyces cerevisiae}$. After identifying key intracellular proteins that correlate with cell growth, targeted dynamic experiments are designed, and key model parameters are captured as functions of the selected proteins using Gaussian processes. This approach captures the dynamic behavior of yeast strains under varying proteome profiles while estimating the uncertainty in the hybrid model's predictions. The outlined modeling framework is adaptable to other scenarios, such as integrating additional layers of omics data for more advanced multiscale biological systems, or employing alternative machine-learning methods to handle larger datasets. Overall, this study outlines a strategy for leveraging omics data to inform multiscale dynamic modeling in systems biology and bioprocess engineering.
- [12] arXiv:2410.18933 [pdf, html, other]
-
Title: Confidence is detection-like in high-dimensional spacesSubjects: Neurons and Cognition (q-bio.NC)
Confidence estimates are often "detection-like" - driven by positive evidence in favour of a decision. This empirical observation has been interpreted as showing human metacognition is limited by biases or heuristics. Here we show that Bayesian confidence estimates also exhibit heightened sensitivity to decision-congruent evidence in higher-dimensional signal detection theoretic spaces, leading to detection-like confidence criteria. This effect is due to a nonlinearity induced by normalisation of confidence by a large number of unchosen alternatives. Our analysis suggests that detection-like confidence is rational when computing confidence in a higher-dimensional evidence space than that assumed by the experimenter.
New submissions (showing 12 of 12 entries)
- [13] arXiv:2410.18118 (cross-list from physics.chem-ph) [pdf, other]
-
Title: OWPCP: A Deep Learning Model to Predict Octanol-Water Partition CoefficientComments: PreprintSubjects: Chemical Physics (physics.chem-ph); Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
The physicochemical properties of chemical compounds have great importance in several areas, including pharmaceuticals, environmental and separation science. Among these are physicochemical properties such as the octanol-water partition coefficient, which has been considered an important index pointing out lipophilicity and hydrophilicity. It affects drug absorption and membrane permeability. Following Lipinski's rule of five, logP was identified as one of the key determinants of the stability of chemical entities and, as such, needed state-of-the-art methods for measuring lipophilicity. This paper presents a deep-learning model, OWPCP, developed to compute logP using Morgan fingerprints and MACCS keys as input features. It uses the interconnection of such molecular representations with logP values extracted from 26,254 compounds. The dataset was prepared to contain a wide range of chemical structures with differing molecular weights and polar surface area. Hyperparameter optimization was conducted using the Keras Tuner alongside the Hyperband algorithm to enhance the performance. OWPCP demonstrated outstanding performance compared to current computational methods, achieving an MAE=0.247 on the test set and outperforming all previous DL models. Remarkably, while one of the most accurate recent models is based on experimental data on retention time to make predictions, OWPCP manages computing logP efficiently without depending on these factors, being, therefore, very useful during early-stage drug discovery. Our model outperforms the best model, which leverages Retention Time, and our model does not require any experimental data. Further validation of the model performance was done across different functional groups, and it showed very high accuracy, especially for compounds that contain aliphatic OH groups. The results have indicated that OWPCP provides a reliable prediction of logP.
Cross submissions (showing 1 of 1 entries)
- [14] arXiv:2211.15963 (replaced) [pdf, other]
-
Title: Simulation and assimilation of the digital human brainWenlian Lu, Xin Du, Jiexiang Wang, Longbin Zeng, Leijun Ye, Shitong Xiang, Qibao Zheng, Jie Zhang, Ningsheng Xu, Jianfeng Feng (on behalf of the DTB Consortium)Comments: 74 pages,13 figures, 9 tablesSubjects: Neurons and Cognition (q-bio.NC)
Here, we present the Digital Brain (DB), a platform for simulating spiking neuronal networks at the large neuron scale of the human brain based on personalized magnetic-resonance-imaging data and biological constraints. The DB aims to reproduce both the resting state and certain aspects of the action of the human brain. An architecture with up to 86 billion neurons and 14,012 GPUs, including a two-level routing scheme between GPUs to accelerate spike transmission up to 47.8 trillion neuronal synapses, was implemented as part of the simulations. We show that the DB can reproduce blood-oxygen-level-dependent signals of the resting-state of the human brain with a high correlation coefficient, as well as interact with its perceptual input, as demonstrated in a visual task. These results indicate the feasibility of implementing a digital representation of the human brain, which can open the door to a broad range of potential applications.
- [15] arXiv:2302.09445 (replaced) [pdf, html, other]
-
Title: Inference of weak-form partial differential equations describing migration and proliferation mechanisms in wound healing experiments on cancer cellsPatrick C. Kinnunen, Siddhartha Srivastava, Zhenlin Wang, Kenneth K.Y. Ho, Brock A. Humphries, Siyi Chen, Jennifer J. Linderman, Gary D. Luker, Kathryn E. Luker, Krishna GarikipatiSubjects: Cell Behavior (q-bio.CB)
Targeting signaling pathways that drive cancer cell migration or proliferation is a common therapeutic approach. A popular experimental technique, the scratch assay, measures the migration and proliferation-driven cell closure of a defect in a confluent cell monolayer. These assays do not measure dynamic effects. To improve analysis of scratch assays, we combine high-throughput scratch assays, video microscopy, and system identification to infer partial differential equation (PDE) models of cell migration and proliferation. We capture the evolution of cell density fields over time using live cell microscopy and automated image processing. We employ weak form-based system identification techniques for cell density dynamics modeled with first-order kinetics of advection-diffusion-reaction systems. We present a comparison of our methods to results obtained using traditional inference approaches on previously analyzed 1-dimensional scratch assay data. We demonstrate the application of this pipeline on high throughput 2-dimensional scratch assays and find that low levels of trametinib inhibit wound closure primarily by decreasing random cell migration by approximately 20%. Our integrated experimental and computational pipeline can be adapted for quantitatively inferring the effect of biological perturbations on cell migration and proliferation in various cell lines.
- [16] arXiv:2410.00532 (replaced) [pdf, html, other]
-
Title: smICA: an open source repository for mapping the concentration of fluorescently labeled molecules in living cells on the basis of confocal imaging combined with time-correlated single-photon countingTomasz Kalwarczyk, Grzegorz Bubak, Jarosław Michalski, Karina Kwapiszewska, Marta Pilz, Adam Mamot, Jacek Jemielity, Robert HołystComments: 14 pages, 6 figures, 18, referencesSubjects: Quantitative Methods (q-bio.QM)
Advanced microscopy techniques are essential for visualizing and tracking cellular components and molecules in biomedical research. However, conventional fluorescence microscopy methods often struggle with accurately measuring molecule concentrations in cells. To overcome these limitations, we introduce a novel approach that integrates laser scanning confocal microscopy with time-correlated single photon counting (TCSPC), supported by an open-source analysis tool called smICA (single-molecule Image to Concentration Analyzer). Our method, validated against traditional fluorescence correlation spectroscopy (FCS), offers enhanced accuracy in determining fluorescent molecule concentrations, particularly in cases where molecules are immobile or unevenly distributed. This is demonstrated using fluorescently labeled mRNA in living cells, highlighting the approach's effectiveness.
- [17] arXiv:2209.13371 (replaced) [pdf, other]
-
Title: Considerations and recommendations from the ISMRM Diffusion Study Group for preclinical diffusion MRI: Part 2 -- Ex vivo imaging: added value and acquisitionKurt G Schilling, Francesco Grussu, Andrada Ianus, Brian Hansen, Amy FD Howard, Rachel L C Barrett, Manisha Aggarwal, Stijn Michielse, Fatima Nasrallah, Warda Syeda, Nian Wang, Jelle Veraart, Alard Roebroeck, Andrew F Bagdasarian, Cornelius Eichner, Farshid Sepehrband, Jan Zimmermann, Lucas Soustelle, Christien Bowman, Benjamin C Tendler, Andreea Hertanu, Ben Jeurissen, Lucio Frydman, Yohan van de Looij, David Hike, Jeff F Dunn, Karla Miller, Bennett A Landman, Noam Shemesh, Adam Anderson, Emilie McKinnon, Shawna Farquharson, Flavio Dell' Acqua, Carlo Pierpaoli, Ivana Drobnjak, Alexander Leemans, Kevin D Harkins, Maxime Descoteaux, Duan Xu, Hao Huang, Mathieu D Santin, Samuel C. Grant, Andre Obenaus, Gene S Kim, Dan Wu, Denis Le Bihan, Stephen J Blackband, Luisa Ciobanu, Els Fieremans, Ruiliang Bai, Trygve Leergaard, Jiangyang Zhang, Tim B Dyrby, G Allan Johnson, Julien Cohen-Adad, Matthew D Budde, Ileana O JelescuComments: Part 2 of 3 in "Considerations and recommendations for preclinical diffusion MRI"Subjects: Medical Physics (physics.med-ph); Tissues and Organs (q-bio.TO)
The value of preclinical diffusion MRI (dMRI) is substantial. While dMRI enables in vivo non-invasive characterization of tissue, ex vivo dMRI is increasingly used to probe tissue microstructure and brain connectivity. Ex vivo dMRI has several experimental advantages including higher signal-to-noise ratio and spatial resolution compared to in vivo studies, and enabling more advanced diffusion contrasts. Another major advantage of ex vivo dMRI is the direct comparison with histological data as a methodological validation. However, there are a number of considerations that must be made when performing ex vivo experiments. The steps from tissue preparation, image acquisition and processing, and interpretation of results are complex, with decisions that not only differ dramatically from in vivo imaging of small animals, but ultimately affect what questions can be answered using the data. This work represents "Part 2" of a 3-part series of recommendations and considerations for preclinical dMRI. We describe best practices for dMRI of ex vivo tissue, with a focus on the value that ex vivo imaging adds to the field of dMRI and considerations in ex vivo image acquisition. We give general considerations and foundational knowledge that must be considered when designing experiments. We describe differences in specimens and models and discuss why some may be more or less appropriate for different studies. We then give guidelines for ex vivo protocols, including tissue fixation, sample preparation, and MR scanning. In each section, we attempt to provide guidelines and recommendations, but also highlight areas for which no guidelines exist (and why), and where future work should lie. An overarching goal herein is to enhance the rigor and reproducibility of ex vivo dMRI acquisitions and analyses, and thereby advance biomedical knowledge.