-
Single-Layer Learnable Activation for Implicit Neural Representation (SL$^{2}$A-INR)
Authors:
Moein Heidari,
Reza Rezaeian,
Reza Azad,
Dorit Merhof,
Hamid Soltanian-Zadeh,
Ilker Hacihaliloglu
Abstract:
Implicit Neural Representation (INR), leveraging a neural network to transform coordinate input into corresponding attributes, has recently driven significant advances in several vision-related domains. However, the performance of INR is heavily influenced by the choice of the nonlinear activation function used in its multilayer perceptron (MLP) architecture. Multiple nonlinearities have been inve…
▽ More
Implicit Neural Representation (INR), leveraging a neural network to transform coordinate input into corresponding attributes, has recently driven significant advances in several vision-related domains. However, the performance of INR is heavily influenced by the choice of the nonlinear activation function used in its multilayer perceptron (MLP) architecture. Multiple nonlinearities have been investigated; yet, current INRs face limitations in capturing high-frequency components, diverse signal types, and handling inverse problems. We have identified that these problems can be greatly alleviated by introducing a paradigm shift in INRs. We find that an architecture with learnable activations in initial layers can represent fine details in the underlying signals. Specifically, we propose SL$^{2}$A-INR, a hybrid network for INR with a single-layer learnable activation function, prompting the effectiveness of traditional ReLU-based MLPs. Our method performs superior across diverse tasks, including image representation, 3D shape reconstructions, inpainting, single image super-resolution, CT reconstruction, and novel view synthesis. Through comprehensive experiments, SL$^{2}$A-INR sets new benchmarks in accuracy, quality, and convergence rates for INR.
△ Less
Submitted 18 September, 2024; v1 submitted 16 September, 2024;
originally announced September 2024.
-
Implicit Neural Representations with Fourier Kolmogorov-Arnold Networks
Authors:
Ali Mehrabian,
Parsa Mojarad Adi,
Moein Heidari,
Ilker Hacihaliloglu
Abstract:
Implicit neural representations (INRs) use neural networks to provide continuous and resolution-independent representations of complex signals with a small number of parameters. However, existing INR models often fail to capture important frequency components specific to each task. To address this issue, in this paper, we propose a Fourier Kolmogorov Arnold network (FKAN) for INRs. The proposed FK…
▽ More
Implicit neural representations (INRs) use neural networks to provide continuous and resolution-independent representations of complex signals with a small number of parameters. However, existing INR models often fail to capture important frequency components specific to each task. To address this issue, in this paper, we propose a Fourier Kolmogorov Arnold network (FKAN) for INRs. The proposed FKAN utilizes learnable activation functions modeled as Fourier series in the first layer to effectively control and learn the task-specific frequency components. In addition, the activation functions with learnable Fourier coefficients improve the ability of the network to capture complex patterns and details, which is beneficial for high-resolution and high-dimensional data. Experimental results show that our proposed FKAN model outperforms three state-of-the-art baseline schemes, and improves the peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) for the image representation task and intersection over union (IoU) for the 3D occupancy volume representation task, respectively.
△ Less
Submitted 20 September, 2024; v1 submitted 14 September, 2024;
originally announced September 2024.
-
New Bounds on Quantum Sample Complexity of Measurement Classes
Authors:
Mohsen Heidari,
Wojciech Szpankowski
Abstract:
This paper studies quantum supervised learning for classical inference from quantum states. In this model, a learner has access to a set of labeled quantum samples as the training set. The objective is to find a quantum measurement that predicts the label of the unseen samples. The hardness of learning is measured via sample complexity under a quantum counterpart of the well-known probably approxi…
▽ More
This paper studies quantum supervised learning for classical inference from quantum states. In this model, a learner has access to a set of labeled quantum samples as the training set. The objective is to find a quantum measurement that predicts the label of the unseen samples. The hardness of learning is measured via sample complexity under a quantum counterpart of the well-known probably approximately correct (PAC). Quantum sample complexity is expected to be higher than classical one, because of the measurement incompatibility and state collapse. Recent efforts showed that the sample complexity of learning a finite quantum concept class $\mathcal{C}$ scales as $O(|\mathcal{C}|)$. This is significantly higher than the classical sample complexity that grows logarithmically with the class size. This work improves the sample complexity bound to $O(V_{\mathcal{C}^*} \log |\mathcal{C}^*|)$, where $\mathcal{C}^*$ is the set of extreme points of the convex closure of $\mathcal{C}$ and $V_{\mathcal{C}^*}$ is the shadow-norm of this set. We show the tightness of our bound for the class of bounded Hilbert-Schmidt norm, scaling as $O(\log |\mathcal{C}^*|)$. Our approach is based on a new quantum empirical risk minimization (ERM) algorithm equipped with a shadow tomography method.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
MSA$^2$Net: Multi-scale Adaptive Attention-guided Network for Medical Image Segmentation
Authors:
Sina Ghorbani Kolahi,
Seyed Kamal Chaharsooghi,
Toktam Khatibi,
Afshin Bozorgpour,
Reza Azad,
Moein Heidari,
Ilker Hacihaliloglu,
Dorit Merhof
Abstract:
Medical image segmentation involves identifying and separating object instances in a medical image to delineate various tissues and structures, a task complicated by the significant variations in size, shape, and density of these features. Convolutional neural networks (CNNs) have traditionally been used for this task but have limitations in capturing long-range dependencies. Transformers, equippe…
▽ More
Medical image segmentation involves identifying and separating object instances in a medical image to delineate various tissues and structures, a task complicated by the significant variations in size, shape, and density of these features. Convolutional neural networks (CNNs) have traditionally been used for this task but have limitations in capturing long-range dependencies. Transformers, equipped with self-attention mechanisms, aim to address this problem. However, in medical image segmentation it is beneficial to merge both local and global features to effectively integrate feature maps across various scales, capturing both detailed features and broader semantic elements for dealing with variations in structures. In this paper, we introduce MSA$^2$Net, a new deep segmentation framework featuring an expedient design of skip-connections. These connections facilitate feature fusion by dynamically weighting and combining coarse-grained encoder features with fine-grained decoder feature maps. Specifically, we propose a Multi-Scale Adaptive Spatial Attention Gate (MASAG), which dynamically adjusts the receptive field (Local and Global contextual information) to ensure that spatially relevant features are selectively highlighted while minimizing background distractions. Extensive evaluations involving dermatology, and radiological datasets demonstrate that our MSA$^2$Net outperforms state-of-the-art (SOTA) works or matches their performance. The source code is publicly available at https://github.com/xmindflow/MSA-2Net.
△ Less
Submitted 3 August, 2024; v1 submitted 31 July, 2024;
originally announced July 2024.
-
Alternative views on fuzzy numbers and their application to fuzzy differential equations
Authors:
Akbar H. Borzabadi,
Mohammad Heidari,
Delfim F. M. Torres
Abstract:
We consider fuzzy valued functions from two parametric representations of $α$-level sets. New concepts are introduced and compared with available notions. Following the two proposed approaches, we study fuzzy differential equations. Their relation with Zadeh's extension principle and the generalized Hukuhara derivative is discussed. Moreover, we prove existence and uniqueness theorems for fuzzy di…
▽ More
We consider fuzzy valued functions from two parametric representations of $α$-level sets. New concepts are introduced and compared with available notions. Following the two proposed approaches, we study fuzzy differential equations. Their relation with Zadeh's extension principle and the generalized Hukuhara derivative is discussed. Moreover, we prove existence and uniqueness theorems for fuzzy differential equations. Illustrative examples are given.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Computation-Efficient Era: A Comprehensive Survey of State Space Models in Medical Image Analysis
Authors:
Moein Heidari,
Sina Ghorbani Kolahi,
Sanaz Karimijafarbigloo,
Bobby Azad,
Afshin Bozorgpour,
Soheila Hatami,
Reza Azad,
Ali Diba,
Ulas Bagci,
Dorit Merhof,
Ilker Hacihaliloglu
Abstract:
Sequence modeling plays a vital role across various domains, with recurrent neural networks being historically the predominant method of performing these tasks. However, the emergence of transformers has altered this paradigm due to their superior performance. Built upon these advances, transformers have conjoined CNNs as two leading foundational models for learning visual representations. However…
▽ More
Sequence modeling plays a vital role across various domains, with recurrent neural networks being historically the predominant method of performing these tasks. However, the emergence of transformers has altered this paradigm due to their superior performance. Built upon these advances, transformers have conjoined CNNs as two leading foundational models for learning visual representations. However, transformers are hindered by the $\mathcal{O}(N^2)$ complexity of their attention mechanisms, while CNNs lack global receptive fields and dynamic weight allocation. State Space Models (SSMs), specifically the \textit{\textbf{Mamba}} model with selection mechanisms and hardware-aware architecture, have garnered immense interest lately in sequential modeling and visual representation learning, challenging the dominance of transformers by providing infinite context lengths and offering substantial efficiency maintaining linear complexity in the input sequence. Capitalizing on the advances in computer vision, medical imaging has heralded a new epoch with Mamba models. Intending to help researchers navigate the surge, this survey seeks to offer an encyclopedic review of Mamba models in medical imaging. Specifically, we start with a comprehensive theoretical review forming the basis of SSMs, including Mamba architecture and its alternatives for sequence modeling paradigms in this context. Next, we offer a structured classification of Mamba models in the medical field and introduce a diverse categorization scheme based on their application, imaging modalities, and targeted organs. Finally, we summarize key challenges, discuss different future research directions of the SSMs in the medical domain, and propose several directions to fulfill the demands of this field. In addition, we have compiled the studies discussed in this paper along with their open-source implementations on our GitHub repository.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
Reinforcement Learning-Guided Semi-Supervised Learning
Authors:
Marzi Heidari,
Hanping Zhang,
Yuhong Guo
Abstract:
In recent years, semi-supervised learning (SSL) has gained significant attention due to its ability to leverage both labeled and unlabeled data to improve model performance, especially when labeled data is scarce. However, most current SSL methods rely on heuristics or predefined rules for generating pseudo-labels and leveraging unlabeled data. They are limited to exploiting loss functions and reg…
▽ More
In recent years, semi-supervised learning (SSL) has gained significant attention due to its ability to leverage both labeled and unlabeled data to improve model performance, especially when labeled data is scarce. However, most current SSL methods rely on heuristics or predefined rules for generating pseudo-labels and leveraging unlabeled data. They are limited to exploiting loss functions and regularization methods within the standard norm. In this paper, we propose a novel Reinforcement Learning (RL) Guided SSL method, RLGSSL, that formulates SSL as a one-armed bandit problem and deploys an innovative RL loss based on weighted reward to adaptively guide the learning process of the prediction model. RLGSSL incorporates a carefully designed reward function that balances the use of labeled and unlabeled data to enhance generalization performance. A semi-supervised teacher-student framework is further deployed to increase the learning stability. We demonstrate the effectiveness of RLGSSL through extensive experiments on several benchmark datasets and show that our approach achieves consistent superior performance compared to state-of-the-art SSL methods.
△ Less
Submitted 2 May, 2024;
originally announced May 2024.
-
ECOR: Explainable CLIP for Object Recognition
Authors:
Ali Rasekh,
Sepehr Kazemi Ranjbar,
Milad Heidari,
Wolfgang Nejdl
Abstract:
Large Vision Language Models (VLMs), such as CLIP, have significantly contributed to various computer vision tasks, including object recognition and object detection. Their open vocabulary feature enhances their value. However, their black-box nature and lack of explainability in predictions make them less trustworthy in critical domains. Recently, some work has been done to force VLMs to provide…
▽ More
Large Vision Language Models (VLMs), such as CLIP, have significantly contributed to various computer vision tasks, including object recognition and object detection. Their open vocabulary feature enhances their value. However, their black-box nature and lack of explainability in predictions make them less trustworthy in critical domains. Recently, some work has been done to force VLMs to provide reasonable rationales for object recognition, but this often comes at the expense of classification accuracy. In this paper, we first propose a mathematical definition of explainability in the object recognition task based on the joint probability distribution of categories and rationales, then leverage this definition to fine-tune CLIP in an explainable manner. Through evaluations of different datasets, our method demonstrates state-of-the-art performance in explainable classification. Notably, it excels in zero-shot settings, showcasing its adaptability. This advancement improves explainable object recognition, enhancing trust across diverse applications. The code will be made available online upon publication.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Prompt-Driven Feature Diffusion for Open-World Semi-Supervised Learning
Authors:
Marzi Heidari,
Hanping Zhang,
Yuhong Guo
Abstract:
In this paper, we present a novel approach termed Prompt-Driven Feature Diffusion (PDFD) within a semi-supervised learning framework for Open World Semi-Supervised Learning (OW-SSL). At its core, PDFD deploys an efficient feature-level diffusion model with the guidance of class-specific prompts to support discriminative feature representation learning and feature generation, tackling the challenge…
▽ More
In this paper, we present a novel approach termed Prompt-Driven Feature Diffusion (PDFD) within a semi-supervised learning framework for Open World Semi-Supervised Learning (OW-SSL). At its core, PDFD deploys an efficient feature-level diffusion model with the guidance of class-specific prompts to support discriminative feature representation learning and feature generation, tackling the challenge of the non-availability of labeled data for unseen classes in OW-SSL. In particular, PDFD utilizes class prototypes as prompts in the diffusion model, leveraging their class-discriminative and semantic generalization ability to condition and guide the diffusion process across all the seen and unseen classes. Furthermore, PDFD incorporates a class-conditional adversarial loss for diffusion model training, ensuring that the features generated via the diffusion process can be discriminatively aligned with the class-conditional features of the real data. Additionally, the class prototypes of the unseen classes are computed using only unlabeled instances with confident predictions within a semi-supervised learning framework. We conduct extensive experiments to evaluate the proposed PDFD. The empirical results show PDFD exhibits remarkable performance enhancements over many state-of-the-art existing methods.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Efficient Gradient Estimation of Variational Quantum Circuits with Lie Algebraic Symmetries
Authors:
Mohsen Heidari,
Masih Mozakka,
Wojciech Szpankowski
Abstract:
Hybrid quantum-classical optimization and learning strategies are among the most promising approaches to harnessing quantum information or gaining a quantum advantage over classical methods. However, efficient estimation of the gradient of the objective function in such models remains a challenge due to several factors including the exponential dimensionality of the Hilbert spaces, and information…
▽ More
Hybrid quantum-classical optimization and learning strategies are among the most promising approaches to harnessing quantum information or gaining a quantum advantage over classical methods. However, efficient estimation of the gradient of the objective function in such models remains a challenge due to several factors including the exponential dimensionality of the Hilbert spaces, and information loss of quantum measurements. In this work, we developed an efficient framework that makes the Hadamard test efficiently applicable to gradient estimation for a broad range of quantum systems, an advance that had been wanting from the outset. Under certain mild structural assumptions, the gradient is estimated with the measurement shots that scale logarithmically with the number of parameters and with polynomial classical and quantum time. This is an exponential reduction in the measurement cost and polynomial speed up in time compared to existing works. The structural assumptions are (1) the dimension of the dynamical Lie algebra is polynomial in the number of qubits, and (2) the observable has a bounded Hilbert-Schmidt norm.
△ Less
Submitted 7 October, 2024; v1 submitted 7 April, 2024;
originally announced April 2024.
-
Enhancing Efficiency in Vision Transformer Networks: Design Techniques and Insights
Authors:
Moein Heidari,
Reza Azad,
Sina Ghorbani Kolahi,
René Arimond,
Leon Niggemeier,
Alaa Sulaiman,
Afshin Bozorgpour,
Ehsan Khodapanah Aghdam,
Amirhossein Kazerouni,
Ilker Hacihaliloglu,
Dorit Merhof
Abstract:
Intrigued by the inherent ability of the human visual system to identify salient regions in complex scenes, attention mechanisms have been seamlessly integrated into various Computer Vision (CV) tasks. Building upon this paradigm, Vision Transformer (ViT) networks exploit attention mechanisms for improved efficiency. This review navigates the landscape of redesigned attention mechanisms within ViT…
▽ More
Intrigued by the inherent ability of the human visual system to identify salient regions in complex scenes, attention mechanisms have been seamlessly integrated into various Computer Vision (CV) tasks. Building upon this paradigm, Vision Transformer (ViT) networks exploit attention mechanisms for improved efficiency. This review navigates the landscape of redesigned attention mechanisms within ViTs, aiming to enhance their performance. This paper provides a comprehensive exploration of techniques and insights for designing attention mechanisms, systematically reviewing recent literature in the field of CV. This survey begins with an introduction to the theoretical foundations and fundamental concepts underlying attention mechanisms. We then present a systematic taxonomy of various attention mechanisms within ViTs, employing redesigned approaches. A multi-perspective categorization is proposed based on their application, objectives, and the type of attention applied. The analysis includes an exploration of the novelty, strengths, weaknesses, and an in-depth evaluation of the different proposed strategies. This culminates in the development of taxonomies that highlight key properties and contributions. Finally, we gather the reviewed studies along with their available open-source implementations at our \href{https://github.com/mindflow-institue/Awesome-Attention-Mechanism-in-Medical-Imaging}{GitHub}\footnote{\url{https://github.com/xmindflow/Awesome-Attention-Mechanism-in-Medical-Imaging}}. We aim to regularly update it with the most recent relevant papers.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Vision-Language Synthetic Data Enhances Echocardiography Downstream Tasks
Authors:
Pooria Ashrafian,
Milad Yazdani,
Moein Heidari,
Dena Shahriari,
Ilker Hacihaliloglu
Abstract:
High-quality, large-scale data is essential for robust deep learning models in medical applications, particularly ultrasound image analysis. Diffusion models facilitate high-fidelity medical image generation, reducing the costs associated with acquiring and annotating new images. This paper utilizes recent vision-language models to produce diverse and realistic synthetic echocardiography image dat…
▽ More
High-quality, large-scale data is essential for robust deep learning models in medical applications, particularly ultrasound image analysis. Diffusion models facilitate high-fidelity medical image generation, reducing the costs associated with acquiring and annotating new images. This paper utilizes recent vision-language models to produce diverse and realistic synthetic echocardiography image data, preserving key features of the original images guided by textual and semantic label maps. Specifically, we investigate three potential avenues: unconditional generation, generation guided by text, and a hybrid approach incorporating both textual and semantic supervision. We show that the rich contextual information present in the synthesized data potentially enhances the accuracy and interpretability of downstream tasks, such as echocardiography segmentation and classification with improved metrics and faster convergence. Our implementation with checkpoints, prompts, and the created synthetic dataset will be publicly available at \href{https://github.com/Pooria90/DiffEcho}{GitHub}.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
Adaptive Weighted Co-Learning for Cross-Domain Few-Shot Learning
Authors:
Abdullah Alchihabi,
Marzi Heidari,
Yuhong Guo
Abstract:
Due to the availability of only a few labeled instances for the novel target prediction task and the significant domain shift between the well annotated source domain and the target domain, cross-domain few-shot learning (CDFSL) induces a very challenging adaptation problem. In this paper, we propose a simple Adaptive Weighted Co-Learning (AWCoL) method to address the CDFSL challenge by adapting t…
▽ More
Due to the availability of only a few labeled instances for the novel target prediction task and the significant domain shift between the well annotated source domain and the target domain, cross-domain few-shot learning (CDFSL) induces a very challenging adaptation problem. In this paper, we propose a simple Adaptive Weighted Co-Learning (AWCoL) method to address the CDFSL challenge by adapting two independently trained source prototypical classification models to the target task in a weighted co-learning manner. The proposed method deploys a weighted moving average prediction strategy to generate probabilistic predictions from each model, and then conducts adaptive co-learning by jointly fine-tuning the two models in an alternating manner based on the pseudo-labels and instance weights produced from the predictions. Moreover, a negative pseudo-labeling regularizer is further deployed to improve the fine-tuning process by penalizing false predictions. Comprehensive experiments are conducted on multiple benchmark datasets and the empirical results demonstrate that the proposed method produces state-of-the-art CDFSL performance.
△ Less
Submitted 6 December, 2023;
originally announced December 2023.
-
DiffGANPaint: Fast Inpainting Using Denoising Diffusion GANs
Authors:
Moein Heidari,
Alireza Morsali,
Tohid Abedini,
Samin Heydarian
Abstract:
Free-form image inpainting is the task of reconstructing parts of an image specified by an arbitrary binary mask. In this task, it is typically desired to generalize model capabilities to unseen mask types, rather than learning certain mask distributions. Capitalizing on the advances in diffusion models, in this paper, we propose a Denoising Diffusion Probabilistic Model (DDPM) based model capable…
▽ More
Free-form image inpainting is the task of reconstructing parts of an image specified by an arbitrary binary mask. In this task, it is typically desired to generalize model capabilities to unseen mask types, rather than learning certain mask distributions. Capitalizing on the advances in diffusion models, in this paper, we propose a Denoising Diffusion Probabilistic Model (DDPM) based model capable of filling missing pixels fast as it models the backward diffusion process using the generator of a generative adversarial network (GAN) network to reduce sampling cost in diffusion models. Experiments on general-purpose image inpainting datasets verify that our approach performs superior or on par with most contemporary works.
△ Less
Submitted 3 August, 2023;
originally announced November 2023.
-
Quantum Shadow Gradient Descent for Variational Quantum Algorithms
Authors:
Mohsen Heidari,
Mobasshir A Naved,
Zahra Honjani,
Wenbo Xie,
Arjun Jacob Grama,
Wojciech Szpankowski
Abstract:
Gradient-based optimizers have been proposed for training variational quantum circuits in settings such as quantum neural networks (QNNs). The task of gradient estimation, however, has proven to be challenging, primarily due to distinctive quantum features such as state collapse and measurement incompatibility. Conventional techniques, such as the parameter-shift rule, necessitate several fresh sa…
▽ More
Gradient-based optimizers have been proposed for training variational quantum circuits in settings such as quantum neural networks (QNNs). The task of gradient estimation, however, has proven to be challenging, primarily due to distinctive quantum features such as state collapse and measurement incompatibility. Conventional techniques, such as the parameter-shift rule, necessitate several fresh samples in each iteration to estimate the gradient due to the stochastic nature of state measurement. Owing to state collapse from measurement, the inability to reuse samples in subsequent iterations motivates a crucial inquiry into whether fundamentally more efficient approaches to sample utilization exist. In this paper, we affirm the feasibility of such efficiency enhancements through a novel procedure called quantum shadow gradient descent (QSGD), which uses a single sample per iteration to estimate all components of the gradient. Our approach is based on an adaptation of shadow tomography that significantly enhances sample efficiency. Through detailed theoretical analysis, we show that QSGD has a significantly faster convergence rate than existing methods under locality conditions. We present detailed numerical experiments supporting all of our theoretical claims.
△ Less
Submitted 22 August, 2024; v1 submitted 10 October, 2023;
originally announced October 2023.
-
SA2-Net: Scale-aware Attention Network for Microscopic Image Segmentation
Authors:
Mustansar Fiaz,
Moein Heidari,
Rao Muhammad Anwer,
Hisham Cholakkal
Abstract:
Microscopic image segmentation is a challenging task, wherein the objective is to assign semantic labels to each pixel in a given microscopic image. While convolutional neural networks (CNNs) form the foundation of many existing frameworks, they often struggle to explicitly capture long-range dependencies. Although transformers were initially devised to address this issue using self-attention, it…
▽ More
Microscopic image segmentation is a challenging task, wherein the objective is to assign semantic labels to each pixel in a given microscopic image. While convolutional neural networks (CNNs) form the foundation of many existing frameworks, they often struggle to explicitly capture long-range dependencies. Although transformers were initially devised to address this issue using self-attention, it has been proven that both local and global features are crucial for addressing diverse challenges in microscopic images, including variations in shape, size, appearance, and target region density. In this paper, we introduce SA2-Net, an attention-guided method that leverages multi-scale feature learning to effectively handle diverse structures within microscopic images. Specifically, we propose scale-aware attention (SA2) module designed to capture inherent variations in scales and shapes of microscopic regions, such as cells, for accurate segmentation. This module incorporates local attention at each level of multi-stage features, as well as global attention across multiple resolutions. Furthermore, we address the issue of blurred region boundaries (e.g., cell boundaries) by introducing a novel upsampling strategy called the Adaptive Up-Attention (AuA) module. This module enhances the discriminative ability for improved localization of microscopic regions using an explicit attention mechanism. Extensive experiments on five challenging datasets demonstrate the benefits of our SA2-Net model. Our source code is publicly available at \url{https://github.com/mustansarfiaz/SA2-Net}.
△ Less
Submitted 19 November, 2023; v1 submitted 28 September, 2023;
originally announced September 2023.
-
Adaptive Parametric Prototype Learning for Cross-Domain Few-Shot Classification
Authors:
Marzi Heidari,
Abdullah Alchihabi,
Qing En,
Yuhong Guo
Abstract:
Cross-domain few-shot classification induces a much more challenging problem than its in-domain counterpart due to the existence of domain shifts between the training and test tasks. In this paper, we develop a novel Adaptive Parametric Prototype Learning (APPL) method under the meta-learning convention for cross-domain few-shot classification. Different from existing prototypical few-shot methods…
▽ More
Cross-domain few-shot classification induces a much more challenging problem than its in-domain counterpart due to the existence of domain shifts between the training and test tasks. In this paper, we develop a novel Adaptive Parametric Prototype Learning (APPL) method under the meta-learning convention for cross-domain few-shot classification. Different from existing prototypical few-shot methods that use the averages of support instances to calculate the class prototypes, we propose to learn class prototypes from the concatenated features of the support set in a parametric fashion and meta-learn the model by enforcing prototype-based regularization on the query set. In addition, we fine-tune the model in the target domain in a transductive manner using a weighted-moving-average self-training approach on the query instances. We conduct experiments on multiple cross-domain few-shot benchmark datasets. The empirical results demonstrate that APPL yields superior performance than many state-of-the-art cross-domain few-shot learning methods.
△ Less
Submitted 3 September, 2023;
originally announced September 2023.
-
On The Reliability Function of Discrete Memoryless Multiple-Access Channel with Feedback
Authors:
Mohsen Heidari,
Achilleas Anastasopoulos,
S. Sandeep Pradhan
Abstract:
The reliability function of a channel is the maximum achievable exponential rate of decay of the error probability as a function of the transmission rate. In this work, we derive bounds on the reliability function of discrete memoryless multiple-access channels (MAC) with noiseless feedback. We show that our bounds are tight for a variety of MACs, such as $m$-ary additive and two independent point…
▽ More
The reliability function of a channel is the maximum achievable exponential rate of decay of the error probability as a function of the transmission rate. In this work, we derive bounds on the reliability function of discrete memoryless multiple-access channels (MAC) with noiseless feedback. We show that our bounds are tight for a variety of MACs, such as $m$-ary additive and two independent point-to-point channels. The bounds are expressed in terms of a new information measure called ``variable-length directed information". The upper bound is proved by analyzing stochastic processes defined based on the entropy of the message, given the past channel's outputs. Our method relies on tools from the theory of martingales, variable-length information measures, and a new technique called time pruning. We further propose a variable-length achievable scheme consisting of three phases: (i) data transmission, (ii) hybrid data-confirmation, and (iii) full confirmation. We show that two-phase-type schemes are strictly suboptimal in achieving the MAC's reliability function. Moreover, we study the shape of the lower-bound and show that it increases linearly with respect to a specific Euclidean distance measure defined between the transmission rate pair and the capacity boundary. As side results, we derive an upper bound on the capacity of MAC with noiseless feedback and study a new problem involving a hybrid of hypothesis testing and data transmission.
△ Less
Submitted 11 June, 2023;
originally announced June 2023.
-
Unwrapping NPT simulations to calculate diffusion coefficients
Authors:
Jakob Tómas Bullerjahn,
Sören von Bülow,
Maziar Heidari,
Jérôme Hénin,
Gerhard Hummer
Abstract:
In molecular dynamics simulations in the NPT ensemble at constant pressure, the size and shape of the periodic simulation box fluctuate with time. For particle images far from the origin, the rescaling of the box by the barostat results in unbounded position displacements. Special care is thus required when a particle trajectory is unwrapped from a projection into the central box under periodic bo…
▽ More
In molecular dynamics simulations in the NPT ensemble at constant pressure, the size and shape of the periodic simulation box fluctuate with time. For particle images far from the origin, the rescaling of the box by the barostat results in unbounded position displacements. Special care is thus required when a particle trajectory is unwrapped from a projection into the central box under periodic boundary conditions to a trajectory in full three-dimensional space, e.g., for the calculation of diffusion coefficients. Here, we review and compare different schemes in use for trajectory unwrapping. We also specify the corresponding rewrapping schemes to put an unwrapped trajectory back into the central box. On this basis, we then identify a scheme for the calculation of meaningful diffusion coefficients, which is a primary application of trajectory unwrapping. In this scheme, the wrapped and unwrapped trajectory are mutually consistent and their statistical properties are preserved. We conclude with advice on best practice for the consistent unwrapping of constant-pressure simulation trajectories and the calculation of accurate translational diffusion coefficients.
△ Less
Submitted 16 March, 2023;
originally announced March 2023.
-
Mechanical activation of reversible bonds by low amplitude high frequencies excitations
Authors:
Maziar Heidari,
Théophile Gaichies,
Ludwik Leibler,
Matthieu Labousse
Abstract:
Reversible covalent or supramolecular bonds play an important role in materials science and in biological systems. The equilibrium between open and closed bonds and the association rate can be controlled thermally, chemically, by mechanical pulling, ultrasound or catalysts. In practice, these intrinsic equilibrium methods either suffer from a limited range of tunability or may damage the system. H…
▽ More
Reversible covalent or supramolecular bonds play an important role in materials science and in biological systems. The equilibrium between open and closed bonds and the association rate can be controlled thermally, chemically, by mechanical pulling, ultrasound or catalysts. In practice, these intrinsic equilibrium methods either suffer from a limited range of tunability or may damage the system. Here, we present a non-equilibrium strategy that exploits the dissipative properties of the system to control and change the dynamic properties of sacrificial and reversible networks. We show theoretically and numerically how high-frequency mechanical oscillations of very low amplitude can open or close bonds. This mechanism indicates how reversible bonds could alleviate mechanical fatigue of materials especially at low temperatures where they are fragile. In another area, it suggests that the system can be actively modified by the application of ultrasound to induce gel-fluid transitions and to activate or deactivate adhesion properties.
△ Less
Submitted 14 March, 2023;
originally announced March 2023.
-
Agnostic PAC Learning of k-juntas Using L2-Polynomial Regression
Authors:
Mohsen Heidari,
Wojciech Szpankowski
Abstract:
Many conventional learning algorithms rely on loss functions other than the natural 0-1 loss for computational efficiency and theoretical tractability. Among them are approaches based on absolute loss (L1 regression) and square loss (L2 regression). The first is proved to be an \textit{agnostic} PAC learner for various important concept classes such as \textit{juntas}, and \textit{half-spaces}. On…
▽ More
Many conventional learning algorithms rely on loss functions other than the natural 0-1 loss for computational efficiency and theoretical tractability. Among them are approaches based on absolute loss (L1 regression) and square loss (L2 regression). The first is proved to be an \textit{agnostic} PAC learner for various important concept classes such as \textit{juntas}, and \textit{half-spaces}. On the other hand, the second is preferable because of its computational efficiency, which is linear in the sample size. However, PAC learnability is still unknown as guarantees have been proved only under distributional restrictions. The question of whether L2 regression is an agnostic PAC learner for 0-1 loss has been open since 1993 and yet has to be answered.
This paper resolves this problem for the junta class on the Boolean cube -- proving agnostic PAC learning of k-juntas using L2 polynomial regression. Moreover, we present a new PAC learning algorithm based on the Boolean Fourier expansion with lower computational complexity. Fourier-based algorithms, such as Linial et al. (1993), have been used under distributional restrictions, such as uniform distribution. We show that with an appropriate change, one can apply those algorithms in agnostic settings without any distributional assumption. We prove our results by connecting the PAC learning with 0-1 loss to the minimum mean square estimation (MMSE) problem. We derive an elegant upper bound on the 0-1 loss in terms of the MMSE error and show that the sign of the MMSE is a PAC learner for any concept class containing it.
△ Less
Submitted 8 March, 2023;
originally announced March 2023.
-
Advances in Medical Image Analysis with Vision Transformers: A Comprehensive Review
Authors:
Reza Azad,
Amirhossein Kazerouni,
Moein Heidari,
Ehsan Khodapanah Aghdam,
Amirali Molaei,
Yiwei Jia,
Abin Jose,
Rijo Roy,
Dorit Merhof
Abstract:
The remarkable performance of the Transformer architecture in natural language processing has recently also triggered broad interest in Computer Vision. Among other merits, Transformers are witnessed as capable of learning long-range dependencies and spatial correlations, which is a clear advantage over convolutional neural networks (CNNs), which have been the de facto standard in Computer Vision…
▽ More
The remarkable performance of the Transformer architecture in natural language processing has recently also triggered broad interest in Computer Vision. Among other merits, Transformers are witnessed as capable of learning long-range dependencies and spatial correlations, which is a clear advantage over convolutional neural networks (CNNs), which have been the de facto standard in Computer Vision problems so far. Thus, Transformers have become an integral part of modern medical image analysis. In this review, we provide an encyclopedic review of the applications of Transformers in medical imaging. Specifically, we present a systematic and thorough review of relevant recent Transformer literature for different medical image analysis tasks, including classification, segmentation, detection, registration, synthesis, and clinical report generation. For each of these applications, we investigate the novelty, strengths and weaknesses of the different proposed strategies and develop taxonomies highlighting key properties and contributions. Further, if applicable, we outline current benchmarks on different datasets. Finally, we summarize key challenges and discuss different future research directions. In addition, we have provided cited papers with their corresponding implementations in https://github.com/mindflow-institue/Awesome-Transformer.
△ Less
Submitted 5 November, 2023; v1 submitted 9 January, 2023;
originally announced January 2023.
-
On Non-Interactive Source Simulation via Fourier Transform
Authors:
Farhad Shirani,
Mohsen Heidari
Abstract:
The non-interactive source simulation (NISS) scenario is considered. In this scenario, a pair of distributed agents, Alice and Bob, observe a distributed binary memoryless source $(X^d,Y^d)$ generated based on joint distribution $P_{X,Y}$. The agents wish to produce a pair of discrete random variables $(U_d,V_d)$ with joint distribution $P_{U_d,V_d}$, such that $P_{U_d,V_d}$ converges in total var…
▽ More
The non-interactive source simulation (NISS) scenario is considered. In this scenario, a pair of distributed agents, Alice and Bob, observe a distributed binary memoryless source $(X^d,Y^d)$ generated based on joint distribution $P_{X,Y}$. The agents wish to produce a pair of discrete random variables $(U_d,V_d)$ with joint distribution $P_{U_d,V_d}$, such that $P_{U_d,V_d}$ converges in total variation distance to a target distribution $Q_{U,V}$ as the input blocklength $d$ is taken to be asymptotically large. Inner and outer bounds are obtained on the set of distributions $Q_{U,V}$ which can be produced given an input distribution $P_{X,Y}$. To this end, a bijective mapping from the set of distributions $Q_{U,V}$ to a union of star-convex sets is provided. By leveraging proof techniques from discrete Fourier analysis along with a novel randomized rounding technique, inner and outer bounds are derived for each of these star-convex sets, and by inverting the aforementioned bijective mapping, necessary and sufficient conditions on $Q_{U,V}$ and $P_{X,Y}$ are provided under which $Q_{U,V}$ can be produced from $P_{X,Y}$. The bounds are applicable in NISS scenarios where the output alphabets $\mathcal{U}$ and $\mathcal{V}$ have arbitrary finite size. In case of binary output alphabets, the outer-bound recovers the previously best-known outer-bound.
△ Less
Submitted 18 December, 2022;
originally announced December 2022.
-
Diffusion Models for Medical Image Analysis: A Comprehensive Survey
Authors:
Amirhossein Kazerouni,
Ehsan Khodapanah Aghdam,
Moein Heidari,
Reza Azad,
Mohsen Fayyaz,
Ilker Hacihaliloglu,
Dorit Merhof
Abstract:
Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems. A diffusion probabilistic model defines a forward diffusion stage where the input data is gradually perturbed over several steps by adding Gaussian noise and then learns to reverse the diffusion process to retrieve the desired noise-free data from noisy data samples. D…
▽ More
Denoising diffusion models, a class of generative models, have garnered immense interest lately in various deep-learning problems. A diffusion probabilistic model defines a forward diffusion stage where the input data is gradually perturbed over several steps by adding Gaussian noise and then learns to reverse the diffusion process to retrieve the desired noise-free data from noisy data samples. Diffusion models are widely appreciated for their strong mode coverage and quality of the generated samples despite their known computational burdens. Capitalizing on the advances in computer vision, the field of medical imaging has also observed a growing interest in diffusion models. To help the researcher navigate this profusion, this survey intends to provide a comprehensive overview of diffusion models in the discipline of medical image analysis. Specifically, we introduce the solid theoretical foundation and fundamental concepts behind diffusion models and the three generic diffusion modelling frameworks: diffusion probabilistic models, noise-conditioned score networks, and stochastic differential equations. Then, we provide a systematic taxonomy of diffusion models in the medical domain and propose a multi-perspective categorization based on their application, imaging modality, organ of interest, and algorithms. To this end, we cover extensive applications of diffusion models in the medical domain. Furthermore, we emphasize the practical use case of some selected approaches, and then we discuss the limitations of the diffusion models in the medical domain and propose several directions to fulfill the demands of this field. Finally, we gather the overviewed studies with their available open-source implementations at https://github.com/amirhossein-kz/Awesome-Diffusion-Models-in-Medical-Imaging.
△ Less
Submitted 3 June, 2023; v1 submitted 14 November, 2022;
originally announced November 2022.
-
Post trade allocation: how much are bunched orders costing your performance?
Authors:
Ali Hirsa,
Massoud Heidari
Abstract:
Individual trade orders are often bunched into a block order for processing efficiency, where in post execution, they are allocated into individual accounts. Since Regulators have not mandated any specific post trade allocation practice or methodology, entities try to rigorously follow internal policies and procedures to meet the minimum Regulatory ask of being procedurally fair and equitable. How…
▽ More
Individual trade orders are often bunched into a block order for processing efficiency, where in post execution, they are allocated into individual accounts. Since Regulators have not mandated any specific post trade allocation practice or methodology, entities try to rigorously follow internal policies and procedures to meet the minimum Regulatory ask of being procedurally fair and equitable. However, as many have found over the years, there is no simple solution for post trade allocation between accounts that results in a uniform distribution of returns. Furthermore, in many instances, the divergences between returns do not dissipate with more transactions, and tend to increase in some cases. This paper is the first systematic treatment of trade allocation risk. We shed light on the reasons for return divergence among accounts, and we present a solution that supports uniform allocation of return irrespective of number of accounts and trade sizes.
△ Less
Submitted 13 October, 2022;
originally announced October 2022.
-
A unit-based symbolic execution method for detecting memory corruption vulnerabilities in executable codes
Authors:
Sara Baradaran,
Mahdi Heidari,
Ali Kamali,
Maryam Mouzarani
Abstract:
Memory corruption is a serious class of software vulnerabilities, which requires careful attention to be detected and removed from applications before getting exploited and harming the system users. Symbolic execution is a well-known method for analyzing programs and detecting various vulnerabilities, e.g., memory corruption. Although this method is sound and complete in theory, it faces some chal…
▽ More
Memory corruption is a serious class of software vulnerabilities, which requires careful attention to be detected and removed from applications before getting exploited and harming the system users. Symbolic execution is a well-known method for analyzing programs and detecting various vulnerabilities, e.g., memory corruption. Although this method is sound and complete in theory, it faces some challenges, such as path explosion, when applied to real-world complex programs. In this paper, we present a method for improving the efficiency of symbolic execution and detecting four classes of memory corruption vulnerabilities in executable codes, i.e., heap-based buffer overflow, stack-based buffer overflow, use-after-free, and double-free. We perform symbolic execution only on test units rather than the whole program to avoid path explosion. In our method, test units are considered parts of the program's code, which might contain vulnerable statements and are statically identified based on the specifications of memory corruption vulnerabilities. Then, each test unit is symbolically executed to calculate path and vulnerability constraints of each statement of the unit, which determine the conditions on unit input data for executing that statement or activating vulnerabilities in it, respectively. Solving these constraints gives us input values for the test unit, which execute the desired statements and reveal vulnerabilities in them. Finally, we use machine learning to approximate the correlation between system and unit input data. Thereby, we generate system inputs that enter the program, reach vulnerable instructions in the desired test unit, and reveal vulnerabilities in them. This method is implemented as a plugin for angr framework and evaluated using a group of benchmark programs. The experiments show its superiority over similar tools in accuracy and performance.
△ Less
Submitted 22 December, 2022; v1 submitted 9 October, 2022;
originally announced October 2022.
-
Expected Worst Case Regret via Stochastic Sequential Covering
Authors:
Changlong Wu,
Mohsen Heidari,
Ananth Grama,
Wojciech Szpankowski
Abstract:
We study the problem of sequential prediction and online minimax regret with stochastically generated features under a general loss function. We introduce a notion of expected worst case minimax regret that generalizes and encompasses prior known minimax regrets. For such minimax regrets we establish tight upper bounds via a novel concept of stochastic global sequential covering. We show that for…
▽ More
We study the problem of sequential prediction and online minimax regret with stochastically generated features under a general loss function. We introduce a notion of expected worst case minimax regret that generalizes and encompasses prior known minimax regrets. For such minimax regrets we establish tight upper bounds via a novel concept of stochastic global sequential covering. We show that for a hypothesis class of VC-dimension $\mathsf{VC}$ and $i.i.d.$ generated features of length $T$, the cardinality of the stochastic global sequential covering can be upper bounded with high probability (whp) by $e^{O(\mathsf{VC} \cdot \log^2 T)}$. We then improve this bound by introducing a new complexity measure called the Star-Littlestone dimension, and show that classes with Star-Littlestone dimension $\mathsf{SL}$ admit a stochastic global sequential covering of order $e^{O(\mathsf{SL} \cdot \log T)}$. We further establish upper bounds for real valued classes with finite fat-shattering numbers. Finally, by applying information-theoretic tools of the fixed design minimax regrets, we provide lower bounds for the expected worst case minimax regret. We demonstrate the effectiveness of our approach by establishing tight bounds on the expected worst case minimax regrets for logarithmic loss and general mixable losses.
△ Less
Submitted 17 September, 2022; v1 submitted 9 September, 2022;
originally announced September 2022.
-
TransDeepLab: Convolution-Free Transformer-based DeepLab v3+ for Medical Image Segmentation
Authors:
Reza Azad,
Moein Heidari,
Moein Shariatnia,
Ehsan Khodapanah Aghdam,
Sanaz Karimijafarbigloo,
Ehsan Adeli,
Dorit Merhof
Abstract:
Convolutional neural networks (CNNs) have been the de facto standard in a diverse set of computer vision tasks for many years. Especially, deep neural networks based on seminal architectures such as U-shaped models with skip-connections or atrous convolution with pyramid pooling have been tailored to a wide range of medical image analysis tasks. The main advantage of such architectures is that the…
▽ More
Convolutional neural networks (CNNs) have been the de facto standard in a diverse set of computer vision tasks for many years. Especially, deep neural networks based on seminal architectures such as U-shaped models with skip-connections or atrous convolution with pyramid pooling have been tailored to a wide range of medical image analysis tasks. The main advantage of such architectures is that they are prone to detaining versatile local features. However, as a general consensus, CNNs fail to capture long-range dependencies and spatial correlations due to the intrinsic property of confined receptive field size of convolution operations. Alternatively, Transformer, profiting from global information modelling that stems from the self-attention mechanism, has recently attained remarkable performance in natural language processing and computer vision. Nevertheless, previous studies prove that both local and global features are critical for a deep model in dense prediction, such as segmenting complicated structures with disparate shapes and configurations. To this end, this paper proposes TransDeepLab, a novel DeepLab-like pure Transformer for medical image segmentation. Specifically, we exploit hierarchical Swin-Transformer with shifted windows to extend the DeepLabv3 and model the Atrous Spatial Pyramid Pooling (ASPP) module. A thorough search of the relevant literature yielded that we are the first to model the seminal DeepLab model with a pure Transformer-based model. Extensive experiments on various medical image segmentation tasks verify that our approach performs superior or on par with most contemporary works on an amalgamation of Vision Transformer and CNN-based methods, along with a significant reduction of model complexity. The codes and trained models are publicly available at https://github.com/rezazad68/transdeeplab
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
TransNorm: Transformer Provides a Strong Spatial Normalization Mechanism for a Deep Segmentation Model
Authors:
Reza Azad,
Mohammad T. AL-Antary,
Moein Heidari,
Dorit Merhof
Abstract:
In the past few years, convolutional neural networks (CNNs), particularly U-Net, have been the prevailing technique in the medical image processing era. Specifically, the seminal U-Net, as well as its alternatives, have successfully managed to address a wide variety of medical image segmentation tasks. However, these architectures are intrinsically imperfect as they fail to exhibit long-range inte…
▽ More
In the past few years, convolutional neural networks (CNNs), particularly U-Net, have been the prevailing technique in the medical image processing era. Specifically, the seminal U-Net, as well as its alternatives, have successfully managed to address a wide variety of medical image segmentation tasks. However, these architectures are intrinsically imperfect as they fail to exhibit long-range interactions and spatial dependencies leading to a severe performance drop in the segmentation of medical images with variable shapes and structures. Transformers, preliminary proposed for sequence-to-sequence prediction, have arisen as surrogate architectures to precisely model global information assisted by the self-attention mechanism. Despite being feasibly designed, utilizing a pure Transformer for image segmentation purposes can result in limited localization capacity stemming from inadequate low-level features. Thus, a line of research strives to design robust variants of Transformer-based U-Net. In this paper, we propose Trans-Norm, a novel deep segmentation framework which concomitantly consolidates a Transformer module into both encoder and skip-connections of the standard U-Net. We argue that the expedient design of skip-connections can be crucial for accurate segmentation as it can assist in feature fusion between the expanding and contracting paths. In this respect, we derive a Spatial Normalization mechanism from the Transformer module to adaptively recalibrate the skip connection path. Extensive experiments across three typical tasks for medical image segmentation demonstrate the effectiveness of TransNorm. The codes and trained models are publicly available at https://github.com/rezazad68/transnorm.
△ Less
Submitted 27 July, 2022;
originally announced July 2022.
-
HiFormer: Hierarchical Multi-scale Representations Using Transformers for Medical Image Segmentation
Authors:
Moein Heidari,
Amirhossein Kazerouni,
Milad Soltany,
Reza Azad,
Ehsan Khodapanah Aghdam,
Julien Cohen-Adad,
Dorit Merhof
Abstract:
Convolutional neural networks (CNNs) have been the consensus for medical image segmentation tasks. However, they suffer from the limitation in modeling long-range dependencies and spatial correlations due to the nature of convolution operation. Although transformers were first developed to address this issue, they fail to capture low-level features. In contrast, it is demonstrated that both local…
▽ More
Convolutional neural networks (CNNs) have been the consensus for medical image segmentation tasks. However, they suffer from the limitation in modeling long-range dependencies and spatial correlations due to the nature of convolution operation. Although transformers were first developed to address this issue, they fail to capture low-level features. In contrast, it is demonstrated that both local and global features are crucial for dense prediction, such as segmenting in challenging contexts. In this paper, we propose HiFormer, a novel method that efficiently bridges a CNN and a transformer for medical image segmentation. Specifically, we design two multi-scale feature representations using the seminal Swin Transformer module and a CNN-based encoder. To secure a fine fusion of global and local features obtained from the two aforementioned representations, we propose a Double-Level Fusion (DLF) module in the skip connection of the encoder-decoder structure. Extensive experiments on various medical image segmentation datasets demonstrate the effectiveness of HiFormer over other CNN-based, transformer-based, and hybrid methods in terms of computational complexity, and quantitative and qualitative results. Our code is publicly available at: https://github.com/amirhossein-kz/HiFormer
△ Less
Submitted 9 January, 2023; v1 submitted 18 July, 2022;
originally announced July 2022.
-
Precise Regret Bounds for Log-loss via a Truncated Bayesian Algorithm
Authors:
Changlong Wu,
Mohsen Heidari,
Ananth Grama,
Wojciech Szpankowski
Abstract:
We study the sequential general online regression, known also as the sequential probability assignments, under logarithmic loss when compared against a broad class of experts. We focus on obtaining tight, often matching, lower and upper bounds for the sequential minimax regret that are defined as the excess loss it incurs over a class of experts. After proving a general upper bound, we consider so…
▽ More
We study the sequential general online regression, known also as the sequential probability assignments, under logarithmic loss when compared against a broad class of experts. We focus on obtaining tight, often matching, lower and upper bounds for the sequential minimax regret that are defined as the excess loss it incurs over a class of experts. After proving a general upper bound, we consider some specific classes of experts from Lipschitz class to bounded Hessian class and derive matching lower and upper bounds with provably optimal constants. Our bounds work for a wide range of values of the data dimension and the number of rounds. To derive lower bounds, we use tools from information theory (e.g., Shtarkov sum) and for upper bounds, we resort to new "smooth truncated covering" of the class of experts. This allows us to find constructive proofs by applying a simple and novel truncated Bayesian algorithm. Our proofs are substantially simpler than the existing ones and yet provide tighter (and often optimal) bounds.
△ Less
Submitted 7 May, 2022;
originally announced May 2022.
-
Intervertebral Disc Labeling With Learning Shape Information, A Look Once Approach
Authors:
Reza Azad,
Moein Heidari,
Julien Cohen-Adad,
Ehsan Adeli,
Dorit Merhof
Abstract:
Accurate and automatic segmentation of intervertebral discs from medical images is a critical task for the assessment of spine-related diseases such as osteoporosis, vertebral fractures, and intervertebral disc herniation. To date, various approaches have been developed in the literature which routinely relies on detecting the discs as the primary step. A disadvantage of many cohort studies is tha…
▽ More
Accurate and automatic segmentation of intervertebral discs from medical images is a critical task for the assessment of spine-related diseases such as osteoporosis, vertebral fractures, and intervertebral disc herniation. To date, various approaches have been developed in the literature which routinely relies on detecting the discs as the primary step. A disadvantage of many cohort studies is that the localization algorithm also yields false-positive detections. In this study, we aim to alleviate this problem by proposing a novel U-Net-based structure to predict a set of candidates for intervertebral disc locations. In our design, we integrate the image shape information (image gradients) to encourage the model to learn rich and generic geometrical information. This additional signal guides the model to selectively emphasize the contextual representation and suppress the less discriminative features. On the post-processing side, to further decrease the false positive rate, we propose a permutation invariant 'look once' model, which accelerates the candidate recovery procedure. In comparison with previous studies, our proposed approach does not need to perform the selection in an iterative fashion. The proposed method was evaluated on the spine generic public multi-center dataset and demonstrated superior performance compared to previous work. We have provided the implementation code in https://github.com/rezazad68/intervertebral-lookonce
△ Less
Submitted 6 April, 2022;
originally announced April 2022.
-
Toward Physically Realizable Quantum Neural Networks
Authors:
Mohsen Heidari,
Ananth Grama,
Wojciech Szpankowski
Abstract:
There has been significant recent interest in quantum neural networks (QNNs), along with their applications in diverse domains. Current solutions for QNNs pose significant challenges concerning their scalability, ensuring that the postulates of quantum mechanics are satisfied and that the networks are physically realizable. The exponential state space of QNNs poses challenges for the scalability o…
▽ More
There has been significant recent interest in quantum neural networks (QNNs), along with their applications in diverse domains. Current solutions for QNNs pose significant challenges concerning their scalability, ensuring that the postulates of quantum mechanics are satisfied and that the networks are physically realizable. The exponential state space of QNNs poses challenges for the scalability of training procedures. The no-cloning principle prohibits making multiple copies of training samples, and the measurement postulates lead to non-deterministic loss functions. Consequently, the physical realizability and efficiency of existing approaches that rely on repeated measurement of several copies of each sample for training QNNs are unclear. This paper presents a new model for QNNs that relies on band-limited Fourier expansions of transfer functions of quantum perceptrons (QPs) to design scalable training procedures. This training procedure is augmented with a randomized quantum stochastic gradient descent technique that eliminates the need for sample replication. We show that this training procedure converges to the true minima in expectation, even in the presence of non-determinism due to quantum measurement. Our solution has a number of important benefits: (i) using QPs with concentrated Fourier power spectrum, we show that the training procedure for QNNs can be made scalable; (ii) it eliminates the need for resampling, thus staying consistent with the no-cloning rule; and (iii) enhanced data efficiency for the overall training process since each data sample is processed once per epoch. We present a detailed theoretical foundation for our models and methods' scalability, accuracy, and data efficiency. We also validate the utility of our approach through a series of numerical experiments.
△ Less
Submitted 22 March, 2022;
originally announced March 2022.
-
Online User Profiling to Detect Social Bots on Twitter
Authors:
Maryam Heidari,
James H Jr Jones,
Ozlem Uzuner
Abstract:
Social media platforms can expose influential trends in many aspects of everyday life. However, the movements they represent can be contaminated by disinformation. Social bots are one of the significant sources of disinformation in social media. Social bots can pose serious cyber threats to society and public opinion. This research aims to develop machine learning models to detect bots based on th…
▽ More
Social media platforms can expose influential trends in many aspects of everyday life. However, the movements they represent can be contaminated by disinformation. Social bots are one of the significant sources of disinformation in social media. Social bots can pose serious cyber threats to society and public opinion. This research aims to develop machine learning models to detect bots based on the extracted user's profile from a Tweet's text. Online users' profile shows the user's personal information, such as age, gender, education, and personality. In this work, the user's profile is constructed based on the user's online posts. This work's main contribution is three-fold: First, we aim to improve bot detection through machine learning models based on the user's personal information generated by the user's online comments. When comparing two online posts, the similarity of personal information makes it difficult to differentiate a bot from a human user. However, this research turns personal information similarity among two online posts into an advantage for the new bot detection model. The new proposed model for bot detection creates user profiles based on personal information such as age, personality, gender, education from users' online posts and introduces a machine learning model to detect social bots with high prediction accuracy based on personal information. Second, create a new public data set that shows the user's profile for more than 6900 Twitter accounts in the Cresci 2017 data set.
△ Less
Submitted 9 March, 2022;
originally announced March 2022.
-
Contextual Attention Network: Transformer Meets U-Net
Authors:
Reza Azad,
Moein Heidari,
Yuli Wu,
Dorit Merhof
Abstract:
Currently, convolutional neural networks (CNN) (e.g., U-Net) have become the de facto standard and attained immense success in medical image segmentation. However, as a downside, CNN based methods are a double-edged sword as they fail to build long-range dependencies and global context connections due to the limited receptive field that stems from the intrinsic characteristics of the convolution o…
▽ More
Currently, convolutional neural networks (CNN) (e.g., U-Net) have become the de facto standard and attained immense success in medical image segmentation. However, as a downside, CNN based methods are a double-edged sword as they fail to build long-range dependencies and global context connections due to the limited receptive field that stems from the intrinsic characteristics of the convolution operation. Hence, recent articles have exploited Transformer variants for medical image segmentation tasks which open up great opportunities due to their innate capability of capturing long-range correlations through the attention mechanism. Although being feasibly designed, most of the cohort studies incur prohibitive performance in capturing local information, thereby resulting in less lucidness of boundary areas. In this paper, we propose a contextual attention network to tackle the aforementioned limitations. The proposed method uses the strength of the Transformer module to model the long-range contextual dependency. Simultaneously, it utilizes the CNN encoder to capture local semantic information. In addition, an object-level representation is included to model the regional interaction map. The extracted hierarchical features are then fed to the contextual attention module to adaptively recalibrate the representation space using the local information. Then, they emphasize the informative regions while taking into account the long-range contextual dependency derived by the Transformer module. We validate our method on several large-scale public medical image segmentation datasets and achieve state-of-the-art performance. We have provided the implementation code in https://github.com/rezazad68/TMUnet.
△ Less
Submitted 31 March, 2022; v1 submitted 2 March, 2022;
originally announced March 2022.
-
Upper Bounds on the Feedback Error Exponent of Channels With States and Memory
Authors:
Mohsen Heidari,
Achilleas Anastasopoulos,
S. Sandeep Pradhan
Abstract:
As a class of state-dependent channels, Markov channels have been long studied in information theory for characterizing the feedback capacity and error exponent. This paper studies a more general variant of such channels where the state evolves via a general stochastic process, not necessarily Markov or ergodic. The states are assumed to be unknown to the transmitter and the receiver, but the unde…
▽ More
As a class of state-dependent channels, Markov channels have been long studied in information theory for characterizing the feedback capacity and error exponent. This paper studies a more general variant of such channels where the state evolves via a general stochastic process, not necessarily Markov or ergodic. The states are assumed to be unknown to the transmitter and the receiver, but the underlying probability distributions are known. For this setup, we derive an upper bound on the feedback error exponent and the feedback capacity with variable-length codes. The bounds are expressed in terms of the directed mutual information and directed relative entropy. The bounds on the error exponent are simplified to Burnashev's expression for discrete memoryless channels. Our method relies on tools from the theory of martingales to analyze a stochastic process defined based on the entropy of the message given the past channel's outputs.
△ Less
Submitted 20 February, 2022;
originally announced February 2022.
-
A Theoretical Framework for Learning from Quantum Data
Authors:
Mohsen Heidari,
Arun Padakandla,
Wojciech Szpankowski
Abstract:
Over decades traditional information theory of source and channel coding advances toward learning and effective extraction of information from data. We propose to go one step further and offer a theoretical foundation for learning classical patterns from quantum data. However, there are several roadblocks to lay the groundwork for such a generalization. First, classical data must be replaced by a…
▽ More
Over decades traditional information theory of source and channel coding advances toward learning and effective extraction of information from data. We propose to go one step further and offer a theoretical foundation for learning classical patterns from quantum data. However, there are several roadblocks to lay the groundwork for such a generalization. First, classical data must be replaced by a density operator over a Hilbert space. Hence, deviated from problems such as state tomography, our samples are i.i.d density operators. The second challenge is even more profound since we must realize that our only interaction with a quantum state is through a measurement which -- due to no-cloning quantum postulate -- loses information after measuring it. With this in mind, we present a quantum counterpart of the well-known PAC framework. Based on that, we propose a quantum analogous of the ERM algorithm for learning measurement hypothesis classes. Then, we establish upper bounds on the quantum sample complexity quantum concept classes.
△ Less
Submitted 13 July, 2021;
originally announced July 2021.
-
A shape preserving quasi-interpolation operator based on a new transcendental RBF
Authors:
Mohammad Heidari,
Maryam Mohammadi,
Stefano De Marchi
Abstract:
It is well-known that the univariate Multiquadric quasi-interpolation operator is constructed based on the piecewise linear interpolation by |x|. In this paper, we first introduce a new transcendental RBF based on the hyperbolic tangent function as a smooth approximant to f(r)=r with higher accuracy and better convergence properties than the multiquadric. Then Wu-Schaback's quasi-interpolation for…
▽ More
It is well-known that the univariate Multiquadric quasi-interpolation operator is constructed based on the piecewise linear interpolation by |x|. In this paper, we first introduce a new transcendental RBF based on the hyperbolic tangent function as a smooth approximant to f(r)=r with higher accuracy and better convergence properties than the multiquadric. Then Wu-Schaback's quasi-interpolation formula is rewritten using the proposed RBF. It preserves convexity and monotonicity. We prove that the proposed scheme converges with a rate of O(h^2). So it has a higher degree of smoothness. Some numerical experiments are given in order to demonstrate the efficiency and accuracy of the method.
△ Less
Submitted 10 June, 2021;
originally announced June 2021.
-
D-optimal designs for the Mitscherlich non-linear regression function
Authors:
Maliheh Heidari,
Md Abu Manju,
Pieta C. IJzerman-Boon,
Edwin R. van den Heuvel
Abstract:
Mitscherlich's function is a well-known three-parameter non-linear regression function that quantifies the relation between a stimulus or a time variable and a response. Optimal designs for this function have been constructed only for normally distributed responses with homoscedastic variances. In this paper, we construct D-optimal designs for discrete and continuous responses having their distrib…
▽ More
Mitscherlich's function is a well-known three-parameter non-linear regression function that quantifies the relation between a stimulus or a time variable and a response. Optimal designs for this function have been constructed only for normally distributed responses with homoscedastic variances. In this paper, we construct D-optimal designs for discrete and continuous responses having their distribution function in the exponential family. We also demonstrate the connection with D-optimality for weighted linear regression.
△ Less
Submitted 4 April, 2021;
originally announced April 2021.
-
On Agnostic PAC Learning using $\mathcal{L}_2$-polynomial Regression and Fourier-based Algorithms
Authors:
Mohsen Heidari,
Wojciech Szpankowski
Abstract:
We develop a framework using Hilbert spaces as a proxy to analyze PAC learning problems with structural properties. We consider a joint Hilbert space incorporating the relation between the true label and the predictor under a joint distribution $D$. We demonstrate that agnostic PAC learning with 0-1 loss is equivalent to an optimization in the Hilbert space domain. With our model, we revisit the P…
▽ More
We develop a framework using Hilbert spaces as a proxy to analyze PAC learning problems with structural properties. We consider a joint Hilbert space incorporating the relation between the true label and the predictor under a joint distribution $D$. We demonstrate that agnostic PAC learning with 0-1 loss is equivalent to an optimization in the Hilbert space domain. With our model, we revisit the PAC learning problem using methods based on least-squares such as $\mathcal{L}_2$ polynomial regression and Linial's low-degree algorithm. We study learning with respect to several hypothesis classes such as half-spaces and polynomial-approximated classes (i.e., functions approximated by a fixed-degree polynomial). We prove that (under some distributional assumptions) such methods obtain generalization error up to $2opt$ with $opt$ being the optimal error of the class. Hence, we show the tightest bound on generalization error when $opt\leq 0.2$.
△ Less
Submitted 11 February, 2021;
originally announced February 2021.
-
Learning k-qubit Quantum Operators via Pauli Decomposition
Authors:
Mohsen Heidari,
Wojciech Szpankowski
Abstract:
Motivated by the limited qubit capacity of current quantum systems, we study the quantum sample complexity of $k$-qubit quantum operators, i.e., operations applicable on only $k$ out of $d$ qubits. The problem is studied according to the quantum probably approximately correct (QPAC) model abiding by quantum mechanical laws such as no-cloning, state collapse, and measurement incompatibility. With t…
▽ More
Motivated by the limited qubit capacity of current quantum systems, we study the quantum sample complexity of $k$-qubit quantum operators, i.e., operations applicable on only $k$ out of $d$ qubits. The problem is studied according to the quantum probably approximately correct (QPAC) model abiding by quantum mechanical laws such as no-cloning, state collapse, and measurement incompatibility. With the delicacy of quantum samples and the richness of quantum operations, one expects a significantly larger quantum sample complexity.
This paper proves the contrary. We show that the quantum sample complexity of $k$-qubit quantum operations is comparable to the classical sample complexity of their counterparts (juntas), at least when $\frac{k}{d}\ll 1$. This is surprising, especially since sample duplication is prohibited, and measurement incompatibility would lead to an exponentially larger sample complexity with standard methods. Our approach is based on the Pauli decomposition of quantum operators and a technique that we name Quantum Shadow Sampling (QSS) to reduce the sample complexity exponentially. The results are proved by developing (i) a connection between the learning loss and the Pauli decomposition; (ii) a scalable QSS circuit for estimating the Pauli coefficients; and (iii) a quantum algorithm for learning $k$-qubit operators with sample complexity $O(\frac{k4^k}{ε^2}\log d)$.
△ Less
Submitted 24 April, 2023; v1 submitted 9 February, 2021;
originally announced February 2021.
-
A practical method for pupil segmentation in challenging conditions
Authors:
Donya Khaledyan,
Mohammad Eshghi,
Morteza Heidari,
Abolfazl Zargari Khuzani,
Najmeh Mashhadi
Abstract:
Various methods have been proposed for authentication, including password or pattern drawing, which is clearly visible on personal electronic devices. However, these methods of authentication are more vulnerable, as passwords and cards can be forgotten, lost, or stolen. Therefore, a great curiosity has developed in individual authentication using biometric methods that are based on physical and be…
▽ More
Various methods have been proposed for authentication, including password or pattern drawing, which is clearly visible on personal electronic devices. However, these methods of authentication are more vulnerable, as passwords and cards can be forgotten, lost, or stolen. Therefore, a great curiosity has developed in individual authentication using biometric methods that are based on physical and behavioral features not possible to forget or be stolen. Authentication methods are used widely in portable devices since the lifetime of battery and time response are essential concerns in these devices. Due to the fact that these systems need to be fast and low power, designing efficient methods is still critical. We, in this paper, proposed a new low power and fast method for pupil segmentation based on approximate computing that under trading a minor level of accuracy, significant improvement in power assumption and time saving can be obtained and makes this algorithm suitable for hardware implementation. Furthermore, the experimental results of PSNR and SSIM show that the error rate in this method is negligible.
△ Less
Submitted 26 September, 2020;
originally announced September 2020.
-
Image quality enhancement in wireless capsule endoscopy with adaptive fraction gamma transformation and unsharp masking filter
Authors:
Rezvan Ezatian,
Donya Khaledyan,
Kian Jafari,
Morteza Heidari,
Abolfazl Zargari Khuzani,
Najmeh Mashhadi
Abstract:
Wireless Capsule Endoscopy (WCE) presented in 2001 as one of the key approaches to observe the entire gastrointestinal (GI) tract, generally the small bowels. It has been used to detect diseases in the gastrointestinal tract. Endoscopic image analysis is still a required field with many open problems. The quality of many images it produced is rather unacceptable due to the nature of this imaging s…
▽ More
Wireless Capsule Endoscopy (WCE) presented in 2001 as one of the key approaches to observe the entire gastrointestinal (GI) tract, generally the small bowels. It has been used to detect diseases in the gastrointestinal tract. Endoscopic image analysis is still a required field with many open problems. The quality of many images it produced is rather unacceptable due to the nature of this imaging system, which causes some issues to prognosticate by physicians and computer-aided diagnosis. In this paper, a novel technique is proposed to improve the quality of images captured by the WCE. More specifically, it enhanced the brightness, contrast, and preserve the color information while reducing its computational complexity. Furthermore, the experimental results of PSNR and SSIM confirm that the error rate in this method is near to the ground and negligible. Moreover, the proposed method improves intensity restricted average local entropy (IRMLE) by 22%, color enhancement factor (CEF) by 10%, and can keep the lightness of image effectively. The performances of our method have better visual quality and objective assessments in compare to the state-of-art methods.
△ Less
Submitted 26 September, 2020;
originally announced September 2020.
-
Applying a random projection algorithm to optimize machine learning model for breast lesion classification
Authors:
Morteza Heidari,
Sivaramakrishnan Lakshmivarahan,
Seyedehnafiseh Mirniaharikandehei,
Gopichandh Danala,
Sai Kiran R. Maryada,
Hong Liu,
Bin Zheng
Abstract:
Machine learning is widely used in developing computer-aided diagnosis (CAD) schemes of medical images. However, CAD usually computes large number of image features from the targeted regions, which creates a challenge of how to identify a small and optimal feature vector to build robust machine learning models. In this study, we investigate feasibility of applying a random projection algorithm to…
▽ More
Machine learning is widely used in developing computer-aided diagnosis (CAD) schemes of medical images. However, CAD usually computes large number of image features from the targeted regions, which creates a challenge of how to identify a small and optimal feature vector to build robust machine learning models. In this study, we investigate feasibility of applying a random projection algorithm to build an optimal feature vector from the initially CAD-generated large feature pool and improve performance of machine learning model. We assemble a retrospective dataset involving 1,487 cases of mammograms in which 644 cases have confirmed malignant mass lesions and 843 have benign lesions. A CAD scheme is first applied to segment mass regions and initially compute 181 features. Then, support vector machine (SVM) models embedded with several feature dimensionality reduction methods are built to predict likelihood of lesions being malignant. All SVM models are trained and tested using a leave-one-case-out cross-validation method. SVM generates a likelihood score of each segmented mass region depicting on one-view mammogram. By fusion of two scores of the same mass depicting on two-view mammograms, a case-based likelihood score is also evaluated. Comparing with the principle component analyses, nonnegative matrix factorization, and Chi-squared methods, SVM embedded with the random projection algorithm yielded a significantly higher case-based lesion classification performance with the area under ROC curve of 0.84+0.01 (p<0.02). The study demonstrates that the random project algorithm is a promising method to generate optimal feature vectors to help improve performance of machine learning models of medical images.
△ Less
Submitted 9 September, 2020;
originally announced September 2020.
-
Deep learning denoising for EOG artifacts removal from EEG signals
Authors:
Najmeh Mashhadi,
Abolfazl Zargari Khuzani,
Morteza Heidari,
Donya Khaledyan
Abstract:
There are many sources of interference encountered in the electroencephalogram (EEG) recordings, specifically ocular, muscular, and cardiac artifacts. Rejection of EEG artifacts is an essential process in EEG analysis since such artifacts cause many problems in EEG signals analysis. One of the most challenging issues in EEG denoising processes is removing the ocular artifacts where Electrooculogra…
▽ More
There are many sources of interference encountered in the electroencephalogram (EEG) recordings, specifically ocular, muscular, and cardiac artifacts. Rejection of EEG artifacts is an essential process in EEG analysis since such artifacts cause many problems in EEG signals analysis. One of the most challenging issues in EEG denoising processes is removing the ocular artifacts where Electrooculographic (EOG), and EEG signals have an overlap in both frequency and time domains. In this paper, we build and train a deep learning model to deal with this challenge and remove the ocular artifacts effectively. In the proposed scheme, we convert each EEG signal to an image to be fed to a U-NET model, which is a deep learning model usually used in image segmentation tasks. We proposed three different schemes and made our U-NET based models learn to purify contaminated EEG signals similar to the process used in the image segmentation process. The results confirm that one of our schemes can achieve a reliable and promising accuracy to reduce the Mean square error between the target signal (Pure EEGs) and the predicted signal (Purified EEGs).
△ Less
Submitted 12 September, 2020;
originally announced September 2020.
-
An approach to human iris recognition using quantitative analysis of image features and machine learning
Authors:
Abolfazl Zargari Khuzani,
Najmeh Mashhadi,
Morteza Heidari,
Donya Khaledyan
Abstract:
The Iris pattern is a unique biological feature for each individual, making it a valuable and powerful tool for human identification. In this paper, an efficient framework for iris recognition is proposed in four steps. (1) Iris segmentation (using a relative total variation combined with Coarse Iris Localization), (2) feature extraction (using Shape&density, FFT, GLCM, GLDM, and Wavelet), (3) fea…
▽ More
The Iris pattern is a unique biological feature for each individual, making it a valuable and powerful tool for human identification. In this paper, an efficient framework for iris recognition is proposed in four steps. (1) Iris segmentation (using a relative total variation combined with Coarse Iris Localization), (2) feature extraction (using Shape&density, FFT, GLCM, GLDM, and Wavelet), (3) feature reduction (employing Kernel-PCA) and (4) classification (applying multi-layer neural network) to classify 2000 iris images of CASIA-Iris-Interval dataset obtained from 200 volunteers. The results confirm that the proposed scheme can provide a reliable prediction with an accuracy of up to 99.64%.
△ Less
Submitted 12 September, 2020;
originally announced September 2020.
-
Applying a random projection algorithm to optimize machine learning model for predicting peritoneal metastasis in gastric cancer patients using CT images
Authors:
Seyedehnafiseh Mirniaharikandehei,
Morteza Heidari,
Gopichandh Danala,
Sivaramakrishnan Lakshmivarahan,
Bin Zheng
Abstract:
Background and Objective: Non-invasively predicting the risk of cancer metastasis before surgery plays an essential role in determining optimal treatment methods for cancer patients (including who can benefit from neoadjuvant chemotherapy). Although developing radiomics based machine learning (ML) models has attracted broad research interest for this purpose, it often faces a challenge of how to b…
▽ More
Background and Objective: Non-invasively predicting the risk of cancer metastasis before surgery plays an essential role in determining optimal treatment methods for cancer patients (including who can benefit from neoadjuvant chemotherapy). Although developing radiomics based machine learning (ML) models has attracted broad research interest for this purpose, it often faces a challenge of how to build a highly performed and robust ML model using small and imbalanced image datasets. Methods: In this study, we explore a new approach to build an optimal ML model. A retrospective dataset involving abdominal computed tomography (CT) images acquired from 159 patients diagnosed with gastric cancer is assembled. Among them, 121 cases have peritoneal metastasis (PM), while 38 cases do not have PM. A computer-aided detection (CAD) scheme is first applied to segment primary gastric tumor volumes and initially computes 315 image features. Then, two Gradient Boosting Machine (GBM) models embedded with two different feature dimensionality reduction methods, namely, the principal component analysis (PCA) and a random projection algorithm (RPA) and a synthetic minority oversampling technique, are built to predict the risk of the patients having PM. All GBM models are trained and tested using a leave-one-case-out cross-validation method. Results: Results show that the GBM embedded with RPA yielded a significantly higher prediction accuracy (71.2%) than using PCA (65.2%) (p<0.05). Conclusions: The study demonstrated that CT images of the primary gastric tumors contain discriminatory information to predict the risk of PM, and RPA is a promising method to generate optimal feature vector, improving the performance of ML models of medical images.
△ Less
Submitted 1 September, 2020;
originally announced September 2020.
-
Diverse and Styled Image Captioning Using SVD-Based Mixture of Recurrent Experts
Authors:
Marzieh Heidari,
Mehdi Ghatee,
Ahmad Nickabadi,
Arash Pourhasan Nezhad
Abstract:
With great advances in vision and natural language processing, the generation of image captions becomes a need. In a recent paper, Mathews, Xie and He [1], extended a new model to generate styled captions by separating semantics and style. In continuation of this work, here a new captioning model is developed including an image encoder to extract the features, a mixture of recurrent networks to em…
▽ More
With great advances in vision and natural language processing, the generation of image captions becomes a need. In a recent paper, Mathews, Xie and He [1], extended a new model to generate styled captions by separating semantics and style. In continuation of this work, here a new captioning model is developed including an image encoder to extract the features, a mixture of recurrent networks to embed the set of extracted features to a set of words, and a sentence generator that combines the obtained words as a stylized sentence. The resulted system that entitled as Mixture of Recurrent Experts (MoRE), uses a new training algorithm that derives singular value decomposition (SVD) from weighting matrices of Recurrent Neural Networks (RNNs) to increase the diversity of captions. Each decomposition step depends on a distinctive factor based on the number of RNNs in MoRE. Since the used sentence generator gives a stylized language corpus without paired images, our captioning model can do the same. Besides, the styled and diverse captions are extracted without training on a densely labeled or styled dataset. To validate this captioning model, we use Microsoft COCO which is a standard factual image caption dataset. We show that the proposed captioning model can generate a diverse and stylized image captions without the necessity of extra-labeling. The results also show better descriptions in terms of content accuracy.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
Improving performance of CNN to predict likelihood of COVID-19 using chest X-ray images with preprocessing algorithms
Authors:
Morteza Heidari,
Seyedehnafiseh Mirniaharikandehei,
Abolfazl Zargari Khuzani,
Gopichandh Danala,
Yuchen Qiu,
Bin Zheng
Abstract:
As the rapid spread of coronavirus disease (COVID-19) worldwide, chest X-ray radiography has also been used to detect COVID-19 infected pneumonia and assess its severity or monitor its prognosis in the hospitals due to its low cost, low radiation dose, and wide accessibility. However, how to more accurately and efficiently detect COVID-19 infected pneumonia and distinguish it from other community-…
▽ More
As the rapid spread of coronavirus disease (COVID-19) worldwide, chest X-ray radiography has also been used to detect COVID-19 infected pneumonia and assess its severity or monitor its prognosis in the hospitals due to its low cost, low radiation dose, and wide accessibility. However, how to more accurately and efficiently detect COVID-19 infected pneumonia and distinguish it from other community-acquired pneumonia remains a challenge. In order to address this challenge, we in this study develop and test a new computer-aided diagnosis (CAD) scheme. It includes several image pre-processing algorithms to remove diaphragms, normalize image contrast-to-noise ratio, and generate three input images, then links to a transfer learning based convolutional neural network (a VGG16 based CNN model) to classify chest X-ray images into three classes of COVID-19 infected pneumonia, other community-acquired pneumonia and normal (non-pneumonia) cases. To this purpose, a publicly available dataset of 8,474 chest X-ray images is used, which includes 415 confirmed COVID-19 infected pneumonia, 5,179 community-acquired pneumonia, and 2,880 non-pneumonia cases. The dataset is divided into two subsets with 90% and 10% of images in each subset to train and test the CNN-based CAD scheme. The testing results achieve 94.0% of overall accuracy in classifying three classes and 98.6% accuracy in detecting Covid-19 infected cases. Thus, the study demonstrates the feasibility of developing a CAD scheme of chest X-ray images and providing radiologists useful decision-making supporting tools in detecting and diagnosis of COVID-19 infected pneumonia.
△ Less
Submitted 11 June, 2020;
originally announced June 2020.
-
Structure of Constrained Systems in Lagrangian Formalism and Degree of Freedom Count
Authors:
Mohammad Javad Heidari,
Ahmad Shirzad
Abstract:
A detailed program is proposed in the Lagrangian formalism to investigate the dynamical behavior of a theory with singular Lagrangian. This program goes on, at different levels, parallel to the Hamiltonian analysis. In particular, we introduce the notions of first class and second class Lagrangian constraints. We show each sequence of first class constraints leads to a Neother identity and consequ…
▽ More
A detailed program is proposed in the Lagrangian formalism to investigate the dynamical behavior of a theory with singular Lagrangian. This program goes on, at different levels, parallel to the Hamiltonian analysis. In particular, we introduce the notions of first class and second class Lagrangian constraints. We show each sequence of first class constraints leads to a Neother identity and consequently to a gauge transformation. We give a general formula for counting the dynamical variables in Lagrangian formalism. As the main advantage of Lagrangian approach, we show the whole procedure can also be performed covariantly. Several examples are given to make our Lagrangian approach clear.
△ Less
Submitted 30 March, 2020;
originally announced March 2020.