subscribe to arXiv mailings

Exploring Power Side-Channel Challenges in Embedded Systems Security

Authors: Pouya Narimani, Meng Wang, Ulysse Planta, Ali Abbasi

Abstract: Power side-channel (PSC) attacks are widely used in embedded microcontrollers, particularly in cryptographic applications, to extract sensitive information. However, expanding the applications of PSC attacks to broader security contexts in the embedded systems domain faces significant challenges. These include the need for specialized hardware setups to manage high noise levels in real-world targe… ▽ More Power side-channel (PSC) attacks are widely used in embedded microcontrollers, particularly in cryptographic applications, to extract sensitive information. However, expanding the applications of PSC attacks to broader security contexts in the embedded systems domain faces significant challenges. These include the need for specialized hardware setups to manage high noise levels in real-world targets and assumptions regarding the attacker's knowledge and capabilities. This paper systematically analyzes these challenges and introduces a novel signal-processing method that addresses key limitations, enabling effective PSC attacks in real-world embedded systems without requiring hardware modifications. We validate the proposed approach through experiments on real-world black-box embedded devices, verifying its potential to expand its usage in various embedded systems security applications beyond traditional cryptographic applications. △ Less

Submitted 15 October, 2024; originally announced October 2024.

arXiv:2409.03131 [pdf]

Well, that escalated quickly: The Single-Turn Crescendo Attack (STCA)

Authors: Alan Aqrawi, Arian Abbasi

Abstract: This paper introduces a new method for adversarial attacks on large language models (LLMs) called the Single-Turn Crescendo Attack (STCA). Building on the multi-turn crescendo attack method introduced by Russinovich, Salem, and Eldan (2024), which gradually escalates the context to provoke harmful responses, the STCA achieves similar outcomes in a single interaction. By condensing the escalation i… ▽ More This paper introduces a new method for adversarial attacks on large language models (LLMs) called the Single-Turn Crescendo Attack (STCA). Building on the multi-turn crescendo attack method introduced by Russinovich, Salem, and Eldan (2024), which gradually escalates the context to provoke harmful responses, the STCA achieves similar outcomes in a single interaction. By condensing the escalation into a single, well-crafted prompt, the STCA bypasses typical moderation filters that LLMs use to prevent inappropriate outputs. This technique reveals vulnerabilities in current LLMs and emphasizes the importance of stronger safeguards in responsible AI (RAI). The STCA offers a novel method that has not been previously explored. △ Less

Submitted 10 September, 2024; v1 submitted 4 September, 2024; originally announced September 2024.

arXiv:2408.12762 [pdf]

Visual Verity in AI-Generated Imagery: Computational Metrics and Human-Centric Analysis

Authors: Memoona Aziz, Umair Rehman, Syed Ali Safi, Amir Zaib Abbasi

Abstract: The rapid advancements in AI technologies have revolutionized the production of graphical content across various sectors, including entertainment, advertising, and e-commerce. These developments have spurred the need for robust evaluation methods to assess the quality and realism of AI-generated images. To address this, we conducted three studies. First, we introduced and validated a questionnaire… ▽ More The rapid advancements in AI technologies have revolutionized the production of graphical content across various sectors, including entertainment, advertising, and e-commerce. These developments have spurred the need for robust evaluation methods to assess the quality and realism of AI-generated images. To address this, we conducted three studies. First, we introduced and validated a questionnaire called Visual Verity, which measures photorealism, image quality, and text-image alignment. Second, we applied this questionnaire to assess images from AI models (DALL-E2, DALL-E3, GLIDE, Stable Diffusion) and camera-generated images, revealing that camera-generated images excelled in photorealism and text-image alignment, while AI models led in image quality. We also analyzed statistical properties, finding that camera-generated images scored lower in hue, saturation, and brightness. Third, we evaluated computational metrics' alignment with human judgments, identifying MS-SSIM and CLIP as the most consistent with human assessments. Additionally, we proposed the Neural Feature Similarity Score (NFSS) for assessing image quality. Our findings highlight the need for refining computational metrics to better capture human visual perception, thereby enhancing AI-generated content evaluation. △ Less

Submitted 1 September, 2024; v1 submitted 22 August, 2024; originally announced August 2024.

arXiv:2408.08848 [pdf]

PsychoLex: Unveiling the Psychological Mind of Large Language Models

Authors: Mohammad Amin Abbasi, Farnaz Sadat Mirnezami, Hassan Naderi

Abstract: This paper explores the intersection of psychology and artificial intelligence through the development and evaluation of specialized Large Language Models (LLMs). We introduce PsychoLex, a suite of resources designed to enhance LLMs' proficiency in psychological tasks in both Persian and English. Key contributions include the PsychoLexQA dataset for instructional content and the PsychoLexEval data… ▽ More This paper explores the intersection of psychology and artificial intelligence through the development and evaluation of specialized Large Language Models (LLMs). We introduce PsychoLex, a suite of resources designed to enhance LLMs' proficiency in psychological tasks in both Persian and English. Key contributions include the PsychoLexQA dataset for instructional content and the PsychoLexEval dataset for rigorous evaluation of LLMs in complex psychological scenarios. Additionally, we present the PsychoLexLLaMA model, optimized specifically for psychological applications, demonstrating superior performance compared to general-purpose models. The findings underscore the potential of tailored LLMs for advancing psychological research and applications, while also highlighting areas for further refinement. This research offers a foundational step towards integrating LLMs into specialized psychological domains, with implications for future advancements in AI-driven psychological practice. △ Less

Submitted 16 August, 2024; originally announced August 2024.

arXiv:2406.19301 [pdf, other]

MCNC: Manifold Constrained Network Compression

Authors: Chayne Thrash, Ali Abbasi, Parsa Nooralinejad, Soroush Abbasi Koohpayegani, Reed Andreas, Hamed Pirsiavash, Soheil Kolouri

Abstract: The outstanding performance of large foundational models across diverse tasks-from computer vision to speech and natural language processing-has significantly increased their demand. However, storing and transmitting these models pose significant challenges due to their massive size (e.g., 350GB for GPT-3). Recent literature has focused on compressing the original weights or reducing the number of… ▽ More The outstanding performance of large foundational models across diverse tasks-from computer vision to speech and natural language processing-has significantly increased their demand. However, storing and transmitting these models pose significant challenges due to their massive size (e.g., 350GB for GPT-3). Recent literature has focused on compressing the original weights or reducing the number of parameters required for fine-tuning these models. These compression methods typically involve constraining the parameter space, for example, through low-rank reparametrization (e.g., LoRA) or quantization (e.g., QLoRA) during model training. In this paper, we present MCNC as a novel model compression method that constrains the parameter space to low-dimensional pre-defined and frozen nonlinear manifolds, which effectively cover this space. Given the prevalence of good solutions in over-parameterized deep neural networks, we show that by constraining the parameter space to our proposed manifold, we can identify high-quality solutions while achieving unprecedented compression rates across a wide variety of tasks. Through extensive experiments in computer vision and natural language processing tasks, we demonstrate that our method, MCNC, significantly outperforms state-of-the-art baselines in terms of compression, accuracy, and/or model reconstruction time. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.03777 [pdf, other]

Empirical Guidelines for Deploying LLMs onto Resource-constrained Edge Devices

Authors: Ruiyang Qin, Dancheng Liu, Chenhui Xu, Zheyu Yan, Zhaoxuan Tan, Zhenge Jia, Amir Nassereldine, Jiajie Li, Meng Jiang, Ahmed Abbasi, Jinjun Xiong, Yiyu Shi

Abstract: The scaling laws have become the de facto guidelines for designing large language models (LLMs), but they were studied under the assumption of unlimited computing resources for both training and inference. As LLMs are increasingly used as personalized intelligent assistants, their customization (i.e., learning through fine-tuning) and deployment onto resource-constrained edge devices will become m… ▽ More The scaling laws have become the de facto guidelines for designing large language models (LLMs), but they were studied under the assumption of unlimited computing resources for both training and inference. As LLMs are increasingly used as personalized intelligent assistants, their customization (i.e., learning through fine-tuning) and deployment onto resource-constrained edge devices will become more and more prevalent. An urging but open question is how a resource-constrained computing environment would affect the design choices for a personalized LLM. We study this problem empirically in this work. In particular, we consider the tradeoffs among a number of key design factors and their intertwined impacts on learning efficiency and accuracy. The factors include the learning methods for LLM customization, the amount of personalized data used for learning customization, the types and sizes of LLMs, the compression methods of LLMs, the amount of time afforded to learn, and the difficulty levels of the target use cases. Through extensive experimentation and benchmarking, we draw a number of surprisingly insightful guidelines for deploying LLMs onto resource-constrained devices. For example, an optimal choice between parameter learning and RAG may vary depending on the difficulty of the downstream task, the longer fine-tuning time does not necessarily help the model, and a compressed LLM may be a better choice than an uncompressed LLM to learn from limited personalized data. △ Less

Submitted 2 October, 2024; v1 submitted 6 June, 2024; originally announced June 2024.

Comments: Benckmarking paper

arXiv:2405.06569 [pdf, other]

Efficient Federated Low Rank Matrix Completion

Authors: Ahmed Ali Abbasi, Namrata Vaswani

Abstract: In this work, we develop and analyze a Gradient Descent (GD) based solution, called Alternating GD and Minimization (AltGDmin), for efficiently solving the low rank matrix completion (LRMC) in a federated setting. LRMC involves recovering an $n \times q$ rank-$r$ matrix $\Xstar$ from a subset of its entries when $r \ll \min(n,q)$. Our theoretical guarantees (iteration and sample complexity bounds)… ▽ More In this work, we develop and analyze a Gradient Descent (GD) based solution, called Alternating GD and Minimization (AltGDmin), for efficiently solving the low rank matrix completion (LRMC) in a federated setting. LRMC involves recovering an $n \times q$ rank-$r$ matrix $\Xstar$ from a subset of its entries when $r \ll \min(n,q)$. Our theoretical guarantees (iteration and sample complexity bounds) imply that AltGDmin is the most communication-efficient solution in a federated setting, is one of the fastest, and has the second best sample complexity among all iterative solutions to LRMC. In addition, we also prove two important corollaries. (a) We provide a guarantee for AltGDmin for solving the noisy LRMC problem. (b) We show how our lemmas can be used to provide an improved sample complexity guarantee for AltMin, which is the fastest centralized solution. △ Less

Submitted 30 September, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

arXiv:2404.15503 [pdf, other]

FedGreen: Carbon-aware Federated Learning with Model Size Adaptation

Authors: Ali Abbasi, Fan Dong, Xin Wang, Henry Leung, Jiayu Zhou, Steve Drew

Abstract: Federated learning (FL) provides a promising collaborative framework to build a model from distributed clients, and this work investigates the carbon emission of the FL process. Cloud and edge servers hosting FL clients may exhibit diverse carbon footprints influenced by their geographical locations with varying power sources, offering opportunities to reduce carbon emissions by training local mod… ▽ More Federated learning (FL) provides a promising collaborative framework to build a model from distributed clients, and this work investigates the carbon emission of the FL process. Cloud and edge servers hosting FL clients may exhibit diverse carbon footprints influenced by their geographical locations with varying power sources, offering opportunities to reduce carbon emissions by training local models with adaptive computations and communications. In this paper, we propose FedGreen, a carbon-aware FL approach to efficiently train models by adopting adaptive model sizes shared with clients based on their carbon profiles and locations using ordered dropout as a model compression technique. We theoretically analyze the trade-offs between the produced carbon emissions and the convergence accuracy, considering the carbon intensity discrepancy across countries to choose the parameters optimally. Empirical studies show that FedGreen can substantially reduce the carbon footprints of FL compared to the state-of-the-art while maintaining competitive model accuracy. △ Less

Submitted 23 April, 2024; originally announced April 2024.

arXiv:2404.04127 [pdf, other]

doi 10.14722/spacesec.2024.23033

On the Feasibility of CubeSats Application Sandboxing for Space Missions

Authors: Gabriele Marra, Ulysse Planta, Philipp Wüstenberg, Ali Abbasi

Abstract: This paper details our journey in designing and selecting a suitable application sandboxing mechanism for a satellite under development, with a focus on small satellites. Central to our study is the development of selection criteria for sandboxing and assessing its appropriateness for our satellite payload. We also test our approach on two already operational satellites, Suchai and SALSAT, to vali… ▽ More This paper details our journey in designing and selecting a suitable application sandboxing mechanism for a satellite under development, with a focus on small satellites. Central to our study is the development of selection criteria for sandboxing and assessing its appropriateness for our satellite payload. We also test our approach on two already operational satellites, Suchai and SALSAT, to validate its effectiveness. These experiments highlight the practicality and efficiency of our chosen sandboxing method for real-world space systems. Our results provide insights and highlight the challenges involved in integrating application sandboxing in the space sector. △ Less

Submitted 5 April, 2024; originally announced April 2024.

Comments: 8 pages, 5 figures, accepted to SpaceSec Workshop 2024 and to be published as post-conference proceedings with NDSS 2024

arXiv:2403.07142 [pdf, other]

One Category One Prompt: Dataset Distillation using Diffusion Models

Authors: Ali Abbasi, Ashkan Shahbazi, Hamed Pirsiavash, Soheil Kolouri

Abstract: The extensive amounts of data required for training deep neural networks pose significant challenges on storage and transmission fronts. Dataset distillation has emerged as a promising technique to condense the information of massive datasets into a much smaller yet representative set of synthetic samples. However, traditional dataset distillation approaches often struggle to scale effectively wit… ▽ More The extensive amounts of data required for training deep neural networks pose significant challenges on storage and transmission fronts. Dataset distillation has emerged as a promising technique to condense the information of massive datasets into a much smaller yet representative set of synthetic samples. However, traditional dataset distillation approaches often struggle to scale effectively with high-resolution images and more complex architectures due to the limitations in bi-level optimization. Recently, several works have proposed exploiting knowledge distillation with decoupled optimization schemes to scale up dataset distillation. Although these methods effectively address the scalability issue, they rely on extensive image augmentations requiring the storage of soft labels for augmented images. In this paper, we introduce Dataset Distillation using Diffusion Models (D3M) as a novel paradigm for dataset distillation, leveraging recent advancements in generative text-to-image foundation models. Our approach utilizes textual inversion, a technique for fine-tuning text-to-image generative models, to create concise and informative representations for large datasets. By employing these learned text prompts, we can efficiently store and infer new samples for introducing data variability within a fixed memory budget. We show the effectiveness of our method through extensive experiments across various computer vision benchmark datasets with different memory budgets. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2403.00280 [pdf]

SoK: Security of Programmable Logic Controllers

Authors: Efrén López-Morales, Ulysse Planta, Carlos Rubio-Medrano, Ali Abbasi, Alvaro A. Cardenas

Abstract: Billions of people rely on essential utility and manufacturing infrastructures such as water treatment plants, energy management, and food production. Our dependence on reliable infrastructures makes them valuable targets for cyberattacks. One of the prime targets for adversaries attacking physical infrastructures are Programmable Logic Controllers (PLCs) because they connect the cyber and physica… ▽ More Billions of people rely on essential utility and manufacturing infrastructures such as water treatment plants, energy management, and food production. Our dependence on reliable infrastructures makes them valuable targets for cyberattacks. One of the prime targets for adversaries attacking physical infrastructures are Programmable Logic Controllers (PLCs) because they connect the cyber and physical worlds. In this study, we conduct the first comprehensive systematization of knowledge that explores the security of PLCs: We present an in-depth analysis of PLC attacks and defenses and discover trends in the security of PLCs from the last 17 years of research. We introduce a novel threat taxonomy for PLCs and Industrial Control Systems (ICS). Finally, we identify and point out research gaps that, if left ignored, could lead to new catastrophic attacks against critical infrastructures. △ Less

Submitted 29 February, 2024; originally announced March 2024.

Comments: 25 pages, 13 figures, Extended version February 2024, A shortened version is to be published in the 33rd USENIX Security Symposium, for more information, see https://efrenlopez.org/

arXiv:2402.06696 [pdf, other]

FL-NAS: Towards Fairness of NAS for Resource Constrained Devices via Large Language Models

Authors: Ruiyang Qin, Yuting Hu, Zheyu Yan, Jinjun Xiong, Ahmed Abbasi, Yiyu Shi

Abstract: Neural Architecture Search (NAS) has become the de fecto tools in the industry in automating the design of deep neural networks for various applications, especially those driven by mobile and edge devices with limited computing resources. The emerging large language models (LLMs), due to their prowess, have also been incorporated into NAS recently and show some promising results. This paper conduc… ▽ More Neural Architecture Search (NAS) has become the de fecto tools in the industry in automating the design of deep neural networks for various applications, especially those driven by mobile and edge devices with limited computing resources. The emerging large language models (LLMs), due to their prowess, have also been incorporated into NAS recently and show some promising results. This paper conducts further exploration in this direction by considering three important design metrics simultaneously, i.e., model accuracy, fairness, and hardware deployment efficiency. We propose a novel LLM-based NAS framework, FL-NAS, in this paper, and show experimentally that FL-NAS can indeed find high-performing DNNs, beating state-of-the-art DNN models by orders-of-magnitude across almost all design considerations. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: ASP-DAC 2024

arXiv:2312.15713 [pdf]

PersianLLaMA: Towards Building First Persian Large Language Model

Authors: Mohammad Amin Abbasi, Arash Ghafouri, Mahdi Firouzmandi, Hassan Naderi, Behrouz Minaei Bidgoli

Abstract: Despite the widespread use of the Persian language by millions globally, limited efforts have been made in natural language processing for this language. The use of large language models as effective tools in various natural language processing tasks typically requires extensive textual data and robust hardware resources. Consequently, the scarcity of Persian textual data and the unavailability of… ▽ More Despite the widespread use of the Persian language by millions globally, limited efforts have been made in natural language processing for this language. The use of large language models as effective tools in various natural language processing tasks typically requires extensive textual data and robust hardware resources. Consequently, the scarcity of Persian textual data and the unavailability of powerful hardware resources have hindered the development of large language models for Persian. This paper introduces the first large Persian language model, named PersianLLaMA, trained on a collection of Persian texts and datasets. This foundational model comes in two versions, with 7 and 13 billion parameters, trained on formal and colloquial Persian texts using two different approaches. PersianLLaMA has been evaluated for natural language generation tasks based on the latest evaluation methods, namely using larger language models, and for natural language understanding tasks based on automated machine metrics. The results indicate that PersianLLaMA significantly outperforms its competitors in both understanding and generating Persian text. PersianLLaMA marks an important step in the development of Persian natural language processing and can be a valuable resource for the Persian-speaking community. This large language model can be used for various natural language processing tasks, especially text generation like chatbots, question-answering, machine translation, and text summarization △ Less

Submitted 25 December, 2023; originally announced December 2023.

arXiv:2312.04718 [pdf, other]

Dynamic Online Modulation Recognition using Incremental Learning

Authors: Ali Owfi, Ali Abbasi, Fatemeh Afghah, Jonathan Ashdown, Kurt Turck

Abstract: Modulation recognition is a fundamental task in communication systems as the accurate identification of modulation schemes is essential for reliable signal processing, interference mitigation for coexistent communication technologies, and network optimization. Incorporating deep learning (DL) models into modulation recognition has demonstrated promising results in various scenarios. However, conve… ▽ More Modulation recognition is a fundamental task in communication systems as the accurate identification of modulation schemes is essential for reliable signal processing, interference mitigation for coexistent communication technologies, and network optimization. Incorporating deep learning (DL) models into modulation recognition has demonstrated promising results in various scenarios. However, conventional DL models often fall short in online dynamic contexts, particularly in class incremental scenarios where new modulation schemes are encountered during online deployment. Retraining these models on all previously seen modulation schemes is not only time-consuming but may also not be feasible due to storage limitations. On the other hand, training solely on new modulation schemes often results in catastrophic forgetting of previously learned classes. This issue renders DL-based modulation recognition models inapplicable in real-world scenarios because the dynamic nature of communication systems necessitate the effective adaptability to new modulation schemes. This paper addresses this challenge by evaluating the performance of multiple Incremental Learning (IL) algorithms in dynamic modulation recognition scenarios, comparing them against conventional DL-based modulation recognition. Our results demonstrate that modulation recognition frameworks based on IL effectively prevent catastrophic forgetting, enabling models to perform robustly in dynamic scenarios. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: To be published in International Workshop on Computing, Networking and Communications (CNC) 2024

arXiv:2311.14683 [pdf]

Data Science for Social Good

Authors: Ahmed Abbasi, Roger H. L. Chiang, Jennifer J. Xu

Abstract: Data science has been described as the fourth paradigm for scientific discovery. The latest wave of data science research, pertaining to machine learning and artificial intelligence (AI), is growing exponentially and garnering millions of annual citations. However, this growth has been accompanied by a diminishing emphasis on social good challenges - our analysis reveals that the proportion of dat… ▽ More Data science has been described as the fourth paradigm for scientific discovery. The latest wave of data science research, pertaining to machine learning and artificial intelligence (AI), is growing exponentially and garnering millions of annual citations. However, this growth has been accompanied by a diminishing emphasis on social good challenges - our analysis reveals that the proportion of data science research focusing on social good is less than it has ever been. At the same time, the proliferation of machine learning and generative AI have sparked debates about the socio-technical prospects and challenges associated with data science for human flourishing, organizations, and society. Against this backdrop, we present a framework for "data science for social good" (DSSG) research that considers the interplay between relevant data science research genres, social good challenges, and different levels of socio-technical abstraction. We perform an analysis of the literature to empirically demonstrate the paucity of work on DSSG in information systems (and other related disciplines) and highlight current impediments. We then use our proposed framework to introduce the articles appearing in the special issue. We hope that this article and the special issue will spur future DSSG research and help reverse the alarming trend across data science research over the past 30-plus years in which social good challenges are garnering proportionately less attention with each passing day. △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2311.12999 [pdf, other]

CovarNav: Machine Unlearning via Model Inversion and Covariance Navigation

Authors: Ali Abbasi, Chayne Thrash, Elaheh Akbari, Daniel Zhang, Soheil Kolouri

Abstract: The rapid progress of AI, combined with its unprecedented public adoption and the propensity of large neural networks to memorize training data, has given rise to significant data privacy concerns. To address these concerns, machine unlearning has emerged as an essential technique to selectively remove the influence of specific training data points on trained models. In this paper, we approach the… ▽ More The rapid progress of AI, combined with its unprecedented public adoption and the propensity of large neural networks to memorize training data, has given rise to significant data privacy concerns. To address these concerns, machine unlearning has emerged as an essential technique to selectively remove the influence of specific training data points on trained models. In this paper, we approach the machine unlearning problem through the lens of continual learning. Given a trained model and a subset of training data designated to be forgotten (i.e., the "forget set"), we introduce a three-step process, named CovarNav, to facilitate this forgetting. Firstly, we derive a proxy for the model's training data using a model inversion attack. Secondly, we mislabel the forget set by selecting the most probable class that deviates from the actual ground truth. Lastly, we deploy a gradient projection method to minimize the cross-entropy loss on the modified forget set (i.e., learn incorrect labels for this set) while preventing forgetting of the inverted samples. We rigorously evaluate CovarNav on the CIFAR-10 and Vggface2 datasets, comparing our results with recent benchmarks in the field and demonstrating the efficacy of our proposed approach. △ Less

Submitted 21 November, 2023; originally announced November 2023.

arXiv:2311.12275 [pdf, other]

doi 10.1145/3649329.3655665

Enabling On-Device Large Language Model Personalization with Self-Supervised Data Selection and Synthesis

Authors: Ruiyang Qin, Jun Xia, Zhenge Jia, Meng Jiang, Ahmed Abbasi, Peipei Zhou, Jingtong Hu, Yiyu Shi

Abstract: After a large language model (LLM) is deployed on edge devices, it is desirable for these devices to learn from user-generated conversation data to generate user-specific and personalized responses in real-time. However, user-generated data usually contains sensitive and private information, and uploading such data to the cloud for annotation is not preferred if not prohibited. While it is possibl… ▽ More After a large language model (LLM) is deployed on edge devices, it is desirable for these devices to learn from user-generated conversation data to generate user-specific and personalized responses in real-time. However, user-generated data usually contains sensitive and private information, and uploading such data to the cloud for annotation is not preferred if not prohibited. While it is possible to obtain annotation locally by directly asking users to provide preferred responses, such annotations have to be sparse to not affect user experience. In addition, the storage of edge devices is usually too limited to enable large-scale fine-tuning with full user-generated data. It remains an open question how to enable on-device LLM personalization, considering sparse annotation and limited on-device storage. In this paper, we propose a novel framework to select and store the most representative data online in a self-supervised way. Such data has a small memory footprint and allows infrequent requests of user annotations for further fine-tuning. To enhance fine-tuning quality, multiple semantically similar pairs of question texts and expected responses are generated using the LLM. Our experiments show that the proposed framework achieves the best user-specific content-generating capability (accuracy) and fine-tuning speed (performance) compared with vanilla baselines. To the best of our knowledge, this is the very first on-device LLM personalization framework. △ Less

Submitted 16 April, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

Comments: Accepted by 2024 61th ACM/IEEE Design Automation Conference (DAC)

arXiv:2311.11995 [pdf, other]

BrainWash: A Poisoning Attack to Forget in Continual Learning

Authors: Ali Abbasi, Parsa Nooralinejad, Hamed Pirsiavash, Soheil Kolouri

Abstract: Continual learning has gained substantial attention within the deep learning community, offering promising solutions to the challenging problem of sequential learning. Yet, a largely unexplored facet of this paradigm is its susceptibility to adversarial attacks, especially with the aim of inducing forgetting. In this paper, we introduce "BrainWash," a novel data poisoning method tailored to impose… ▽ More Continual learning has gained substantial attention within the deep learning community, offering promising solutions to the challenging problem of sequential learning. Yet, a largely unexplored facet of this paradigm is its susceptibility to adversarial attacks, especially with the aim of inducing forgetting. In this paper, we introduce "BrainWash," a novel data poisoning method tailored to impose forgetting on a continual learner. By adding the BrainWash noise to a variety of baselines, we demonstrate how a trained continual learner can be induced to forget its previously learned tasks catastrophically, even when using these continual learning baselines. An important feature of our approach is that the attacker requires no access to previous tasks' data and is armed merely with the model's current parameters and the data belonging to the most recent task. Our extensive experiments highlight the efficacy of BrainWash, showcasing degradation in performance across various regularization-based continual learning methods. △ Less

Submitted 23 November, 2023; v1 submitted 20 November, 2023; originally announced November 2023.

arXiv:2311.10395 [pdf, other]

Bias A-head? Analyzing Bias in Transformer-Based Language Model Attention Heads

Authors: Yi Yang, Hanyu Duan, Ahmed Abbasi, John P. Lalor, Kar Yan Tam

Abstract: Transformer-based pretrained large language models (PLM) such as BERT and GPT have achieved remarkable success in NLP tasks. However, PLMs are prone to encoding stereotypical biases. Although a burgeoning literature has emerged on stereotypical bias mitigation in PLMs, such as work on debiasing gender and racial stereotyping, how such biases manifest and behave internally within PLMs remains large… ▽ More Transformer-based pretrained large language models (PLM) such as BERT and GPT have achieved remarkable success in NLP tasks. However, PLMs are prone to encoding stereotypical biases. Although a burgeoning literature has emerged on stereotypical bias mitigation in PLMs, such as work on debiasing gender and racial stereotyping, how such biases manifest and behave internally within PLMs remains largely unknown. Understanding the internal stereotyping mechanisms may allow better assessment of model fairness and guide the development of effective mitigation strategies. In this work, we focus on attention heads, a major component of the Transformer architecture, and propose a bias analysis framework to explore and identify a small set of biased heads that are found to contribute to a PLM's stereotypical bias. We conduct extensive experiments to validate the existence of these biased heads and to better understand how they behave. We investigate gender and racial bias in the English language in two types of Transformer-based PLMs: the encoder-based BERT model and the decoder-based autoregressive GPT model. Overall, the results shed light on understanding the bias behavior in pretrained language models. △ Less

Submitted 15 June, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

Comments: 14 pages, 7 figures, 3 tables including references and appendices

arXiv:2311.10367 [pdf, other]

Exploring the Relationship between In-Context Learning and Instruction Tuning

Authors: Hanyu Duan, Yixuan Tang, Yi Yang, Ahmed Abbasi, Kar Yan Tam

Abstract: In-Context Learning (ICL) and Instruction Tuning (IT) are two primary paradigms of adopting Large Language Models (LLMs) to downstream applications. However, they are significantly different. In ICL, a set of demonstrations are provided at inference time but the LLM's parameters are not updated. In IT, a set of demonstrations are used to tune LLM's parameters in training time but no demonstrations… ▽ More In-Context Learning (ICL) and Instruction Tuning (IT) are two primary paradigms of adopting Large Language Models (LLMs) to downstream applications. However, they are significantly different. In ICL, a set of demonstrations are provided at inference time but the LLM's parameters are not updated. In IT, a set of demonstrations are used to tune LLM's parameters in training time but no demonstrations are used at inference time. Although a growing body of literature has explored ICL and IT, studies on these topics have largely been conducted in isolation, leading to a disconnect between these two paradigms. In this work, we explore the relationship between ICL and IT by examining how the hidden states of LLMs change in these two paradigms. Through carefully designed experiments conducted with LLaMA-2 (7B and 13B), we find that ICL is implicit IT. In other words, ICL changes an LLM's hidden states as if the demonstrations were used to instructionally tune the model. Furthermore, the convergence between ICL and IT is largely contingent upon several factors related to the provided demonstrations. Overall, this work offers a unique perspective to explore the connection between ICL and IT and sheds light on understanding the behaviors of LLM. △ Less

Submitted 17 November, 2023; originally announced November 2023.

arXiv:2309.14488 [pdf, other]

When Automated Assessment Meets Automated Content Generation: Examining Text Quality in the Era of GPTs

Authors: Marialena Bevilacqua, Kezia Oketch, Ruiyang Qin, Will Stamey, Xinyuan Zhang, Yi Gan, Kai Yang, Ahmed Abbasi

Abstract: The use of machine learning (ML) models to assess and score textual data has become increasingly pervasive in an array of contexts including natural language processing, information retrieval, search and recommendation, and credibility assessment of online content. A significant disruption at the intersection of ML and text are text-generating large-language models such as generative pre-trained t… ▽ More The use of machine learning (ML) models to assess and score textual data has become increasingly pervasive in an array of contexts including natural language processing, information retrieval, search and recommendation, and credibility assessment of online content. A significant disruption at the intersection of ML and text are text-generating large-language models such as generative pre-trained transformers (GPTs). We empirically assess the differences in how ML-based scoring models trained on human content assess the quality of content generated by humans versus GPTs. To do so, we propose an analysis framework that encompasses essay scoring ML-models, human and ML-generated essays, and a statistical model that parsimoniously considers the impact of type of respondent, prompt genre, and the ML model used for assessment model. A rich testbed is utilized that encompasses 18,460 human-generated and GPT-based essays. Results of our benchmark analysis reveal that transformer pretrained language models (PLMs) more accurately score human essay quality as compared to CNN/RNN and feature-based ML methods. Interestingly, we find that the transformer PLMs tend to score GPT-generated text 10-15\% higher on average, relative to human-authored documents. Conversely, traditional deep learning and feature-based ML models score human text considerably higher. Further analysis reveals that although the transformer PLMs are exclusively fine-tuned on human text, they more prominently attend to certain tokens appearing only in GPT-generated text, possibly due to familiarity/overlap in pre-training. Our framework and results have implications for text classification settings where automated scoring of text is likely to be disrupted by generative AI. △ Less

Submitted 25 September, 2023; originally announced September 2023.

Comments: Data available at: https://github.com/nd-hal/automated-ML-scoring-versus-generation

arXiv:2305.16351 [pdf, other]

Federated Learning Model Aggregation in Heterogenous Aerial and Space Networks

Authors: Fan Dong, Ali Abbasi, Henry Leung, Xin Wang, Jiayu Zhou, Steve Drew

Abstract: Federated learning offers a promising approach under the constraints of networking and data privacy constraints in aerial and space networks (ASNs), utilizing large-scale private edge data from drones, balloons, and satellites. Existing research has extensively studied the optimization of the learning process, computing efficiency, and communication overhead. An important yet often overlooked aspe… ▽ More Federated learning offers a promising approach under the constraints of networking and data privacy constraints in aerial and space networks (ASNs), utilizing large-scale private edge data from drones, balloons, and satellites. Existing research has extensively studied the optimization of the learning process, computing efficiency, and communication overhead. An important yet often overlooked aspect is that participants contribute predictive knowledge with varying diversity of knowledge, affecting the quality of the learned federated models. In this paper, we propose a novel approach to address this issue by introducing a Weighted Averaging and Client Selection (WeiAvgCS) framework that emphasizes updates from high-diversity clients and diminishes the influence of those from low-diversity clients. Direct sharing of the data distribution may be prohibitive due to the additional private information that is sent from the clients. As such, we introduce an estimation for the diversity using a projection-based method. Extensive experiments have been performed to show WeiAvgCS's effectiveness. WeiAvgCS could converge 46% faster on FashionMNIST and 38% faster on CIFAR10 than its benchmarks on average in our experiments. △ Less

Submitted 16 April, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: 6 pages, 7 figures, accepted by IEEE ICC workshop on emerging technologies in aerial and space networks 2024

ACM Class: I.2.11; C.2.4

arXiv:2303.03706 [pdf]

Classifying Text-Based Conspiracy Tweets related to COVID-19 using Contextualized Word Embeddings

Authors: Abdul Rehman, Rabeeh Ayaz Abbasi, Irfan ul Haq Qureshi, Akmal Saeed Khattak

Abstract: The FakeNews task in MediaEval 2022 investigates the challenge of finding accurate and high-performance models for the classification of conspiracy tweets related to COVID-19. In this paper, we used BERT, ELMO, and their combination for feature extraction and RandomForest as classifier. The results show that ELMO performs slightly better than BERT, however their combination at feature level reduce… ▽ More The FakeNews task in MediaEval 2022 investigates the challenge of finding accurate and high-performance models for the classification of conspiracy tweets related to COVID-19. In this paper, we used BERT, ELMO, and their combination for feature extraction and RandomForest as classifier. The results show that ELMO performs slightly better than BERT, however their combination at feature level reduces the performance. △ Less

Submitted 7 March, 2023; originally announced March 2023.

Comments: Published in Multimedia Benchmark Workshop 2022, Bergen, Norway and Online, 12-13 January 2023: https://2022.multimediaeval.com/

MSC Class: 68T01 ACM Class: I.2.7

Journal ref: Multimedia Benchmark Workshop, Bergen, Norway and Online, 12-13 January 2023

arXiv:2303.03704 [pdf, other]

Identifying Misinformation Spreaders: A Graph-Based Semi-Supervised Learning Approach

Authors: Atta Ullah, Rabeeh Ayaz Abbasi, Akmal Saeed Khattak, Anwar Said

Abstract: In this paper we proposed a Graph-Based conspiracy source detection method for the MediaEval task 2022 FakeNews: Corona Virus and Conspiracies Multimedia Analysis Task. The goal of this study was to apply SOTA graph neural network methods to the problem of misinformation spreading in online social networks. We explore three different Graph Neural Network models: GCN, GraphSAGE and DGCNN. Experimen… ▽ More In this paper we proposed a Graph-Based conspiracy source detection method for the MediaEval task 2022 FakeNews: Corona Virus and Conspiracies Multimedia Analysis Task. The goal of this study was to apply SOTA graph neural network methods to the problem of misinformation spreading in online social networks. We explore three different Graph Neural Network models: GCN, GraphSAGE and DGCNN. Experimental results demonstrate that DGCNN outperforms in terms of accuracy. △ Less

Submitted 7 March, 2023; originally announced March 2023.

Comments: Published in Multimedia Benchmark Workshop Proceedings 2022: https://2022.multimediaeval.com/

MSC Class: 91D30 ACM Class: I.2.1

arXiv:2211.07621 [pdf, other]

Alternating minimization algorithm with initialization analysis for r-local and k-sparse unlabeled sensing

Authors: Ahmed Abbasi, Abiy Tasissa, Shuchin Aeron

Abstract: The unlabeled sensing problem is to recover an unknown signal from permuted linear measurements. We propose an alternating minimization algorithm with a suitable initialization for the widely considered k-sparse permutation model. Assuming either a Gaussian measurement matrix or a sub-Gaussian signal, we upper bound the initialization error for the r-local and k-sparse permutation models in terms… ▽ More The unlabeled sensing problem is to recover an unknown signal from permuted linear measurements. We propose an alternating minimization algorithm with a suitable initialization for the widely considered k-sparse permutation model. Assuming either a Gaussian measurement matrix or a sub-Gaussian signal, we upper bound the initialization error for the r-local and k-sparse permutation models in terms of the block size $r$ and the number of shuffles k, respectively. Our algorithm is computationally scalable and, compared to baseline methods, achieves superior performance on real and synthetic datasets. △ Less

Submitted 14 November, 2022; originally announced November 2022.

arXiv:2208.07441 [pdf, ps, other]

WatchPed: Pedestrian Crossing Intention Prediction Using Embedded Sensors of Smartwatch

Authors: Jibran Ali Abbasi, Navid Mohammad Imran, Lokesh Chandra Das, Myounggyu Won

Abstract: The pedestrian crossing intention prediction problem is to estimate whether or not the target pedestrian will cross the street. State-of-the-art techniques heavily depend on visual data acquired through the front camera of the ego-vehicle to make a prediction of the pedestrian's crossing intention. Hence, the efficiency of current methodologies tends to decrease notably in situations where visual… ▽ More The pedestrian crossing intention prediction problem is to estimate whether or not the target pedestrian will cross the street. State-of-the-art techniques heavily depend on visual data acquired through the front camera of the ego-vehicle to make a prediction of the pedestrian's crossing intention. Hence, the efficiency of current methodologies tends to decrease notably in situations where visual input is imprecise, for instance, when the distance between the pedestrian and ego-vehicle is considerable or the illumination levels are inadequate. To address the limitation, in this paper, we present the design, implementation, and evaluation of the first-of-its-kind pedestrian crossing intention prediction model based on integration of motion sensor data gathered through the smartwatch (or smartphone) of the pedestrian. We propose an innovative machine learning framework that effectively integrates motion sensor data with visual input to enhance the predictive accuracy significantly, particularly in scenarios where visual data may be unreliable. Moreover, we perform an extensive data collection process and introduce the first pedestrian intention prediction dataset that features synchronized motion sensor data. The dataset comprises 255 video clips that encompass diverse distances and lighting conditions. We trained our model using the widely-used JAAD and our own datasets and compare the performance with a state-of-the-art model. The results demonstrate that our model outperforms the current state-of-the-art method, particularly in cases where the distance between the pedestrian and the observer is considerable (more than 70 meters) and the lighting conditions are inadequate. △ Less

Submitted 15 March, 2023; v1 submitted 15 August, 2022; originally announced August 2022.

arXiv:2206.09424 [pdf]

doi 10.32604/cmc.2022.027655

Construction and Optimization of TRNG Based Substitution Boxes for Block Encryption Algorithms

Authors: Muhammad Fahad Khan, Khalid Saleem, Mohammed Alotaibi, Mohammad Mazyad Hazzazi, Eid Rehman, Aaqif Afzaal Abbasi, Muhammad Asif Gondal

Abstract: Internet of Things is an ecosystem of interconnected devices that are accessible through the internet. The recent research focuses on adding more smartness and intelligence to these edge devices. This makes them susceptible to various kinds of security threats. These edge devices rely on cryptographic techniques to encrypt the pre-processed data collected from the sensors deployed in the field. In… ▽ More Internet of Things is an ecosystem of interconnected devices that are accessible through the internet. The recent research focuses on adding more smartness and intelligence to these edge devices. This makes them susceptible to various kinds of security threats. These edge devices rely on cryptographic techniques to encrypt the pre-processed data collected from the sensors deployed in the field. In this regard, block cipher has been one of the most reliable options through which data security is accomplished. The strength of block encryption algorithms against different attacks is dependent on its nonlinear primitive which is called Substitution Boxes. For the design of S-boxes mainly algebraic and chaos-based techniques are used but researchers also found various weaknesses in these techniques. On the other side, literature endorse the true random numbers for information security due to the reason that, true random numbers are purely non-deterministic. In this paper firstly a natural dynamical phenomenon is utilized for the generation of true random numbers based S-boxes. Secondly, a systematic literature review was conducted to know which metaheuristic optimization technique is highly adopted in the current decade for the optimization of S-boxes. Based on the outcome of Systematic Literature Review (SLR), genetic algorithm is chosen for the optimization of s-boxes. The results of our method validate that the proposed dynamic S-boxes are effective for the block ciphers. Moreover, our results showed that the proposed substitution boxes achieve better △ Less

Submitted 19 June, 2022; originally announced June 2022.

Comments: 15 pages, 3 figuers, Journal Paper

Report number: https://www.techscience.com/cmc/v73n2/48337 MSC Class: 68P25 ACM Class: E.3

Journal ref: Computers, Materials & Continua, 2022

arXiv:2206.08464 [pdf, other]

PRANC: Pseudo RAndom Networks for Compacting deep models

Authors: Parsa Nooralinejad, Ali Abbasi, Soroush Abbasi Koohpayegani, Kossar Pourahmadi Meibodi, Rana Muhammad Shahroz Khan, Soheil Kolouri, Hamed Pirsiavash

Abstract: We demonstrate that a deep model can be reparametrized as a linear combination of several randomly initialized and frozen deep models in the weight space. During training, we seek local minima that reside within the subspace spanned by these random models (i.e., `basis' networks). Our framework, PRANC, enables significant compaction of a deep model. The model can be reconstructed using a single sc… ▽ More We demonstrate that a deep model can be reparametrized as a linear combination of several randomly initialized and frozen deep models in the weight space. During training, we seek local minima that reside within the subspace spanned by these random models (i.e., `basis' networks). Our framework, PRANC, enables significant compaction of a deep model. The model can be reconstructed using a single scalar `seed,' employed to generate the pseudo-random `basis' networks, together with the learned linear mixture coefficients. In practical applications, PRANC addresses the challenge of efficiently storing and communicating deep models, a common bottleneck in several scenarios, including multi-agent learning, continual learners, federated systems, and edge devices, among others. In this study, we employ PRANC to condense image classification models and compress images by compacting their associated implicit neural networks. PRANC outperforms baselines with a large margin on image classification when compressing a deep model almost $100$ times. Moreover, we show that PRANC enables memory-efficient inference by generating layer-wise weights on the fly. The source code of PRANC is here: \url{https://github.com/UCDvision/PRANC} △ Less

Submitted 28 August, 2023; v1 submitted 16 June, 2022; originally announced June 2022.

arXiv:2205.14029 [pdf]

Lesion classification by model-based feature extraction: A differential affine invariant model of soft tissue elasticity

Authors: Weiguo Cao, Marc J. Pomeroy, Zhengrong Liang, Yongfeng Gao, Yongyi Shi, Jiaxing Tan, Fangfang Han, Jing Wang, Jianhua Ma, Hongbin Lu, Almas F. Abbasi, Perry J. Pickhardt

Abstract: The elasticity of soft tissues has been widely considered as a characteristic property to differentiate between healthy and vicious tissues and, therefore, motivated several elasticity imaging modalities, such as Ultrasound Elastography, Magnetic Resonance Elastography, and Optical Coherence Elastography. This paper proposes an alternative approach of modeling the elasticity using Computed Tomogra… ▽ More The elasticity of soft tissues has been widely considered as a characteristic property to differentiate between healthy and vicious tissues and, therefore, motivated several elasticity imaging modalities, such as Ultrasound Elastography, Magnetic Resonance Elastography, and Optical Coherence Elastography. This paper proposes an alternative approach of modeling the elasticity using Computed Tomography (CT) imaging modality for model-based feature extraction machine learning (ML) differentiation of lesions. The model describes a dynamic non-rigid (or elastic) deformation in differential manifold to mimic the soft tissues elasticity under wave fluctuation in vivo. Based on the model, three local deformation invariants are constructed by two tensors defined by the first and second order derivatives from the CT images and used to generate elastic feature maps after normalization via a novel signal suppression method. The model-based elastic image features are extracted from the feature maps and fed to machine learning to perform lesion classifications. Two pathologically proven image datasets of colon polyps (44 malignant and 43 benign) and lung nodules (46 malignant and 20 benign) were used to evaluate the proposed model-based lesion classification. The outcomes of this modeling approach reached the score of area under the curve of the receiver operating characteristics of 94.2 % for the polyps and 87.4 % for the nodules, resulting in an average gain of 5 % to 30 % over ten existing state-of-the-art lesion classification methods. The gains by modeling tissue elasticity for ML differentiation of lesions are striking, indicating the great potential of exploring the modeling strategy to other tissue properties for ML differentiation of lesions. △ Less

Submitted 27 May, 2022; originally announced May 2022.

Comments: 12 pages, 4 figures, 3 tables

arXiv:2203.06514 [pdf, other]

Sparsity and Heterogeneous Dropout for Continual Learning in the Null Space of Neural Activations

Authors: Ali Abbasi, Parsa Nooralinejad, Vladimir Braverman, Hamed Pirsiavash, Soheil Kolouri

Abstract: Continual/lifelong learning from a non-stationary input data stream is a cornerstone of intelligence. Despite their phenomenal performance in a wide variety of applications, deep neural networks are prone to forgetting their previously learned information upon learning new ones. This phenomenon is called "catastrophic forgetting" and is deeply rooted in the stability-plasticity dilemma. Overcoming… ▽ More Continual/lifelong learning from a non-stationary input data stream is a cornerstone of intelligence. Despite their phenomenal performance in a wide variety of applications, deep neural networks are prone to forgetting their previously learned information upon learning new ones. This phenomenon is called "catastrophic forgetting" and is deeply rooted in the stability-plasticity dilemma. Overcoming catastrophic forgetting in deep neural networks has become an active field of research in recent years. In particular, gradient projection-based methods have recently shown exceptional performance at overcoming catastrophic forgetting. This paper proposes two biologically-inspired mechanisms based on sparsity and heterogeneous dropout that significantly increase a continual learner's performance over a long sequence of tasks. Our proposed approach builds on the Gradient Projection Memory (GPM) framework. We leverage k-winner activations in each layer of a neural network to enforce layer-wise sparse activations for each task, together with a between-task heterogeneous dropout that encourages the network to use non-overlapping activation patterns between different tasks. In addition, we introduce two new benchmarks for continual learning under distributional shift, namely Continual Swiss Roll and ImageNet SuperDog-40. Lastly, we provide an in-depth analysis of our proposed method and demonstrate a significant performance boost on various benchmark continual learning problems. △ Less

Submitted 8 July, 2022; v1 submitted 12 March, 2022; originally announced March 2022.

arXiv:2202.04104 [pdf, other]

Teaching Networks to Solve Optimization Problems

Authors: Xinran Liu, Yuzhe Lu, Ali Abbasi, Meiyi Li, Javad Mohammadi, Soheil Kolouri

Abstract: Leveraging machine learning to facilitate the optimization process is an emerging field that holds the promise to bypass the fundamental computational bottleneck caused by classic iterative solvers in critical applications requiring near-real-time optimization. The majority of existing approaches focus on learning data-driven optimizers that lead to fewer iterations in solving an optimization. In… ▽ More Leveraging machine learning to facilitate the optimization process is an emerging field that holds the promise to bypass the fundamental computational bottleneck caused by classic iterative solvers in critical applications requiring near-real-time optimization. The majority of existing approaches focus on learning data-driven optimizers that lead to fewer iterations in solving an optimization. In this paper, we take a different approach and propose to replace the iterative solvers altogether with a trainable parametric set function, that outputs the optimal arguments/parameters of an optimization problem in a single feed forward. We denote our method as Learning to Optimize the Optimization Process (LOOP). We show the feasibility of learning such parametric (set) functions to solve various classic optimization problems including linear/nonlinear regression, principal component analysis, transport-based coreset, and quadratic programming in supply management applications. In addition, we propose two alternative approaches for learning such parametric functions, with and without a solver in the LOOP. Finally, through various numerical experiments, we show that the trained solvers could be orders of magnitude faster than the classic iterative solvers while providing near optimal solutions. △ Less

Submitted 15 July, 2022; v1 submitted 8 February, 2022; originally announced February 2022.

arXiv:2111.10634 [pdf, other]

Identity-Preserving Pose-Robust Face Hallucination Through Face Subspace Prior

Authors: Ali Abbasi, Mohammad Rahmati

Abstract: Over the past few decades, numerous attempts have been made to address the problem of recovering a high-resolution (HR) facial image from its corresponding low-resolution (LR) counterpart, a task commonly referred to as face hallucination. Despite the impressive performance achieved by position-patch and deep learning-based methods, most of these techniques are still unable to recover identity-spe… ▽ More Over the past few decades, numerous attempts have been made to address the problem of recovering a high-resolution (HR) facial image from its corresponding low-resolution (LR) counterpart, a task commonly referred to as face hallucination. Despite the impressive performance achieved by position-patch and deep learning-based methods, most of these techniques are still unable to recover identity-specific features of faces. The former group of algorithms often produces blurry and oversmoothed outputs particularly in the presence of higher levels of degradation, whereas the latter generates faces which sometimes by no means resemble the individuals in the input images. In this paper, a novel face super-resolution approach will be introduced, in which the hallucinated face is forced to lie in a subspace spanned by the available training faces. Therefore, in contrast to the majority of existing face hallucination techniques and thanks to this face subspace prior, the reconstruction is performed in favor of recovering person-specific facial features, rather than merely increasing image quantitative scores. Furthermore, inspired by recent advances in the area of 3D face reconstruction, an efficient 3D dictionary alignment scheme is also presented, through which the algorithm becomes capable of dealing with low-resolution faces taken in uncontrolled conditions. In extensive experiments carried out on several well-known face datasets, the proposed algorithm shows remarkable performance by generating detailed and close to ground truth results which outperform the state-of-the-art face hallucination algorithms by significant margins both in quantitative and qualitative evaluations. △ Less

Submitted 20 November, 2021; originally announced November 2021.

Comments: A shorter version of this paper has been submitted to IEEE Transactions on Image Processing

arXiv:2111.04522 [pdf, other]

Terahertz Wireless Channels: A Holistic Survey on Measurement, Modeling, and Analysis

Authors: Chong Han, Yiqin Wang, Yuanbo Li, Yi Chen, Naveed A. Abbasi, Thomas Kürner, Andreas F. Molisch

Abstract: Terahertz (0.1-10 THz) communications are envisioned as a key technology for sixth generation (6G) wireless systems. The study of underlying THz wireless propagation channels provides the foundations for the development of reliable THz communication systems and their applications. This article provides a comprehensive overview of the study of THz wireless channels. First, the three most popular TH… ▽ More Terahertz (0.1-10 THz) communications are envisioned as a key technology for sixth generation (6G) wireless systems. The study of underlying THz wireless propagation channels provides the foundations for the development of reliable THz communication systems and their applications. This article provides a comprehensive overview of the study of THz wireless channels. First, the three most popular THz channel measurement methodologies, namely, frequency-domain channel measurement based on a vector network analyzer (VNA), time-domain channel measurement based on sliding correlation, and time-domain channel measurement based on THz pulses from time-domain spectroscopy (THz-TDS), are introduced and compared. Current channel measurement systems and measurement campaigns are reviewed. Then, existing channel modeling methodologies are categorized into deterministic, stochastic, and hybrid approaches. State-of-the-art THz channel models are analyzed, and the channel simulators that are based on them are introduced. Next, an in-depth review of channel characteristics in the THz band is presented. Finally, open problems and future research directions for research studies on THz wireless channels for 6G are elaborated. △ Less

Submitted 9 June, 2022; v1 submitted 8 November, 2021; originally announced November 2021.

Comments: to appear in IEEE Communications Surveys and Tutorials

arXiv:2111.03013 [pdf, other]

doi 10.1145/3492321.3519591

Nyx-Net: Network Fuzzing with Incremental Snapshots

Authors: Sergej Schumilo, Cornelius Aschermann, Andrea Jemmett, Ali Abbasi, Thorsten Holz

Abstract: Coverage-guided fuzz testing ("fuzzing") has become mainstream and we have observed lots of progress in this research area recently. However, it is still challenging to efficiently test network services with existing coverage-guided fuzzing methods. In this paper, we introduce the design and implementation of Nyx-Net, a novel snapshot-based fuzzing approach that can successfully fuzz a wide range… ▽ More Coverage-guided fuzz testing ("fuzzing") has become mainstream and we have observed lots of progress in this research area recently. However, it is still challenging to efficiently test network services with existing coverage-guided fuzzing methods. In this paper, we introduce the design and implementation of Nyx-Net, a novel snapshot-based fuzzing approach that can successfully fuzz a wide range of targets spanning servers, clients, games, and even Firefox's Inter-Process Communication (IPC) interface. Compared to state-of-the-art methods, Nyx-Net improves test throughput by up to 300x and coverage found by up to 70%. Additionally, Nyx-Net is able to find crashes in two of ProFuzzBench's targets that no other fuzzer found previously. When using Nyx-Net to play the game Super Mario, Nyx-Net shows speedups of 10-30x compared to existing work. Under some circumstances, Nyx-Net is even able play "faster than light": solving the level takes less wall-clock time than playing the level perfectly even once. Nyx-Net is able to find previously unknown bugs in servers such as Lighttpd, clients such as MySQL client, and even Firefox's IPC mechanism - demonstrating the strength and versatility of the proposed approach. Lastly, our prototype implementation was awarded a $20.000 bug bounty for enabling fuzzing on previously unfuzzable code in Firefox and solving a long-standing problem at Mozilla. △ Less

Submitted 4 November, 2021; originally announced November 2021.

Journal ref: EuroSys '22, Proceedings of the Seventeenth European Conference on Computer Systems, March 2022, Pages 166-180

arXiv:2110.14034 [pdf, other]

doi 10.1109/ICASSP43922.2022.9746201

r-local sensing: Improved algorithm and applications

Authors: Ahmed Ali Abbasi, Abiy Tasissa, Shuchin Aeron

Abstract: The unlabeled sensing problem is to solve a noisy linear system of equations under unknown permutation of the measurements. We study a particular case of the problem where the permutations are restricted to be r-local, i.e. the permutation matrix is block diagonal with r x r blocks. Assuming a Gaussian measurement matrix, we argue that the r-local permutation model is more challenging compared to… ▽ More The unlabeled sensing problem is to solve a noisy linear system of equations under unknown permutation of the measurements. We study a particular case of the problem where the permutations are restricted to be r-local, i.e. the permutation matrix is block diagonal with r x r blocks. Assuming a Gaussian measurement matrix, we argue that the r-local permutation model is more challenging compared to a recent sparse permutation model. We propose a proximal alternating minimization algorithm for the general unlabeled sensing problem that provably converges to a first order stationary point. Applied to the r-local model, we show that the resulting algorithm is efficient. We validate the algorithm on synthetic and real datasets. We also formulate the 1-d unassigned distance geometry problem as an unlabeled sensing problem with a structured measurement matrix. △ Less

Submitted 14 February, 2022; v1 submitted 26 October, 2021; originally announced October 2021.

arXiv:2106.08913 [pdf, other]

Technical Report: Hardening Code Obfuscation Against Automated Attacks

Authors: Moritz Schloegel, Tim Blazytko, Moritz Contag, Cornelius Aschermann, Julius Basler, Thorsten Holz, Ali Abbasi

Abstract: Software obfuscation is a crucial technology to protect intellectual property and manage digital rights within our society. Despite its huge practical importance, both commercial and academic state-of-the-art obfuscation methods are vulnerable to a plethora of automated deobfuscation attacks, such as symbolic execution, taint analysis, or program synthesis. While several enhanced obfuscation techn… ▽ More Software obfuscation is a crucial technology to protect intellectual property and manage digital rights within our society. Despite its huge practical importance, both commercial and academic state-of-the-art obfuscation methods are vulnerable to a plethora of automated deobfuscation attacks, such as symbolic execution, taint analysis, or program synthesis. While several enhanced obfuscation techniques were recently proposed to thwart taint analysis or symbolic execution, they either impose a prohibitive runtime overhead or can be removed in an automated way (e.g., via compiler optimizations). In general, these techniques suffer from focusing on a single attack vector, allowing an attacker to switch to other, more effective techniques, such as program synthesis. In this work, we present Loki, an approach for software obfuscation that is resilient against all known automated deobfuscation attacks. To this end, we use and efficiently combine multiple techniques, including a generic approach to synthesize formally verified expressions of arbitrary complexity. Contrary to state-of-the-art approaches that rely on a few hardcoded generation rules, our expressions are more diverse and harder to pattern match against. Even the most recent state-of-the-art research on Mixed-Boolean Arithmetic (MBA) deobfuscation fails to simplify them. Moreover, Loki protects against previously unaccounted attack vectors such as program synthesis, for which it reduces the success rate to merely 19%. In a comprehensive evaluation, we show that our design incurs significantly less overhead while providing a much stronger protection level compared to existing works. △ Less

Submitted 17 June, 2022; v1 submitted 16 June, 2021; originally announced June 2021.

arXiv:2103.13546 [pdf, other]

Benchmarking Modern Named Entity Recognition Techniques for Free-text Health Record De-identification

Authors: Abdullah Ahmed, Adeel Abbasi, Carsten Eickhoff

Abstract: Electronic Health Records (EHRs) have become the primary form of medical data-keeping across the United States. Federal law restricts the sharing of any EHR data that contains protected health information (PHI). De-identification, the process of identifying and removing all PHI, is crucial for making EHR data publicly available for scientific research. This project explores several deep learning-b… ▽ More Electronic Health Records (EHRs) have become the primary form of medical data-keeping across the United States. Federal law restricts the sharing of any EHR data that contains protected health information (PHI). De-identification, the process of identifying and removing all PHI, is crucial for making EHR data publicly available for scientific research. This project explores several deep learning-based named entity recognition (NER) methods to determine which method(s) perform better on the de-identification task. We trained and tested our models on the i2b2 training dataset, and qualitatively assessed their performance using EHR data collected from a local hospital. We found that 1) BiLSTM-CRF represents the best-performing encoder/decoder combination, 2) character-embeddings and CRFs tend to improve precision at the price of recall, and 3) transformers alone under-perform as context encoders. Future work focused on structuring medical text may improve the extraction of semantic and syntactic information for the purposes of EHR de-identification. △ Less

Submitted 24 March, 2021; originally announced March 2021.

Comments: Presented at AMIA Informatics Summit 2021

arXiv:2101.03696 [pdf]

doi 10.1109/TASE.2020.3044155

A Cooperative Dynamic Task Assignment Framework for COTSBot AUVs

Authors: Amin Abbasi, Somaiyeh MahmoudZadeh, Amirmehdi Yazdani

Abstract: This paper presents a cooperative dynamic task assignment framework for a certain class of Autonomous Underwater Vehicles (AUVs) employed to control outbreak of Crown-Of-Thorns Starfish (COTS) in Australia's Great Barrier Reef. The problem of monitoring and controlling the COTS is transcribed into a constrained task assignment problem in which eradicating clusters of COTS, by the injection system… ▽ More This paper presents a cooperative dynamic task assignment framework for a certain class of Autonomous Underwater Vehicles (AUVs) employed to control outbreak of Crown-Of-Thorns Starfish (COTS) in Australia's Great Barrier Reef. The problem of monitoring and controlling the COTS is transcribed into a constrained task assignment problem in which eradicating clusters of COTS, by the injection system of COTSbot AUVs, is considered as a task. A probabilistic map of the operating environment including seabed terrain, clusters of COTS, and coastlines is constructed. Then, a novel heuristic algorithm called Heuristic Fleet Cooperation (HFC) is developed to provide a cooperative injection of the COTSbot AUVs to the maximum possible COTS in an assigned mission time. Extensive simulation studies together with quantitative performance analysis are conducted to demonstrate the effectiveness and robustness of the proposed cooperative task assignment algorithm in eradicating the COTS in the Great Barrier Reef. △ Less

Submitted 10 January, 2021; originally announced January 2021.

Journal ref: IEEE Transactions on Automation Science and Engineering 2020

arXiv:2101.03693 [pdf]

Exploiting a Fleet of UAVs for Monitoring and Data Acquisition of a Distributed Sensor Network

Authors: S. MahmoudZadeh, A. Yazdani, A. Elmi, A. Abbasi, P. Ghanooni

Abstract: This study proposes an efficient data collection strategy exploiting a team of Unmanned Aerial Vehicles (UAVs) to monitor and collect the data of a large distributed sensor network usually used for environmental monitoring, meteorology, agriculture, and renewable energy applications. The study develops a collaborative mission planning system that enables a team of UAVs to conduct and complete the… ▽ More This study proposes an efficient data collection strategy exploiting a team of Unmanned Aerial Vehicles (UAVs) to monitor and collect the data of a large distributed sensor network usually used for environmental monitoring, meteorology, agriculture, and renewable energy applications. The study develops a collaborative mission planning system that enables a team of UAVs to conduct and complete the mission of sensors' data collection collaboratively while considering existing constraints of the UAV payload and battery capacity. The proposed mission planner system employs the Differential Evolution (DE) optimization algorithm enabling UAVs to maximize the number of visited sensor nodes given the priority of the sensors and avoiding the redundant collection of sensors' data. The proposed mission planner is evaluated through extensive simulation and comparative analysis. The simulation results confirm the effectiveness and fidelity of the proposed mission planner to be used for the distributed sensor network monitoring and data collection. △ Less

Submitted 10 January, 2021; originally announced January 2021.

arXiv:2012.13605 [pdf]

doi 10.15302/J-QB-021-0278

COVIDX: Computer-aided diagnosis of Covid-19 and its severity prediction with raw digital chest X-ray images

Authors: Wajid Arshad Abbasi, Syed Ali Abbas, Saiqa Andleeb

Abstract: Coronavirus disease (COVID-19) is a contagious infection caused by severe acute respiratory syndrome coronavirus-2 (SARS-COV-2) and it has infected and killed millions of people across the globe. In the absence of specific drugs or vaccines for the treatment of COVID-19 and the limitation of prevailing diagnostic techniques, there is a requirement for some alternate automatic screening systems tha… ▽ More Coronavirus disease (COVID-19) is a contagious infection caused by severe acute respiratory syndrome coronavirus-2 (SARS-COV-2) and it has infected and killed millions of people across the globe. In the absence of specific drugs or vaccines for the treatment of COVID-19 and the limitation of prevailing diagnostic techniques, there is a requirement for some alternate automatic screening systems that can be used by the physicians to quickly identify and isolate the infected patients. A chest X-ray (CXR) image can be used as an alternative modality to detect and diagnose the COVID-19. In this study, we present an automatic COVID-19 diagnostic and severity prediction (COVIDX) system that uses deep feature maps from CXR images to diagnose COVID-19 and its severity prediction. The proposed system uses a three-phase classification approach (healthy vs unhealthy, COVID-19 vs Pneumonia, and COVID-19 severity) using different shallow supervised classification algorithms. We evaluated COVIDX not only through 10-fold cross2 validation and by using an external validation dataset but also in real settings by involving an experienced radiologist. In all the evaluation settings, COVIDX outperforms all the existing stateof-the-art methods designed for this purpose. We made COVIDX easily accessible through a cloud-based webserver and python code available at https://sites.google.com/view/wajidarshad/software and https://github.com/wajidarshad/covidx, respectively. △ Less

Submitted 25 December, 2020; originally announced December 2020.

Comments: 19 pages, 3 figures, 5 tables

Journal ref: Quantitative Biology, 10(2), 208-220, 2022

arXiv:2012.05391 [pdf]

Feasibility Assessment of a Cost-Effective Two-Wheel Kian-I Mobile Robot for Autonomous Navigation

Authors: Amin Abbasi, Somaiyeh MahmoudZadeh, Amirmehdi Yazdani, Ata Jahangir Moshayedi

Abstract: A two-wheeled mobile robot, namely Kian-I, is designed and prototyped in this research. The Kian-I is comparable with Khepera-IV in terms of dimensional specifications, mounted sensors, and performance capabilities and can be used for educational purposes and cost-effective experimental tests. A motion control architecture is designed for Kian-I in this study to facilitate accurate navigation for… ▽ More A two-wheeled mobile robot, namely Kian-I, is designed and prototyped in this research. The Kian-I is comparable with Khepera-IV in terms of dimensional specifications, mounted sensors, and performance capabilities and can be used for educational purposes and cost-effective experimental tests. A motion control architecture is designed for Kian-I in this study to facilitate accurate navigation for the robot in an immersive environment. The implemented control structure consists of two main components of the path recommender system and trajectory tracking controller. Given partial knowledge about the operation field, the path recommender system adopts B-spline curves and Particle Swarm Optimization (PSO) algorithm to determine a collision-free path curve with translational velocity constraint. The provided optimal reference path feeds into the trajectory tracking controller enabling Kian-I to navigate autonomously in the operating field. The trajectory tracking module eliminate the error between the desired path and the followed trajectory through controlling the wheels' velocity. To assess the feasibility of the proposed control architecture, the performance of Kian-I robot in autonomous navigation from any arbitrary initial pose to a target of interest is evaluated through numerous simulation and experimental studies. The experimental results demonstrate the functional capacities and performance of the prototyped robot to be used as a benchmark for investigation and verification of various mobile robot algorithms in the laboratory environment. △ Less

Submitted 9 December, 2020; originally announced December 2020.

Journal ref: Robotics and Autonomous System 2020

arXiv:2009.08869 [pdf]

doi 10.1142/S0219720021500153

PANDA: Predicting the change in proteins binding affinity upon mutations using sequence information

Authors: Wajid Arshad Abbasi, Syed Ali Abbas, Saiqa Andleeb

Abstract: Accurately determining a change in protein binding affinity upon mutations is important for the discovery and design of novel therapeutics and to assist mutagenesis studies. Determination of change in binding affinity upon mutations requires sophisticated, expensive, and time-consuming wet-lab experiments that can be aided with computational methods. Most of the computational prediction techniques… ▽ More Accurately determining a change in protein binding affinity upon mutations is important for the discovery and design of novel therapeutics and to assist mutagenesis studies. Determination of change in binding affinity upon mutations requires sophisticated, expensive, and time-consuming wet-lab experiments that can be aided with computational methods. Most of the computational prediction techniques require protein structures that limit their applicability to protein complexes with known structures. In this work, we explore the sequence-based prediction of change in protein binding affinity upon mutation. We have used protein sequence information instead of protein structures along with machine learning techniques to accurately predict the change in protein binding affinity upon mutation. Our proposed sequence-based novel change in protein binding affinity predictor called PANDA gives better accuracy than existing methods over the same validation set as well as on an external independent test dataset. On an external test dataset, our proposed method gives a maximum Pearson correlation coefficient of 0.52 in comparison to the state-of-the-art existing protein structure-based method called MutaBind which gives a maximum Pearson correlation coefficient of 0.59. Our proposed protein sequence-based method, to predict a change in binding affinity upon mutations, has wide applicability and comparable performance in comparison to existing protein structure-based methods. A cloud-based webserver implementation of PANDA and its python code is available at https://sites.google.com/view/wajidarshad/software and https://github.com/wajidarshad/panda. △ Less

Submitted 16 September, 2020; originally announced September 2020.

Journal ref: Journal of Bioinformatics and Computational Biology, 2021

arXiv:2008.06940 [pdf, other]

TempNodeEmb:Temporal Node Embedding considering temporal edge influence matrix

Authors: Khushnood Abbas, Alireza Abbasi, Dong Shi, Niu Ling, Mingsheng Shang, Chen Liong, Bolun Chen

Abstract: Understanding the evolutionary patterns of real-world evolving complex systems such as human interactions, transport networks, biological interactions, and computer networks has important implications in our daily lives. Predicting future links among the nodes in such networks reveals an important aspect of the evolution of temporal networks. To analyse networks, they are mapped to adjacency matri… ▽ More Understanding the evolutionary patterns of real-world evolving complex systems such as human interactions, transport networks, biological interactions, and computer networks has important implications in our daily lives. Predicting future links among the nodes in such networks reveals an important aspect of the evolution of temporal networks. To analyse networks, they are mapped to adjacency matrices, however, a single adjacency matrix cannot represent complex relationships (e.g. temporal pattern), and therefore, some approaches consider a simplified representation of temporal networks but in high-dimensional and generally sparse matrices. As a result, adjacency matrices cannot be directly used by machine learning models for making network or node level predictions. To overcome this problem, automated frameworks are proposed for learning low-dimensional vectors for nodes or edges, as state-of-the-art techniques in predicting temporal patterns in networks such as link prediction. However, these models fail to consider temporal dimensions of the networks. This gap motivated us to propose in this research a new node embedding technique which exploits the evolving nature of the networks considering a simple three-layer graph neural network at each time step, and extracting node orientation by Given's angle method. To prove our proposed algorithm's efficiency, we evaluated the efficiency of our proposed algorithm against six state-of-the-art benchmark network embedding models, on four real temporal networks data, and the results show our model outperforms other methods in predicting future links in temporal networks. △ Less

Submitted 16 August, 2020; originally announced August 2020.

Comments: IEEE double column 6 pages

arXiv:2007.12969 [pdf]

Constructing a Testbed for Psychometric Natural Language Processing

Authors: Ahmed Abbasi, David G. Dobolyi, Richard G. Netemeyer

Abstract: Psychometric measures of ability, attitudes, perceptions, and beliefs are crucial for understanding user behaviors in various contexts including health, security, e-commerce, and finance. Traditionally, psychometric dimensions have been measured and collected using survey-based methods. Inferring such constructs from user-generated text could afford opportunities for timely, unobtrusive, collectio… ▽ More Psychometric measures of ability, attitudes, perceptions, and beliefs are crucial for understanding user behaviors in various contexts including health, security, e-commerce, and finance. Traditionally, psychometric dimensions have been measured and collected using survey-based methods. Inferring such constructs from user-generated text could afford opportunities for timely, unobtrusive, collection and analysis. In this paper, we describe our efforts to construct a corpus for psychometric natural language processing (NLP). We discuss our multi-step process to align user text with their survey-based response items and provide an overview of the resulting testbed which encompasses survey-based psychometric measures and accompanying user-generated text from over 8,500 respondents. We report preliminary results on the use of the text to categorize/predict users' survey response labels. We also discuss the important implications of our work and resulting testbed for future psychometric NLP research. △ Less

Submitted 25 July, 2020; originally announced July 2020.

Comments: 7 pages, 9 figures

arXiv:2007.02307 [pdf, ps, other]

Challenges in Designing Exploit Mitigations for Deeply Embedded Systems

Authors: Ali Abbasi, Jos Wetzels, Thorsten Holz, Sandro Etalle

Abstract: Memory corruption vulnerabilities have been around for decades and rank among the most prevalent vulnerabilities in embedded systems. Yet this constrained environment poses unique design and implementation challenges that significantly complicate the adoption of common hardening techniques. Combined with the irregular and involved nature of embedded patch management, this results in prolonged vuln… ▽ More Memory corruption vulnerabilities have been around for decades and rank among the most prevalent vulnerabilities in embedded systems. Yet this constrained environment poses unique design and implementation challenges that significantly complicate the adoption of common hardening techniques. Combined with the irregular and involved nature of embedded patch management, this results in prolonged vulnerability exposure windows and vulnerabilities that are relatively easy to exploit. Considering the sensitive and critical nature of many embedded systems, this situation merits significant improvement. In this work, we present the first quantitative study of exploit mitigation adoption in 42 embedded operating systems, showing the embedded world to significantly lag behind the general-purpose world. To improve the security of deeply embedded systems, we subsequently present μArmor, an approach to address some of the key gaps identified in our quantitative analysis. μArmor raises the bar for exploitation of embedded memory corruption vulnerabilities, while being adoptable on the short term without incurring prohibitive extra performance or storage costs. △ Less

Submitted 5 July, 2020; originally announced July 2020.

Comments: Published in 4th IEEE European Symposium on Security and Privacy (EuroS&P'19)

arXiv:2005.12855 [pdf, other]

COVID-Net S: Towards computer-aided severity assessment via training and validation of deep neural networks for geographic extent and opacity extent scoring of chest X-rays for SARS-CoV-2 lung disease severity

Authors: Alexander Wong, Zhong Qiu Lin, Linda Wang, Audrey G. Chung, Beiyi Shen, Almas Abbasi, Mahsa Hoshmand-Kochi, Timothy Q. Duong

Abstract: Background: A critical step in effective care and treatment planning for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of the COVID-19 pandemic, is the assessment of the severity of disease progression. Chest x-rays (CXRs) are often used to assess SARS-CoV-2 severity, with two important assessment metrics being extent of lung involvement and degree of opacity. In this pro… ▽ More Background: A critical step in effective care and treatment planning for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of the COVID-19 pandemic, is the assessment of the severity of disease progression. Chest x-rays (CXRs) are often used to assess SARS-CoV-2 severity, with two important assessment metrics being extent of lung involvement and degree of opacity. In this proof-of-concept study, we assess the feasibility of computer-aided scoring of CXRs of SARS-CoV-2 lung disease severity using a deep learning system. Materials and Methods: Data consisted of 396 CXRs from SARS-CoV-2 positive patient cases. Geographic extent and opacity extent were scored by two board-certified expert chest radiologists (with 20+ years of experience) and a 2nd-year radiology resident. The deep neural networks used in this study, which we name COVID-Net S, are based on a COVID-Net network architecture. 100 versions of the network were independently learned (50 to perform geographic extent scoring and 50 to perform opacity extent scoring) using random subsets of CXRs from the study, and we evaluated the networks using stratified Monte Carlo cross-validation experiments. Findings: The COVID-Net S deep neural networks yielded R$^2$ of 0.664 $\pm$ 0.032 and 0.635 $\pm$ 0.044 between predicted scores and radiologist scores for geographic extent and opacity extent, respectively, in stratified Monte Carlo cross-validation experiments. The best performing networks achieved R$^2$ of 0.739 and 0.741 between predicted scores and radiologist scores for geographic extent and opacity extent, respectively. Interpretation: The results are promising and suggest that the use of deep neural networks on CXRs could be an effective tool for computer-aided assessment of SARS-CoV-2 lung disease severity, although additional studies are needed before adoption for routine clinical use. △ Less

Submitted 16 April, 2021; v1 submitted 26 May, 2020; originally announced May 2020.

Comments: 8 pages

arXiv:2005.11856 [pdf, other]

Predicting COVID-19 Pneumonia Severity on Chest X-ray with Deep Learning

Authors: Joseph Paul Cohen, Lan Dao, Paul Morrison, Karsten Roth, Yoshua Bengio, Beiyi Shen, Almas Abbasi, Mahsa Hoshmand-Kochi, Marzyeh Ghassemi, Haifang Li, Tim Q Duong

Abstract: Purpose: The need to streamline patient management for COVID-19 has become more pressing than ever. Chest X-rays provide a non-invasive (potentially bedside) tool to monitor the progression of the disease. In this study, we present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images. Such a tool can gauge severity of COVID-19 lung infections (and pneumonia in ge… ▽ More Purpose: The need to streamline patient management for COVID-19 has become more pressing than ever. Chest X-rays provide a non-invasive (potentially bedside) tool to monitor the progression of the disease. In this study, we present a severity score prediction model for COVID-19 pneumonia for frontal chest X-ray images. Such a tool can gauge severity of COVID-19 lung infections (and pneumonia in general) that can be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the ICU. Methods: Images from a public COVID-19 database were scored retrospectively by three blinded experts in terms of the extent of lung involvement as well as the degree of opacity. A neural network model that was pre-trained on large (non-COVID-19) chest X-ray datasets is used to construct features for COVID-19 images which are predictive for our task. Results: This study finds that training a regression model on a subset of the outputs from an this pre-trained chest X-ray model predicts our geographic extent score (range 0-8) with 1.14 mean absolute error (MAE) and our lung opacity score (range 0-6) with 0.78 MAE. Conclusions: These results indicate that our model's ability to gauge severity of COVID-19 lung infections could be used for escalation or de-escalation of care as well as monitoring treatment efficacy, especially in the intensive care unit (ICU). A proper clinical trial is needed to evaluate efficacy. To enable this we make our code, labels, and data available online at https://github.com/mlmed/torchxrayvision/tree/master/scripts/covid-severity and https://github.com/ieee8023/covid-chestxray-dataset △ Less

Submitted 30 June, 2020; v1 submitted 24 May, 2020; originally announced May 2020.

arXiv:2004.11842 [pdf]

doi 10.32604/cmc.2020.07708

A Mobile Cloud-Based eHealth Scheme

Authors: Yihe Liu, Aaqif Afzaal Abbasi, Atefeh Aghaei, Almas Abbasi, Amir Mosavi, Shahab Shamshirband, Mohammed A. A. Al-qaness

Abstract: Mobile cloud computing is an emerging field that is gaining popularity across borders at a rapid pace. Similarly, the field of health informatics is also considered as an extremely important field. This work observes the collaboration between these two fields to solve the traditional problem of extracting Electrocardiogram signals from trace reports and then performing analysis. The developed syst… ▽ More Mobile cloud computing is an emerging field that is gaining popularity across borders at a rapid pace. Similarly, the field of health informatics is also considered as an extremely important field. This work observes the collaboration between these two fields to solve the traditional problem of extracting Electrocardiogram signals from trace reports and then performing analysis. The developed system has two front ends, the first dedicated for the user to perform the photographing of the trace report. Once the photographing is complete, mobile computing is used to extract the signal. Once the signal is extracted, it is uploaded into the server and further analysis is performed on the signal in the cloud. Once this is done, the second interface, intended for the use of the physician, can download and view the trace from the cloud. The data is securely held using a password-based authentication method. The system presented here is one of the first attempts at delivering the total solution, and after further upgrades, it will be possible to deploy the system in a commercial setting. △ Less

Submitted 15 April, 2020; originally announced April 2020.

Comments: 9 pages, 3 figures

MSC Class: 68T05

arXiv:2003.11864 [pdf, other]

doi 10.1109/RBME.2021.3056455

Information and Communication Theoretical Understanding and Treatment of Spinal Cord Injuries: State-of-the-art and Research Challenges

Authors: Ozgur B. Akan, Hamideh Ramezani, Meltem Civas, Oktay Cetinkaya, Bilgesu A. Bilgin, Naveed A. Abbasi

Abstract: Among the various key networks in the human body, the nervous system occupies central importance. The debilitating effects of spinal cord injuries (SCI) impact a significant number of people throughout the world, and to date, there is no satisfactory method to treat them. In this paper, we review the major treatment techniques for SCI that include promising solutions based on information and commu… ▽ More Among the various key networks in the human body, the nervous system occupies central importance. The debilitating effects of spinal cord injuries (SCI) impact a significant number of people throughout the world, and to date, there is no satisfactory method to treat them. In this paper, we review the major treatment techniques for SCI that include promising solutions based on information and communication technology (ICT) and identify the key characteristics of such systems. We then introduce two novel ICT-based treatment approaches for SCI. The first proposal is based on neural interface systems (NIS) with enhanced feedback, where the external machines are interfaced with the brain and the spinal cord such that the brain signals are directly routed to the limbs for movement. The second proposal relates to the design of self-organizing artificial neurons (ANs) that can be used to replace the injured or dead biological neurons. Apart from SCI treatment, the proposed methods may also be utilized as enabling technologies for neural interface applications by acting as bio-cyber interfaces between the nervous system and machines. Furthermore, under the framework of Internet of Bio- Nano Things (IoBNT), experience gained from SCI treatment techniques can be transferred to nano communication research. △ Less

Submitted 11 March, 2021; v1 submitted 26 March, 2020; originally announced March 2020.

Comments: IEEE Reviews in Biomedical Engineering

arXiv:2003.11265 [pdf, other]

Multiscale Sparsifying Transform Learning for Image Denoising

Authors: Ashkan Abbasi, Amirhassan Monadjemi, Leyuan Fang, Hossein Rabbani, Neda Noormohammadi, Yi Zhang

Abstract: The data-driven sparse methods such as synthesis dictionary learning (e.g., K-SVD) and sparsifying transform learning have been proven effective in image denoising. However, they are intrinsically single-scale which can lead to suboptimal results. We propose two methods developed based on wavelet subbands mixing to efficiently combine the merits of both single and multiscale methods. We show that… ▽ More The data-driven sparse methods such as synthesis dictionary learning (e.g., K-SVD) and sparsifying transform learning have been proven effective in image denoising. However, they are intrinsically single-scale which can lead to suboptimal results. We propose two methods developed based on wavelet subbands mixing to efficiently combine the merits of both single and multiscale methods. We show that an efficient multiscale method can be devised without the need for denoising detail subbands which substantially reduces the runtime. The proposed methods are initially derived within the framework of sparsifying transform learning denoising, and then, they are generalized to propose our multiscale extensions for the well-known K-SVD and SAIST image denoising methods. We analyze and assess the studied methods thoroughly and compare them with the well-known and state-of-the-art methods. The experiments show that our methods are able to offer good trade-offs between performance and complexity. △ Less

Submitted 25 July, 2021; v1 submitted 25 March, 2020; originally announced March 2020.

Showing 1–50 of 71 results for author: Abbasi, A