-
Bahasa Harmony: A Comprehensive Dataset for Bahasa Text-to-Speech Synthesis with Discrete Codec Modeling of EnGen-TTS
Authors:
Onkar Kishor Susladkar,
Vishesh Tripathi,
Biddwan Ahmed
Abstract:
This research introduces a comprehensive Bahasa text-to-speech (TTS) dataset and a novel TTS model, EnGen-TTS, designed to enhance the quality and versatility of synthetic speech in the Bahasa language. The dataset, spanning \textasciitilde55.0 hours and 52K audio recordings, integrates diverse textual sources, ensuring linguistic richness. A meticulous recording setup captures the nuances of Baha…
▽ More
This research introduces a comprehensive Bahasa text-to-speech (TTS) dataset and a novel TTS model, EnGen-TTS, designed to enhance the quality and versatility of synthetic speech in the Bahasa language. The dataset, spanning \textasciitilde55.0 hours and 52K audio recordings, integrates diverse textual sources, ensuring linguistic richness. A meticulous recording setup captures the nuances of Bahasa phonetics, employing professional equipment to ensure high-fidelity audio samples. Statistical analysis reveals the dataset's scale and diversity, laying the foundation for model training and evaluation. The proposed EnGen-TTS model performs better than established baselines, achieving a Mean Opinion Score (MOS) of 4.45 $\pm$ 0.13. Additionally, our investigation on real-time factor and model size highlights EnGen-TTS as a compelling choice, with efficient performance. This research marks a significant advancement in Bahasa TTS technology, with implications for diverse language applications. Link to Generated Samples: \url{https://bahasa-harmony-comp.vercel.app/}
△ Less
Submitted 9 October, 2024;
originally announced October 2024.
-
A Joint Spectro-Temporal Relational Thinking Based Acoustic Modeling Framework
Authors:
Zheng Nan,
Ting Dang,
Vidhyasaharan Sethu,
Beena Ahmed
Abstract:
Relational thinking refers to the inherent ability of humans to form mental impressions about relations between sensory signals and prior knowledge, and subsequently incorporate them into their model of their world. Despite the crucial role relational thinking plays in human understanding of speech, it has yet to be leveraged in any artificial speech recognition systems. Recently, there have been…
▽ More
Relational thinking refers to the inherent ability of humans to form mental impressions about relations between sensory signals and prior knowledge, and subsequently incorporate them into their model of their world. Despite the crucial role relational thinking plays in human understanding of speech, it has yet to be leveraged in any artificial speech recognition systems. Recently, there have been some attempts to correct this oversight, but these have been limited to coarse utterance-level models that operate exclusively in the time domain. In an attempt to narrow the gap between artificial systems and human abilities, this paper presents a novel spectro-temporal relational thinking based acoustic modeling framework. Specifically, it first generates numerous probabilistic graphs to model the relationships among speech segments across both time and frequency domains. The relational information rooted in every pair of nodes within these graphs is then aggregated and embedded into latent representations that can be utilized by downstream tasks. Models built upon this framework outperform state-of-the-art systems with a 7.82\% improvement in phoneme recognition tasks over the TIMIT dataset. In-depth analyses further reveal that our proposed relational thinking modeling mainly improves the model's ability to recognize vowels, which are the most likely to be confused by phoneme recognizers.
△ Less
Submitted 17 September, 2024;
originally announced September 2024.
-
Evaluating Optimal Safe Flows Decomposition for RNA Assembly
Authors:
Bashar Ahmed,
Siddharth Singh Rana,
Ujjwal,
Shahbaz Khan
Abstract:
In Bioinformatics, the applications of flow decomposition in directed acyclic graphs are highlighted in RNA Assembly problem. However, it admits multiple solutions where exactly one solution correctly represents the underlying transcripts. The problem was addressed by Safe and Complete framework~[RECOMB16], which reports all the parts of the solution that are present in every possible solution. Kh…
▽ More
In Bioinformatics, the applications of flow decomposition in directed acyclic graphs are highlighted in RNA Assembly problem. However, it admits multiple solutions where exactly one solution correctly represents the underlying transcripts. The problem was addressed by Safe and Complete framework~[RECOMB16], which reports all the parts of the solution that are present in every possible solution. Khan et al.~[RECOMB22] first studied flow decomposition in the safe and complete framework. Their algorithm showed superior performance ($\approx20\%$) over the popular heuristic (greedy-width) on sufficiently complex graphs for a unified metric of precision and coverage (F-score). They presented the solution in multiple representations using simple but suboptimal algorithms, which were later optimized by Khan and Tomescu~[ESA22], who also presented an optimal representation.
In this paper, we evaluate the practical significance of the optimal algorithms by Khan and Tomescu~[ESA22]. Our work highlights the significance of the theoretically optimal algorithms improving time (up to $60-70\%$) and memory (up to $76-85\%$), and the optimal representations improving output size (up to $135-170\%$) significantly. However, the impact of optimal algorithms was limited due to a large number of extremely short safe paths. We propose heuristics to improve these representations further, resulting in further improvement in time (up to $10\%$) and output size ($10-25\%$). However, in absolute terms, these improvements were limited to a few seconds on real datasets involved due to the smaller size of the graphs. We thus generated large random graphs, to demonstrate the scalability of the above results. The older algorithms [RECOMB22] were not practical on moderately large graphs ($\geq 1M$ nodes), while optimal algorithms [ESA22] were linearly scalable for much larger graphs ($\geq 100M$ nodes).
△ Less
Submitted 20 September, 2024;
originally announced September 2024.
-
Auto-Landmark: Acoustic Landmark Dataset and Open-Source Toolkit for Landmark Extraction
Authors:
Xiangyu Zhang,
Daijiao Liu,
Tianyi Xiao,
Cihan Xiao,
Tuende Szalay,
Mostafa Shahin,
Beena Ahmed,
Julien Epps
Abstract:
In the speech signal, acoustic landmarks identify times when the acoustic manifestations of the linguistically motivated distinctive features are most salient. Acoustic landmarks have been widely applied in various domains, including speech recognition, speech depression detection, clinical analysis of speech abnormalities, and the detection of disordered speech. However, there is currently no dat…
▽ More
In the speech signal, acoustic landmarks identify times when the acoustic manifestations of the linguistically motivated distinctive features are most salient. Acoustic landmarks have been widely applied in various domains, including speech recognition, speech depression detection, clinical analysis of speech abnormalities, and the detection of disordered speech. However, there is currently no dataset available that provides precise timing information for landmarks, which has been proven to be crucial for downstream applications involving landmarks. In this paper, we selected the most useful acoustic landmarks based on previous research and annotated the TIMIT dataset with them, based on a combination of phoneme boundary information and manual inspection. Moreover, previous landmark extraction tools were not open source or benchmarked, so to address this, we developed an open source Python-based landmark extraction tool and established a series of landmark detection baselines. The first of their kinds, the dataset with landmark precise timing information, landmark extraction tool and baselines are designed to support a wide variety of future research.
△ Less
Submitted 12 September, 2024;
originally announced September 2024.
-
Rethinking Mamba in Speech Processing by Self-Supervised Models
Authors:
Xiangyu Zhang,
Jianbo Ma,
Mostafa Shahin,
Beena Ahmed,
Julien Epps
Abstract:
The Mamba-based model has demonstrated outstanding performance across tasks in computer vision, natural language processing, and speech processing. However, in the realm of speech processing, the Mamba-based model's performance varies across different tasks. For instance, in tasks such as speech enhancement and spectrum reconstruction, the Mamba model performs well when used independently. However…
▽ More
The Mamba-based model has demonstrated outstanding performance across tasks in computer vision, natural language processing, and speech processing. However, in the realm of speech processing, the Mamba-based model's performance varies across different tasks. For instance, in tasks such as speech enhancement and spectrum reconstruction, the Mamba model performs well when used independently. However, for tasks like speech recognition, additional modules are required to surpass the performance of attention-based models. We propose the hypothesis that the Mamba-based model excels in "reconstruction" tasks within speech processing. However, for "classification tasks" such as Speech Recognition, additional modules are necessary to accomplish the "reconstruction" step. To validate our hypothesis, we analyze the previous Mamba-based Speech Models from an information theory perspective. Furthermore, we leveraged the properties of HuBERT in our study. We trained a Mamba-based HuBERT model, and the mutual information patterns, along with the model's performance metrics, confirmed our assumptions.
△ Less
Submitted 11 September, 2024;
originally announced September 2024.
-
Physics-informed DeepONet with stiffness-based loss functions for structural response prediction
Authors:
Bilal Ahmed,
Yuqing Qiu,
Diab W. Abueidda,
Waleed El-Sekelly,
Borja Garcia de Soto,
Tarek Abdoun,
Mostafa E. Mobasher
Abstract:
Finite element modeling is a well-established tool for structural analysis, yet modeling complex structures often requires extensive pre-processing, significant analysis effort, and considerable time. This study addresses this challenge by introducing an innovative method for real-time prediction of structural static responses using DeepOnet which relies on a novel approach to physics-informed net…
▽ More
Finite element modeling is a well-established tool for structural analysis, yet modeling complex structures often requires extensive pre-processing, significant analysis effort, and considerable time. This study addresses this challenge by introducing an innovative method for real-time prediction of structural static responses using DeepOnet which relies on a novel approach to physics-informed networks driven by structural balance laws. This approach offers the flexibility to accurately predict responses under various load classes and magnitudes. The trained DeepONet can generate solutions for the entire domain, within a fraction of a second. This capability effectively eliminates the need for extensive remodeling and analysis typically required for each new case in FE modeling. We apply the proposed method to two structures: a simple 2D beam structure and a comprehensive 3D model of a real bridge. To predict multiple variables with DeepONet, we utilize two strategies: a split branch/trunk and multiple DeepONets combined into a single DeepONet. In addition to data-driven training, we introduce a novel physics-informed training approaches. This method leverages structural stiffness matrices to enforce fundamental equilibrium and energy conservation principles, resulting in two novel physics-informed loss functions: energy conservation and static equilibrium using the Schur complement. We use various combinations of loss functions to achieve an error rate of less than 5% with significantly reduced training time. This study shows that DeepONet, enhanced with hybrid loss functions, can accurately and efficiently predict displacements and rotations at each mesh point, with reduced training time.
△ Less
Submitted 2 September, 2024;
originally announced September 2024.
-
Adaptive Data Quality Scoring Operations Framework using Drift-Aware Mechanism for Industrial Applications
Authors:
Firas Bayram,
Bestoun S. Ahmed,
Erik Hallin
Abstract:
Within data-driven artificial intelligence (AI) systems for industrial applications, ensuring the reliability of the incoming data streams is an integral part of trustworthy decision-making. An approach to assess data validity is data quality scoring, which assigns a score to each data point or stream based on various quality dimensions. However, certain dimensions exhibit dynamic qualities, which…
▽ More
Within data-driven artificial intelligence (AI) systems for industrial applications, ensuring the reliability of the incoming data streams is an integral part of trustworthy decision-making. An approach to assess data validity is data quality scoring, which assigns a score to each data point or stream based on various quality dimensions. However, certain dimensions exhibit dynamic qualities, which require adaptation on the basis of the system's current conditions. Existing methods often overlook this aspect, making them inefficient in dynamic production environments. In this paper, we introduce the Adaptive Data Quality Scoring Operations Framework, a novel framework developed to address the challenges posed by dynamic quality dimensions in industrial data streams. The framework introduces an innovative approach by integrating a dynamic change detector mechanism that actively monitors and adapts to changes in data quality, ensuring the relevance of quality scores. We evaluate the proposed framework performance in a real-world industrial use case. The experimental results reveal high predictive performance and efficient processing time, highlighting its effectiveness in practical quality-driven AI applications.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
Damage identification for bridges using machine learning: Development and application to KW51 bridge
Authors:
Yuqing Qiu,
Bilal Ahmed,
Diab W. Abueidda,
Waleed El-Sekelly,
Borja Garcia de Soto,
Tarek Abdoun,
Hongli Ji,
Jinhao Qiu,
Mostafa E. Mobasher
Abstract:
The available tools for damage identification in civil engineering structures are known to be computationally expensive and data-demanding. This paper proposes a comprehensive machine learning based damage identification (CMLDI) method that integrates modal analysis and dynamic analysis strategies. The proposed approach is applied to a real structure - KW51 railway bridge in Leuven. CMLDI diligent…
▽ More
The available tools for damage identification in civil engineering structures are known to be computationally expensive and data-demanding. This paper proposes a comprehensive machine learning based damage identification (CMLDI) method that integrates modal analysis and dynamic analysis strategies. The proposed approach is applied to a real structure - KW51 railway bridge in Leuven. CMLDI diligently combines signal processing, machine learning (ML), and structural analysis techniques to achieve a fast damage identification solver that relies on minimal monitoring data. CMLDI considers modal analysis inputs and extracted features from acceleration responses to inform the damage identification based on the long-term and short-term monitoring data. Results of operational modal analysis, through the analysis of long-term monitoring data, are analyzed using pre-trained k-nearest neighbor (kNN) classifiers to identify damage existence, location, and magnitude. A well-crafted assembly of signal processing and ML methods is used to analyze acceleration time histories. Stacked gated recurrent unit (Stacked GRU) networks are used to identify damage existence, kNN classifiers are used to identify damage magnitude, and convolutions neural networks (CNN) are used to identify damage location. The damage identification results for the KW51 bridge demonstrate this approach's high accuracy, efficiency, and robustness. In this work, the training data is retrieved from the sensor of the KW51 bridge as well as the numerical finite element model (FEM). The proposed approach presents a systematic path to the generation of training data using a validated FEM. The data generation relies on modeling combinations of damage locations and magnitudes along the bridge.
△ Less
Submitted 25 September, 2024; v1 submitted 6 August, 2024;
originally announced August 2024.
-
Mamba in Speech: Towards an Alternative to Self-Attention
Authors:
Xiangyu Zhang,
Qiquan Zhang,
Hexin Liu,
Tianyi Xiao,
Xinyuan Qian,
Beena Ahmed,
Eliathamby Ambikairajah,
Haizhou Li,
Julien Epps
Abstract:
Transformer and its derivatives have achieved success in diverse tasks across computer vision, natural language processing, and speech processing. To reduce the complexity of computations within the multi-head self-attention mechanism in Transformer, Selective State Space Models (i.e., Mamba) were proposed as an alternative. Mamba exhibited its effectiveness in natural language processing and comp…
▽ More
Transformer and its derivatives have achieved success in diverse tasks across computer vision, natural language processing, and speech processing. To reduce the complexity of computations within the multi-head self-attention mechanism in Transformer, Selective State Space Models (i.e., Mamba) were proposed as an alternative. Mamba exhibited its effectiveness in natural language processing and computer vision tasks, but its superiority has rarely been investigated in speech signal processing. This paper explores solutions for applying Mamba to speech processing by discussing two typical speech processing tasks: speech recognition, which requires semantic and sequential information, and speech enhancement, which focuses primarily on sequential patterns. The experimental results show the superiority of bidirectional Mamba~(BiMamba) for speech processing to vanilla Mamba. Moreover, experiments demonstrate the effectiveness of BiMamba as an alternative to the self-attention module in Transformer and its derivates, particularly for the semantic-aware task. The crucial technologies for transferring Mamba to speech are then summarized in ablation studies and the discussion section to offer insights for future research.
△ Less
Submitted 4 October, 2024; v1 submitted 21 May, 2024;
originally announced May 2024.
-
An Adaptive Metaheuristic Framework for Changing Environments
Authors:
Bestoun S. Ahmed
Abstract:
The rapidly changing landscapes of modern optimization problems require algorithms that can be adapted in real-time. This paper introduces an Adaptive Metaheuristic Framework (AMF) designed for dynamic environments. It is capable of intelligently adapting to changes in the problem parameters. The AMF combines a dynamic representation of problems, a real-time sensing system, and adaptive techniques…
▽ More
The rapidly changing landscapes of modern optimization problems require algorithms that can be adapted in real-time. This paper introduces an Adaptive Metaheuristic Framework (AMF) designed for dynamic environments. It is capable of intelligently adapting to changes in the problem parameters. The AMF combines a dynamic representation of problems, a real-time sensing system, and adaptive techniques to navigate continuously changing optimization environments. Through a simulated dynamic optimization problem, the AMF's capability is demonstrated to detect environmental changes and proactively adjust its search strategy. This framework utilizes a differential evolution algorithm that is improved with an adaptation module that adjusts solutions in response to detected changes. The capability of the AMF to adjust is tested through a series of iterations, demonstrating its resilience and robustness in sustaining solution quality despite the problem's development. The effectiveness of AMF is demonstrated through a series of simulations on a dynamic optimization problem. Robustness and agility characterize the algorithm's performance, as evidenced by the presented fitness evolution and solution path visualizations. The findings show that AMF is a practical solution to dynamic optimization and a major step forward in the creation of algorithms that can handle the unpredictability of real-world problems.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Optimizing Service Placement in Edge-to-Cloud AR/VR Systems using a Multi-Objective Genetic Algorithm
Authors:
Mohammadsadeq Garshasbi Herabad,
Javid Taheri,
Bestoun S. Ahmed,
Calin Curescu
Abstract:
Augmented Reality (AR) and Virtual Reality (VR) systems involve computationally intensive image processing algorithms that can burden end-devices with limited resources, leading to poor performance in providing low latency services. Edge-to-cloud computing overcomes the limitations of end-devices by offloading their computations to nearby edge devices or remote cloud servers. Although this proves…
▽ More
Augmented Reality (AR) and Virtual Reality (VR) systems involve computationally intensive image processing algorithms that can burden end-devices with limited resources, leading to poor performance in providing low latency services. Edge-to-cloud computing overcomes the limitations of end-devices by offloading their computations to nearby edge devices or remote cloud servers. Although this proves to be sufficient for many applications, optimal placement of latency sensitive AR/VR services in edge-to-cloud infrastructures (to provide desirable service response times and reliability) remain a formidable challenging. To address this challenge, this paper develops a Multi-Objective Genetic Algorithm (MOGA) to optimize the placement of AR/VR-based services in multi-tier edge-to-cloud environments. The primary objective of the proposed MOGA is to minimize the response time of all running services, while maximizing the reliability of the underlying system from both software and hardware perspectives. To evaluate its performance, we mathematically modeled all components and developed a tailor-made simulator to assess its effectiveness on various scales. MOGA was compared with several heuristics to prove that intuitive solutions, which are usually assumed sufficient, are not efficient enough for the stated problem. The experimental results indicated that MOGA can significantly reduce the response time of deployed services by an average of 67\% on different scales, compared to other heuristic methods. MOGA also ensures reliability of the 97\% infrastructure (hardware) and 95\% services (software).
△ Less
Submitted 19 March, 2024;
originally announced March 2024.
-
Komodo: A Linguistic Expedition into Indonesia's Regional Languages
Authors:
Louis Owen,
Vishesh Tripathi,
Abhay Kumar,
Biddwan Ahmed
Abstract:
The recent breakthroughs in Large Language Models (LLMs) have mostly focused on languages with easily available and sufficient resources, such as English. However, there remains a significant gap for languages that lack sufficient linguistic resources in the public domain. Our work introduces Komodo-7B, 7-billion-parameter Large Language Models designed to address this gap by seamlessly operating…
▽ More
The recent breakthroughs in Large Language Models (LLMs) have mostly focused on languages with easily available and sufficient resources, such as English. However, there remains a significant gap for languages that lack sufficient linguistic resources in the public domain. Our work introduces Komodo-7B, 7-billion-parameter Large Language Models designed to address this gap by seamlessly operating across Indonesian, English, and 11 regional languages in Indonesia. Komodo-7B is a family of LLMs that consist of Komodo-7B-Base and Komodo-7B-Instruct. Komodo-7B-Instruct stands out by achieving state-of-the-art performance in various tasks and languages, outperforming the benchmarks set by OpenAI's GPT-3.5, Cohere's Aya-101, Llama-2-Chat-13B, Mixtral-8x7B-Instruct-v0.1, Gemma-7B-it , and many more. This model not only demonstrates superior performance in both language-specific and overall assessments but also highlights its capability to excel in linguistic diversity. Our commitment to advancing language models extends beyond well-resourced languages, aiming to bridge the gap for those with limited linguistic assets. Additionally, Komodo-7B-Instruct's better cross-language understanding contributes to addressing educational disparities in Indonesia, offering direct translations from English to 11 regional languages, a significant improvement compared to existing language translation services. Komodo-7B represents a crucial step towards inclusivity and effectiveness in language models, providing to the linguistic needs of diverse communities.
△ Less
Submitted 19 March, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection
Authors:
Xiangyu Zhang,
Hexin Liu,
Kaishuai Xu,
Qiquan Zhang,
Daijiao Liu,
Beena Ahmed,
Julien Epps
Abstract:
Depression is a critical concern in global mental health, prompting extensive research into AI-based detection methods. Among various AI technologies, Large Language Models (LLMs) stand out for their versatility in mental healthcare applications. However, their primary limitation arises from their exclusive dependence on textual input, which constrains their overall capabilities. Furthermore, the…
▽ More
Depression is a critical concern in global mental health, prompting extensive research into AI-based detection methods. Among various AI technologies, Large Language Models (LLMs) stand out for their versatility in mental healthcare applications. However, their primary limitation arises from their exclusive dependence on textual input, which constrains their overall capabilities. Furthermore, the utilization of LLMs in identifying and analyzing depressive states is still relatively untapped. In this paper, we present an innovative approach to integrating acoustic speech information into the LLMs framework for multimodal depression detection. We investigate an efficient method for depression detection by integrating speech signals into LLMs utilizing Acoustic Landmarks. By incorporating acoustic landmarks, which are specific to the pronunciation of spoken words, our method adds critical dimensions to text transcripts. This integration also provides insights into the unique speech patterns of individuals, revealing the potential mental states of individuals. Evaluations of the proposed approach on the DAIC-WOZ dataset reveal state-of-the-art results when compared with existing Audio-Text baselines. In addition, this approach is not only valuable for the detection of depression but also represents a new perspective in enhancing the ability of LLMs to comprehend and process speech signals.
△ Less
Submitted 23 September, 2024; v1 submitted 17 February, 2024;
originally announced February 2024.
-
Beyond Extraction: Contextualising Tabular Data for Efficient Summarisation by Language Models
Authors:
Uday Allu,
Biddwan Ahmed,
Vishesh Tripathi
Abstract:
The conventional use of the Retrieval-Augmented Generation (RAG) architecture has proven effective for retrieving information from diverse documents. However, challenges arise in handling complex table queries, especially within PDF documents containing intricate tabular structures.This research introduces an innovative approach to enhance the accuracy of complex table queries in RAG-based systems…
▽ More
The conventional use of the Retrieval-Augmented Generation (RAG) architecture has proven effective for retrieving information from diverse documents. However, challenges arise in handling complex table queries, especially within PDF documents containing intricate tabular structures.This research introduces an innovative approach to enhance the accuracy of complex table queries in RAG-based systems. Our methodology involves storing PDFs in the retrieval database and extracting tabular content separately. The extracted tables undergo a process of context enrichment, concatenating headers with corresponding values. To ensure a comprehensive understanding of the enriched data, we employ a fine-tuned version of the Llama-2-chat language model for summarisation within the RAG architecture. Furthermore, we augment the tabular data with contextual sense using the ChatGPT 3.5 API through a one-shot prompt. This enriched data is then fed into the retrieval database alongside other PDFs. Our approach aims to significantly improve the precision of complex table queries, offering a promising solution to a longstanding challenge in information retrieval.
△ Less
Submitted 10 February, 2024; v1 submitted 4 January, 2024;
originally announced January 2024.
-
Compact Electrochromic Optical Recording of Bioelectric Potentials
Authors:
Kenneth Nakasone,
Chris Zavik,
Erica Liu,
Burhan Ahmed,
Dana Griffith,
Lothar Maisenbacher,
Ashwin Singh,
Yuecheng Zhou,
Bianxiao Cui,
Holger Müller
Abstract:
Electrochromic optical recording (ECORE) is a label-free method that utilizes electrochromism to optically detect electrical signals in biological cells with a high signal-to-noise ratio and is suitable for long-term recording. However, ECORE usually requires a large and intricate optical setup, making it relatively difficult to transport and to study specimens on a large scale. Here, we present a…
▽ More
Electrochromic optical recording (ECORE) is a label-free method that utilizes electrochromism to optically detect electrical signals in biological cells with a high signal-to-noise ratio and is suitable for long-term recording. However, ECORE usually requires a large and intricate optical setup, making it relatively difficult to transport and to study specimens on a large scale. Here, we present a compact ECORE apparatus that drastically reduces the spatial footprint and complexity of the ECORE setup whilst maintaining high sensitivity. An autobalancing differential photodetector automates common-mode noise rejection, removing the need for manually adjustable optics, and a compact laser module conserves space compared to a typical laser mount. The result is a simple, easy-to-use, and relatively low cost system that achieves a sensitivity of 16.7 μV (within a factor of 5 of the shot noise limit), and reliably detects action potentials from Human-induced pluripotent stem cell (HiPSC) derived cardiomyocytes. This setup can be further improved to within 1.5 dB of the shot noise limit by filtering out power-line interference.
△ Less
Submitted 26 November, 2023;
originally announced November 2023.
-
Phonological Level wav2vec2-based Mispronunciation Detection and Diagnosis Method
Authors:
Mostafa Shahin,
Julien Epps,
Beena Ahmed
Abstract:
The automatic identification and analysis of pronunciation errors, known as Mispronunciation Detection and Diagnosis (MDD) plays a crucial role in Computer Aided Pronunciation Learning (CAPL) tools such as Second-Language (L2) learning or speech therapy applications. Existing MDD methods relying on analysing phonemes can only detect categorical errors of phonemes that have an adequate amount of tr…
▽ More
The automatic identification and analysis of pronunciation errors, known as Mispronunciation Detection and Diagnosis (MDD) plays a crucial role in Computer Aided Pronunciation Learning (CAPL) tools such as Second-Language (L2) learning or speech therapy applications. Existing MDD methods relying on analysing phonemes can only detect categorical errors of phonemes that have an adequate amount of training data to be modelled. With the unpredictable nature of the pronunciation errors of non-native or disordered speakers and the scarcity of training datasets, it is unfeasible to model all types of mispronunciations. Moreover, phoneme-level MDD approaches have a limited ability to provide detailed diagnostic information about the error made. In this paper, we propose a low-level MDD approach based on the detection of speech attribute features. Speech attribute features break down phoneme production into elementary components that are directly related to the articulatory system leading to more formative feedback to the learner. We further propose a multi-label variant of the Connectionist Temporal Classification (CTC) approach to jointly model the non-mutually exclusive speech attributes using a single model. The pre-trained wav2vec2 model was employed as a core model for the speech attribute detector. The proposed method was applied to L2 speech corpora collected from English learners from different native languages. The proposed speech attribute MDD method was further compared to the traditional phoneme-level MDD and achieved a significantly lower False Acceptance Rate (FAR), False Rejection Rate (FRR), and Diagnostic Error Rate (DER) over all speech attributes compared to the phoneme-level equivalent.
△ Less
Submitted 12 November, 2023;
originally announced November 2023.
-
Spatial HuBERT: Self-supervised Spatial Speech Representation Learning for a Single Talker from Multi-channel Audio
Authors:
Antoni Dimitriadis,
Siqi Pan,
Vidhyasaharan Sethu,
Beena Ahmed
Abstract:
Self-supervised learning has been used to leverage unlabelled data, improving accuracy and generalisation of speech systems through the training of representation models. While many recent works have sought to produce effective representations across a variety of acoustic domains, languages, modalities and even simultaneous speakers, these studies have all been limited to single-channel audio reco…
▽ More
Self-supervised learning has been used to leverage unlabelled data, improving accuracy and generalisation of speech systems through the training of representation models. While many recent works have sought to produce effective representations across a variety of acoustic domains, languages, modalities and even simultaneous speakers, these studies have all been limited to single-channel audio recordings. This paper presents Spatial HuBERT, a self-supervised speech representation model that learns both acoustic and spatial information pertaining to a single speaker in a potentially noisy environment by using multi-channel audio inputs. Spatial HuBERT learns representations that outperform state-of-the-art single-channel speech representations on a variety of spatial downstream tasks, particularly in reverberant and noisy environments. We also demonstrate the utility of the representations learned by Spatial HuBERT on a speech localisation downstream task. Along with this paper, we publicly release a new dataset of 100 000 simulated first-order ambisonics room impulse responses.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling
Authors:
Zheng Nan,
Ting Dang,
Vidhyasaharan Sethu,
Beena Ahmed
Abstract:
Connectionist temporal classification (CTC) is commonly adopted for sequence modeling tasks like speech recognition, where it is necessary to preserve order between the input and target sequences. However, CTC is only applied to deterministic sequence models, where the latent space is discontinuous and sparse, which in turn makes them less capable of handling data variability when compared to vari…
▽ More
Connectionist temporal classification (CTC) is commonly adopted for sequence modeling tasks like speech recognition, where it is necessary to preserve order between the input and target sequences. However, CTC is only applied to deterministic sequence models, where the latent space is discontinuous and sparse, which in turn makes them less capable of handling data variability when compared to variational models. In this paper, we integrate CTC with a variational model and derive loss functions that can be used to train more generalizable sequence models that preserve order. Specifically, we derive two versions of the novel variational CTC based on two reasonable assumptions, the first being that the variational latent variables at each time step are conditionally independent; and the second being that these latent variables are Markovian. We show that both loss functions allow direct optimization of the variational lower bound for the model log-likelihood, and present computationally tractable forms for implementing them.
△ Less
Submitted 14 December, 2023; v1 submitted 21 September, 2023;
originally announced September 2023.
-
Machine Learning Data Suitability and Performance Testing Using Fault Injection Testing Framework
Authors:
Manal Rahal,
Bestoun S. Ahmed,
Jorgen Samuelsson
Abstract:
Creating resilient machine learning (ML) systems has become necessary to ensure production-ready ML systems that acquire user confidence seamlessly. The quality of the input data and the model highly influence the successful end-to-end testing in data-sensitive systems. However, the testing approaches of input data are not as systematic and are few compared to model testing. To address this gap, t…
▽ More
Creating resilient machine learning (ML) systems has become necessary to ensure production-ready ML systems that acquire user confidence seamlessly. The quality of the input data and the model highly influence the successful end-to-end testing in data-sensitive systems. However, the testing approaches of input data are not as systematic and are few compared to model testing. To address this gap, this paper presents the Fault Injection for Undesirable Learning in input Data (FIUL-Data) testing framework that tests the resilience of ML models to multiple intentionally-triggered data faults. Data mutators explore vulnerabilities of ML systems against the effects of different fault injections. The proposed framework is designed based on three main ideas: The mutators are not random; one data mutator is applied at an instance of time, and the selected ML models are optimized beforehand. This paper evaluates the FIUL-Data framework using data from analytical chemistry, comprising retention time measurements of anti-sense oligonucleotide. Empirical evaluation is carried out in a two-step process in which the responses of selected ML models to data mutation are analyzed individually and then compared with each other. The results show that the FIUL-Data framework allows the evaluation of the resilience of ML models. In most experiments cases, ML models show higher resilience at larger training datasets, where gradient boost performed better than support vector regression in smaller training sets. Overall, the mean squared error metric is useful in evaluating the resilience of models due to its higher sensitivity to data mutation.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
METICULOUS: An FPGA-based Main Memory Emulator for System Software Studies
Authors:
Takahiro Hirofuchi,
Takaaki Fukai,
Akram Ben Ahmed,
Ryousei Takano,
Kento Sato
Abstract:
Due to the scaling problem of the DRAM technology, non-volatile memory devices, which are based on different principle of operation than DRAM, are now being intensively developed to expand the main memory of computers. Disaggregated memory is also drawing attention as an emerging technology to scale up the main memory. Although system software studies need to discuss management mechanisms for the…
▽ More
Due to the scaling problem of the DRAM technology, non-volatile memory devices, which are based on different principle of operation than DRAM, are now being intensively developed to expand the main memory of computers. Disaggregated memory is also drawing attention as an emerging technology to scale up the main memory. Although system software studies need to discuss management mechanisms for the new main memory designs incorporating such emerging memory systems, there are no feasible memory emulation mechanisms that efficiently work for large-scale, privileged programs such as operating systems and hypervisors. In this paper, we propose an FPGA-based main memory emulator for system software studies on new main memory systems. It can emulate the main memory incorporating multiple memory regions with different performance characteristics. For the address region of each memory device, it emulates the latencies, bandwidths and bit-flip error rates of read/write operations, respectively. The emulator is implemented at the hardware module of an off-the-self FPGA System-on-Chip board. Any privileged/unprivileged software programs running on its powerful 64-bit CPU cores can access emulated main memory devices at a practical speed through the exactly same interface as normal DRAM main memory. We confirmed that the emulator transparently worked for CPU cores and successfully changed the performance of a memory region according to given emulation parameters; for example, the latencies measured by CPU cores were exactly proportional to the latencies inserted by the emulator, involving the minimum overhead of approximately 240 ns. As a preliminary use case, we confirmed that the emulator allows us to change the bandwidth limit and the inserted latency individually for unmodified software programs, making discussions on latency sensitivity much easier.
△ Less
Submitted 7 September, 2023;
originally announced September 2023.
-
BED: Bi-Encoder-Based Detectors for Out-of-Distribution Detection
Authors:
Louis Owen,
Biddwan Ahmed,
Abhay Kumar
Abstract:
This paper introduces a novel method leveraging bi-encoder-based detectors along with a comprehensive study comparing different out-of-distribution (OOD) detection methods in NLP using different feature extractors. The feature extraction stage employs popular methods such as Universal Sentence Encoder (USE), BERT, MPNET, and GLOVE to extract informative representations from textual data. The evalu…
▽ More
This paper introduces a novel method leveraging bi-encoder-based detectors along with a comprehensive study comparing different out-of-distribution (OOD) detection methods in NLP using different feature extractors. The feature extraction stage employs popular methods such as Universal Sentence Encoder (USE), BERT, MPNET, and GLOVE to extract informative representations from textual data. The evaluation is conducted on several datasets, including CLINC150, ROSTD-Coarse, SNIPS, and YELLOW. Performance is assessed using metrics such as F1-Score, MCC, FPR@90, FPR@95, AUPR, an AUROC. The experimental results demonstrate that the proposed bi-encoder-based detectors outperform other methods, both those that require OOD labels in training and those that do not, across all datasets, showing great potential for OOD detection in NLP. The simplicity of the training process and the superior detection performance make them applicable to real-world scenarios. The presented methods and benchmarking metrics serve as a valuable resource for future research in OOD detection, enabling further advancements in this field. The code and implementation details can be found on our GitHub repository: https://github.com/yellowmessenger/ood-detection.
△ Less
Submitted 13 March, 2024; v1 submitted 15 June, 2023;
originally announced June 2023.
-
DA-LSTM: A Dynamic Drift-Adaptive Learning Framework for Interval Load Forecasting with LSTM Networks
Authors:
Firas Bayram,
Phil Aupke,
Bestoun S. Ahmed,
Andreas Kassler,
Andreas Theocharis,
Jonas Forsman
Abstract:
Load forecasting is a crucial topic in energy management systems (EMS) due to its vital role in optimizing energy scheduling and enabling more flexible and intelligent power grid systems. As a result, these systems allow power utility companies to respond promptly to demands in the electricity market. Deep learning (DL) models have been commonly employed in load forecasting problems supported by a…
▽ More
Load forecasting is a crucial topic in energy management systems (EMS) due to its vital role in optimizing energy scheduling and enabling more flexible and intelligent power grid systems. As a result, these systems allow power utility companies to respond promptly to demands in the electricity market. Deep learning (DL) models have been commonly employed in load forecasting problems supported by adaptation mechanisms to cope with the changing pattern of consumption by customers, known as concept drift. A drift magnitude threshold should be defined to design change detection methods to identify drifts. While the drift magnitude in load forecasting problems can vary significantly over time, existing literature often assumes a fixed drift magnitude threshold, which should be dynamically adjusted rather than fixed during system evolution. To address this gap, in this paper, we propose a dynamic drift-adaptive Long Short-Term Memory (DA-LSTM) framework that can improve the performance of load forecasting models without requiring a drift threshold setting. We integrate several strategies into the framework based on active and passive adaptation approaches. To evaluate DA-LSTM in real-life settings, we thoroughly analyze the proposed framework and deploy it in a real-world problem through a cloud-based environment. Efficiency is evaluated in terms of the prediction performance of each approach and computational cost. The experiments show performance improvements on multiple evaluation metrics achieved by our framework compared to baseline methods from the literature. Finally, we present a trade-off analysis between prediction performance and computational costs.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
Holographic CFT Phase Transitions and Criticality for Rotating AdS Black Holes
Authors:
Moaathe Belhaj Ahmed,
Wan Cong,
David Kubiznak,
Robert B. Mann,
Manus R. Visser
Abstract:
Employing the novel exact dictionary between the laws of extended black hole thermodynamics and the laws of the dual CFT, we study the extended thermodynamics for CFT states that are dual to neutral singly-spinning asymptotically AdS black holes in $d$ bulk spacetime dimensions. On the field theory side we include two independent pairs of thermodynamic conjugate variables: the central charge-chemi…
▽ More
Employing the novel exact dictionary between the laws of extended black hole thermodynamics and the laws of the dual CFT, we study the extended thermodynamics for CFT states that are dual to neutral singly-spinning asymptotically AdS black holes in $d$ bulk spacetime dimensions. On the field theory side we include two independent pairs of thermodynamic conjugate variables: the central charge-chemical potential term and the pressure-volume term. In this setting we uncover various phase transitions and critical behaviour in the CFT, focusing on three different thermodynamic ensembles. Namely, for fixed angular momentum and central charge, we show there is a Van der Waals-like criticality for $d=4,5$ and reentrant phase transitions for $d\ge 6$. At fixed angular velocity and central charge, there is a first-order (de)confinement phase transition in all dimensions $d \ge 3$. Finally, at fixed angular momentum and chemical potential we find a plethora of zero-order phase transitions and unstable phases in both $d=4$ and $d=6$.
△ Less
Submitted 25 October, 2023; v1 submitted 4 May, 2023;
originally announced May 2023.
-
A Domain-Region Based Evaluation of ML Performance Robustness to Covariate Shift
Authors:
Firas Bayram,
Bestoun S. Ahmed
Abstract:
Most machine learning methods assume that the input data distribution is the same in the training and testing phases. However, in practice, this stationarity is usually not met and the distribution of inputs differs, leading to unexpected performance of the learned model in deployment. The issue in which the training and test data inputs follow different probability distributions while the input-o…
▽ More
Most machine learning methods assume that the input data distribution is the same in the training and testing phases. However, in practice, this stationarity is usually not met and the distribution of inputs differs, leading to unexpected performance of the learned model in deployment. The issue in which the training and test data inputs follow different probability distributions while the input-output relationship remains unchanged is referred to as covariate shift. In this paper, the performance of conventional machine learning models was experimentally evaluated in the presence of covariate shift. Furthermore, a region-based evaluation was performed by decomposing the domain of probability density function of the input data to assess the classifier's performance per domain region. Distributional changes were simulated in a two-dimensional classification problem. Subsequently, a higher four-dimensional experiments were conducted. Based on the experimental analysis, the Random Forests algorithm is the most robust classifier in the two-dimensional case, showing the lowest degradation rate for accuracy and F1-score metrics, with a range between 0.1% and 2.08%. Moreover, the results reveal that in higher-dimensional experiments, the performance of the models is predominantly influenced by the complexity of the classification function, leading to degradation rates exceeding 25% in most cases. It is also concluded that the models exhibit high bias towards the region with high density in the input space domain of the training samples.
△ Less
Submitted 18 April, 2023;
originally announced April 2023.
-
Designing Fair, Cost-optimal Auctions based on Deep Learning for Procuring Agricultural Inputs through Farmer Collectives
Authors:
Mayank Ratan Bhardwaj,
Bazil Ahmed,
Prathik Diwakar,
Ganesh Ghalme,
Y. Narahari
Abstract:
Procuring agricultural inputs (agri-inputs for short) such as seeds, fertilizers, and pesticides, at desired quality levels and at affordable cost, forms a critical component of agricultural input operations. This is a particularly challenging problem being faced by small and marginal farmers in any emerging economy. Farmer collectives (FCs), which are cooperative societies of farmers, offer an ex…
▽ More
Procuring agricultural inputs (agri-inputs for short) such as seeds, fertilizers, and pesticides, at desired quality levels and at affordable cost, forms a critical component of agricultural input operations. This is a particularly challenging problem being faced by small and marginal farmers in any emerging economy. Farmer collectives (FCs), which are cooperative societies of farmers, offer an excellent prospect for enabling cost-effective procurement of inputs with assured quality to the farmers. In this paper, our objective is to design sound, explainable mechanisms by which an FC will be able to procure agri-inputs in bulk and distribute the inputs procured to the individual farmers who are members of the FC. In the methodology proposed here, an FC engages qualified suppliers in a competitive, volume discount procurement auction in which the suppliers specify price discounts based on volumes supplied. The desiderata of properties for such an auction include: minimization of the total cost of procurement; incentive compatibility; individual rationality; fairness; and other business constraints. An auction satisfying all these properties is analytically infeasible and a key contribution of this paper is to develop a deep learning based approach to design such an auction. We use two realistic, stylized case studies from chili seeds procurement and a popular pesticide procurement to demonstrate the efficacy of these auctions.
△ Less
Submitted 14 April, 2023;
originally announced April 2023.
-
DQSOps: Data Quality Scoring Operations Framework for Data-Driven Applications
Authors:
Firas Bayram,
Bestoun S. Ahmed,
Erik Hallin,
Anton Engman
Abstract:
Data quality assessment has become a prominent component in the successful execution of complex data-driven artificial intelligence (AI) software systems. In practice, real-world applications generate huge volumes of data at speeds. These data streams require analysis and preprocessing before being permanently stored or used in a learning task. Therefore, significant attention has been paid to the…
▽ More
Data quality assessment has become a prominent component in the successful execution of complex data-driven artificial intelligence (AI) software systems. In practice, real-world applications generate huge volumes of data at speeds. These data streams require analysis and preprocessing before being permanently stored or used in a learning task. Therefore, significant attention has been paid to the systematic management and construction of high-quality datasets. Nevertheless, managing voluminous and high-velocity data streams is usually performed manually (i.e. offline), making it an impractical strategy in production environments. To address this challenge, DataOps has emerged to achieve life-cycle automation of data processes using DevOps principles. However, determining the data quality based on a fitness scale constitutes a complex task within the framework of DataOps. This paper presents a novel Data Quality Scoring Operations (DQSOps) framework that yields a quality score for production data in DataOps workflows. The framework incorporates two scoring approaches, an ML prediction-based approach that predicts the data quality score and a standard-based approach that periodically produces the ground-truth scores based on assessing several data quality dimensions. We deploy the DQSOps framework in a real-world industrial use case. The results show that DQSOps achieves significant computational speedup rates compared to the conventional approach of data quality scoring while maintaining high prediction performance.
△ Less
Submitted 27 March, 2023;
originally announced March 2023.
-
Automated and Systematic Digital Twins Testing for Industrial Processes
Authors:
Yunpeng Ma,
Khalil Younis,
Bestoun S. Ahmed,
Andreas Kassler,
Pavel Krakhmalev,
Andreas Thore,
Hans Lindback
Abstract:
Digital twins (DT) of industrial processes have become increasingly important. They aim to digitally represent the physical world to help evaluate, optimize, and predict physical processes and behaviors. Therefore, DT is a vital tool to improve production automation through digitalization and becomes more sophisticated due to rapidly evolving simulation and modeling capabilities, integration of Io…
▽ More
Digital twins (DT) of industrial processes have become increasingly important. They aim to digitally represent the physical world to help evaluate, optimize, and predict physical processes and behaviors. Therefore, DT is a vital tool to improve production automation through digitalization and becomes more sophisticated due to rapidly evolving simulation and modeling capabilities, integration of IoT sensors with DT, and high-capacity cloud/edge computing infrastructure. However, the fidelity and reliability of DT software are essential to represent the physical world. This paper shows an automated and systematic test architecture for DT that correlates DT states with real-time sensor data from a production line in the forging industry. Our evaluation shows that the architecture can significantly accelerate the automatic DT testing process and improve its reliability. A systematic online DT testing method can significantly detect the performance shift and continuously improve the DT's fidelity. The snapshot creation methodology and testing agent architecture can be an inspiration and can be generally applicable to other industrial processes that use DT to generalize their automated testing.
△ Less
Submitted 25 February, 2023;
originally announced February 2023.
-
Holographic dual of extended black hole thermodynamics
Authors:
Moaathe Belhaj Ahmed,
Wan Cong,
David Kubizňák,
Robert B. Mann,
Manus R. Visser
Abstract:
By respecting the conformal symmetry of the dual CFT, and treating the conformal factor of the AdS boundary as a thermodynamic parameter, we formulate the holographic first law that is exactly dual to the first law of extended black hole thermodynamics with variable cosmological constant but fixed Newton's constant.
By respecting the conformal symmetry of the dual CFT, and treating the conformal factor of the AdS boundary as a thermodynamic parameter, we formulate the holographic first law that is exactly dual to the first law of extended black hole thermodynamics with variable cosmological constant but fixed Newton's constant.
△ Less
Submitted 20 February, 2023; v1 submitted 16 February, 2023;
originally announced February 2023.
-
Patient-specific Finite Element Modeling of Aneurysmal dilatation after chronic type B aortic dissection
Authors:
Shaojie Zhang,
Joan D Laubrie,
S. Jamaleddin Mousavi,
Stéphane Avril,
Sabrina Ben Ahmed
Abstract:
Progressive aneurysmal dilatation is a well-recognized complication in patients with chronic type B aortic dissection (cTBAD), which may lead to a delayed rupture and create a life-threatening condition. However, our understanding of such aortic expansion in cTBAD remains weak. In the present paper, we propose to use numerical simulations to study the role of growth and remodeling (G\&R) in aneury…
▽ More
Progressive aneurysmal dilatation is a well-recognized complication in patients with chronic type B aortic dissection (cTBAD), which may lead to a delayed rupture and create a life-threatening condition. However, our understanding of such aortic expansion in cTBAD remains weak. In the present paper, we propose to use numerical simulations to study the role of growth and remodeling (G\&R) in aneurysmal dilatation after cTBAD. We set up a 3D finite-element model of G\&R for aortic dissection within an open-source code. Constitutive equations, momentum balance equations, and equations related to the mechanobiology of the artery were formulated based on the homogenized constrained mixture theory. The model was first applied to idealized aortic geometries with cylindrical and toric shapes to demonstrate its feasibility and efficiency. The model was then applied to a patient-specific aortic segment to show its potential in more relevant and complex patient-specific clinical applications. It was found that the G\&R tends to naturally trigger the aneurysmal dilatation after dissection, in order to restore its tensional equilibrium. Our results indicated that the value of the gain parameter, related to collagen G\&R, plays an important role in the stability of aortic expansion after cTBAD. A small gain parameter will induce an excessive aneurysmal degeneration whilst a large gain parameter helps to recover a stabilized state of the artery after dissection. Finally, it was found that other mechanobiology-related parameters, such as the circumferential length of the dissection, as well as the pressure in the false lumen, may also be determinant for the stability of aneurysmal dilatation after cTBAD. Both a wide tear and an elevated false lumen pressure favor an unstable development of aortic expansion after cTBAD. As future work, the present model will be validated through predictions of aneurysmal dilatation in patient-specific clinical cases, in comparison with datasets followed over a significant period of time.
△ Less
Submitted 6 January, 2023;
originally announced January 2023.
-
Quality Assurance in MLOps Setting: An Industrial Perspective
Authors:
Ayan Chatterjee,
Bestoun S. Ahmed,
Erik Hallin,
Anton Engman
Abstract:
Today, machine learning (ML) is widely used in industry to provide the core functionality of production systems. However, it is practically always used in production systems as part of a larger end-to-end software system that is made up of several other components in addition to the ML model. Due to production demand and time constraints, automated software engineering practices are highly applica…
▽ More
Today, machine learning (ML) is widely used in industry to provide the core functionality of production systems. However, it is practically always used in production systems as part of a larger end-to-end software system that is made up of several other components in addition to the ML model. Due to production demand and time constraints, automated software engineering practices are highly applicable. The increased use of automated ML software engineering practices in industries such as manufacturing and utilities requires an automated Quality Assurance (QA) approach as an integral part of ML software. Here, QA helps reduce risk by offering an objective perspective on the software task. Although conventional software engineering has automated tools for QA data analysis for data-driven ML, the use of QA practices for ML in operation (MLOps) is lacking. This paper examines the QA challenges that arise in industrial MLOps and conceptualizes modular strategies to deal with data integrity and Data Quality (DQ). The paper is accompanied by real industrial use-cases from industrial partners. The paper also presents several challenges that may serve as a basis for future studies.
△ Less
Submitted 24 November, 2022; v1 submitted 23 November, 2022;
originally announced November 2022.
-
Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations
Authors:
Renee Lu,
Mostafa Shahin,
Beena Ahmed
Abstract:
Children's speech recognition is a vital, yet largely overlooked domain when building inclusive speech technologies. The major challenge impeding progress in this domain is the lack of adequate child speech corpora; however, recent advances in self-supervised learning have created a new opportunity for overcoming this problem of data scarcity. In this paper, we leverage self-supervised adult speec…
▽ More
Children's speech recognition is a vital, yet largely overlooked domain when building inclusive speech technologies. The major challenge impeding progress in this domain is the lack of adequate child speech corpora; however, recent advances in self-supervised learning have created a new opportunity for overcoming this problem of data scarcity. In this paper, we leverage self-supervised adult speech representations and use three well-known child speech corpora to build models for children's speech recognition. We assess the performance of fine-tuning on both native and non-native children's speech, examine the effect of cross-domain child corpora, and investigate the minimum amount of child speech required to fine-tune a model which outperforms a state-of-the-art adult model. We also analyze speech recognition performance across children's ages. Our results demonstrate that fine-tuning with cross-domain child corpora leads to relative improvements of up to 46.08% and 45.53% for native and non-native child speech respectively, and absolute improvements of 14.70% and 31.10%. We also show that with as little as 5 hours of transcribed children's speech, it is possible to fine-tune a children's speech recognition system that outperforms a state-of-the-art adult model fine-tuned on 960 hours of adult speech.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Few-Shot Learning for Biometric Verification
Authors:
Saad Bin Ahmed,
Umaid M. Zaffar,
Marium Aslam,
Muhammad Imran Malik
Abstract:
In machine learning applications, it is common practice to feed as much information as possible. In most cases, the model can handle large data sets that allow to predict more accurately. In the presence of data scarcity, a Few-Shot learning (FSL) approach aims to build more accurate algorithms with limited training data. We propose a novel end-to-end lightweight architecture that verifies biometr…
▽ More
In machine learning applications, it is common practice to feed as much information as possible. In most cases, the model can handle large data sets that allow to predict more accurately. In the presence of data scarcity, a Few-Shot learning (FSL) approach aims to build more accurate algorithms with limited training data. We propose a novel end-to-end lightweight architecture that verifies biometric data by producing competitive results as compared to state-of-the-art accuracies through Few-Shot learning methods. The dense layers add to the complexity of state-of-the-art deep learning models which inhibits them to be used in low-power applications. In presented approach, a shallow network is coupled with a conventional machine learning technique that exploits hand-crafted features to verify biometric images from multi-modal sources such as signatures, periocular region, iris, face, fingerprints etc. We introduce a self-estimated threshold that strictly monitors False Acceptance Rate (FAR) while generalizing its results hence eliminating user-defined thresholds from ROC curves that are likely to be biased on local data distribution. This hybrid model benefits from few-shot learning to make up for scarcity of data in biometric use-cases. We have conducted extensive experimentation with commonly used biometric datasets. The obtained results provided an effective solution for biometric verification systems.
△ Less
Submitted 3 May, 2023; v1 submitted 12 November, 2022;
originally announced November 2022.
-
Vision-Based Robust Lane Detection and Tracking under Different Challenging Environmental Conditions
Authors:
Samia Sultana,
Boshir Ahmed,
Manoranjan Paul,
Muhammad Rafiqul Islam,
Shamim Ahmad
Abstract:
Lane marking detection is fundamental for both advanced driving assistance systems. However, detecting lane is highly challenging when the visibility of a road lane marking is low due to real-life challenging environment and adverse weather. Most of the lane detection methods suffer from four types of challenges: (i) light effects i.e., shadow, glare of light, reflection etc.; (ii) Obscured visibi…
▽ More
Lane marking detection is fundamental for both advanced driving assistance systems. However, detecting lane is highly challenging when the visibility of a road lane marking is low due to real-life challenging environment and adverse weather. Most of the lane detection methods suffer from four types of challenges: (i) light effects i.e., shadow, glare of light, reflection etc.; (ii) Obscured visibility of eroded, blurred, colored and cracked lane caused by natural disasters and adverse weather; (iii) lane marking occlusion by different objects from surroundings (wiper, vehicles etc.); and (iv) presence of confusing lane like lines inside the lane view e.g., guardrails, pavement marking, road divider etc. Here, we propose a robust lane detection and tracking method with three key technologies. First, we introduce a comprehensive intensity threshold range (CITR) to improve the performance of the canny operator in detecting low intensity lane edges. Second, we propose a two-step lane verification technique, the angle based geometric constraint (AGC) and length-based geometric constraint (LGC) followed by Hough Transform, to verify the characteristics of lane marking and to prevent incorrect lane detection. Finally, we propose a novel lane tracking technique, by defining a range of horizontal lane position (RHLP) along the x axis which will be updating with respect to the lane position of previous frame. It can keep track of the lane position when either left or right or both lane markings are partially and fully invisible. To evaluate the performance of the proposed method we used the DSDLDE [1] and SLD [2] dataset with 1080x1920 and 480x720 resolutions at 24 and 25 frames/sec respectively. Experimental results show that the average detection rate is 97.55%, and the average processing time is 22.33 msec/frame, which outperform the state of-the-art method.
△ Less
Submitted 14 June, 2023; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning
Authors:
Mostafa Shahin,
Beena Ahmed,
Julien Epps
Abstract:
One of the major challenges in acoustic modelling of child speech is the rapid changes that occur in the children's articulators as they grow up, their differing growth rates and the subsequent high variability in the same age group. These high acoustic variations along with the scarcity of child speech corpora have impeded the development of a reliable speech recognition system for children. In t…
▽ More
One of the major challenges in acoustic modelling of child speech is the rapid changes that occur in the children's articulators as they grow up, their differing growth rates and the subsequent high variability in the same age group. These high acoustic variations along with the scarcity of child speech corpora have impeded the development of a reliable speech recognition system for children. In this paper, a speaker- and age-invariant training approach based on adversarial multi-task learning is proposed. The system consists of one generator shared network that learns to generate speaker- and age-invariant features connected to three discrimination networks, for phoneme, age, and speaker. The generator network is trained to minimize the phoneme-discrimination loss and maximize the speaker- and age-discrimination losses in an adversarial multi-task learning fashion. The generator network is a Time Delay Neural Network (TDNN) architecture while the three discriminators are feed-forward networks. The system was applied to the OGI speech corpora and achieved a 13% reduction in the WER of the ASR.
△ Less
Submitted 6 November, 2022; v1 submitted 18 October, 2022;
originally announced October 2022.
-
Heart Attack Classification System using Neural Network Trained with Particle Swarm Optimization
Authors:
Askandar H. Amin,
Botan K. Ahmed,
Bestan B. Maaroof,
Tarik A. Rashid
Abstract:
The prior detection of a heart attack could lead to the saving of one's life. Putting specific criteria into a system that provides an early warning of an imminent at-tack will be advantageous to a better prevention plan for an upcoming heart attack. Some studies have been conducted for this purpose, but yet the goal has not been reached to prevent a patient from getting such a disease. In this pa…
▽ More
The prior detection of a heart attack could lead to the saving of one's life. Putting specific criteria into a system that provides an early warning of an imminent at-tack will be advantageous to a better prevention plan for an upcoming heart attack. Some studies have been conducted for this purpose, but yet the goal has not been reached to prevent a patient from getting such a disease. In this paper, Neural Network trained with Particle Swarm Optimization (PSONN) is used to analyze the input criteria and enhance heart attack anticipation. A real and novel dataset that has been recorded on the disease is used. After preprocessing the data, the features are fed into the system. As a result, the outcomes from PSONN have been evaluated against those from other algorithms. Decision Tree, Random Forest, Neural network trained with Backpropagation (BPNN), and Naive Bayes were among those employed. Then the results of 100%, 99.2424%, 99.2323%, 81.3131%, and 66.4141% are produced concerning the mentioned algorithms, which show that PSONN has recorded the highest accuracy rate among all other tested algorithms.
△ Less
Submitted 21 August, 2022;
originally announced September 2022.
-
A Drift Handling Approach for Self-Adaptive ML Software in Scalable Industrial Processes
Authors:
Firas Bayram,
Bestoun S. Ahmed,
Erik Hallin,
Anton Engman
Abstract:
Most industrial processes in real-world manufacturing applications are characterized by the scalability property, which requires an automated strategy to self-adapt machine learning (ML) software systems to the new conditions. In this paper, we investigate an Electroslag Remelting (ESR) use case process from the Uddeholms AB steel company. The use case involves predicting the minimum pressure valu…
▽ More
Most industrial processes in real-world manufacturing applications are characterized by the scalability property, which requires an automated strategy to self-adapt machine learning (ML) software systems to the new conditions. In this paper, we investigate an Electroslag Remelting (ESR) use case process from the Uddeholms AB steel company. The use case involves predicting the minimum pressure value for a vacuum pumping event. Taking into account the long time required to collect new records and efficiently integrate the new machines with the built ML software system. Additionally, to accommodate the changes and satisfy the non-functional requirement of the software system, namely adaptability, we propose an automated and adaptive approach based on a drift handling technique called importance weighting. The aim is to address the problem of adding a new furnace to production and enable the adaptability attribute of the ML software. The overall results demonstrate the improvements in ML software performance achieved by implementing the proposed approach over the classical non-adaptive approach.
△ Less
Submitted 15 August, 2022;
originally announced August 2022.
-
Testing of Machine Learning Models with Limited Samples: An Industrial Vacuum Pumping Application
Authors:
Ayan Chatterjee,
Bestoun S. Ahmed,
Erik Hallin,
Anton Engman
Abstract:
There is often a scarcity of training data for machine learning (ML) classification and regression models in industrial production, especially for time-consuming or sparsely run manufacturing processes. A majority of the limited ground-truth data is used for training, while a handful of samples are left for testing. Here, the number of test samples is inadequate to properly evaluate the robustness…
▽ More
There is often a scarcity of training data for machine learning (ML) classification and regression models in industrial production, especially for time-consuming or sparsely run manufacturing processes. A majority of the limited ground-truth data is used for training, while a handful of samples are left for testing. Here, the number of test samples is inadequate to properly evaluate the robustness of the ML models under test for classification and regression. Furthermore, the output of these ML models may be inaccurate or even fail if the input data differ from the expected. This is the case for ML models used in the Electroslag Remelting (ESR) process in the refined steel industry to predict the pressure in a vacuum chamber. A vacuum pumping event that occurs once a workday generates a few hundred samples in a year of pumping for training and testing. In the absence of adequate training and test samples, this paper first presents a method to generate a fresh set of augmented samples based on vacuum pumping principles. Based on the generated augmented samples, three test scenarios and one test oracle are presented to assess the robustness of an ML model used for production on an industrial scale. Experiments are conducted with real industrial production data obtained from Uddeholms AB steel company. The evaluations indicate that Ensemble and Neural Network are the most robust when trained on augmented data using the proposed testing strategy. The evaluation also demonstrates the proposed method's effectiveness in checking and improving ML algorithms' robustness in such situations. The work improves software testing's state-of-the-art robustness testing in similar settings. Finally, the paper presents an MLOps implementation of the proposed approach for real-time ML model prediction and action on the edge node and automated continuous delivery of ML software from the cloud.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
Novel Strategy Generating Variable-length State Machine Test Paths
Authors:
Vaclav Rechtberger,
Miroslav Bures,
Bestoun S. Ahmed,
Hynek Schvach
Abstract:
Finite State Machine is a popular modeling notation for various systems, especially software and electronic. Test paths can be automatically generated from the system model to test such systems using a suitable algorithm. This paper presents a strategy that generates test paths and allows to start and end test paths only in defined states of the finite state machine. The strategy also simultaneous…
▽ More
Finite State Machine is a popular modeling notation for various systems, especially software and electronic. Test paths can be automatically generated from the system model to test such systems using a suitable algorithm. This paper presents a strategy that generates test paths and allows to start and end test paths only in defined states of the finite state machine. The strategy also simultaneously supports generating test paths only of length in a given range. For this purpose, alternative system models, test coverage criteria, and a set of algorithms are developed. The strategy is compared with the best alternative based on the reduction of the test set generated by the established N-switch coverage approach on a mix of 171 industrial and artificially generated problem instances. The proposed strategy outperforms the compared variant in a smaller number of test path steps. The extent varies with the used test coverage criterion and preferred test path length range from none to two and half fold difference. Moreover, the proposed technique detected up to 30% more simple artificial defects inserted into experimental SUT models per one test step than the compared alternative technique. The proposed strategy is well applicable in situations where a possible test path starts and ends in a state machine needs to be reflected and, concurrently, the length of the test paths has to be in a defined range.
△ Less
Submitted 25 July, 2022;
originally announced July 2022.
-
IoT Anomaly Detection Methods and Applications: A Survey
Authors:
Ayan Chatterjee,
Bestoun S. Ahmed
Abstract:
Ongoing research on anomaly detection for the Internet of Things (IoT) is a rapidly expanding field. This growth necessitates an examination of application trends and current gaps. The vast majority of those publications are in areas such as network and infrastructure security, sensor monitoring, smart home, and smart city applications and are extending into even more sectors. Recent advancements…
▽ More
Ongoing research on anomaly detection for the Internet of Things (IoT) is a rapidly expanding field. This growth necessitates an examination of application trends and current gaps. The vast majority of those publications are in areas such as network and infrastructure security, sensor monitoring, smart home, and smart city applications and are extending into even more sectors. Recent advancements in the field have increased the necessity to study the many IoT anomaly detection applications. This paper begins with a summary of the detection methods and applications, accompanied by a discussion of the categorization of IoT anomaly detection algorithms. We then discuss the current publications to identify distinct application domains, examining papers chosen based on our search criteria. The survey considers 64 papers among recent publications published between January 2019 and July 2021. In recent publications, we observed a shortage of IoT anomaly detection methodologies, for example, when dealing with the integration of systems with various sensors, data and concept drifts, and data augmentation where there is a shortage of Ground Truth data. Finally, we discuss the present such challenges and offer new perspectives where further research is required.
△ Less
Submitted 19 July, 2022;
originally announced July 2022.
-
Vortex/anti-vortex pair creation in black hole thermodynamics
Authors:
Moaathe Belhaj Ahmed,
David Kubiznak,
Robert B. Mann
Abstract:
An isolated critical point is a peculiar thermodynamic critical point that occurs in the phase diagram of hyperbolic black holes in Kth-order Lovelock gravity in higher dimensions (with K odd) for special tuned Lovelock coupling constants. It corresponds to a "merger" of two swallowtails and is characterized by non-standard critical exponents. Upon employing a recent proposal for assigning a topol…
▽ More
An isolated critical point is a peculiar thermodynamic critical point that occurs in the phase diagram of hyperbolic black holes in Kth-order Lovelock gravity in higher dimensions (with K odd) for special tuned Lovelock coupling constants. It corresponds to a "merger" of two swallowtails and is characterized by non-standard critical exponents. Upon employing a recent proposal for assigning a topological charge to thermodynamic critical points, we argue that the isolated critical point offers an interpretation corresponding to the onset of a topological phase transition of a "vortex/anti-vortex pair".
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Inverse Laplace transform based on Widder's method for Tsallis exponential
Authors:
S. S. Naina Mohammed,
K. Jeevanandham,
A. Basherrudin Mahmud Ahmed,
Md. Manirul Ali,
R. Chandrashekar
Abstract:
A generalization of the Laplace transform based on the generalized Tsallis $q$-exponential is given in the present work for a new type of kernel. We also define the inverse transform for this generalized transform based on the complex integration method. We prove identities corresponding to the Laplace transform and inverse transform like the $q$-convolution theorem, the action of generalized deri…
▽ More
A generalization of the Laplace transform based on the generalized Tsallis $q$-exponential is given in the present work for a new type of kernel. We also define the inverse transform for this generalized transform based on the complex integration method. We prove identities corresponding to the Laplace transform and inverse transform like the $q$-convolution theorem, the action of generalized derivative and generalized integration on the Laplace transform. We then derive a $q$-generalization of the inverse Laplace transform based on the Post-Widder's method which bypasses the necessity for a complex contour integration. We demonstrate the usefulness of this in computing the Laplace and inverse Laplace transform of some elementary functions. Finally we use the Post-Widder's method based inverse Laplace transform to compute the density of states from the partition function for the case of a generalized classical ideal gas and linear harmonic oscillator in $D$-dimensions.
△ Less
Submitted 11 May, 2022; v1 submitted 7 May, 2022;
originally announced May 2022.
-
Quantifiable Assurance: From IPs to Platforms
Authors:
Bulbul Ahmed,
Md Kawser Bepary,
Nitin Pundir,
Mike Borza,
Oleg Raikhman,
Amit Garg,
Dale Donchin,
Adam Cron,
Mohamed A Abdel-moneum,
Farimah Farahmandi,
Fahim Rahman,
Mark Tehranipoor
Abstract:
Hardware vulnerabilities are generally considered more difficult to fix than software ones because they are persistent after fabrication. Thus, it is crucial to assess the security and fix the vulnerabilities at earlier design phases, such as Register Transfer Level (RTL) and gate level. The focus of the existing security assessment techniques is mainly twofold. First, they check the security of I…
▽ More
Hardware vulnerabilities are generally considered more difficult to fix than software ones because they are persistent after fabrication. Thus, it is crucial to assess the security and fix the vulnerabilities at earlier design phases, such as Register Transfer Level (RTL) and gate level. The focus of the existing security assessment techniques is mainly twofold. First, they check the security of Intellectual Property (IP) blocks separately. Second, they aim to assess the security against individual threats considering the threats are orthogonal. We argue that IP-level security assessment is not sufficient. Eventually, the IPs are placed in a platform, such as a system-on-chip (SoC), where each IP is surrounded by other IPs connected through glue logic and shared/private buses. Hence, we must develop a methodology to assess the platform-level security by considering both the IP-level security and the impact of the additional parameters introduced during platform integration. Another important factor to consider is that the threats are not always orthogonal. Improving security against one threat may affect the security against other threats. Hence, to build a secure platform, we must first answer the following questions: What additional parameters are introduced during the platform integration? How do we define and characterize the impact of these parameters on security? How do the mitigation techniques of one threat impact others? This paper aims to answer these important questions and proposes techniques for quantifiable assurance by quantitatively estimating and measuring the security of a platform at the pre-silicon stages. We also touch upon the term security optimization and present the challenges for future research directions.
△ Less
Submitted 16 April, 2022;
originally announced April 2022.
-
From Concept Drift to Model Degradation: An Overview on Performance-Aware Drift Detectors
Authors:
Firas Bayram,
Bestoun S. Ahmed,
Andreas Kassler
Abstract:
The dynamicity of real-world systems poses a significant challenge to deployed predictive machine learning (ML) models. Changes in the system on which the ML model has been trained may lead to performance degradation during the system's life cycle. Recent advances that study non-stationary environments have mainly focused on identifying and addressing such changes caused by a phenomenon called con…
▽ More
The dynamicity of real-world systems poses a significant challenge to deployed predictive machine learning (ML) models. Changes in the system on which the ML model has been trained may lead to performance degradation during the system's life cycle. Recent advances that study non-stationary environments have mainly focused on identifying and addressing such changes caused by a phenomenon called concept drift. Different terms have been used in the literature to refer to the same type of concept drift and the same term for various types. This lack of unified terminology is set out to create confusion on distinguishing between different concept drift variants. In this paper, we start by grouping concept drift types by their mathematical definitions and survey the different terms used in the literature to build a consolidated taxonomy of the field. We also review and classify performance-based concept drift detection methods proposed in the last decade. These methods utilize the predictive model's performance degradation to signal substantial changes in the systems. The classification is outlined in a hierarchical diagram to provide an orderly navigation between the methods. We present a comprehensive analysis of the main attributes and strategies for tracking and evaluating the model's performance in the predictive system. The paper concludes by discussing open research challenges and possible research directions.
△ Less
Submitted 21 March, 2022;
originally announced March 2022.
-
Overview of Test Coverage Criteria for Test Case Generation from Finite State Machines Modelled as Directed Graphs
Authors:
Vaclav Rechtberger,
Miroslav Bures,
Bestoun S. Ahmed
Abstract:
Test Coverage criteria are an essential concept for test engineers when generating the test cases from a System Under Test model. They are routinely used in test case generation for user interfaces, middleware, and back-end system parts for software, electronics, or Internet of Things (IoT) systems. Test Coverage criteria define the number of actions or combinations by which a system is tested, in…
▽ More
Test Coverage criteria are an essential concept for test engineers when generating the test cases from a System Under Test model. They are routinely used in test case generation for user interfaces, middleware, and back-end system parts for software, electronics, or Internet of Things (IoT) systems. Test Coverage criteria define the number of actions or combinations by which a system is tested, informally determining a potential "strength" of a test set. As no previous study summarized all commonly used test coverage criteria for Finite State Machines and comprehensively discussed them regarding their subsumption, equivalence, or non-comparability, this paper provides this overview. In this study, 14 most common test coverage criteria and seven of their synonyms for Finite State Machines defined via a directed graph are summarized and compared. The results give researchers and industry testing engineers a helpful overview when setting a software-based or IoT system test strategy.
△ Less
Submitted 17 March, 2022;
originally announced March 2022.
-
Prioritized Variable-length Test Cases Generation for Finite State Machines
Authors:
Vaclav Rechtberger,
Miroslav Bures,
Bestoun S. Ahmed,
Youcef Belkhier,
Jiri Nema,
Hynek Schvach
Abstract:
Model-based Testing (MBT) is an effective approach for testing when parts of a system-under-test have the characteristics of a finite state machine (FSM). Despite various strategies in the literature on this topic, little work exists to handle special testing situations. More specifically, when concurrently: (1) the test paths can start and end only in defined states of the FSM, (2) a prioritizati…
▽ More
Model-based Testing (MBT) is an effective approach for testing when parts of a system-under-test have the characteristics of a finite state machine (FSM). Despite various strategies in the literature on this topic, little work exists to handle special testing situations. More specifically, when concurrently: (1) the test paths can start and end only in defined states of the FSM, (2) a prioritization mechanism that requires only defined states and transitions of the FSM to be visited by test cases is required, and (3) the test paths must be in a given length range, not necessarily of explicit uniform length. This paper presents a test generation strategy that satisfies all these requirements. A concurrent combination of these requirements is highly practical for real industrial testing. Six variants of possible algorithms to implement this strategy are described. Using a mixture of 180 problem instances from real automotive and defense projects and artificially generated FSMs, all variants are compared with a baseline strategy based on an established N-switch coverage concept modification. Various properties of the generated test paths and their potential to activate fictional defects defined in FSMs are evaluated. The presented strategy outperforms the baseline in most problem configurations. Out of the six analyzed variants, three give the best results even though a universal best performer is hard to identify. Depending on the application of the FSM, the strategy and evaluation presented in this paper are applicable both in testing functional and non-functional software requirements.
△ Less
Submitted 3 April, 2022; v1 submitted 17 March, 2022;
originally announced March 2022.
-
Combinatorial results for order-preserving partial injective contraction mappings
Authors:
Bayo Musa Ahmed,
Nadia Aldhamri,
Fatma Al-Kharousi,
Georg Klein,
Abdullahi Umar
Abstract:
Let $ \mathcal{I}_n$ be the symmetric inverse semigroup on $X_n = \{1, 2, \ldots , n\}$. Let $\mathcal{OCI}_n$ be the subsemigroup of $\mathcal{I}_n$ consisting of all order-preserving injective partial contraction mappings, and let $\mathcal{ODCI}_n$ be the subsemigroup of $\mathcal{I}_n$ consisting of all order-preserving and order-decreasing injective partial contraction mappings of $X_n$. In t…
▽ More
Let $ \mathcal{I}_n$ be the symmetric inverse semigroup on $X_n = \{1, 2, \ldots , n\}$. Let $\mathcal{OCI}_n$ be the subsemigroup of $\mathcal{I}_n$ consisting of all order-preserving injective partial contraction mappings, and let $\mathcal{ODCI}_n$ be the subsemigroup of $\mathcal{I}_n$ consisting of all order-preserving and order-decreasing injective partial contraction mappings of $X_n$. In this paper, we investigate the cardinalities of some equivalences on $\mathcal{OCI}_n$ and $\mathcal{ODCI}_n$ which lead naturally to obtaining the order of these semigroups. Then, we relate the formulae obtained to Fibonacci numbers. Similar results about $\mathcal{ORCI}_n$, the semigroup of order-preserving or order-reversing injective partial contraction mappings, are deduced.
△ Less
Submitted 12 March, 2022;
originally announced March 2022.
-
Using Deep Reinforcement Learning for Zero Defect Smart Forging
Authors:
Yunpeng Ma,
Andreas Kassler,
Bestoun S. Ahmed,
Pavel Krakhmalev,
Andreas Thore,
Arash Toyser,
Hans Lindback
Abstract:
Defects during production may lead to material waste, which is a significant challenge for many companies as it reduces revenue and negatively impacts sustainability and the environment. An essential reason for material waste is a low degree of automation, especially in industries that currently have a low degree of digitalization, such as steel forging. Those industries typically rely on heavy an…
▽ More
Defects during production may lead to material waste, which is a significant challenge for many companies as it reduces revenue and negatively impacts sustainability and the environment. An essential reason for material waste is a low degree of automation, especially in industries that currently have a low degree of digitalization, such as steel forging. Those industries typically rely on heavy and old machinery such as large induction ovens that are mostly controlled manually or using well-known recipes created by experts. However, standard recipes may fail when unforeseen events happen, such as an unplanned stop in production, which may lead to overheating and thus material degradation during the forging process. In this paper, we develop a digital twin-based optimization strategy for the heating process for a forging line to automate the development of an optimal control policy that adjusts the power for the heating coils in an induction oven based on temperature data observed from pyrometers. We design a digital twin-based deep reinforcement learning (DTRL) framework and train two different deep reinforcement learning (DRL) models for the heating phase using a digital twin of the forging line. The twin is based on a simulator that contains a heating transfer and movement model, which is used as an environment for the DRL training. Our evaluation shows that both models significantly reduce the temperature unevenness and can help to automate the traditional heating process.
△ Less
Submitted 25 January, 2022;
originally announced January 2022.
-
BIOPAK Flasher: Epidemic disease monitoring and detection in Pakistan using text mining
Authors:
Muhammad Nasir,
Maheen Bakhtyar,
Junaid Baber,
Sadia Lakho,
Bilal Ahmed,
Waheed Noor
Abstract:
Infectious disease outbreak has a significant impact on morbidity, mortality and can cause economic instability of many countries. As global trade is growing, goods and individuals are expected to travel across the border, an infected epidemic area carrier can pose a great danger to his hostile. If a disease outbreak is recognized promptly, then commercial products and travelers (traders/visitors)…
▽ More
Infectious disease outbreak has a significant impact on morbidity, mortality and can cause economic instability of many countries. As global trade is growing, goods and individuals are expected to travel across the border, an infected epidemic area carrier can pose a great danger to his hostile. If a disease outbreak is recognized promptly, then commercial products and travelers (traders/visitors) will be effectively vaccinated, and therefore the disease stopped. Early detection of outbreaks plays an important role here, and beware of the rapid implementation of control measures by citizens, public health organizations, and government. Many indicators have valuable information, such as online news sources (RSS) and social media sources (Twitter, Facebook) that can be used, but are unstructured and bulky, to extract information about disease outbreaks. Few early warning outbreak systems exist with some limitation of linguistic (Urdu) and covering areas (Pakistan). In Pakistan, few channels are published the outbreak news in Urdu or English. The aim is to procure information from Pakistan's English and Urdu news channels and then investigate process, integrate, and visualize the disease epidemic. Urdu ontology is not existed before to match extracted diseases, so we also build that ontology of disease.
△ Less
Submitted 12 June, 2021;
originally announced June 2021.
-
Hybrid Henry Gas Solubility Optimization Algorithm with Dynamic Cluster-to-Algorithm Mapping for Search-based Software Engineering Problems
Authors:
Kamal Z. Zamli,
Md. Abdul Kader,
Saiful Azad,
Bestoun S. Ahmed
Abstract:
This paper discusses a new variant of the Henry Gas Solubility Optimization (HGSO) Algorithm, called Hybrid HGSO (HHGSO). Unlike its predecessor, HHGSO allows multiple clusters serving different individual meta-heuristic algorithms (i.e., with its own defined parameters and local best) to coexist within the same population. Exploiting the dynamic cluster-to-algorithm mapping via penalized and rewa…
▽ More
This paper discusses a new variant of the Henry Gas Solubility Optimization (HGSO) Algorithm, called Hybrid HGSO (HHGSO). Unlike its predecessor, HHGSO allows multiple clusters serving different individual meta-heuristic algorithms (i.e., with its own defined parameters and local best) to coexist within the same population. Exploiting the dynamic cluster-to-algorithm mapping via penalized and reward model with adaptive switching factor, HHGSO offers a novel approach for meta-heuristic hybridization consisting of Jaya Algorithm, Sooty Tern Optimization Algorithm, Butterfly Optimization Algorithm, and Owl Search Algorithm, respectively. The acquired results from the selected two case studies (i.e., involving team formation problem and combinatorial test suite generation) indicate that the hybridization has notably improved the performance of HGSO and gives superior performance against other competing meta-heuristic and hyper-heuristic algorithms.
△ Less
Submitted 31 May, 2021;
originally announced May 2021.
-
Data-driven Design of Context-aware Monitors for Hazard Prediction in Artificial Pancreas Systems
Authors:
Xugui Zhou,
Bulbul Ahmed,
James H. Aylor,
Philip Asare,
Homa Alemzadeh
Abstract:
Medical Cyber-physical Systems (MCPS) are vulnerable to accidental or malicious faults that can target their controllers and cause safety hazards and harm to patients. This paper proposes a combined model and data-driven approach for designing context-aware monitors that can detect early signs of hazards and mitigate them in MCPS. We present a framework for formal specification of unsafe system co…
▽ More
Medical Cyber-physical Systems (MCPS) are vulnerable to accidental or malicious faults that can target their controllers and cause safety hazards and harm to patients. This paper proposes a combined model and data-driven approach for designing context-aware monitors that can detect early signs of hazards and mitigate them in MCPS. We present a framework for formal specification of unsafe system context using Signal Temporal Logic (STL) combined with an optimization method for patient-specific refinement of STL formulas based on real or simulated faulty data from the closed-loop system for the generation of monitor logic. We evaluate our approach in simulation using two state-of-the-art closed-loop Artificial Pancreas Systems (APS). The results show the context-aware monitor achieves up to 1.4 times increase in average hazard prediction accuracy (F1-score) over several baseline monitors, reduces false-positive and false-negative rates, and enables hazard mitigation with a 54% success rate while decreasing the average risk for patients.
△ Less
Submitted 13 April, 2021; v1 submitted 6 April, 2021;
originally announced April 2021.