-
On the Design Space Between Transformers and Recursive Neural Nets
Authors:
Jishnu Ray Chowdhury,
Cornelia Caragea
Abstract:
In this paper, we study two classes of models, Recursive Neural Networks (RvNNs) and Transformers, and show that a tight connection between them emerges from the recent development of two recent models - Continuous Recursive Neural Networks (CRvNN) and Neural Data Routers (NDR). On one hand, CRvNN pushes the boundaries of traditional RvNN, relaxing its discrete structure-wise composition and ends…
▽ More
In this paper, we study two classes of models, Recursive Neural Networks (RvNNs) and Transformers, and show that a tight connection between them emerges from the recent development of two recent models - Continuous Recursive Neural Networks (CRvNN) and Neural Data Routers (NDR). On one hand, CRvNN pushes the boundaries of traditional RvNN, relaxing its discrete structure-wise composition and ends up with a Transformer-like structure. On the other hand, NDR constrains the original Transformer to induce better structural inductive bias, ending up with a model that is close to CRvNN. Both models, CRvNN and NDR, show strong performance in algorithmic tasks and generalization in which simpler forms of RvNNs and Transformers fail. We explore these "bridge" models in the design space between RvNNs and Transformers, formalize their tight connections, discuss their limitations, and propose ideas for future research.
△ Less
Submitted 2 September, 2024;
originally announced September 2024.
-
A Systematic Literature Review on the Use of Blockchain Technology in Transition to a Circular Economy
Authors:
Ishmam Abid,
S. M. Zuhayer Anzum Fuad,
Mohammad Jabed Morshed Chowdhury,
Mehruba Sharmin Chowdhury,
Md Sadek Ferdous
Abstract:
The circular economy has the potential to increase resource efficiency and minimize waste through the 4R framework of reducing, reusing, recycling, and recovering. Blockchain technology is currently considered a valuable aid in the transition to a circular economy. Its decentralized and tamper-resistant nature enables the construction of transparent and secure supply chain management systems, ther…
▽ More
The circular economy has the potential to increase resource efficiency and minimize waste through the 4R framework of reducing, reusing, recycling, and recovering. Blockchain technology is currently considered a valuable aid in the transition to a circular economy. Its decentralized and tamper-resistant nature enables the construction of transparent and secure supply chain management systems, thereby improving product accountability and traceability. However, the full potential of blockchain technology in circular economy models will not be realized until a number of concerns, including scalability, interoperability, data protection, and regulatory and legal issues, are addressed. More research and stakeholder participation are required to overcome these limitations and achieve the benefits of blockchain technology in promoting a circular economy. This article presents a systematic literature review (SLR) that identified industry use cases for blockchain-driven circular economy models and offered architectures to minimize resource consumption, prices, and inefficiencies while encouraging the reuse, recycling, and recovery of end-of-life products. Three main outcomes emerged from our review of 41 documents, which included scholarly publications, Twitter-linked information, and Google results. The relationship between blockchain and the 4R framework for circular economy; discussion the terminology and various forms of blockchain and circular economy; and identification of the challenges and obstacles that blockchain technology may face in enabling a circular economy. This research shows how blockchain technology can help with the transition to a circular economy. Yet, it emphasizes the importance of additional study and stakeholder participation to overcome potential hurdles and obstacles in implementing blockchain-driven circular economy models.
△ Less
Submitted 21 August, 2024;
originally announced August 2024.
-
Prototype Smart Home Environment With Biofeedback
Authors:
Azmyin Md. Kamal,
Mushfiqul Azad,
Sumayia Jerin Chowdhury
Abstract:
In this paper we present a prototype of a smart home system which can actuate different peripherals based on the emotional "arousal" level of an user. The system is comprised of two embedded subsystems named "Wearable" and "Benchtop" which communicates with one another over UPD/IP protocol. The Wearable unit can differentiate the emotional arousal into three distinct classes (Normal, Medium and Hi…
▽ More
In this paper we present a prototype of a smart home system which can actuate different peripherals based on the emotional "arousal" level of an user. The system is comprised of two embedded subsystems named "Wearable" and "Benchtop" which communicates with one another over UPD/IP protocol. The Wearable unit can differentiate the emotional arousal into three distinct classes (Normal, Medium and High) based on physiological data whilst the Benchtop unit can display different colors on a 16 digit NEOPIXEL ring and, play tones to emulate actuation of peripheral devices in the smart home environment. Experiments with three video clips were performed which showed that the system can classify emotional arousal with an average accuracy of 41%. An FSM model of the Benchtop unit was created using Ptolemy II which showed the model to be fully deterministic and robust to communication disruption between the two units. The proposed project will add a new paradigm in smart home and IoT research by incorporating emotional feedback to automatically adjust the indoor environment for greater comfort, ease of living and in-home assisted ambulatory care for the residents.
△ Less
Submitted 28 July, 2024;
originally announced July 2024.
-
EarlyMalDetect: A Novel Approach for Early Windows Malware Detection Based on Sequences of API Calls
Authors:
Pascal Maniriho,
Abdun Naser Mahmood,
Mohammad Jabed Morshed Chowdhury
Abstract:
In this work, we propose EarlyMalDetect, a novel approach for early Windows malware detection based on sequences of API calls. Our approach leverages generative transformer models and attention-guided deep recurrent neural networks to accurately identify and detect patterns of malicious behaviors in the early stage of malware execution. By analyzing the sequences of API calls invoked during execut…
▽ More
In this work, we propose EarlyMalDetect, a novel approach for early Windows malware detection based on sequences of API calls. Our approach leverages generative transformer models and attention-guided deep recurrent neural networks to accurately identify and detect patterns of malicious behaviors in the early stage of malware execution. By analyzing the sequences of API calls invoked during execution, the proposed approach can classify executable files (programs) as malware or benign by predicting their behaviors based on a few shots (initial API calls) invoked during execution. EarlyMalDetect can predict and reveal what a malware program is going to perform on the target system before it occurs, which can help to stop it before executing its malicious payload and infecting the system. Specifically, EarlyMalDetect relies on a fine-tuned transformer model based on API calls which has the potential to predict the next API call functions to be used by a malware or benign executable program. Our extensive experimental evaluations show that the proposed approach is highly effective in predicting malware behaviors and can be used as a preventive measure against zero-day threats in Windows systems.
△ Less
Submitted 18 July, 2024;
originally announced July 2024.
-
Deep Attention Driven Reinforcement Learning (DAD-RL) for Autonomous Decision-Making in Dynamic Environment
Authors:
Jayabrata Chowdhury,
Venkataramanan Shivaraman,
Sumit Dangi,
Suresh Sundaram,
P. B. Sujit
Abstract:
Autonomous Vehicle (AV) decision making in urban environments is inherently challenging due to the dynamic interactions with surrounding vehicles. For safe planning, AV must understand the weightage of various spatiotemporal interactions in a scene. Contemporary works use colossal transformer architectures to encode interactions mainly for trajectory prediction, resulting in increased computationa…
▽ More
Autonomous Vehicle (AV) decision making in urban environments is inherently challenging due to the dynamic interactions with surrounding vehicles. For safe planning, AV must understand the weightage of various spatiotemporal interactions in a scene. Contemporary works use colossal transformer architectures to encode interactions mainly for trajectory prediction, resulting in increased computational complexity. To address this issue without compromising spatiotemporal understanding and performance, we propose the simple Deep Attention Driven Reinforcement Learning (DADRL) framework, which dynamically assigns and incorporates the significance of surrounding vehicles into the ego's RL driven decision making process. We introduce an AV centric spatiotemporal attention encoding (STAE) mechanism for learning the dynamic interactions with different surrounding vehicles. To understand map and route context, we employ a context encoder to extract features from context maps. The spatiotemporal representations combined with contextual encoding provide a comprehensive state representation. The resulting model is trained using the Soft Actor Critic (SAC) algorithm. We evaluate the proposed framework on the SMARTS urban benchmarking scenarios without traffic signals to demonstrate that DADRL outperforms recent state of the art methods. Furthermore, an ablation study underscores the importance of the context-encoder and spatio temporal attention encoder in achieving superior performance.
△ Less
Submitted 28 September, 2024; v1 submitted 11 July, 2024;
originally announced July 2024.
-
Sample Rate Independent Recurrent Neural Networks for Audio Effects Processing
Authors:
Alistair Carson,
Alec Wright,
Jatin Chowdhury,
Vesa Välimäki,
Stefan Bilbao
Abstract:
In recent years, machine learning approaches to modelling guitar amplifiers and effects pedals have been widely investigated and have become standard practice in some consumer products. In particular, recurrent neural networks (RNNs) are a popular choice for modelling non-linear devices such as vacuum tube amplifiers and distortion circuitry. One limitation of such models is that they are trained…
▽ More
In recent years, machine learning approaches to modelling guitar amplifiers and effects pedals have been widely investigated and have become standard practice in some consumer products. In particular, recurrent neural networks (RNNs) are a popular choice for modelling non-linear devices such as vacuum tube amplifiers and distortion circuitry. One limitation of such models is that they are trained on audio at a specific sample rate and therefore give unreliable results when operating at another rate. Here, we investigate several methods of modifying RNN structures to make them approximately sample rate independent, with a focus on oversampling. In the case of integer oversampling, we demonstrate that a previously proposed delay-based approach provides high fidelity sample rate conversion whilst additionally reducing aliasing. For non-integer sample rate adjustment, we propose two novel methods and show that one of these, based on cubic Lagrange interpolation of a delay-line, provides a significant improvement over existing methods. To our knowledge, this work provides the first in-depth study into this problem.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Lens-Type Redirective Intelligent Surfaces for Multi-User MIMO Communication
Authors:
Bamelak Tadele,
Faouzi Bellili,
Amine Mezghani,
Md Jawwad Chowdhury,
Haseeb Ur Rehman
Abstract:
This paper explores the idea of using redirective reconfigurable intelligent surfaces (RedRIS) to overcome many of the challenges associated with the conventional reflective RIS. We develop a framework for jointly optimizing the switching matrix of the lens-type RedRIS ports along with the active precoding matrix at the base station (BS) and the receive scaling factor. A joint non-convex optimizat…
▽ More
This paper explores the idea of using redirective reconfigurable intelligent surfaces (RedRIS) to overcome many of the challenges associated with the conventional reflective RIS. We develop a framework for jointly optimizing the switching matrix of the lens-type RedRIS ports along with the active precoding matrix at the base station (BS) and the receive scaling factor. A joint non-convex optimization problem is formulated under the minimum mean-square error (MMSE) criterion with the aim to maximize the spectral efficiency of each user. In the single-cell scenario, the optimum active precoding matrix at the multi-antenna BS and the receive scaling factor are found in closed-form by applying Lagrange optimization, while the optimal switching matrix of the lens-type RedRIS is obtained by means of a newly developed alternating optimization algorithm. We then extend the framework to the multi-cell scenario with single-antenna base stations that are aided by the same lens-type RedRIS. We further present two methods for reducing the number of effective connections of the RedRIS ports that result in appreciable overhead savings while enhancing the robustness of the system. The proposed RedRIS-based schemes are gauged against conventional reflective RIS-aided systems under both perfect and imperfect channel state information (CSI). The simulation results show the superiority of the proposed schemes in terms of overall throughput while incurring much less control overhead.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
Blockchain-enabled Circular Economy -- Collaborative Responsibility in Solar Panel Recycling
Authors:
Mohammad Jabed Morshed Chowdhury,
Naveed Ul Hassan,
Wayes Tushar,
Dustin Niyato,
Tapan Saha,
H Vincent Poor,
Chau Yuen
Abstract:
The adoption of renewable energy resources, such as solar power, is on the rise. However, the excessive installation and lack of recycling facilities pose environmental risks. This paper suggests a circular economy approach to address the issue. By implementing blockchain technology, the end-of-life (EOL) of solar panels can be tracked, and responsibilities can be assigned to relevant stakeholders…
▽ More
The adoption of renewable energy resources, such as solar power, is on the rise. However, the excessive installation and lack of recycling facilities pose environmental risks. This paper suggests a circular economy approach to address the issue. By implementing blockchain technology, the end-of-life (EOL) of solar panels can be tracked, and responsibilities can be assigned to relevant stakeholders. The degradation of panels can be monetized by tracking users' energy-related activities, and these funds can be used for future recycling. A new coin, the recycling coin (RC-Coin), incentivizes solar panel recycling and utilizes decentralized finance to stabilize the coin price and supply issue.
△ Less
Submitted 14 March, 2024;
originally announced March 2024.
-
Investigating Recurrent Transformers with Dynamic Halt
Authors:
Jishnu Ray Chowdhury,
Cornelia Caragea
Abstract:
In this paper, we comprehensively study the inductive biases of two major approaches to augmenting Transformers with a recurrent mechanism: (1) the approach of incorporating a depth-wise recurrence similar to Universal Transformers; and (2) the approach of incorporating a chunk-wise temporal recurrence like Temporal Latent Bottleneck. Furthermore, we propose and investigate novel ways to extend an…
▽ More
In this paper, we comprehensively study the inductive biases of two major approaches to augmenting Transformers with a recurrent mechanism: (1) the approach of incorporating a depth-wise recurrence similar to Universal Transformers; and (2) the approach of incorporating a chunk-wise temporal recurrence like Temporal Latent Bottleneck. Furthermore, we propose and investigate novel ways to extend and combine the above methods - for example, we propose a global mean-based dynamic halting mechanism for Universal Transformers and an augmentation of Temporal Latent Bottleneck with elements from Universal Transformer. We compare the models and probe their inductive biases in several diagnostic tasks, such as Long Range Arena (LRA), flip-flop language modeling, ListOps, and Logical Inference. The code is released in: https://github.com/JRC1995/InvestigatingRecurrentTransformers/tree/main
△ Less
Submitted 2 September, 2024; v1 submitted 1 February, 2024;
originally announced February 2024.
-
A review-based study on different Text-to-Speech technologies
Authors:
Md. Jalal Uddin Chowdhury,
Ashab Hussan
Abstract:
This research paper presents a comprehensive review-based study on various Text-to-Speech (TTS) technologies. TTS technology is an important aspect of human-computer interaction, enabling machines to convert written text into audible speech. The paper examines the different TTS technologies available, including concatenative TTS, formant synthesis TTS, and statistical parametric TTS. The study foc…
▽ More
This research paper presents a comprehensive review-based study on various Text-to-Speech (TTS) technologies. TTS technology is an important aspect of human-computer interaction, enabling machines to convert written text into audible speech. The paper examines the different TTS technologies available, including concatenative TTS, formant synthesis TTS, and statistical parametric TTS. The study focuses on comparing the advantages and limitations of these technologies in terms of their naturalness of voice, the level of complexity of the system, and their suitability for different applications. In addition, the paper explores the latest advancements in TTS technology, including neural TTS and hybrid TTS. The findings of this research will provide valuable insights for researchers, developers, and users who want to understand the different TTS technologies and their suitability for specific applications.
△ Less
Submitted 17 December, 2023;
originally announced December 2023.
-
Graph-based Prediction and Planning Policy Network (GP3Net) for scalable self-driving in dynamic environments using Deep Reinforcement Learning
Authors:
Jayabrata Chowdhury,
Venkataramanan Shivaraman,
Suresh Sundaram,
P B Sujit
Abstract:
Recent advancements in motion planning for Autonomous Vehicles (AVs) show great promise in using expert driver behaviors in non-stationary driving environments. However, learning only through expert drivers needs more generalizability to recover from domain shifts and near-failure scenarios due to the dynamic behavior of traffic participants and weather conditions. A deep Graph-based Prediction an…
▽ More
Recent advancements in motion planning for Autonomous Vehicles (AVs) show great promise in using expert driver behaviors in non-stationary driving environments. However, learning only through expert drivers needs more generalizability to recover from domain shifts and near-failure scenarios due to the dynamic behavior of traffic participants and weather conditions. A deep Graph-based Prediction and Planning Policy Network (GP3Net) framework is proposed for non-stationary environments that encodes the interactions between traffic participants with contextual information and provides a decision for safe maneuver for AV. A spatio-temporal graph models the interactions between traffic participants for predicting the future trajectories of those participants. The predicted trajectories are utilized to generate a future occupancy map around the AV with uncertainties embedded to anticipate the evolving non-stationary driving environments. Then the contextual information and future occupancy maps are input to the policy network of the GP3Net framework and trained using Proximal Policy Optimization (PPO) algorithm. The proposed GP3Net performance is evaluated on standard CARLA benchmarking scenarios with domain shifts of traffic patterns (urban, highway, and mixed). The results show that the GP3Net outperforms previous state-of-the-art imitation learning-based planning models for different towns. Further, in unseen new weather conditions, GP3Net completes the desired route with fewer traffic infractions. Finally, the results emphasize the advantage of including the prediction module to enhance safety measures in non-stationary environments.
△ Less
Submitted 10 December, 2023;
originally announced December 2023.
-
Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability
Authors:
Jishnu Ray Chowdhury,
Cornelia Caragea
Abstract:
Binary Balanced Tree RvNNs (BBT-RvNNs) enforce sequence composition according to a preset balanced binary tree structure. Thus, their non-linear recursion depth is just $\log_2 n$ ($n$ being the sequence length). Such logarithmic scaling makes BBT-RvNNs efficient and scalable on long sequence tasks such as Long Range Arena (LRA). However, such computational efficiency comes at a cost because BBT-R…
▽ More
Binary Balanced Tree RvNNs (BBT-RvNNs) enforce sequence composition according to a preset balanced binary tree structure. Thus, their non-linear recursion depth is just $\log_2 n$ ($n$ being the sequence length). Such logarithmic scaling makes BBT-RvNNs efficient and scalable on long sequence tasks such as Long Range Arena (LRA). However, such computational efficiency comes at a cost because BBT-RvNNs cannot solve simple arithmetic tasks like ListOps. On the flip side, RvNNs (e.g., Beam Tree RvNN) that do succeed on ListOps (and other structure-sensitive tasks like formal logical inference) are generally several times more expensive than even RNNs. In this paper, we introduce a novel framework -- Recursion in Recursion (RIR) to strike a balance between the two sides - getting some of the benefits from both worlds. In RIR, we use a form of two-level nested recursion - where the outer recursion is a $k$-ary balanced tree model with another recursive model (inner recursion) implementing its cell function. For the inner recursion, we choose Beam Tree RvNNs (BT-RvNN). To adjust BT-RvNNs within RIR we also propose a novel strategy of beam alignment. Overall, this entails that the total recursive depth in RIR is upper-bounded by $k \log_k n$. Our best RIR-based model is the first model that demonstrates high ($\geq 90\%$) length-generalization performance on ListOps while at the same time being scalable enough to be trainable on long sequence inputs from LRA. Moreover, in terms of accuracy in the LRA language tasks, it performs competitively with Structured State Space Models (SSMs) without any special initialization - outperforming Transformers by a large margin. On the other hand, while SSMs can marginally outperform RIR on LRA, they (SSMs) fail to length-generalize on ListOps. Our code is available at: \url{https://github.com/JRC1995/BeamRecursionFamily/}.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Efficient Beam Tree Recursion
Authors:
Jishnu Ray Chowdhury,
Cornelia Caragea
Abstract:
Beam Tree Recursive Neural Network (BT-RvNN) was recently proposed as a simple extension of Gumbel Tree RvNN and it was shown to achieve state-of-the-art length generalization performance in ListOps while maintaining comparable performance on other tasks. However, although not the worst in its kind, BT-RvNN can be still exorbitantly expensive in memory usage. In this paper, we identify the main bo…
▽ More
Beam Tree Recursive Neural Network (BT-RvNN) was recently proposed as a simple extension of Gumbel Tree RvNN and it was shown to achieve state-of-the-art length generalization performance in ListOps while maintaining comparable performance on other tasks. However, although not the worst in its kind, BT-RvNN can be still exorbitantly expensive in memory usage. In this paper, we identify the main bottleneck in BT-RvNN's memory usage to be the entanglement of the scorer function and the recursive cell function. We propose strategies to remove this bottleneck and further simplify its memory usage. Overall, our strategies not only reduce the memory usage of BT-RvNN by $10$-$16$ times but also create a new state-of-the-art in ListOps while maintaining similar performance in other tasks. In addition, we also propose a strategy to utilize the induced latent-tree node representations produced by BT-RvNN to turn BT-RvNN from a sentence encoder of the form $f:\mathbb{R}^{n \times d} \rightarrow \mathbb{R}^{d}$ into a sequence contextualizer of the form $f:\mathbb{R}^{n \times d} \rightarrow \mathbb{R}^{n \times d}$. Thus, our proposals not only open up a path for further scalability of RvNNs but also standardize a way to use BT-RvNNs as another building block in the deep learning toolkit that can be easily stacked or interfaced with other popular models such as Transformers and Structured State Space models.
△ Less
Submitted 7 November, 2023; v1 submitted 20 July, 2023;
originally announced July 2023.
-
Managing health insurance using blockchain technology
Authors:
Tajkia Nuri Ananna,
Munshi Saifuzzaman,
Mohammad Jabed Morshed Chowdhury,
Md Sadek Ferdous
Abstract:
Health insurance plays a significant role in ensuring quality healthcare. In response to the escalating costs of the medical industry, the demand for health insurance is soaring. Additionally, those with health insurance are more likely to receive preventative care than those without health insurance. However, from granting health insurance to delivering services to insured individuals, the health…
▽ More
Health insurance plays a significant role in ensuring quality healthcare. In response to the escalating costs of the medical industry, the demand for health insurance is soaring. Additionally, those with health insurance are more likely to receive preventative care than those without health insurance. However, from granting health insurance to delivering services to insured individuals, the health insurance industry faces numerous obstacles. Fraudulent actions, false claims, a lack of transparency and data privacy, reliance on human effort and dishonesty from consumers, healthcare professionals, or even the insurer party itself, are the most common and important hurdles towards success. Given these constraints, this chapter briefly covers the most immediate concerns in the health insurance industry and provides insight into how blockchain technology integration can contribute to resolving these issues. This chapter finishes by highlighting existing limitations as well as potential future directions.
△ Less
Submitted 17 June, 2023;
originally announced June 2023.
-
Predictive Maneuver Planning with Deep Reinforcement Learning (PMP-DRL) for comfortable and safe autonomous driving
Authors:
Jayabrata Chowdhury,
Vishruth Veerendranath,
Suresh Sundaram,
Narasimhan Sundararajan
Abstract:
This paper presents a Predictive Maneuver Planning with Deep Reinforcement Learning (PMP-DRL) model for maneuver planning. Traditional rule-based maneuver planning approaches often have to improve their abilities to handle the variabilities of real-world driving scenarios. By learning from its experience, a Reinforcement Learning (RL)-based driving agent can adapt to changing driving conditions an…
▽ More
This paper presents a Predictive Maneuver Planning with Deep Reinforcement Learning (PMP-DRL) model for maneuver planning. Traditional rule-based maneuver planning approaches often have to improve their abilities to handle the variabilities of real-world driving scenarios. By learning from its experience, a Reinforcement Learning (RL)-based driving agent can adapt to changing driving conditions and improve its performance over time. Our proposed approach combines a predictive model and an RL agent to plan for comfortable and safe maneuvers. The predictive model is trained using historical driving data to predict the future positions of other surrounding vehicles. The surrounding vehicles' past and predicted future positions are embedded in context-aware grid maps. At the same time, the RL agent learns to make maneuvers based on this spatio-temporal context information. Performance evaluation of PMP-DRL has been carried out using simulated environments generated from publicly available NGSIM US101 and I80 datasets. The training sequence shows the continuous improvement in the driving experiences. It shows that proposed PMP-DRL can learn the trade-off between safety and comfortability. The decisions generated by the recent imitation learning-based model are compared with the proposed PMP-DRL for unseen scenarios. The results clearly show that PMP-DRL can handle complex real-world scenarios and make better comfortable and safe maneuver decisions than rule-based and imitative models.
△ Less
Submitted 15 June, 2023;
originally announced June 2023.
-
Monotonic Location Attention for Length Generalization
Authors:
Jishnu Ray Chowdhury,
Cornelia Caragea
Abstract:
We explore different ways to utilize position-based cross-attention in seq2seq networks to enable length generalization in algorithmic tasks. We show that a simple approach of interpolating the original and reversed encoded representations combined with relative attention allows near-perfect length generalization for both forward and reverse lookup tasks or copy tasks that had been generally hard…
▽ More
We explore different ways to utilize position-based cross-attention in seq2seq networks to enable length generalization in algorithmic tasks. We show that a simple approach of interpolating the original and reversed encoded representations combined with relative attention allows near-perfect length generalization for both forward and reverse lookup tasks or copy tasks that had been generally hard to tackle. We also devise harder diagnostic tasks where the relative distance of the ideal attention position varies with timestep. In such settings, the simple interpolation trick with relative attention is not sufficient. We introduce novel variants of location attention building on top of Dubois et al. (2020) to address the new diagnostic tasks. We also show the benefits of our approaches for length generalization in SCAN (Lake & Baroni, 2018) and CFQ (Keysers et al., 2020). Our code is available on GitHub.
△ Less
Submitted 31 May, 2023;
originally announced May 2023.
-
Beam Tree Recursive Cells
Authors:
Jishnu Ray Chowdhury,
Cornelia Caragea
Abstract:
We propose Beam Tree Recursive Cell (BT-Cell) - a backpropagation-friendly framework to extend Recursive Neural Networks (RvNNs) with beam search for latent structure induction. We further extend this framework by proposing a relaxation of the hard top-k operators in beam search for better propagation of gradient signals. We evaluate our proposed models in different out-of-distribution splits in b…
▽ More
We propose Beam Tree Recursive Cell (BT-Cell) - a backpropagation-friendly framework to extend Recursive Neural Networks (RvNNs) with beam search for latent structure induction. We further extend this framework by proposing a relaxation of the hard top-k operators in beam search for better propagation of gradient signals. We evaluate our proposed models in different out-of-distribution splits in both synthetic and realistic data. Our experiments show that BTCell achieves near-perfect performance on several challenging structure-sensitive synthetic tasks like ListOps and logical inference while maintaining comparable performance in realistic data against other RvNN-based models. Additionally, we identify a previously unknown failure case for neural models in generalization to unseen number of arguments in ListOps. The code is available at: https://github.com/JRC1995/BeamTreeRecursiveCells.
△ Less
Submitted 20 June, 2023; v1 submitted 31 May, 2023;
originally announced May 2023.
-
Data Augmentation for Low-Resource Keyphrase Generation
Authors:
Krishna Garg,
Jishnu Ray Chowdhury,
Cornelia Caragea
Abstract:
Keyphrase generation is the task of summarizing the contents of any given article into a few salient phrases (or keyphrases). Existing works for the task mostly rely on large-scale annotated datasets, which are not easy to acquire. Very few works address the problem of keyphrase generation in low-resource settings, but they still rely on a lot of additional unlabeled data for pretraining and on au…
▽ More
Keyphrase generation is the task of summarizing the contents of any given article into a few salient phrases (or keyphrases). Existing works for the task mostly rely on large-scale annotated datasets, which are not easy to acquire. Very few works address the problem of keyphrase generation in low-resource settings, but they still rely on a lot of additional unlabeled data for pretraining and on automatic methods for pseudo-annotations. In this paper, we present data augmentation strategies specifically to address keyphrase generation in purely resource-constrained domains. We design techniques that use the full text of the articles to improve both present and absent keyphrase generation. We test our approach comprehensively on three datasets and show that the data augmentation strategies consistently improve the state-of-the-art performance. We release our source code at https://github.com/kgarg8/kpgen-lowres-data-aug.
△ Less
Submitted 29 May, 2023;
originally announced May 2023.
-
Neural Keyphrase Generation: Analysis and Evaluation
Authors:
Tuhin Kundu,
Jishnu Ray Chowdhury,
Cornelia Caragea
Abstract:
Keyphrase generation aims at generating topical phrases from a given text either by copying from the original text (present keyphrases) or by producing new keyphrases (absent keyphrases) that capture the semantic meaning of the text. Encoder-decoder models are most widely used for this task because of their capabilities for absent keyphrase generation. However, there has been little to no analysis…
▽ More
Keyphrase generation aims at generating topical phrases from a given text either by copying from the original text (present keyphrases) or by producing new keyphrases (absent keyphrases) that capture the semantic meaning of the text. Encoder-decoder models are most widely used for this task because of their capabilities for absent keyphrase generation. However, there has been little to no analysis on the performance and behavior of such models for keyphrase generation. In this paper, we study various tendencies exhibited by three strong models: T5 (based on a pre-trained transformer), CatSeq-Transformer (a non-pretrained Transformer), and ExHiRD (based on a recurrent neural network). We analyze prediction confidence scores, model calibration, and the effect of token position on keyphrases generation. Moreover, we motivate and propose a novel metric framework, SoftKeyScore, to evaluate the similarity between two sets of keyphrases by using softscores to account for partial matching and semantic similarity. We find that SoftKeyScore is more suitable than the standard F1 metric for evaluating two sets of given keyphrases.
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
Machine Fault Classification using Hamiltonian Neural Networks
Authors:
Jeremy Shen,
Jawad Chowdhury,
Sourav Banerjee,
Gabriel Terejanu
Abstract:
A new approach is introduced to classify faults in rotating machinery based on the total energy signature estimated from sensor measurements. The overall goal is to go beyond using black-box models and incorporate additional physical constraints that govern the behavior of mechanical systems. Observational data is used to train Hamiltonian neural networks that describe the conserved energy of the…
▽ More
A new approach is introduced to classify faults in rotating machinery based on the total energy signature estimated from sensor measurements. The overall goal is to go beyond using black-box models and incorporate additional physical constraints that govern the behavior of mechanical systems. Observational data is used to train Hamiltonian neural networks that describe the conserved energy of the system for normal and various abnormal regimes. The estimated total energy function, in the form of the weights of the Hamiltonian neural network, serves as the new feature vector to discriminate between the faults using off-the-shelf classification models. The experimental results are obtained using the MaFaulDa database, where the proposed model yields a promising area under the curve (AUC) of $0.78$ for the binary classification (normal vs abnormal) and $0.84$ for the multi-class problem (normal, and $5$ different abnormal regimes).
△ Less
Submitted 4 January, 2023;
originally announced January 2023.
-
Evaluation of Induced Expert Knowledge in Causal Structure Learning by NOTEARS
Authors:
Jawad Chowdhury,
Rezaur Rashid,
Gabriel Terejanu
Abstract:
Causal modeling provides us with powerful counterfactual reasoning and interventional mechanism to generate predictions and reason under various what-if scenarios. However, causal discovery using observation data remains a nontrivial task due to unobserved confounding factors, finite sampling, and changes in the data distribution. These can lead to spurious cause-effect relationships. To mitigate…
▽ More
Causal modeling provides us with powerful counterfactual reasoning and interventional mechanism to generate predictions and reason under various what-if scenarios. However, causal discovery using observation data remains a nontrivial task due to unobserved confounding factors, finite sampling, and changes in the data distribution. These can lead to spurious cause-effect relationships. To mitigate these challenges in practice, researchers augment causal learning with known causal relations. The goal of the paper is to study the impact of expert knowledge on causal relations in the form of additional constraints used in the formulation of the nonparametric NOTEARS. We provide a comprehensive set of comparative analyses of biasing the model using different types of knowledge. We found that (i) knowledge that corrects the mistakes of the NOTEARS model can lead to statistically significant improvements, (ii) constraints on active edges have a larger positive impact on causal discovery than inactive edges, and surprisingly, (iii) the induced knowledge does not correct on average more incorrect active and/or inactive edges than expected. We also demonstrate the behavior of the model and the effectiveness of domain knowledge on a real-world dataset.
△ Less
Submitted 4 January, 2023;
originally announced January 2023.
-
From Causal Pairs to Causal Graphs
Authors:
Rezaur Rashid,
Jawad Chowdhury,
Gabriel Terejanu
Abstract:
Causal structure learning from observational data remains a non-trivial task due to various factors such as finite sampling, unobserved confounding factors, and measurement errors. Constraint-based and score-based methods tend to suffer from high computational complexity due to the combinatorial nature of estimating the directed acyclic graph (DAG). Motivated by the `Cause-Effect Pair' NIPS 2013 W…
▽ More
Causal structure learning from observational data remains a non-trivial task due to various factors such as finite sampling, unobserved confounding factors, and measurement errors. Constraint-based and score-based methods tend to suffer from high computational complexity due to the combinatorial nature of estimating the directed acyclic graph (DAG). Motivated by the `Cause-Effect Pair' NIPS 2013 Workshop on Causality Challenge, in this paper, we take a different approach and generate a probability distribution over all possible graphs informed by the cause-effect pair features proposed in response to the workshop challenge. The goal of the paper is to propose new methods based on this probabilistic information and compare their performance with traditional and state-of-the-art approaches. Our experiments, on both synthetic and real datasets, show that our proposed methods not only have statistically similar or better performances than some traditional approaches but also are computationally faster.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
DatChain -- Blockchain implementation in Data transfer for IoT Devices
Authors:
Om Rajput,
Suyash Nigam,
M. J. Chowdhury,
Kayalvizhi Jayavel
Abstract:
Currently, the IoT ecosystem is comprised of fully connected smart devices that exchange data to provide more automated, precise, and fast decisions. This idealised situation can only be accomplished if a system for data transactions is processed efficiently and security is ensured with high scalability and practicability. The integrity of data must be maintained during the exchange or transfer of…
▽ More
Currently, the IoT ecosystem is comprised of fully connected smart devices that exchange data to provide more automated, precise, and fast decisions. This idealised situation can only be accomplished if a system for data transactions is processed efficiently and security is ensured with high scalability and practicability. The integrity of data must be maintained during the exchange or transfer of data between entities. We propose to make a application called DatChain that responds to the above situation. The application stores data sensed by the Iot sensors in the backend after encrypting it and when the data is required for any purpose it can be exchanged using a suitable blockchain network that can keep up with the transfer rate even at high traffic in a secure environment.
△ Less
Submitted 3 November, 2022;
originally announced November 2022.
-
Deep Learning Models for Detecting Malware Attacks
Authors:
Pascal Maniriho,
Abdun Naser Mahmood,
Mohammad Jabed Morshed Chowdhury
Abstract:
Malware is one of the most common and severe cyber-attack today. Malware infects millions of devices and can perform several malicious activities including mining sensitive data, encrypting data, crippling system performance, and many more. Hence, malware detection is crucial to protect our computers and mobile devices from malware attacks. Deep learning (DL) is one of the emerging and promising t…
▽ More
Malware is one of the most common and severe cyber-attack today. Malware infects millions of devices and can perform several malicious activities including mining sensitive data, encrypting data, crippling system performance, and many more. Hence, malware detection is crucial to protect our computers and mobile devices from malware attacks. Deep learning (DL) is one of the emerging and promising technologies for detecting malware. The recent high production of malware variants against desktop and mobile platforms makes DL algorithms powerful approaches for building scalable and advanced malware detection models as they can handle big datasets. This work explores current deep learning technologies for detecting malware attacks on the Windows, Linux, and Android platforms. Specifically, we present different categories of DL algorithms, network optimizers, and regularization methods. Different loss functions, activation functions, and frameworks for implementing DL models are presented. We also present feature extraction approaches and a review of recent DL-based models for detecting malware attacks on the above platforms. Furthermore, this work presents major research issues on malware detection including future directions to further advance knowledge and research in this field.
△ Less
Submitted 29 January, 2024; v1 submitted 8 September, 2022;
originally announced September 2022.
-
MalDetConv: Automated Behaviour-based Malware Detection Framework Based on Natural Language Processing and Deep Learning Techniques
Authors:
Pascal Maniriho,
Abdun Naser Mahmood,
Mohammad Jabed Morshed Chowdhury
Abstract:
The popularity of Windows attracts the attention of hackers/cyber-attackers, making Windows devices the primary target of malware attacks in recent years. Several sophisticated malware variants and anti-detection methods have been significantly enhanced and as a result, traditional malware detection techniques have become less effective. This work presents MalBehavD-V1, a new behavioural dataset o…
▽ More
The popularity of Windows attracts the attention of hackers/cyber-attackers, making Windows devices the primary target of malware attacks in recent years. Several sophisticated malware variants and anti-detection methods have been significantly enhanced and as a result, traditional malware detection techniques have become less effective. This work presents MalBehavD-V1, a new behavioural dataset of Windows Application Programming Interface (API) calls extracted from benign and malware executable files using the dynamic analysis approach. In addition, we present MalDetConV, a new automated behaviour-based framework for detecting both existing and zero-day malware attacks. MalDetConv uses a text processing-based encoder to transform features of API calls into a suitable format supported by deep learning models. It then uses a hybrid of convolutional neural network (CNN) and bidirectional gated recurrent unit (CNN-BiGRU) automatic feature extractor to select high-level features of the API Calls which are then fed to a fully connected neural network module for malware classification. MalDetConv also uses an explainable component that reveals features that contributed to the final classification outcome, helping the decision-making process for security analysts. The performance of the proposed framework is evaluated using our MalBehavD-V1 dataset and other benchmark datasets. The detection results demonstrate the effectiveness of MalDetConv over the state-of-the-art techniques with detection accuracy of 96.10%, 95.73%, 98.18%, and 99.93% achieved while detecting unseen malware from MalBehavD-V1, Allan and John, Brazilian, and Ki-D datasets, respectively. The experimental results show that MalDetConv is highly accurate in detecting both known and zero-day malware attacks on Windows devices.
△ Less
Submitted 7 September, 2022;
originally announced September 2022.
-
An efficient Deep Spatio-Temporal Context Aware decision Network (DST-CAN) for Predictive Manoeuvre Planning
Authors:
Jayabrata Chowdhury,
Suresh Sundaram,
Nishanth Rao,
Narasimhan Sundararajan
Abstract:
To ensure the safety and efficiency of its maneuvers, an Autonomous Vehicle (AV) should anticipate the future intentions of surrounding vehicles using its sensor information. If an AV can predict its surrounding vehicles' future trajectories, it can make safe and efficient manoeuvre decisions. In this paper, we present such a Deep Spatio-Temporal Context-Aware decision Network (DST-CAN) model for…
▽ More
To ensure the safety and efficiency of its maneuvers, an Autonomous Vehicle (AV) should anticipate the future intentions of surrounding vehicles using its sensor information. If an AV can predict its surrounding vehicles' future trajectories, it can make safe and efficient manoeuvre decisions. In this paper, we present such a Deep Spatio-Temporal Context-Aware decision Network (DST-CAN) model for predictive manoeuvre planning of AVs. A memory neuron network is used to predict future trajectories of its surrounding vehicles. The driving environment's spatio-temporal information (past, present, and predicted future trajectories) are embedded into a context-aware grid. The proposed DST-CAN model employs these context-aware grids as inputs to a convolutional neural network to understand the spatial relationships between the vehicles and determine a safe and efficient manoeuvre decision. The DST-CAN model also uses information of human driving behavior on a highway. Performance evaluation of DST-CAN has been carried out using two publicly available NGSIM US-101 and I-80 datasets. Also, rule-based ground truth decisions have been compared with those generated by DST-CAN. The results clearly show that DST-CAN can make much better decisions with 3-sec of predicted trajectories of neighboring vehicles compared to currently existing methods that do not use this prediction.
△ Less
Submitted 8 July, 2024; v1 submitted 20 May, 2022;
originally announced May 2022.
-
On the Evaluation of Answer-Agnostic Paragraph-level Multi-Question Generation
Authors:
Jishnu Ray Chowdhury,
Debanjan Mahata,
Cornelia Caragea
Abstract:
We study the task of predicting a set of salient questions from a given paragraph without any prior knowledge of the precise answer. We make two main contributions. First, we propose a new method to evaluate a set of predicted questions against the set of references by using the Hungarian algorithm to assign predicted questions to references before scoring the assigned pairs. We show that our prop…
▽ More
We study the task of predicting a set of salient questions from a given paragraph without any prior knowledge of the precise answer. We make two main contributions. First, we propose a new method to evaluate a set of predicted questions against the set of references by using the Hungarian algorithm to assign predicted questions to references before scoring the assigned pairs. We show that our proposed evaluation strategy has better theoretical and practical properties compared to prior methods because it can properly account for the coverage of references. Second, we compare different strategies to utilize a pre-trained seq2seq model to generate and select a set of questions related to a given paragraph. The code is available.
△ Less
Submitted 11 March, 2022; v1 submitted 8 March, 2022;
originally announced March 2022.
-
BlockMeter: An Application Agnostic Performance Measurement Framework For Private Blockchain Platforms
Authors:
Ifteher Alom,
Md Sadek Ferdous,
Mohammad Jabed Morshed Chowdhury
Abstract:
Blockchain Technology is an emerging technology with the potential to disrupt a number of application domains. Though blockchain platforms like Bitcoin and Ethereum have seen immense success and acceptability, their nature of being public and anonymous make them unsuitable for many enterprise level use-cases. To address this issue, Linux Foundation has started an open source umbrella initiative, k…
▽ More
Blockchain Technology is an emerging technology with the potential to disrupt a number of application domains. Though blockchain platforms like Bitcoin and Ethereum have seen immense success and acceptability, their nature of being public and anonymous make them unsuitable for many enterprise level use-cases. To address this issue, Linux Foundation has started an open source umbrella initiative, known as the Hyperledger Platforms. Under this initiative, a number of private blockchain platforms have been developed which can be used for different enterprise level applications. However, the scalability and performance of these private blockchains must be examined to understand their suitability for different use-cases. Recent researches and projects on performance benchmarking for private blockchain systems are very specific to use-cases and are generally tied to a blockchain platform. In this article, we presentBlockMeter, an application agnostic performance benchmarking framework for private blockchain platforms. This framework can be utilised to measure the key performance matrices of any application deployed on top of an external private blockchain application in real-time. In this article, we present the architecture of the framework and discuss its different implementation aspects. Then, to showcase the applicability of the framework, we use BlockMeter to evaluate the two most widely used Hyperledger platforms, Hyperledger Fabric and HyperledgerSawtooth, against a number of use-cases.
△ Less
Submitted 11 February, 2022;
originally announced February 2022.
-
Novelty Controlled Paraphrase Generation with Retrieval Augmented Conditional Prompt Tuning
Authors:
Jishnu Ray Chowdhury,
Yong Zhuang,
Shuyi Wang
Abstract:
Paraphrase generation is a fundamental and long-standing task in natural language processing. In this paper, we concentrate on two contributions to the task: (1) we propose Retrieval Augmented Prompt Tuning (RAPT) as a parameter-efficient method to adapt large pre-trained language models for paraphrase generation; (2) we propose Novelty Conditioned RAPT (NC-RAPT) as a simple model-agnostic method…
▽ More
Paraphrase generation is a fundamental and long-standing task in natural language processing. In this paper, we concentrate on two contributions to the task: (1) we propose Retrieval Augmented Prompt Tuning (RAPT) as a parameter-efficient method to adapt large pre-trained language models for paraphrase generation; (2) we propose Novelty Conditioned RAPT (NC-RAPT) as a simple model-agnostic method of using specialized prompt tokens for controlled paraphrase generation with varying levels of lexical novelty. By conducting extensive experiments on four datasets, we demonstrate the effectiveness of the proposed approaches for retaining the semantic content of the original text while inducing lexical novelty in the generation.
△ Less
Submitted 12 March, 2022; v1 submitted 1 February, 2022;
originally announced February 2022.
-
Keyphrase Generation Beyond the Boundaries of Title and Abstract
Authors:
Krishna Garg,
Jishnu Ray Chowdhury,
Cornelia Caragea
Abstract:
Keyphrase generation aims at generating important phrases (keyphrases) that best describe a given document. In scholarly domains, current approaches have largely used only the title and abstract of the articles to generate keyphrases. In this paper, we comprehensively explore whether the integration of additional information from the full text of a given article or from semantically similar articl…
▽ More
Keyphrase generation aims at generating important phrases (keyphrases) that best describe a given document. In scholarly domains, current approaches have largely used only the title and abstract of the articles to generate keyphrases. In this paper, we comprehensively explore whether the integration of additional information from the full text of a given article or from semantically similar articles can be helpful for a neural keyphrase generation model or not. We discover that adding sentences from the full text, particularly in the form of the extractive summary of the article can significantly improve the generation of both types of keyphrases that are either present or absent from the text. Experimental results with three widely used models for keyphrase generation along with one of the latest transformer models suitable for longer documents, Longformer Encoder-Decoder (LED) validate the observation. We also present a new large-scale scholarly dataset FullTextKP for keyphrase generation. Unlike prior large-scale datasets, FullTextKP includes the full text of the articles along with the title and abstract. We release the source code at https://github.com/kgarg8/FullTextKP.
△ Less
Submitted 20 October, 2022; v1 submitted 13 December, 2021;
originally announced December 2021.
-
KPDrop: Improving Absent Keyphrase Generation
Authors:
Jishnu Ray Chowdhury,
Seoyeon Park,
Tuhin Kundu,
Cornelia Caragea
Abstract:
Keyphrase generation is the task of generating phrases (keyphrases) that summarize the main topics of a given document. Keyphrases can be either present or absent from the given document. While the extraction of present keyphrases has received much attention in the past, only recently a stronger focus has been placed on the generation of absent keyphrases. However, generating absent keyphrases is…
▽ More
Keyphrase generation is the task of generating phrases (keyphrases) that summarize the main topics of a given document. Keyphrases can be either present or absent from the given document. While the extraction of present keyphrases has received much attention in the past, only recently a stronger focus has been placed on the generation of absent keyphrases. However, generating absent keyphrases is challenging; even the best methods show only a modest degree of success. In this paper, we propose a model-agnostic approach called keyphrase dropout (or KPDrop) to improve absent keyphrase generation. In this approach, we randomly drop present keyphrases from the document and turn them into artificial absent keyphrases during training. We test our approach extensively and show that it consistently improves the absent performance of strong baselines in both supervised and resource-constrained semi-supervised settings.
△ Less
Submitted 24 October, 2022; v1 submitted 2 December, 2021;
originally announced December 2021.
-
Blockchain-based Covid Vaccination Registration and Monitoring
Authors:
Shirajus Salekin Nabil,
Md. Sabbir Alam Pran,
Ali Abrar Al Haque,
Narayan Ranjan Chakraborty,
Mohammad Jabed Morshed Chowdhury,
Md Sadek Ferdous
Abstract:
Covid-19 (SARS-CoV-2) has changed almost all the aspects of our living. Governments around the world have imposed lockdown to slow down the transmissions. In the meantime, researchers worked hard to find the vaccine. Fortunately, we have found the vaccine, in fact a good number of them. However, managing the testing and vaccination process of the total population is a mammoth job. There are multip…
▽ More
Covid-19 (SARS-CoV-2) has changed almost all the aspects of our living. Governments around the world have imposed lockdown to slow down the transmissions. In the meantime, researchers worked hard to find the vaccine. Fortunately, we have found the vaccine, in fact a good number of them. However, managing the testing and vaccination process of the total population is a mammoth job. There are multiple government and private sector organisations that are working together to ensure proper testing and vaccination. However, there is always delay or data silo problems in multi-organisational works. Therefore, streamlining this process is vital to improve the efficiency and save more lives. It is already proved that technology has a significant impact on the health sector, including blockchain. Blockchain provides a distributed system along with greater privacy, transparency and authenticity. In this article, we have presented a blockchain-based system that seamlessly integrates testing and vaccination system, allowing the system to be transparent. The instant verification of any tamper-proof result and a transparent and efficient vaccination system have been exhibited and implemented in the research. We have also implemented the system as "Digital Vaccine Passport" (DVP) and analysed its performance.
△ Less
Submitted 20 September, 2021;
originally announced September 2021.
-
A Systematic Literature Review on Wearable Health Data Publishing under Differential Privacy
Authors:
Munshi Saifuzzaman,
Tajkia Nuri Ananna,
Mohammad Jabed Morshed Chowdhury,
Md Sadek Ferdous,
Farida Chowdhury
Abstract:
Wearable devices generate different types of physiological data about the individuals. These data can provide valuable insights for medical researchers and clinicians that cannot be availed through traditional measures. Researchers have historically relied on survey responses or observed behavior. Interestingly, physiological data can provide a richer amount of user cognition than that obtained fr…
▽ More
Wearable devices generate different types of physiological data about the individuals. These data can provide valuable insights for medical researchers and clinicians that cannot be availed through traditional measures. Researchers have historically relied on survey responses or observed behavior. Interestingly, physiological data can provide a richer amount of user cognition than that obtained from any other sources, including the user himself. Therefore, the inexpensive consumer-grade wearable devices have become a point of interest for the health researchers. In addition, they are also used in continuous remote health monitoring and sometimes by the insurance companies. However, the biggest concern for such kind of use cases is the privacy of the individuals. There are a few privacy mechanisms, such as abstraction and k-anonymity, are widely used in information systems. Recently, Differential Privacy (DP) has emerged as a proficient technique to publish privacy sensitive data, including data from wearable devices. In this paper, we have conducted a Systematic Literature Review (SLR) to identify, select and critically appraise researches in DP as well as to understand different techniques and exiting use of DP in wearable data publishing. Based on our study we have identified the limitations of proposed solutions and provided future directions.
△ Less
Submitted 15 September, 2021;
originally announced September 2021.
-
Modeling Hierarchical Structures with Continuous Recursive Neural Networks
Authors:
Jishnu Ray Chowdhury,
Cornelia Caragea
Abstract:
Recursive Neural Networks (RvNNs), which compose sequences according to their underlying hierarchical syntactic structure, have performed well in several natural language processing tasks compared to similar models without structural biases. However, traditional RvNNs are incapable of inducing the latent structure in a plain text sequence on their own. Several extensions have been proposed to over…
▽ More
Recursive Neural Networks (RvNNs), which compose sequences according to their underlying hierarchical syntactic structure, have performed well in several natural language processing tasks compared to similar models without structural biases. However, traditional RvNNs are incapable of inducing the latent structure in a plain text sequence on their own. Several extensions have been proposed to overcome this limitation. Nevertheless, these extensions tend to rely on surrogate gradients or reinforcement learning at the cost of higher bias or variance. In this work, we propose Continuous Recursive Neural Network (CRvNN) as a backpropagation-friendly alternative to address the aforementioned limitations. This is done by incorporating a continuous relaxation to the induced structure. We demonstrate that CRvNN achieves strong performance in challenging synthetic tasks such as logical inference and ListOps. We also show that CRvNN performs comparably or better than prior latent structure models on real-world tasks such as sentiment analysis and natural language inference.
△ Less
Submitted 10 June, 2021;
originally announced June 2021.
-
BONIK: A Blockchain Empowered Chatbot for Financial Transactions
Authors:
Md. Saiful Islam Bhuiyan,
Abdur Razzak,
Md Sadek Ferdous,
Mohammad Jabed M. Chowdhury,
Mohammad A. Hoque,
Sasu Tarkoma
Abstract:
A Chatbot is a popular platform to enable users to interact with a software or website to gather information or execute actions in an automated fashion. In recent years, chatbots are being used for executing financial transactions, however, there are a number of security issues, such as secure authentication, data integrity, system availability and transparency, that must be carefully handled for…
▽ More
A Chatbot is a popular platform to enable users to interact with a software or website to gather information or execute actions in an automated fashion. In recent years, chatbots are being used for executing financial transactions, however, there are a number of security issues, such as secure authentication, data integrity, system availability and transparency, that must be carefully handled for their wide-scale adoption. Recently, the blockchain technology, with a number of security advantages, has emerged as one of the foundational technologies with the potential to disrupt a number of application domains, particularly in the financial sector. In this paper, we forward the idea of integrating a chatbot with blockchain technology in the view to improve the security issues in financial chatbots. More specifically, we present BONIK, a blockchain empowered chatbot for financial transactions, and discuss its architecture and design choices. Furthermore, we explore the developed Proof-of-Concept (PoC), evaluate its performance, analyse how different security and privacy issues are mitigated using BONIK.
△ Less
Submitted 17 November, 2020;
originally announced November 2020.
-
Modelling Attacks in Blockchain Systems using Petri Nets
Authors:
Md. Atik Shahriar,
Faisal Haque Bappy,
A. K. M. Fakhrul Hossain,
Dayamoy Datta Saikat,
Md Sadek Ferdous,
Mohammad Jabed M. Chowdhury,
Md Zakirul Alam Bhuiyan
Abstract:
Blockchain technology has evolved through many changes and modifications, such as smart-contracts since its inception in 2008. The popularity of a blockchain system is due to the fact that it offers a significant security advantage over other traditional systems. However, there have been many attacks in various blockchain systems, exploiting different vulnerabilities and bugs, which caused a signi…
▽ More
Blockchain technology has evolved through many changes and modifications, such as smart-contracts since its inception in 2008. The popularity of a blockchain system is due to the fact that it offers a significant security advantage over other traditional systems. However, there have been many attacks in various blockchain systems, exploiting different vulnerabilities and bugs, which caused a significant financial loss. Therefore, it is essential to understand how these attacks in blockchain occur, which vulnerabilities they exploit, and what threats they expose. Another concerning issue in this domain is the recent advancement in the quantum computing field, which imposes a significant threat to the security aspects of many existing secure systems, including blockchain, as they would invalidate many widely-used cryptographic algorithms. Thus, it is important to examine how quantum computing will affect these or other new attacks in the future. In this paper, we explore different vulnerabilities in current blockchain systems and analyse the threats that various theoretical and practical attacks in the blockchain expose. We then model those attacks using Petri nets concerning current systems and future quantum computers.
△ Less
Submitted 14 November, 2020;
originally announced November 2020.
-
A Comparison of Virtual Analog Modelling Techniques for Desktop and Embedded Implementations
Authors:
Jatin Chowdhury
Abstract:
We develop a virtual analog model of the Klon Centaur guitar pedal circuit, comparing various circuit modelling techniques. The techniques analyzed include traditional modelling techniques such as nodal analysis and Wave Digital Filters, as well as a machine learning technique using recurrent neural networks. We examine these techniques in the contexts of two use cases: an audio plug-in designed t…
▽ More
We develop a virtual analog model of the Klon Centaur guitar pedal circuit, comparing various circuit modelling techniques. The techniques analyzed include traditional modelling techniques such as nodal analysis and Wave Digital Filters, as well as a machine learning technique using recurrent neural networks. We examine these techniques in the contexts of two use cases: an audio plug-in designed to be run on a consumer-grade desktop computer, and a guitar pedal-style effect running on an embedded device. Finally, we discuss the advantages and disdvantages of each technique for modelling different circuits, and targeting different platforms.
△ Less
Submitted 6 September, 2020;
originally announced September 2020.
-
Explainable Deep Modeling of Tabular Data using TableGraphNet
Authors:
Gabriel Terejanu,
Jawad Chowdhury,
Rezaur Rashid,
Asif Chowdhury
Abstract:
The vast majority of research on explainability focuses on post-explainability rather than explainable modeling. Namely, an explanation model is derived to explain a complex black box model built with the sole purpose of achieving the highest performance possible. In part, this trend might be driven by the misconception that there is a trade-off between explainability and accuracy. Furthermore, th…
▽ More
The vast majority of research on explainability focuses on post-explainability rather than explainable modeling. Namely, an explanation model is derived to explain a complex black box model built with the sole purpose of achieving the highest performance possible. In part, this trend might be driven by the misconception that there is a trade-off between explainability and accuracy. Furthermore, the consequential work on Shapely values, grounded in game theory, has also contributed to a new wave of post-explainability research on better approximations for various machine learning models, including deep learning models. We propose a new architecture that inherently produces explainable predictions in the form of additive feature attributions. Our approach learns a graph representation for each record in the dataset. Attribute centric features are then derived from the graph and fed into a contribution deep set model to produce the final predictions. We show that our explainable model attains the same level of performance as black box models. Finally, we provide an augmented model training approach that leverages the missingness property and yields high levels of consistency (as required for the Shapely values) without loss of accuracy.
△ Less
Submitted 12 February, 2020;
originally announced February 2020.
-
Blockchain Consensus Algorithms: A Survey
Authors:
Md Sadek Ferdous,
Mohammad Jabed Morshed Chowdhury,
Mohammad A. Hoque,
Alan Colman
Abstract:
In recent years, blockchain technology has received unparalleled attention from academia, industry, and governments all around the world. It is considered a technological breakthrough anticipated to disrupt several application domains. This has resulted in a plethora of blockchain systems for various purposes. However, many of these blockchain systems suffer from serious shortcomings related to th…
▽ More
In recent years, blockchain technology has received unparalleled attention from academia, industry, and governments all around the world. It is considered a technological breakthrough anticipated to disrupt several application domains. This has resulted in a plethora of blockchain systems for various purposes. However, many of these blockchain systems suffer from serious shortcomings related to their performance and security, which need to be addressed before any wide-scale adoption can be achieved. A crucial component of any blockchain system is its underlying consensus algorithm, which in many ways, determines its performance and security. Therefore, to address the limitations of different blockchain systems, several existing as well novel consensus algorithms have been introduced. A systematic analysis of these algorithms will help to understand how and why any particular blockchain performs the way it functions. However, the existing studies of consensus algorithms are not comprehensive. Those studies have incomplete discussions on the properties of the algorithms and fail to analyse several major blockchain consensus algorithms in terms of their scopes. This article fills this gap by analysing a wide range of consensus algorithms using a comprehensive taxonomy of properties and by examining the implications of different issues still prevalent in consensus algorithms in detail. The result of the analysis is presented in tabular formats, which provides a visual illustration of these algorithms in a meaningful way. We have also analysed more than hundred top crypto-currencies belonging to different categories of consensus algorithms to understand their properties and to implicate different trends in these crypto-currencies. Finally, we have presented a decision tree of algorithms to be used as a tool to test the suitability of consensus algorithms under different criteria.
△ Less
Submitted 7 February, 2020; v1 submitted 20 January, 2020;
originally announced January 2020.
-
On Identifying Hashtags in Disaster Twitter Data
Authors:
Jishnu Ray Chowdhury,
Cornelia Caragea,
Doina Caragea
Abstract:
Tweet hashtags have the potential to improve the search for information during disaster events. However, there is a large number of disaster-related tweets that do not have any user-provided hashtags. Moreover, only a small number of tweets that contain actionable hashtags are useful for disaster response. To facilitate progress on automatic identification (or extraction) of disaster hashtags for…
▽ More
Tweet hashtags have the potential to improve the search for information during disaster events. However, there is a large number of disaster-related tweets that do not have any user-provided hashtags. Moreover, only a small number of tweets that contain actionable hashtags are useful for disaster response. To facilitate progress on automatic identification (or extraction) of disaster hashtags for Twitter data, we construct a unique dataset of disaster-related tweets annotated with hashtags useful for filtering actionable information. Using this dataset, we further investigate Long Short Term Memory-based models within a Multi-Task Learning framework. The best performing model achieves an F1-score as high as 92.22%. The dataset, code, and other resources are available on Github.
△ Less
Submitted 5 January, 2020;
originally announced January 2020.
-
Approximate Sampling using an Accelerated Metropolis-Hastings based on Bayesian Optimization and Gaussian Processes
Authors:
Asif J. Chowdhury,
Gabriel Terejanu
Abstract:
Markov Chain Monte Carlo (MCMC) methods have a drawback when working with a target distribution or likelihood function that is computationally expensive to evaluate, specially when working with big data. This paper focuses on Metropolis-Hastings (MH) algorithm for unimodal distributions. Here, an enhanced MH algorithm is proposed that requires less number of expensive function evaluations, has sho…
▽ More
Markov Chain Monte Carlo (MCMC) methods have a drawback when working with a target distribution or likelihood function that is computationally expensive to evaluate, specially when working with big data. This paper focuses on Metropolis-Hastings (MH) algorithm for unimodal distributions. Here, an enhanced MH algorithm is proposed that requires less number of expensive function evaluations, has shorter burn-in period, and uses a better proposal distribution. The main innovations include the use of Bayesian optimization to reach the high probability region quickly, emulating the target distribution using Gaussian processes (GP), and using Laplace approximation of the GP to build a proposal distribution that captures the underlying correlation better. The experiments show significant improvement over the regular MH. Statistical comparison between the results from two algorithms is presented.
△ Less
Submitted 21 October, 2019;
originally announced October 2019.
-
Keyphrase Extraction from Disaster-related Tweets
Authors:
Jishnu Ray Chowdhury,
Cornelia Caragea,
Doina Caragea
Abstract:
While keyphrase extraction has received considerable attention in recent years, relatively few studies exist on extracting keyphrases from social media platforms such as Twitter, and even fewer for extracting disaster-related keyphrases from such sources. During a disaster, keyphrases can be extremely useful for filtering relevant tweets that can enhance situational awareness. Previously, joint tr…
▽ More
While keyphrase extraction has received considerable attention in recent years, relatively few studies exist on extracting keyphrases from social media platforms such as Twitter, and even fewer for extracting disaster-related keyphrases from such sources. During a disaster, keyphrases can be extremely useful for filtering relevant tweets that can enhance situational awareness. Previously, joint training of two different layers of a stacked Recurrent Neural Network for keyword discovery and keyphrase extraction had been shown to be effective in extracting keyphrases from general Twitter data. We improve the model's performance on both general Twitter data and disaster-related Twitter data by incorporating contextual word embeddings, POS-tags, phonetics, and phonological features. Moreover, we discuss the shortcomings of the often used F1-measure for evaluating the quality of predicted keyphrases with respect to the ground truth annotations. Instead of the F1-measure, we propose the use of embedding-based metrics to better capture the correctness of the predicted keyphrases. In addition, we also present a novel extension of an embedding-based metric. The extension allows one to better control the penalty for the difference in the number of ground-truth and predicted keyphrases
△ Less
Submitted 17 October, 2019;
originally announced October 2019.
-
A Multiple Filter Based Neural Network Approach to the Extrapolation of Adsorption Energies on Metal Surfaces for Catalysis Applications
Authors:
Asif J. Chowdhury,
Wenqiang Yang,
Kareem E. Abdelfatah,
Mehdi Zare,
Andreas Heyden,
Gabriel Terejanu
Abstract:
Computational catalyst discovery involves the development of microkinetic reactor models based on estimated parameters determined from density functional theory (DFT). For complex surface chemistries, the cost of calculating the adsorption energies by DFT for a large number of reaction intermediates can become prohibitive. Here, we have identified appropriate descriptors and machine learning model…
▽ More
Computational catalyst discovery involves the development of microkinetic reactor models based on estimated parameters determined from density functional theory (DFT). For complex surface chemistries, the cost of calculating the adsorption energies by DFT for a large number of reaction intermediates can become prohibitive. Here, we have identified appropriate descriptors and machine learning models that can be used to predict part of these adsorption energies given data on the rest of them. Our investigations also included the case when the species data used to train the predictive model is of different size relative to the species the model tries to predict - an extrapolation in the data space which is typically difficult with regular machine learning models. We have developed a neural network based predictive model that combines an established model with the concepts of a convolutional neural network that, when extrapolating, achieves significant improvement over the previous models.
△ Less
Submitted 1 October, 2019;
originally announced October 2019.
-
Immutable Autobiography of Smart Cars
Authors:
Md Sadek Ferdous,
Mohammad Jabed Morshed Chowdhury,
Kamanashis Biswas,
Niaz Chowdhury
Abstract:
The popularity of smart cars is increasing around the world as they offer a wide range of services and conveniences.These smart cars are equipped with a variety of sensors generating a large amount of data, many of which are sensitive. Besides, there are multiple parties involved in a lifespan of a smart car ,such as manufacturers, car owners, government agencies, and third-party service providers…
▽ More
The popularity of smart cars is increasing around the world as they offer a wide range of services and conveniences.These smart cars are equipped with a variety of sensors generating a large amount of data, many of which are sensitive. Besides, there are multiple parties involved in a lifespan of a smart car ,such as manufacturers, car owners, government agencies, and third-party service providers who also produce data about the vehicle. In addition to managing and sharing data amongst these entities in a secure and privacy-friendly way which is a great challenge itself, there exists a trust deficit about some types of data as they remain under the custody of the car owner(e.g. satellite navigation and mileage data) and can easily be manipulated. In this paper, we propose a blockchain supported architecture enabling the owner of a smart car to create an immutable record of every data, called the auto biography of a car, generated within its lifespan. We also explain how the trust about this record is guaranteed by the immutability characteristic of the blockchain. Furthermore, the paper describes how the proposed architecture enables a secure and privacy-friendly sharing of smart car data between different parties in a secure yet privacy-friendly manner.
△ Less
Submitted 19 October, 2018;
originally announced October 2018.
-
CAPTCHA Based on Human Cognitive Factor
Authors:
Mohammad Jabed Morshed Chowdhury,
Narayan Ranjan Chakraborty
Abstract:
A CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is an automatic security mechanism used to determine whether the user is a human or a malicious computer program. It is a program that generates and grades tests that are human solvable, but intends to be beyond the capabilities of current computer programs. CAPTCHA should be designed to be very easy for humans…
▽ More
A CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is an automatic security mechanism used to determine whether the user is a human or a malicious computer program. It is a program that generates and grades tests that are human solvable, but intends to be beyond the capabilities of current computer programs. CAPTCHA should be designed to be very easy for humans but very hard for machines. Unfortunately, the existing CAPTCHA systems while trying to maximize the difficulty for automated programs to pass tests by increasing distortion or noise have consequently, made it also very difficult for potential users. To address the issue, this paper addresses an alternative form of CAPTCHA that provides a variety of questions from mathematical, logical and general problems which only human can understand and answer correctly in a given time. The proposed framework supports diversity in choosing the questions to be answered and a user-friendly framework to the users. A user-study is also conducted to judge the performance of the developed system with different background. The study shows the efficacy of the implemented system with a good level of user satisfaction over traditional CAPTCHA available today.
△ Less
Submitted 28 December, 2013;
originally announced December 2013.