-
Enhancing Long Context Performance in LLMs Through Inner Loop Query Mechanism
Authors:
Yimin Tang,
Yurong Xu,
Ning Yan,
Masood Mortazavi
Abstract:
Transformers have a quadratic scaling of computational complexity with input size, which limits the input context window size of large language models (LLMs) in both training and inference. Meanwhile, retrieval-augmented generation (RAG) besed models can better handle longer contexts by using a retrieval system to filter out unnecessary information. However, most RAG methods only perform retrieval…
▽ More
Transformers have a quadratic scaling of computational complexity with input size, which limits the input context window size of large language models (LLMs) in both training and inference. Meanwhile, retrieval-augmented generation (RAG) besed models can better handle longer contexts by using a retrieval system to filter out unnecessary information. However, most RAG methods only perform retrieval based on the initial query, which may not work well with complex questions that require deeper reasoning. We introduce a novel approach, Inner Loop Memory Augmented Tree Retrieval (ILM-TR), involving inner-loop queries, based not only on the query question itself but also on intermediate findings. At inference time, our model retrieves information from the RAG system, integrating data from lengthy documents at various levels of abstraction. Based on the information retrieved, the LLM generates texts stored in an area named Short-Term Memory (STM) which is then used to formulate the next query. This retrieval process is repeated until the text in STM converged. Our experiments demonstrate that retrieval with STM offers improvements over traditional retrieval-augmented LLMs, particularly in long context tests such as Multi-Needle In A Haystack (M-NIAH) and BABILong.
△ Less
Submitted 11 October, 2024;
originally announced October 2024.
-
4D Metric-Semantic Mapping for Persistent Orchard Monitoring: Method and Dataset
Authors:
Jiuzhou Lei,
Ankit Prabhu,
Xu Liu,
Fernando Cladera,
Mehrad Mortazavi,
Reza Ehsani,
Pratik Chaudhari,
Vijay Kumar
Abstract:
Automated persistent and fine-grained monitoring of orchards at the individual tree or fruit level helps maximize crop yield and optimize resources such as water, fertilizers, and pesticides while preventing agricultural waste. Towards this goal, we present a 4D spatio-temporal metric-semantic mapping method that fuses data from multiple sensors, including LiDAR, RGB camera, and IMU, to monitor th…
▽ More
Automated persistent and fine-grained monitoring of orchards at the individual tree or fruit level helps maximize crop yield and optimize resources such as water, fertilizers, and pesticides while preventing agricultural waste. Towards this goal, we present a 4D spatio-temporal metric-semantic mapping method that fuses data from multiple sensors, including LiDAR, RGB camera, and IMU, to monitor the fruits in an orchard across their growth season. A LiDAR-RGB fusion module is designed for 3D fruit tracking and localization, which first segments fruits using a deep neural network and then tracks them using the Hungarian Assignment algorithm. Additionally, the 4D data association module aligns data from different growth stages into a common reference frame and tracks fruits spatio-temporally, providing information such as fruit counts, sizes, and positions. We demonstrate our method's accuracy in 4D metric-semantic mapping using data collected from a real orchard under natural, uncontrolled conditions with seasonal variations. We achieve a 3.1 percent error in total fruit count estimation for over 1790 fruits across 60 apple trees, along with accurate size estimation results with a mean error of 1.1 cm. The datasets, consisting of LiDAR, RGB, and IMU data of five fruit species captured across their growth seasons, along with corresponding ground truth data, will be made publicly available at: https://4d-metric-semantic-mapping.org/
△ Less
Submitted 29 September, 2024;
originally announced September 2024.
-
Medium-entropy Engineering of magnetism in layered antiferromagnet CuxNi2(1-x)CrxP2S6
Authors:
Dinesh Upreti,
Rabindra Basnet,
M. M. Sharma,
Santosh Karki Chhetri,
Gokul Acharya,
Md Rafique Un Nabi,
Josh Sakon,
Mansour Mortazavi,
Jin Hu
Abstract:
Engineering magnetism in layered magnets could result in novel phenomena related to two-dimensional (2D) magnetism, which can be useful for fundamental research and practical applications. Extensive doping efforts such as substitution and intercalation have been adopted to tune antiferromagnetic (AFM) properties in M2P2X6 compounds. The substitutional doping in this material family has mainly focu…
▽ More
Engineering magnetism in layered magnets could result in novel phenomena related to two-dimensional (2D) magnetism, which can be useful for fundamental research and practical applications. Extensive doping efforts such as substitution and intercalation have been adopted to tune antiferromagnetic (AFM) properties in M2P2X6 compounds. The substitutional doping in this material family has mainly focused on bimetallic substitution. Recently, the metal substitution can also be extended to more than two metal elements, leading to medium and high-entropy alloys (MEAs and HEAs), which are fairly underexplored in layered magnetic systems including M2P2X6. In this work, we explored the magnetic properties of the previously unreported Cu- and Cr-substituted Ni2P2S6 i.e., CuxNi2(1-x)CrxP2S6. Our study reveals a relatively systematic evolution of AFM phases with substitution than that observed in traditional bimetallic substitution in M2P2X6. Furthermore, the Cu and Cr substitutions in Ni2P2S6 are found to enhance the ferromagnetic (FM) correlation, which is also accompanied by a possible weak FM phase at low temperatures for the intermediate compositions from 0.32 to 0.80. Our work provides a strategy to establish ferromagnetism in AFM M2P2X6 that can also be used for property tuning in other layered magnets.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Tuning Magnetism in Ising-type van der Waals Magnet FePS3 by Lithium Intercalation
Authors:
Dinesh Upreti,
Rabindra Basnet,
M. M. Sharma,
Santosh Karki Chhetri,
Gokul Acharya,
Md Rafique Un Nabi,
Josh Sakon,
Bo Da,
Mansour Mortazavi,
Jin Hu
Abstract:
Recently, layered materials transition metal thiophosphate MPX3 (M = transition metals, X = S or Se) have gained significant attention because of their rich magnetic, optical, and electronic properties. Specifically, the diverse magnetic structures and the robustness of magnetism in the two-dimensional limit have made them prominent candidates to study two-dimensional magnetism. Numerous efforts s…
▽ More
Recently, layered materials transition metal thiophosphate MPX3 (M = transition metals, X = S or Se) have gained significant attention because of their rich magnetic, optical, and electronic properties. Specifically, the diverse magnetic structures and the robustness of magnetism in the two-dimensional limit have made them prominent candidates to study two-dimensional magnetism. Numerous efforts such as substitutions and interlayer intercalations have been made to tune the properties of these materials, which has greatly deepened the understanding of the underlying mechanisms that govern the properties. In this work, we focus on modifying the magnetism of Ising-type antiferromagnet FePS3 using electrochemical lithium intercalation. Our work unveils the effectiveness of electrochemical intercalation as a controllable tool to modulating magnetism, including tuning magnetic ordering temperature and inducing low temperature spin-glass state, offering an approach for implementing this material into applications.
△ Less
Submitted 17 July, 2024;
originally announced July 2024.
-
Fusion of regional and sparse attention in Vision Transformers
Authors:
Nabil Ibtehaz,
Ning Yan,
Masood Mortazavi,
Daisuke Kihara
Abstract:
Modern vision transformers leverage visually inspired local interaction between pixels through attention computed within window or grid regions, in contrast to the global attention employed in the original ViT. Regional attention restricts pixel interactions within specific regions, while sparse attention disperses them across sparse grids. These differing approaches pose a challenge between maint…
▽ More
Modern vision transformers leverage visually inspired local interaction between pixels through attention computed within window or grid regions, in contrast to the global attention employed in the original ViT. Regional attention restricts pixel interactions within specific regions, while sparse attention disperses them across sparse grids. These differing approaches pose a challenge between maintaining hierarchical relationships vs. capturing a global context. In this study, drawing inspiration from atrous convolution, we propose Atrous Attention, a blend of regional and sparse attention that dynamically integrates both local and global information while preserving hierarchical structures. Based on this, we introduce a versatile, hybrid vision transformer backbone called ACC-ViT, tailored for standard vision tasks. Our compact model achieves approximately 84% accuracy on ImageNet-1K with fewer than 28.5 million parameters, outperforming the state-of-the-art MaxViT by 0.42% while requiring 8.4% fewer parameters.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Modally Reduced Representation Learning of Multi-Lead ECG Signals through Simultaneous Alignment and Reconstruction
Authors:
Nabil Ibtehaz,
Masood Mortazavi
Abstract:
Electrocardiogram (ECG) signals, profiling the electrical activities of the heart, are used for a plethora of diagnostic applications. However, ECG systems require multiple leads or channels of signals to capture the complete view of the cardiac system, which limits their application in smartwatches and wearables. In this work, we propose a modally reduced representation learning method for ECG si…
▽ More
Electrocardiogram (ECG) signals, profiling the electrical activities of the heart, are used for a plethora of diagnostic applications. However, ECG systems require multiple leads or channels of signals to capture the complete view of the cardiac system, which limits their application in smartwatches and wearables. In this work, we propose a modally reduced representation learning method for ECG signals that is capable of generating channel-agnostic, unified representations for ECG signals. Through joint optimization of reconstruction and alignment, we ensure that the embeddings of the different channels contain an amalgamation of the overall information across channels while also retaining their specific information. On an independent test dataset, we generated highly correlated channel embeddings from different ECG channels, leading to a moderate approximation of the 12-lead signals from a single-channel embedding. Our generated embeddings can work as competent features for ECG signals for downstream tasks.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Evolution of Magnetism in Magnetic Topological Semimetal NdSb$_x$Te$_{2-x+δ}$
Authors:
Santosh Karki Chhetri,
Rabindra Basnet,
Jian Wang,
Krishna Pandey,
Gokul Acharya,
Md Rafique Un Nabi,
Dinesh Upreti,
Josh Sakon,
Mansour Mortazavi,
Jin Hu
Abstract:
Magnetic topological semimetals LnSbTe (Ln = Lanthanide) have attracted intensive attention because of the presence of interplay between magnetism, topological, and electron correlations depending on the choices of magnetic Ln elements. Recently, varying Sb-Te composition has been found to effectively control the electronic and magnetic states in LnSbxTe$_{2-x}$. With this motivation, we report th…
▽ More
Magnetic topological semimetals LnSbTe (Ln = Lanthanide) have attracted intensive attention because of the presence of interplay between magnetism, topological, and electron correlations depending on the choices of magnetic Ln elements. Recently, varying Sb-Te composition has been found to effectively control the electronic and magnetic states in LnSbxTe$_{2-x}$. With this motivation, we report the evolution of magnetic properties with Sb-Te substitution in NdSb$_x$Te$_{2-x+δ}$. Our work reveals the interesting non-monotonic change in magnetic ordering temperature with varying composition stoichiometry. In addition, reducing the Sb content x drives the reorientation of moments from in-plane (ab-plane) to out-of-plane (c-axis) direction that results in the distinct magnetic structures for two end compounds NdTe$_2$ ($x = 0$) and NdSbTe ($x = 1$). Furthermore, the moment orientation in NdSb$_x$Te$_{2-x+δ}$ is also found to be strongly tunable upon application of weak magnetic field, leading to rich magnetic phases depending on the composition stoichiometry, temperature, and magnetic field. Such strong tuning of magnetism in this material establishes it as a promising platform for investigating tunable topological states and correlated topological physics.
△ Less
Submitted 23 April, 2024;
originally announced April 2024.
-
Field-induced spin polarization in lightly Cr-substituted layered antiferromagnet NiPS3
Authors:
Rabindra Basnet,
Dinesh Upreti,
Taksh Patel,
Santosh Karki Chhetri,
Gokul Acharya,
Md Rafique Un Nabi,
Manish Mani Sharma,
Josh Sakon,
Mansour Mortazavi,
Jin Hu
Abstract:
Tuning magnetic properties in layered magnets is an important route to realize novel phenomenon related to two-dimensional (2D) magnetism. Recently, tuning antiferromagnetic (AFM) properties through substitution and intercalation techniques have been widely studied in MPX3 compounds. Interesting phenomena, such as diverse AFM structures and even the signatures of ferrimagnetism, have been reported…
▽ More
Tuning magnetic properties in layered magnets is an important route to realize novel phenomenon related to two-dimensional (2D) magnetism. Recently, tuning antiferromagnetic (AFM) properties through substitution and intercalation techniques have been widely studied in MPX3 compounds. Interesting phenomena, such as diverse AFM structures and even the signatures of ferrimagnetism, have been reported. However, long-range ferromagnetic (FM) ordering has remained elusive. In this work, we explored the magnetic properties of the previously unreported Cr-substituted NiPS3. We found that Cr substitution is extremely efficient in controlling spin orientation in NiPS3. Our study reveals a field-induced spin polarization in lightly (9%) Cr-substituted NiPS3, which is likely attributed to the attenuation of AFM interactions and magnetic anisotropy due to Cr doping. Our work provides a possible strategy to achieve FM phase in AFM MPX3, which could be useful for investigating 2D magnetism as well as potential device applications.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
ACC-ViT : Atrous Convolution's Comeback in Vision Transformers
Authors:
Nabil Ibtehaz,
Ning Yan,
Masood Mortazavi,
Daisuke Kihara
Abstract:
Transformers have elevated to the state-of-the-art vision architectures through innovations in attention mechanism inspired from visual perception. At present two classes of attentions prevail in vision transformers, regional and sparse attention. The former bounds the pixel interactions within a region; the latter spreads them across sparse grids. The opposing natures of them have resulted in a d…
▽ More
Transformers have elevated to the state-of-the-art vision architectures through innovations in attention mechanism inspired from visual perception. At present two classes of attentions prevail in vision transformers, regional and sparse attention. The former bounds the pixel interactions within a region; the latter spreads them across sparse grids. The opposing natures of them have resulted in a dilemma between either preserving hierarchical relation or attaining a global context. In this work, taking inspiration from atrous convolution, we introduce Atrous Attention, a fusion of regional and sparse attention, which can adaptively consolidate both local and global information, while maintaining hierarchical relations. As a further tribute to atrous convolution, we redesign the ubiquitous inverted residual convolution blocks with atrous convolution. Finally, we propose a generalized, hybrid vision transformer backbone, named ACC-ViT, following conventional practices for standard vision tasks. Our tiny version model achieves $\sim 84 \%$ accuracy on ImageNet-1K, with less than $28.5$ million parameters, which is $0.42\%$ improvement over state-of-the-art MaxViT while having $8.4\%$ less parameters. In addition, we have investigated the efficacy of ACC-ViT backbone under different evaluation settings, such as finetuning, linear probing, and zero-shot learning on tasks involving medical image analysis, object detection, and language-image contrastive learning. ACC-ViT is therefore a strong vision backbone, which is also competitive in mobile-scale versions, ideal for niche applications with small datasets.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
HetGPT: Harnessing the Power of Prompt Tuning in Pre-Trained Heterogeneous Graph Neural Networks
Authors:
Yihong Ma,
Ning Yan,
Jiayu Li,
Masood Mortazavi,
Nitesh V. Chawla
Abstract:
Graphs have emerged as a natural choice to represent and analyze the intricate patterns and rich information of the Web, enabling applications such as online page classification and social recommendation. The prevailing "pre-train, fine-tune" paradigm has been widely adopted in graph machine learning tasks, particularly in scenarios with limited labeled nodes. However, this approach often exhibits…
▽ More
Graphs have emerged as a natural choice to represent and analyze the intricate patterns and rich information of the Web, enabling applications such as online page classification and social recommendation. The prevailing "pre-train, fine-tune" paradigm has been widely adopted in graph machine learning tasks, particularly in scenarios with limited labeled nodes. However, this approach often exhibits a misalignment between the training objectives of pretext tasks and those of downstream tasks. This gap can result in the "negative transfer" problem, wherein the knowledge gained from pre-training adversely affects performance in the downstream tasks. The surge in prompt-based learning within Natural Language Processing (NLP) suggests the potential of adapting a "pre-train, prompt" paradigm to graphs as an alternative. However, existing graph prompting techniques are tailored to homogeneous graphs, neglecting the inherent heterogeneity of Web graphs. To bridge this gap, we propose HetGPT, a general post-training prompting framework to improve the predictive performance of pre-trained heterogeneous graph neural networks (HGNNs). The key is the design of a novel prompting function that integrates a virtual class prompt and a heterogeneous feature prompt, with the aim to reformulate downstream tasks to mirror pretext tasks. Moreover, HetGPT introduces a multi-view neighborhood aggregation mechanism, capturing the complex neighborhood structure in heterogeneous graphs. Extensive experiments on three benchmark datasets demonstrate HetGPT's capability to enhance the performance of state-of-the-art HGNNs on semi-supervised node classification.
△ Less
Submitted 23 January, 2024; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Envisioning a Next Generation Extended Reality Conferencing System with Efficient Photorealistic Human Rendering
Authors:
Chuanyue Shen,
Letian Zhang,
Zhangsihao Yang,
Masood Mortazavi,
Xiyun Song,
Liang Peng,
Heather Yu
Abstract:
Meeting online is becoming the new normal. Creating an immersive experience for online meetings is a necessity towards more diverse and seamless environments. Efficient photorealistic rendering of human 3D dynamics is the core of immersive meetings. Current popular applications achieve real-time conferencing but fall short in delivering photorealistic human dynamics, either due to limited 2D space…
▽ More
Meeting online is becoming the new normal. Creating an immersive experience for online meetings is a necessity towards more diverse and seamless environments. Efficient photorealistic rendering of human 3D dynamics is the core of immersive meetings. Current popular applications achieve real-time conferencing but fall short in delivering photorealistic human dynamics, either due to limited 2D space or the use of avatars that lack realistic interactions between participants. Recent advances in neural rendering, such as the Neural Radiance Field (NeRF), offer the potential for greater realism in metaverse meetings. However, the slow rendering speed of NeRF poses challenges for real-time conferencing. We envision a pipeline for a future extended reality metaverse conferencing system that leverages monocular video acquisition and free-viewpoint synthesis to enhance data and hardware efficiency. Towards an immersive conferencing experience, we explore an accelerated NeRF-based free-viewpoint synthesis algorithm for rendering photorealistic human dynamics more efficiently. We show that our algorithm achieves comparable rendering quality while performing training and inference 44.5% and 213% faster than state-of-the-art methods, respectively. Our exploration provides a design basis for constructing metaverse conferencing systems that can handle complex application scenarios, including dynamic scene relighting with customized themes and multi-user conferencing that harmonizes real-world people into an extended world.
△ Less
Submitted 28 June, 2023;
originally announced June 2023.
-
Selecting Sustainable Optimal Stock by Using Multi-Criteria Fuzzy Decision-Making Approaches Based on the Development of the Gordon Model: A case study of the Toronto Stock Exchange
Authors:
Mohsen Mortazavi
Abstract:
Choosing the right stock portfolio with the highest efficiencies has always concerned accurate and legal investors. Investors have always been concerned about the accuracy and legitimacy of choosing the right stock portfolio with high efficiency. Therefore, this paper aims to determine the criteria for selecting an optimal stock portfolio with a high-efficiency ratio in the Toronto Stock Exchange…
▽ More
Choosing the right stock portfolio with the highest efficiencies has always concerned accurate and legal investors. Investors have always been concerned about the accuracy and legitimacy of choosing the right stock portfolio with high efficiency. Therefore, this paper aims to determine the criteria for selecting an optimal stock portfolio with a high-efficiency ratio in the Toronto Stock Exchange using the integrated evaluation and decision-making trial laboratory (DEMATEL) model and Multi-Criteria Fuzzy decision-making approaches regarding the development of the Gordon model. In the current study, results obtained using combined multi-criteria fuzzy decision-making approaches, the practical factors, the relative weight of dividends, discount rate, and dividend growth rate have been comprehensively illustrated using combined multi-criteria fuzzy decision-making approaches. A group of 10 experts with at least a ten-year of experience in the stock exchange field was formed to review the different and new aspects of the subject (portfolio selection) to decide the interaction between the group members and the exchange of attitudes and ideas regarding the criteria. The sequence of influence and effectiveness of the main criteria with DEMATEL has shown that the profitability criterion interacts most with other criteria. The criteria of managing methods and operations (MPO), market, risk, and growth criteria are ranked next in terms of interaction with other criteria. This study concludes that regarding the model's appropriate and reliable validity in choosing the optimal stock portfolio, it is recommended that portfolio managers in companies, investment funds, and capital owners use the model to select stocks in the Toronto Stock Exchange optimally.
△ Less
Submitted 26 April, 2023;
originally announced April 2023.
-
Theta-Resonance: A Single-Step Reinforcement Learning Method for Design Space Exploration
Authors:
Masood S. Mortazavi,
Tiancheng Qin,
Ning Yan
Abstract:
Given an environment (e.g., a simulator) for evaluating samples in a specified design space and a set of weighted evaluation metrics -- one can use Theta-Resonance, a single-step Markov Decision Process (MDP), to train an intelligent agent producing progressively more optimal samples. In Theta-Resonance, a neural network consumes a constant input tensor and produces a policy as a set of conditiona…
▽ More
Given an environment (e.g., a simulator) for evaluating samples in a specified design space and a set of weighted evaluation metrics -- one can use Theta-Resonance, a single-step Markov Decision Process (MDP), to train an intelligent agent producing progressively more optimal samples. In Theta-Resonance, a neural network consumes a constant input tensor and produces a policy as a set of conditional probability density functions (PDFs) for sampling each design dimension. We specialize existing policy gradient algorithms in deep reinforcement learning (D-RL) in order to use evaluation feedback (in terms of cost, penalty or reward) to update our policy network with robust algorithmic stability and minimal design evaluations. We study multiple neural architectures (for our policy network) within the context of a simple SoC design space and propose a method of constructing synthetic space exploration problems to compare and improve design space exploration (DSE) algorithms. Although we only present categorical design spaces, we also outline how to use Theta-Resonance in order to explore continuous and mixed continuous-discrete design spaces.
△ Less
Submitted 17 November, 2022; v1 submitted 3 November, 2022;
originally announced November 2022.
-
Fully Convolutional Scene Graph Generation
Authors:
Hengyue Liu,
Ning Yan,
Masood S. Mortazavi,
Bir Bhanu
Abstract:
This paper presents a fully convolutional scene graph generation (FCSGG) model that detects objects and relations simultaneously. Most of the scene graph generation frameworks use a pre-trained two-stage object detector, like Faster R-CNN, and build scene graphs using bounding box features. Such pipeline usually has a large number of parameters and low inference speed. Unlike these approaches, FCS…
▽ More
This paper presents a fully convolutional scene graph generation (FCSGG) model that detects objects and relations simultaneously. Most of the scene graph generation frameworks use a pre-trained two-stage object detector, like Faster R-CNN, and build scene graphs using bounding box features. Such pipeline usually has a large number of parameters and low inference speed. Unlike these approaches, FCSGG is a conceptually elegant and efficient bottom-up approach that encodes objects as bounding box center points, and relationships as 2D vector fields which are named as Relation Affinity Fields (RAFs). RAFs encode both semantic and spatial features, and explicitly represent the relationship between a pair of objects by the integral on a sub-region that points from subject to object. FCSGG only utilizes visual features and still generates strong results for scene graph generation. Comprehensive experiments on the Visual Genome dataset demonstrate the efficacy, efficiency, and generalizability of the proposed method. FCSGG achieves highly competitive results on recall and zero-shot recall with significantly reduced inference time.
△ Less
Submitted 30 March, 2021;
originally announced March 2021.
-
Speech-Image Semantic Alignment Does Not Depend on Any Prior Classification Tasks
Authors:
Masood S. Mortazavi
Abstract:
Semantically-aligned $(speech, image)$ datasets can be used to explore "visually-grounded speech". In a majority of existing investigations, features of an image signal are extracted using neural networks "pre-trained" on other tasks (e.g., classification on ImageNet). In still others, pre-trained networks are used to extract audio features prior to semantic embedding. Without "transfer learning"…
▽ More
Semantically-aligned $(speech, image)$ datasets can be used to explore "visually-grounded speech". In a majority of existing investigations, features of an image signal are extracted using neural networks "pre-trained" on other tasks (e.g., classification on ImageNet). In still others, pre-trained networks are used to extract audio features prior to semantic embedding. Without "transfer learning" through pre-trained initialization or pre-trained feature extraction, previous results have tended to show low rates of recall in $speech \rightarrow image$ and $image \rightarrow speech$ queries.
Choosing appropriate neural architectures for encoders in the speech and image branches and using large datasets, one can obtain competitive recall rates without any reliance on any pre-trained initialization or feature extraction: $(speech,image)$ semantic alignment and $speech \rightarrow image$ and $image \rightarrow speech$ retrieval are canonical tasks worthy of independent investigation of their own and allow one to explore other questions---e.g., the size of the audio embedder can be reduced significantly with little loss of recall rates in $speech \rightarrow image$ and $image \rightarrow speech$ queries.
△ Less
Submitted 28 October, 2020;
originally announced October 2020.
-
FoCL: Feature-Oriented Continual Learning for Generative Models
Authors:
Qicheng Lao,
Mehrzad Mortazavi,
Marzieh Tahaei,
Francis Dutil,
Thomas Fevens,
Mohammad Havaei
Abstract:
In this paper, we propose a general framework in continual learning for generative models: Feature-oriented Continual Learning (FoCL). Unlike previous works that aim to solve the catastrophic forgetting problem by introducing regularization in the parameter space or image space, FoCL imposes regularization in the feature space. We show in our experiments that FoCL has faster adaptation to distribu…
▽ More
In this paper, we propose a general framework in continual learning for generative models: Feature-oriented Continual Learning (FoCL). Unlike previous works that aim to solve the catastrophic forgetting problem by introducing regularization in the parameter space or image space, FoCL imposes regularization in the feature space. We show in our experiments that FoCL has faster adaptation to distributional changes in sequentially arriving tasks, and achieves the state-of-the-art performance for generative models in task incremental learning. We discuss choices of combined regularization spaces towards different use case scenarios for boosted performance, e.g., tasks that have high variability in the background. Finally, we introduce a forgetfulness measure that fairly evaluates the degree to which a model suffers from forgetting. Interestingly, the analysis of our proposed forgetfulness score also implies that FoCL tends to have a mitigated forgetting for future tasks.
△ Less
Submitted 8 March, 2020;
originally announced March 2020.
-
The Impact of Hole Geometry on Relative Robustness of In-Painting Networks: An Empirical Study
Authors:
Masood S. Mortazavi,
Ning Yan
Abstract:
In-painting networks use existing pixels to generate appropriate pixels to fill "holes" placed on parts of an image. A 2-D in-painting network's input usually consists of (1) a three-channel 2-D image, and (2) an additional channel for the "holes" to be in-painted in that image. In this paper, we study the robustness of a given in-painting neural network against variations in hole geometry distrib…
▽ More
In-painting networks use existing pixels to generate appropriate pixels to fill "holes" placed on parts of an image. A 2-D in-painting network's input usually consists of (1) a three-channel 2-D image, and (2) an additional channel for the "holes" to be in-painted in that image. In this paper, we study the robustness of a given in-painting neural network against variations in hole geometry distributions. We observe that the robustness of an in-painting network is dependent on the probability distribution function (PDF) of the hole geometry presented to it during its training even if the underlying image dataset used (in training and testing) does not alter. We develop an experimental methodology for testing and evaluating relative robustness of in-painting networks against four different kinds of hole geometry PDFs. We examine a number of hypothesis regarding (1) the natural bias of in-painting networks to the hole distribution used for their training, (2) the underlying dataset's ability to differentiate relative robustness as hole distributions vary in a train-test (cross-comparison) grid, and (3) the impact of the directional distribution of edges in the holes and in the image dataset. We present results for L1, PSNR and SSIM quality metrics and develop a specific measure of relative in-painting robustness to be used in cross-comparison grids based on these quality metrics. (One can incorporate other quality metrics in this relative measure.) The empirical work reported here is an initial step in a broader and deeper investigation of "filling the blank" neural networks' sensitivity, robustness and regularization with respect to hole "geometry" PDFs, and it suggests further research in this domain.
△ Less
Submitted 4 March, 2020;
originally announced March 2020.
-
Si-based GeSn photodetectors towards mid-infrared imaging applications
Authors:
Huong Tran,
Thach Pham,
Joe Margetis,
Yiyin Zhou,
Wei Dou,
Perry C. Grant,
Joshua M. Grant,
Sattar Alkabi,
Greg Sun,
Richard A. Soref,
John Tolle,
Yong-Hang Zhang,
Wei Du,
Baohua Li,
Mansour Mortazavi,
Shui-Qing Yu
Abstract:
This paper reports a comprehensive study of Si-based GeSn mid-infrared photodetectors, which includes: 1) the demonstration of a set of photoconductors with Sn compositions ranging from 10.5% to 22.3%, showing the cut-off wavelength has been extended to 3.65 um. The measured maximum D* of 1.1x10^10 cmHz^(1/2)W(-1) is comparable to that of commercial extended-InGaAs detectors; 2) the development of…
▽ More
This paper reports a comprehensive study of Si-based GeSn mid-infrared photodetectors, which includes: 1) the demonstration of a set of photoconductors with Sn compositions ranging from 10.5% to 22.3%, showing the cut-off wavelength has been extended to 3.65 um. The measured maximum D* of 1.1x10^10 cmHz^(1/2)W(-1) is comparable to that of commercial extended-InGaAs detectors; 2) the development of surface passivation technique on photodiode based on in-depth analysis of dark current mechanism, effectively reducing the dark current. Moreover, mid-infrared images were obtained using GeSn photodetectors, showing the comparable image quality with that acquired by using commercial PbSe detectors.
△ Less
Submitted 6 June, 2019;
originally announced June 2019.
-
UHV-CVD Growth of High Quality GeSn Using SnCl4: From Growth Optimization to Prototype Devices
Authors:
P. C. Grant,
W. Dou,
B. Alharthi,
J. M. Grant,
H. Tran,
G. Abernathy,
A. Mosleh,
W. Du,
5 B. Li,
M. Mortazavi,
H. A. Naseem,
S. Q. Yu
Abstract:
The persistent interest of the epitaxy of group IV alloy GeSn is mainly driven by the demand of efficient light source that could be monolithically integrated on Si for mid-infrared Si photonics. For chemical vapor deposition of GeSn, the exploration of parameter window is difficult from the beginning due to its non-equilibrium growth condition. In this work, we demonstrated the effective pathway…
▽ More
The persistent interest of the epitaxy of group IV alloy GeSn is mainly driven by the demand of efficient light source that could be monolithically integrated on Si for mid-infrared Si photonics. For chemical vapor deposition of GeSn, the exploration of parameter window is difficult from the beginning due to its non-equilibrium growth condition. In this work, we demonstrated the effective pathway to achieve the high quality GeSn with high Sn incorporation. The GeSn films were grown on Ge-buffered Si via ultra-high vacuum chemical vapor deposition using GeH4 and SnCl4 as precursor gasses. The influence of both SnCl4 flow fraction and growth temperature on the Sn incorporation and material quality were investigated. The key to achieve effective Sn incorporation and high material quality is to explore the proper parameter match between SnCl4 supply and growth temperature, which is also called optimized growth regime. The Sn precipitation is significantly suppressed in optimized growth regime, leading to more Sn incorporation into Ge and enhanced material quality. The prototype GeSn photoconductors were fabricated with typical samples, showing the promising devices applications towards mid-infrared optoelectronics.
△ Less
Submitted 5 October, 2018;
originally announced October 2018.
-
Si-based GeSn lasers with wavelength coverage of 2 to 3 μm and operating temperatures up to 180 K
Authors:
Joe Margetis,
Sattar Al-Kabi,
Wei Du,
Wei Dou,
Yiyin Zhou,
Thach Pham,
Perry Grant,
Seyed Ghetmiri,
Aboozar Mosleh,
Baohua Li,
Jifeng Liu,
Greg Sun,
Richard Soref,
John Tolle,
Mansour Mortazavi,
Shui-Qing Yu
Abstract:
A Si-based monolithic laser is highly desirable for full integration of Si-photonics. Lasing from direct bandgap group-IV GeSn alloy has opened a completely new venue from the traditional III-V integration approach. We demonstrated optically pumped GeSn lasers on Si with broad wavelength coverage from 2 to 3 μm. The GeSn alloys were grown using newly developed approaches with an industry standard…
▽ More
A Si-based monolithic laser is highly desirable for full integration of Si-photonics. Lasing from direct bandgap group-IV GeSn alloy has opened a completely new venue from the traditional III-V integration approach. We demonstrated optically pumped GeSn lasers on Si with broad wavelength coverage from 2 to 3 μm. The GeSn alloys were grown using newly developed approaches with an industry standard chemical vapor deposition reactor and low-cost commercially available precursors. The achieved maximum Sn composition of 17.5% exceeded the generally acknowledged Sn incorporation limits for using similar deposition chemistries. The highest lasing temperature was measured as 180 K with the active layer thickness as thin as 260 nm. The unprecedented lasing performance is mainly due to the unique growth approaches, which offer high-quality epitaxial materials. The results reported in this work show a major advance towards Si-based mid-infrared laser sources for integrated photonics.
△ Less
Submitted 19 August, 2017;
originally announced August 2017.
-
Quantum Tunneling of Thermal Protons Through Pristine Graphene
Authors:
Igor Poltavsky,
Limin Zheng,
Majid Mortazavi,
Alexandre Tkatchenko
Abstract:
Atomically thin two-dimensional materials such as graphene and hexagonal boron nitride have recently been found to exhibit appreciable permeability to thermal protons, making these materials emerging candidates for separation technologies [S. Hu et al., Nature 516, 227 (2014); M. Lozada-Hidalgo et al., Science 351, 68 (2016).]. These remarkable findings remain unexplained by density-functional ele…
▽ More
Atomically thin two-dimensional materials such as graphene and hexagonal boron nitride have recently been found to exhibit appreciable permeability to thermal protons, making these materials emerging candidates for separation technologies [S. Hu et al., Nature 516, 227 (2014); M. Lozada-Hidalgo et al., Science 351, 68 (2016).]. These remarkable findings remain unexplained by density-functional electronic structure calculations, which instead yield barriers that exceed by 1.0 eV those found in experiments. Here we resolve this puzzle by demonstrating that the proton transfer through pristine graphene is driven by quantum nuclear effects, which substantially reduce the transport barrier by up to 1.4 eV compared to the results of classical molecular dynamics (MD). Our Feynman-Kac path-integral MD simulations unambiguously reveal the quantum tunneling mechanism of strongly interacting hydrogen ions through two-dimensional materials. In addition, we predict a strong isotope effect of 1 eV on the transport barrier for graphene in vacuum and at room temperature. These findings not only shed light on the graphene permeability to protons and deuterons, but also offer new insights for controlling the underlying quantum ion transport mechanisms in nanostructured separation membranes.
△ Less
Submitted 12 April, 2017; v1 submitted 20 May, 2016;
originally announced May 2016.
-
Parallel variable-density particle-laden turbulence simulation
Authors:
Hadi Pouransari,
Milad Mortazavi,
Ali Mani
Abstract:
We have developed a fully parallel C++/MPI based simulation code for variable-density particle-laden turbulent flows. The fluid is represented through a uniform Eulerian staggered grid, while particles are modeled using a Lagrangian point-particle framework. Spatial discretization is second-order accurate, and time integration has a fourth-order accuracy. Two-way coupling of the particles with the…
▽ More
We have developed a fully parallel C++/MPI based simulation code for variable-density particle-laden turbulent flows. The fluid is represented through a uniform Eulerian staggered grid, while particles are modeled using a Lagrangian point-particle framework. Spatial discretization is second-order accurate, and time integration has a fourth-order accuracy. Two-way coupling of the particles with the background flow is considered in both momentum and energy equations. The code is fully modular and abstracted, and easily can be extended or modified. We have considered two different boundary conditions. We have also developed a novel parallel linear solver for the variable density Poisson equation that arises in the calculation.
△ Less
Submitted 20 January, 2016;
originally announced January 2016.
-
On the Optimization of non-Dense Metabolic Networks in non-Equilibrium State Utilizing 2D-Lattice Simulation
Authors:
Erfan Khaji,
Mahsa Mortazavi
Abstract:
Modeling and optimization of metabolic networks has been one of the hottest topics in computational systems biology within recent years. However, the complexity and uncertainty of these networks in addition to the lack of necessary data has resulted in more efforts to design and usage of more capable models which fit to realistic conditions. In this paper, instead of optimizing networks in equilib…
▽ More
Modeling and optimization of metabolic networks has been one of the hottest topics in computational systems biology within recent years. However, the complexity and uncertainty of these networks in addition to the lack of necessary data has resulted in more efforts to design and usage of more capable models which fit to realistic conditions. In this paper, instead of optimizing networks in equilibrium condition, the optimization of dynamic networks in non-equilibrium states including low number of molecules has been studied using a 2-D lattice simulation. A prototyped network has been simulated with such approach, and has been optimized using Swarm Particle Algorithm the results of which are presented in addition to the relevant plots.
△ Less
Submitted 29 June, 2014;
originally announced June 2014.
-
A new class of $f$-deformed charge coherent states and their nonclassical properties
Authors:
M Mortazavi,
M K Tavassoly
Abstract:
Two-mode charge (pair) coherent states has been introduced previously by using $<η|$ representation. In the present paper we reobtain these states by a rather different method. Then, using the nonlinear coherent states approach and based on a simple manner by which the representation of two-mode charge coherent states is introduced, we generalize the bosonic creation and annihilation operators to…
▽ More
Two-mode charge (pair) coherent states has been introduced previously by using $<η|$ representation. In the present paper we reobtain these states by a rather different method. Then, using the nonlinear coherent states approach and based on a simple manner by which the representation of two-mode charge coherent states is introduced, we generalize the bosonic creation and annihilation operators to the $f$-deformed ladder operators and construct a new class of $f$-deformed charge coherent states. Unlike the (linear) pair coherent states, our presented structure has the potentiality to generate a large class of pair coherent states with various nonclassicality signs and physical properties which are of interest. Along this purpose, we use a few well-known nonlinearity functions associated with particular quantum systems as some physical appearances of our presented formalism. After introducing the explicit form of the above correlated states in two-mode Fock-space, several nonclassicality features of the corresponding states (as well as the two-mode linear charge coherent states) are numerically investigated by calculating quadrature squeezing, Mandel parameter, second-order correlation function, second-order correlation function between the two modes and Cauchy-Schwartz inequality. Also, the oscillatory behaviour of the photon count and the quasi-probability (Husimi) function of the associated states will be discussed.
△ Less
Submitted 12 April, 2012;
originally announced April 2012.