-
High-dimensional multiple imputation (HDMI) for partially observed confounders including natural language processing-derived auxiliary covariates
Authors:
Janick Weberpals,
Pamela A. Shaw,
Kueiyu Joshua Lin,
Richard Wyss,
Joseph M Plasek,
Li Zhou,
Kerry Ngan,
Thomas DeRamus,
Sudha R. Raman,
Bradley G. Hammill,
Hana Lee,
Sengwee Toh,
John G. Connolly,
Kimberly J. Dandreo,
Fang Tian,
Wei Liu,
Jie Li,
José J. Hernández-Muñoz,
Sebastian Schneeweiss,
Rishi J. Desai
Abstract:
Multiple imputation (MI) models can be improved by including auxiliary covariates (AC), but their performance in high-dimensional data is not well understood. We aimed to develop and compare high-dimensional MI (HDMI) approaches using structured and natural language processing (NLP)-derived AC in studies with partially observed confounders. We conducted a plasmode simulation study using data from…
▽ More
Multiple imputation (MI) models can be improved by including auxiliary covariates (AC), but their performance in high-dimensional data is not well understood. We aimed to develop and compare high-dimensional MI (HDMI) approaches using structured and natural language processing (NLP)-derived AC in studies with partially observed confounders. We conducted a plasmode simulation study using data from opioid vs. non-steroidal anti-inflammatory drug (NSAID) initiators (X) with observed serum creatinine labs (Z2) and time-to-acute kidney injury as outcome. We simulated 100 cohorts with a null treatment effect, including X, Z2, atrial fibrillation (U), and 13 other investigator-derived confounders (Z1) in the outcome generation. We then imposed missingness (MZ2) on 50% of Z2 measurements as a function of Z2 and U and created different HDMI candidate AC using structured and NLP-derived features. We mimicked scenarios where U was unobserved by omitting it from all AC candidate sets. Using LASSO, we data-adaptively selected HDMI covariates associated with Z2 and MZ2 for MI, and with U to include in propensity score models. The treatment effect was estimated following propensity score matching in MI datasets and we benchmarked HDMI approaches against a baseline imputation and complete case analysis with Z1 only. HDMI using claims data showed the lowest bias (0.072). Combining claims and sentence embeddings led to an improvement in the efficiency displaying the lowest root-mean-squared-error (0.173) and coverage (94%). NLP-derived AC alone did not perform better than baseline MI. HDMI approaches may decrease bias in studies with partially observed confounders where missingness depends on unobserved factors.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
On the Adversarial Robustness of Learning-based Image Compression Against Rate-Distortion Attacks
Authors:
Chenhao Wu,
Qingbo Wu,
Haoran Wei,
Shuai Chen,
Lei Wang,
King Ngi Ngan,
Fanman Meng,
Hongliang Li
Abstract:
Despite demonstrating superior rate-distortion (RD) performance, learning-based image compression (LIC) algorithms have been found to be vulnerable to malicious perturbations in recent studies. However, the adversarial attacks considered in existing literature remain divergent from real-world scenarios, both in terms of the attack direction and bitrate. Additionally, existing methods focus solely…
▽ More
Despite demonstrating superior rate-distortion (RD) performance, learning-based image compression (LIC) algorithms have been found to be vulnerable to malicious perturbations in recent studies. However, the adversarial attacks considered in existing literature remain divergent from real-world scenarios, both in terms of the attack direction and bitrate. Additionally, existing methods focus solely on empirical observations of the model vulnerability, neglecting to identify the origin of it. These limitations hinder the comprehensive investigation and in-depth understanding of the adversarial robustness of LIC algorithms. To address the aforementioned issues, this paper considers the arbitrary nature of the attack direction and the uncontrollable compression ratio faced by adversaries, and presents two practical rate-distortion attack paradigms, i.e., Specific-ratio Rate-Distortion Attack (SRDA) and Agnostic-ratio Rate-Distortion Attack (ARDA). Using the performance variations as indicators, we evaluate the adversarial robustness of eight predominant LIC algorithms against diverse attacks. Furthermore, we propose two novel analytical tools for in-depth analysis, i.e., Entropy Causal Intervention and Layer-wise Distance Magnify Ratio, and reveal that hyperprior significantly increases the bitrate and Inverse Generalized Divisive Normalization (IGDN) significantly amplifies input perturbations when under attack. Lastly, we examine the efficacy of adversarial training and introduce the use of online updating for defense. By comparing their advantages and disadvantages, we provide a reference for constructing more robust LIC algorithms against the rate-distortion attacks.
△ Less
Submitted 4 July, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Decays of Standard Model like Higgs boson $h \rightarrowγγ, Z γ$ in a minimal left-right symmetric model
Authors:
T. T. Hong,
V. K. Le,
L. T. T. Phuong,
N . C. Hoi,
N. T. K. Ngan,
N. H. T. Nha
Abstract:
Two decay channels $h\rightarrow γγ, Zγ$ of the Standard Model-like Higgs in a left-right symmetry model are investigated under recent experimental data. We will show there exist one-loop contributions that affect the $h\rightarrow Zγ$ amplitude, but not the $h\rightarrow γγ$ amplitude. From numerical investigations, we show that the signal strength $μ_{Z γ}$ of the decay $h\rightarrow Zγ$ is stil…
▽ More
Two decay channels $h\rightarrow γγ, Zγ$ of the Standard Model-like Higgs in a left-right symmetry model are investigated under recent experimental data. We will show there exist one-loop contributions that affect the $h\rightarrow Zγ$ amplitude, but not the $h\rightarrow γγ$ amplitude. From numerical investigations, we show that the signal strength $μ_{Z γ}$ of the decay $h\rightarrow Zγ$ is still constrained strictly by that of $h\rightarrow γγ$, namely $|Δμ_{γγ}|<38\%$ results in max $|Δμ_{Z γ}|<46\%$. On the other hand, the future experimental sensitivity $|Δμ_{γγ}|=4\%$ still allows $|Δμ_{Z γ}|$ reaches to values larger than the expected sensitivity $|Δμ_{Z γ}|=23\%$.
△ Less
Submitted 11 March, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Learning with Noisy Low-Cost MOS for Image Quality Assessment via Dual-Bias Calibration
Authors:
Lei Wang,
Qingbo Wu,
Desen Yuan,
King Ngi Ngan,
Hongliang Li,
Fanman Meng,
Linfeng Xu
Abstract:
Learning based image quality assessment (IQA) models have obtained impressive performance with the help of reliable subjective quality labels, where mean opinion score (MOS) is the most popular choice. However, in view of the subjective bias of individual annotators, the labor-abundant MOS (LA-MOS) typically requires a large collection of opinion scores from multiple annotators for each image, whi…
▽ More
Learning based image quality assessment (IQA) models have obtained impressive performance with the help of reliable subjective quality labels, where mean opinion score (MOS) is the most popular choice. However, in view of the subjective bias of individual annotators, the labor-abundant MOS (LA-MOS) typically requires a large collection of opinion scores from multiple annotators for each image, which significantly increases the learning cost. In this paper, we aim to learn robust IQA models from low-cost MOS (LC-MOS), which only requires very few opinion scores or even a single opinion score for each image. More specifically, we consider the LC-MOS as the noisy observation of LA-MOS and enforce the IQA model learned from LC-MOS to approach the unbiased estimation of LA-MOS. In this way, we represent the subjective bias between LC-MOS and LA-MOS, and the model bias between IQA predictions learned from LC-MOS and LA-MOS (i.e., dual-bias) as two latent variables with unknown parameters. By means of the expectation-maximization based alternating optimization, we can jointly estimate the parameters of the dual-bias, which suppresses the misleading of LC-MOS via a gated dual-bias calibration (GDBC) module. To the best of our knowledge, this is the first exploration of robust IQA model learning from noisy low-cost labels. Theoretical analysis and extensive experiments on four popular IQA datasets show that the proposed method is robust toward different bias rates and annotation numbers and significantly outperforms the other learning based IQA models when only LC-MOS is available. Furthermore, we also achieve comparable performance with respect to the other models learned with LA-MOS.
△ Less
Submitted 27 November, 2023;
originally announced November 2023.
-
Quantum Photonic Circuits Integrated with Color Centers in Designer Nanodiamonds
Authors:
Kinfung Ngan,
Yuan Zhan,
Constantin Dory,
Jelena Vučković,
Shuo Sun
Abstract:
Diamond has emerged as a leading host material for solid-state quantum emitters, quantum memories, and quantum sensors. However, the challenges in fabricating photonic devices in diamond have limited its potential for use in quantum technologies. While various hybrid integration approaches have been developed for coupling diamond color centers with photonic devices defined in a heterogeneous mater…
▽ More
Diamond has emerged as a leading host material for solid-state quantum emitters, quantum memories, and quantum sensors. However, the challenges in fabricating photonic devices in diamond have limited its potential for use in quantum technologies. While various hybrid integration approaches have been developed for coupling diamond color centers with photonic devices defined in a heterogeneous material, these methods suffer from either large insertion loss at the material interface or evanescent light-matter coupling. Here, we present a new technique that enables deterministic assembly of diamond color centers in a silicon nitride photonic circuit. Using this technique, we observe Purcell enhancement of silicon vacancy centers coupled to a silicon nitride ring resonator. Our hybrid integration approach has the potential for achieving the maximum possible light-matter interaction strength while maintaining low insertion loss, and paves the way towards scalable manufacturing of large-scale quantum photonic circuits integrated with high-quality quantum emitters and spins.
△ Less
Submitted 4 October, 2023; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Forgetting to Remember: A Scalable Incremental Learning Framework for Cross-Task Blind Image Quality Assessment
Authors:
Rui Ma,
Qingbo Wu,
King Ngi Ngan,
Hongliang Li,
Fanman Meng,
Linfeng Xu
Abstract:
Recent years have witnessed the great success of blind image quality assessment (BIQA) in various task-specific scenarios, which present invariable distortion types and evaluation criteria. However, due to the rigid structure and learning framework, they cannot apply to the cross-task BIQA scenario, where the distortion types and evaluation criteria keep changing in practical applications. This pa…
▽ More
Recent years have witnessed the great success of blind image quality assessment (BIQA) in various task-specific scenarios, which present invariable distortion types and evaluation criteria. However, due to the rigid structure and learning framework, they cannot apply to the cross-task BIQA scenario, where the distortion types and evaluation criteria keep changing in practical applications. This paper proposes a scalable incremental learning framework (SILF) that could sequentially conduct BIQA across multiple evaluation tasks with limited memory capacity. More specifically, we develop a dynamic parameter isolation strategy to sequentially update the task-specific parameter subsets, which are non-overlapped with each other. Each parameter subset is temporarily settled to Remember one evaluation preference toward its corresponding task, and the previously settled parameter subsets can be adaptively reused in the following BIQA to achieve better performance based on the task relevance. To suppress the unrestrained expansion of memory capacity in sequential tasks learning, we develop a scalable memory unit by gradually and selectively pruning unimportant neurons from previously settled parameter subsets, which enable us to Forget part of previous experiences and free the limited memory capacity for adapting to the emerging new tasks. Extensive experiments on eleven IQA datasets demonstrate that our proposed method significantly outperforms the other state-of-the-art methods in cross-task BIQA. The source code of the proposed method is available at https://github.com/maruiperfect/SILF.
△ Less
Submitted 6 February, 2023; v1 submitted 15 September, 2022;
originally announced September 2022.
-
Tailoring Drug Mobility by Photothermal Heating of Graphene Plasmons
Authors:
Anh D. Phan,
Nguyen K. Ngan,
Do T. Nga,
Nam B. Le,
Chu Viet Ha
Abstract:
We propose a theoretical approach to quantitatively determine the photothermally driven enhancement of molecular mobility of graphene-indomethacin mixtures under infrared laser irradiation. Graphene plasmons absorb incident electromagnetic energy and dissipate them into heat. The absorbed energy depends on optical properties of graphene plasmons, which are sensitive to structural parameters, and c…
▽ More
We propose a theoretical approach to quantitatively determine the photothermally driven enhancement of molecular mobility of graphene-indomethacin mixtures under infrared laser irradiation. Graphene plasmons absorb incident electromagnetic energy and dissipate them into heat. The absorbed energy depends on optical properties of graphene plasmons, which are sensitive to structural parameters, and concentration of plasmonic nanostructures. By using theoretical model, we calculate temperature gradients of the bulk drug with different concentrations of graphene plasmons. From these, we determine the temperature dependence of structural molecular relaxation and diffusion of indomethacin and find how the heating process significantly enhances the drug mobility.
△ Less
Submitted 28 December, 2021;
originally announced December 2021.
-
Toward a better understanding of activation volume and dynamic decoupling of glass-forming liquids under compression
Authors:
Anh D. Phan,
Nguyen K. Ngan,
Nam B. Le,
Le T. M. Thanh
Abstract:
We theoretically investigate physical properties of the pressure-induced activation volume and dynamic decoupling of ternidazole, glycerol, and probucol by the Elastically Collective Nonlinear Langevin Equation theory. Based on the predicted temperature dependence of activated relaxation under various compression, the activation volume is determined to characterize effects of pressure on molecular…
▽ More
We theoretically investigate physical properties of the pressure-induced activation volume and dynamic decoupling of ternidazole, glycerol, and probucol by the Elastically Collective Nonlinear Langevin Equation theory. Based on the predicted temperature dependence of activated relaxation under various compression, the activation volume is determined to characterize effects of pressure on molecular dynamics of materials. We find that the decoupling of the structural relaxation time of compressed systems from their bulk uncompressed value is governed by the power-law rule. The decoupling exponent exponentially grows with pressure below 2 GPa. The decoupling exponent and activation volume are intercorrelated and have a connection with the differential activation free energy. We numerically and mathematically analyze relationships among these quantities to explain many results in previous experiments and simulations.
△ Less
Submitted 28 July, 2021;
originally announced July 2021.
-
Contrastive Counterfactual Visual Explanations With Overdetermination
Authors:
Adam White,
Kwun Ho Ngan,
James Phelan,
Saman Sadeghi Afgeh,
Kevin Ryan,
Constantino Carlos Reyes-Aldasoro,
Artur d'Avila Garcez
Abstract:
A novel explainable AI method called CLEAR Image is introduced in this paper. CLEAR Image is based on the view that a satisfactory explanation should be contrastive, counterfactual and measurable. CLEAR Image explains an image's classification probability by contrasting the image with a corresponding image generated automatically via adversarial learning. This enables both salient segmentation and…
▽ More
A novel explainable AI method called CLEAR Image is introduced in this paper. CLEAR Image is based on the view that a satisfactory explanation should be contrastive, counterfactual and measurable. CLEAR Image explains an image's classification probability by contrasting the image with a corresponding image generated automatically via adversarial learning. This enables both salient segmentation and perturbations that faithfully determine each segment's importance. CLEAR Image was successfully applied to a medical imaging case study where it outperformed methods such as Grad-CAM and LIME by an average of 27% using a novel pointing game metric. CLEAR Image excels in identifying cases of "causal overdetermination" where there are multiple patches in an image, any one of which is sufficient by itself to cause the classification probability to be close to one.
△ Less
Submitted 9 June, 2022; v1 submitted 28 June, 2021;
originally announced June 2021.
-
Impact of high pressure on reversible structural relaxation of metallic glass
Authors:
Nguyen K. Ngan,
Anh D. Phan,
Alessio Zaccone
Abstract:
We theoretically investigate the temperature dependence of the reversible structural relaxation time and diffusion constant of metallic glasses under pressure. The compression not only changes the glassy dynamics, but also generates a metastable state along with a higher-energy state where the system can rejuvenate. The relaxation times for forward and backward transitions in this two-state system…
▽ More
We theoretically investigate the temperature dependence of the reversible structural relaxation time and diffusion constant of metallic glasses under pressure. The compression not only changes the glassy dynamics, but also generates a metastable state along with a higher-energy state where the system can rejuvenate. The relaxation times for forward and backward transitions in this two-state system are nearly identical and much faster than the relaxation time without accounting for barrier-recrossing. At ambient pressure, the expected irreversible relaxation process is recovered, and our numerical results agree well with prior experimental results. An increase of pressure has a minor effect on the relaxation time and diffusion constant that one computes without considering the influence of the metastable state, but it leads to a large reduction of the reversible relaxation time computed upon taking the metastable state into account. The presence of external compression is also shown to trigger a fragile-to-strong crossover in metallic glasses.
△ Less
Submitted 27 May, 2021;
originally announced May 2021.
-
Non-Homogeneous Haze Removal via Artificial Scene Prior and Bidimensional Graph Reasoning
Authors:
Haoran Wei,
Qingbo Wu,
Hui Li,
King Ngi Ngan,
Hongliang Li,
Fanman Meng,
Linfeng Xu
Abstract:
Due to the lack of natural scene and haze prior information, it is greatly challenging to completely remove the haze from a single image without distorting its visual content. Fortunately, the real-world haze usually presents non-homogeneous distribution, which provides us with many valuable clues in partial well-preserved regions. In this paper, we propose a Non-Homogeneous Haze Removal Network (…
▽ More
Due to the lack of natural scene and haze prior information, it is greatly challenging to completely remove the haze from a single image without distorting its visual content. Fortunately, the real-world haze usually presents non-homogeneous distribution, which provides us with many valuable clues in partial well-preserved regions. In this paper, we propose a Non-Homogeneous Haze Removal Network (NHRN) via artificial scene prior and bidimensional graph reasoning. Firstly, we employ the gamma correction iteratively to simulate artificial multiple shots under different exposure conditions, whose haze degrees are different and enrich the underlying scene prior. Secondly, beyond utilizing the local neighboring relationship, we build a bidimensional graph reasoning module to conduct non-local filtering in the spatial and channel dimensions of feature maps, which models their long-range dependency and propagates the natural scene prior between the well-preserved nodes and the nodes contaminated by haze. To the best of our knowledge, this is the first exploration to remove non-homogeneous haze via the graph reasoning based framework. We evaluate our method on different benchmark datasets. The results demonstrate that our method achieves superior performance over many state-of-the-art algorithms for both the single image dehazing and hazy image understanding tasks. The source code of the proposed NHRN is available on https://github.com/whrws/NHRNet.
△ Less
Submitted 15 November, 2022; v1 submitted 5 April, 2021;
originally announced April 2021.
-
BA^2M: A Batch Aware Attention Module for Image Classification
Authors:
Qishang Cheng,
Hongliang Li,
Qingbo Wu,
King Ngi Ngan
Abstract:
The attention mechanisms have been employed in Convolutional Neural Network (CNN) to enhance the feature representation. However, existing attention mechanisms only concentrate on refining the features inside each sample and neglect the discrimination between different samples. In this paper, we propose a batch aware attention module (BA2M) for feature enrichment from a distinctive perspective. Mo…
▽ More
The attention mechanisms have been employed in Convolutional Neural Network (CNN) to enhance the feature representation. However, existing attention mechanisms only concentrate on refining the features inside each sample and neglect the discrimination between different samples. In this paper, we propose a batch aware attention module (BA2M) for feature enrichment from a distinctive perspective. More specifically, we first get the sample-wise attention representation (SAR) by fusing the channel, local spatial and global spatial attention maps within each sample. Then, we feed the SARs of the whole batch to a normalization function to get the weights for each sample. The weights serve to distinguish the features' importance between samples in a training batch with different complexity of content. The BA2M could be embedded into different parts of CNN and optimized with the network in an end-to-end manner. The design of BA2M is lightweight with few extra parameters and calculations. We validate BA2M through extensive experiments on CIFAR-100 and ImageNet-1K for the image recognition task. The results show that BA2M can boost the performance of various network architectures and outperforms many classical attention methods. Besides, BA2M exceeds traditional methods of re-weighting samples based on the loss value.
△ Less
Submitted 28 March, 2021;
originally announced March 2021.
-
Advanced Geometry Surface Coding for Dynamic Point Cloud Compression
Authors:
Jian Xiong,
Hao Gao,
Miaohui Wang,
Hongliang Li,
King Ngi Ngan,
Weisi Lin
Abstract:
In video-based dynamic point cloud compression (V-PCC), 3D point clouds are projected onto 2D images for compressing with the existing video codecs. However, the existing video codecs are originally designed for natural visual signals, and it fails to account for the characteristics of point clouds. Thus, there are still problems in the compression of geometry information generated from the point…
▽ More
In video-based dynamic point cloud compression (V-PCC), 3D point clouds are projected onto 2D images for compressing with the existing video codecs. However, the existing video codecs are originally designed for natural visual signals, and it fails to account for the characteristics of point clouds. Thus, there are still problems in the compression of geometry information generated from the point clouds. Firstly, the distortion model in the existing rate-distortion optimization (RDO) is not consistent with the geometry quality assessment metrics. Secondly, the prediction methods in video codecs fail to account for the fact that the highest depth values of a far layer is greater than or equal to the corresponding lowest depth values of a near layer. This paper proposes an advanced geometry surface coding (AGSC) method for dynamic point clouds (DPC) compression. The proposed method consists of two modules, including an error projection model-based (EPM-based) RDO and an occupancy map-based (OM-based) merge prediction. Firstly, the EPM model is proposed to describe the relationship between the distortion model in the existing video codec and the geometry quality metric. Secondly, the EPM-based RDO method is presented to project the existing distortion model on the plane normal and is simplified to estimate the average normal vectors of coding units (CUs). Finally, we propose the OM-based merge prediction approach, in which the prediction pixels of merge modes are refined based on the occupancy map. Experiments tested on the standard point clouds show that the proposed method achieves an average 9.84\% bitrate saving for geometry compression.
△ Less
Submitted 11 March, 2021;
originally announced March 2021.
-
Counterdiabatic control of transport in a synthetic tight-binding lattice
Authors:
Eric J. Meier,
Kinfung Ngan,
Dries Sels,
Bryce Gadway
Abstract:
Quantum state transformations that are robust to experimental imperfections are important for applications in quantum information science and quantum sensing. Counterdiabatic (CD) approaches, which use knowledge of the underlying system Hamiltonian to actively correct for diabatic effects, are powerful tools for achieving simultaneously fast and stable state transformations. Protocols for CD drivi…
▽ More
Quantum state transformations that are robust to experimental imperfections are important for applications in quantum information science and quantum sensing. Counterdiabatic (CD) approaches, which use knowledge of the underlying system Hamiltonian to actively correct for diabatic effects, are powerful tools for achieving simultaneously fast and stable state transformations. Protocols for CD driving have thus far been limited in their experimental implementation to discrete systems with just two or three levels, as well as bulk systems with scaling symmetries. Here, we extend the tool of CD control to a discrete synthetic lattice system composed of as many as nine sites. Although this system has a vanishing gap and thus no adiabatic support in the thermodynamic limit, we show that CD approaches can still give a substantial, several order-of-magnitude, improvement in fidelity over naive, fast adiabatic protocols.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
Subjective and Objective De-raining Quality Assessment Towards Authentic Rain Image
Authors:
Qingbo Wu,
Lei Wang,
King N. Ngan,
Hongliang Li,
Fanman Meng,
Linfeng Xu
Abstract:
Images acquired by outdoor vision systems easily suffer poor visibility and annoying interference due to the rainy weather, which brings great challenge for accurately understanding and describing the visual contents. Recent researches have devoted great efforts on the task of rain removal for improving the image visibility. However, there is very few exploration about the quality assessment of de…
▽ More
Images acquired by outdoor vision systems easily suffer poor visibility and annoying interference due to the rainy weather, which brings great challenge for accurately understanding and describing the visual contents. Recent researches have devoted great efforts on the task of rain removal for improving the image visibility. However, there is very few exploration about the quality assessment of de-rained image, even it is crucial for accurately measuring the performance of various de-raining algorithms. In this paper, we first create a de-raining quality assessment (DQA) database that collects 206 authentic rain images and their de-rained versions produced by 6 representative single image rain removal algorithms. Then, a subjective study is conducted on our DQA database, which collects the subject-rated scores of all de-rained images. To quantitatively measure the quality of de-rained image with non-uniform artifacts, we propose a bi-directional feature embedding network (B-FEN) which integrates the features of global perception and local difference together. Experiments confirm that the proposed method significantly outperforms many existing universal blind image quality assessment models. To help the research towards perceptually preferred de-raining algorithm, we will publicly release our DQA database and B-FEN source code on https://github.com/wqb-uestc.
△ Less
Submitted 5 October, 2019; v1 submitted 26 September, 2019;
originally announced September 2019.
-
Class Activation Map generation by Multiple Level Class Grouping and Orthogonal Constraint
Authors:
Kaixu Huang,
Fanman Meng,
Hongliang Li,
Shuai Chen,
Qingbo Wu,
King N. Ngan
Abstract:
Class activation map (CAM) highlights regions of classes based on classification network, which is widely used in weakly supervised tasks. However, it faces the problem that the class activation regions are usually small and local. Although several efforts paid to the second step (the CAM generation step) have partially enhanced the generation, we believe such problem is also caused by the first s…
▽ More
Class activation map (CAM) highlights regions of classes based on classification network, which is widely used in weakly supervised tasks. However, it faces the problem that the class activation regions are usually small and local. Although several efforts paid to the second step (the CAM generation step) have partially enhanced the generation, we believe such problem is also caused by the first step (training step), because single classification model trained on the entire classes contains finite discriminate information that limits the object region extraction. To this end, this paper solves CAM generation by using multiple classification models. To form multiple classification networks that carry different discriminative information, we try to capture the semantic relationships between classes to form different semantic levels of classification models. Specifically, hierarchical clustering based on class relationships is used to form hierarchical clustering results, where the clustering levels are treated as semantic levels to form the classification models. Moreover, a new orthogonal module and a two-branch based CAM generation method are proposed to generate class regions that are orthogonal and complementary. We use the PASCAL VOC 2012 dataset to verify the proposed method. Experimental results show that our approach improves the CAM generation.
△ Less
Submitted 21 September, 2019;
originally announced September 2019.
-
A New Few-shot Segmentation Network Based on Class Representation
Authors:
Yuwei Yang,
Fanman Meng,
Hongliang Li,
King N. Ngan,
Qingbo Wu
Abstract:
This paper studies few-shot segmentation, which is a task of predicting foreground mask of unseen classes by a few of annotations only, aided by a set of rich annotations already existed. The existing methods mainly focus the task on "\textit{how to transfer segmentation cues from support images (labeled images) to query images (unlabeled images)}", and try to learn efficient and general transfer…
▽ More
This paper studies few-shot segmentation, which is a task of predicting foreground mask of unseen classes by a few of annotations only, aided by a set of rich annotations already existed. The existing methods mainly focus the task on "\textit{how to transfer segmentation cues from support images (labeled images) to query images (unlabeled images)}", and try to learn efficient and general transfer module that can be easily extended to unseen classes. However, it is proved to be a challenging task to learn the transfer module that is general to various classes. This paper solves few-shot segmentation in a new perspective of "\textit{how to represent unseen classes by existing classes}", and formulates few-shot segmentation as the representation process that represents unseen classes (in terms of forming the foreground prior) by existing classes precisely. Based on such idea, we propose a new class representation based few-shot segmentation framework, which firstly generates class activation map of unseen class based on the knowledge of existing classes, and then uses the map as foreground probability map to extract the foregrounds from query image. A new two-branch based few-shot segmentation network is proposed. Moreover, a new CAM generation module that extracts the CAM of unseen classes rather than the classical training classes is raised. We validate the effectiveness of our method on Pascal VOC 2012 dataset, the value FB-IoU of one-shot and five-shot arrives at 69.2\% and 70.1\% respectively, which outperforms the state-of-the-art method.
△ Less
Submitted 18 September, 2019;
originally announced September 2019.
-
Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking
Authors:
Lu Sheng,
Jianfei Cai,
Tat-Jen Cham,
Vladimir Pavlovic,
King Ngi Ngan
Abstract:
In this paper, we propose a generative framework that unifies depth-based 3D facial pose tracking and face model adaptation on-the-fly, in the unconstrained scenarios with heavy occlusions and arbitrary facial expression variations. Specifically, we introduce a statistical 3D morphable model that flexibly describes the distribution of points on the surface of the face model, with an efficient swit…
▽ More
In this paper, we propose a generative framework that unifies depth-based 3D facial pose tracking and face model adaptation on-the-fly, in the unconstrained scenarios with heavy occlusions and arbitrary facial expression variations. Specifically, we introduce a statistical 3D morphable model that flexibly describes the distribution of points on the surface of the face model, with an efficient switchable online adaptation that gradually captures the identity of the tracked subject and rapidly constructs a suitable face model when the subject changes. Moreover, unlike prior art that employed ICP-based facial pose estimation, to improve robustness to occlusions, we propose a ray visibility constraint that regularizes the pose based on the face model's visibility with respect to the input point cloud. Ablation studies and experimental results on Biwi and ICT-3DHP datasets demonstrate that the proposed framework is effective and outperforms completing state-of-the-art depth-based methods.
△ Less
Submitted 6 May, 2019;
originally announced May 2019.
-
MVF-Net: Multi-View 3D Face Morphable Model Regression
Authors:
Fanzi Wu,
Linchao Bao,
Yajing Chen,
Yonggen Ling,
Yibing Song,
Songnan Li,
King Ngi Ngan,
Wei Liu
Abstract:
We address the problem of recovering the 3D geometry of a human face from a set of facial images in multiple views. While recent studies have shown impressive progress in 3D Morphable Model (3DMM) based facial reconstruction, the settings are mostly restricted to a single view. There is an inherent drawback in the single-view setting: the lack of reliable 3D constraints can cause unresolvable ambi…
▽ More
We address the problem of recovering the 3D geometry of a human face from a set of facial images in multiple views. While recent studies have shown impressive progress in 3D Morphable Model (3DMM) based facial reconstruction, the settings are mostly restricted to a single view. There is an inherent drawback in the single-view setting: the lack of reliable 3D constraints can cause unresolvable ambiguities. We in this paper explore 3DMM-based shape recovery in a different setting, where a set of multi-view facial images are given as input. A novel approach is proposed to regress 3DMM parameters from multi-view inputs with an end-to-end trainable Convolutional Neural Network (CNN). Multiview geometric constraints are incorporated into the network by establishing dense correspondences between different views leveraging a novel self-supervised view alignment loss. The main ingredient of the view alignment loss is a differentiable dense optical flow estimator that can backpropagate the alignment errors between an input view and a synthetic rendering from another input view, which is projected to the target view through the 3D shape to be inferred. Through minimizing the view alignment loss, better 3D shapes can be recovered such that the synthetic projections from one view to another can better align with the observed image. Extensive experiments demonstrate the superiority of the proposed method over other 3DMM methods.
△ Less
Submitted 9 April, 2019;
originally announced April 2019.
-
Hierarchy Neighborhood Discriminative Hashing for An Unified View of Single-Label and Multi-Label Image retrieval
Authors:
Lei Ma,
Hongliang Li,
Qingbo Wu,
Fanman Meng,
King Ngi Ngan
Abstract:
Recently, deep supervised hashing methods have become popular for large-scale image retrieval task. To preserve the semantic similarity notion between examples, they typically utilize the pairwise supervision or the triplet supervised information for hash learning. However, these methods usually ignore the semantic class information which can help the improvement of the semantic discriminative abi…
▽ More
Recently, deep supervised hashing methods have become popular for large-scale image retrieval task. To preserve the semantic similarity notion between examples, they typically utilize the pairwise supervision or the triplet supervised information for hash learning. However, these methods usually ignore the semantic class information which can help the improvement of the semantic discriminative ability of hash codes. In this paper, we propose a novel hierarchy neighborhood discriminative hashing method. Specifically, we construct a bipartite graph to build coarse semantic neighbourhood relationship between the sub-class feature centers and the embeddings features. Moreover, we utilize the pairwise supervised information to construct the fined semantic neighbourhood relationship between embeddings features. Finally, we propose a hierarchy neighborhood discriminative hashing loss to unify the single-label and multilabel image retrieval problem with a one-stream deep neural network architecture. Experimental results on two largescale datasets demonstrate that the proposed method can outperform the state-of-the-art hashing methods.
△ Less
Submitted 11 January, 2019; v1 submitted 10 January, 2019;
originally announced January 2019.
-
3D Facial Expression Reconstruction using Cascaded Regression
Authors:
Fanzi Wu,
Songnan Li,
Tianhao Zhao,
King Ngi Ngan,
Lv Sheng
Abstract:
This paper proposes a novel model fitting algorithm for 3D facial expression reconstruction from a single image. Face expression reconstruction from a single image is a challenging task in computer vision. Most state-of-the-art methods fit the input image to a 3D Morphable Model (3DMM). These methods need to solve a stochastic problem and cannot deal with expression and pose variations. To solve t…
▽ More
This paper proposes a novel model fitting algorithm for 3D facial expression reconstruction from a single image. Face expression reconstruction from a single image is a challenging task in computer vision. Most state-of-the-art methods fit the input image to a 3D Morphable Model (3DMM). These methods need to solve a stochastic problem and cannot deal with expression and pose variations. To solve this problem, we adopt a 3D face expression model and use a combined feature which is robust to scale, rotation and different lighting conditions. The proposed method applies a cascaded regression framework to estimate parameters for the 3DMM. 2D landmarks are detected and used to initialize the 3D shape and mapping matrices. In each iteration, residues between the current 3DMM parameters and the ground truth are estimated and then used to update the 3D shapes. The mapping matrices are also calculated based on the updated shapes and 2D landmarks. HOG features of the local patches and displacements between 3D landmark projections and 2D landmarks are exploited. Compared with existing methods, the proposed method is robust to expression and pose changes and can reconstruct higher fidelity 3D face shape.
△ Less
Submitted 17 August, 2018; v1 submitted 10 December, 2017;
originally announced December 2017.
-
A Perceptually Weighted Rank Correlation Indicator for Objective Image Quality Assessment
Authors:
Qingbo Wu,
Hongliang Li,
Fanman Meng,
King N. Ngan
Abstract:
In the field of objective image quality assessment (IQA), the Spearman's $ρ$ and Kendall's $τ$ are two most popular rank correlation indicators, which straightforwardly assign uniform weight to all quality levels and assume each pair of images are sortable. They are successful for measuring the average accuracy of an IQA metric in ranking multiple processed images. However, two important perceptua…
▽ More
In the field of objective image quality assessment (IQA), the Spearman's $ρ$ and Kendall's $τ$ are two most popular rank correlation indicators, which straightforwardly assign uniform weight to all quality levels and assume each pair of images are sortable. They are successful for measuring the average accuracy of an IQA metric in ranking multiple processed images. However, two important perceptual properties are ignored by them as well. Firstly, the sorting accuracy (SA) of high quality images are usually more important than the poor quality ones in many real world applications, where only the top-ranked images would be pushed to the users. Secondly, due to the subjective uncertainty in making judgement, two perceptually similar images are usually hardly sortable, whose ranks do not contribute to the evaluation of an IQA metric. To more accurately compare different IQA algorithms, we explore a perceptually weighted rank correlation indicator in this paper, which rewards the capability of correctly ranking high quality images, and suppresses the attention towards insensitive rank mistakes. More specifically, we focus on activating `valid' pairwise comparison towards image quality, whose difference exceeds a given sensory threshold (ST). Meanwhile, each image pair is assigned an unique weight, which is determined by both the quality level and rank deviation. By modifying the perception threshold, we can illustrate the sorting accuracy with a more sophisticated SA-ST curve, rather than a single rank correlation coefficient. The proposed indicator offers a new insight for interpreting visual perception behaviors. Furthermore, the applicability of our indicator is validated in recommending robust IQA metrics for both the degraded and enhanced image data.
△ Less
Submitted 15 May, 2017;
originally announced May 2017.
-
Probing neutrino and Higgs sectors in $SU(2)_1 \times SU(2)_2 \times U(1)_Y$ model with lepton-flavor non-universality
Authors:
L. T. Hue,
A. B. Arbuzov,
N. T. K. Ngan,
H. N. Long
Abstract:
The neutrino and Higgs sectors in the $\mbox{SU(2)}_1 \times \mbox{SU(2)}_2 \times \mbox{U(1)}_Y $ model with lepton-flavor non-universality are discussed. We show that active neutrinos can get Majorana masses from radiative corrections, after adding only new singly charged Higgs bosons. The mechanism for generation of neutrino masses is the same as in the Zee models. This also gives a hint to sol…
▽ More
The neutrino and Higgs sectors in the $\mbox{SU(2)}_1 \times \mbox{SU(2)}_2 \times \mbox{U(1)}_Y $ model with lepton-flavor non-universality are discussed. We show that active neutrinos can get Majorana masses from radiative corrections, after adding only new singly charged Higgs bosons. The mechanism for generation of neutrino masses is the same as in the Zee models. This also gives a hint to solving the dark matter problem based on similar ways discussed recently in many radiative neutrino mass models with dark matter. Except the active neutrinos, the appearance of singly charged Higgs bosons and dark matter does not affect significantly the physical spectrum of all particles in the original model. We indicate this point by investigating the Higgs sector in both cases before and after singly charged scalars are added into it. Many interesting properties of physical Higgs bosons, which were not shown previously, are explored. In particular, the mass matrices of charged and CP-odd Higgs fields are proportional to the coefficient of triple Higgs coupling $μ$. The mass eigenstates and eigenvalues in the CP-even Higgs sector are also presented. All couplings of the SM-like Higgs boson to normal fermions and gauge bosons are different from the SM predictions by a factor $c_h$, which must satisfy the recent global fit of experimental data, namely $0.995<|c_h|<1$. We have analyzed a more general diagonalization of gauge boson mass matrices, then we show that the ratio of the tangents of the $W-W'$ and $Z-Z'$ mixing angles is exactly the cosine of the Weinberg angle, implying that number of parameters is reduced by 1. Signals of new physics from decays of new heavy fermions and Higgs bosons at LHC and constraints of their masses are also discussed.
△ Less
Submitted 25 May, 2017; v1 submitted 21 November, 2016;
originally announced November 2016.
-
Phenomenology of the SU(3)_C \otimes SU(2)_L \otimes SU(3)_R \otimes U(1)_X gauge model
Authors:
P. V. Dong,
D. T. Huong,
D. V. Loi,
N. T. Nhuan,
N. T. K. Ngan
Abstract:
We study the left-right asymmetric model based on SU(3)_C \otimes SU(2)_L \otimes SU(3)_R \otimes U(1)_X gauge group, which improves the theoretical and phenomenological aspects of the known left-right symmetric model. This new gauge symmetry yields that the fermion generation number is three, and the tree-level flavor-changing neutral currents arise in both gauge and scalar sectors. Also, it can…
▽ More
We study the left-right asymmetric model based on SU(3)_C \otimes SU(2)_L \otimes SU(3)_R \otimes U(1)_X gauge group, which improves the theoretical and phenomenological aspects of the known left-right symmetric model. This new gauge symmetry yields that the fermion generation number is three, and the tree-level flavor-changing neutral currents arise in both gauge and scalar sectors. Also, it can provide the observed neutrino masses as well as dark matter automatically. Further, we investigate the mass spectrum of the gauge and scalar fields. All the gauge interactions of the fermions and scalars are derived. We examine the tree-level contributions of the new neutral vector, Z'_R, and new neutral scalar, H_2, to flavor-violating neutral meson mixings, say K-\bar{K}, B_d-\bar{B}_d, and B_s-\bar{B}_s, which strongly constrain the new physics scale as well as the elements of the right-handed quark mixing matrices. The bounds for the new physics scale are in agreement with those coming from the ρ-parameter as well as the mixing parameters between W, Z bosons and new gauge bosons.
△ Less
Submitted 12 April, 2017; v1 submitted 12 September, 2016;
originally announced September 2016.
-
Phenomenology of the simple 3-3-1 model with inert scalars
Authors:
Phung Van Dong,
N. T. K. Ngan,
T. D. Tham,
L. D. Thien,
N. T. Thuy
Abstract:
The simple 3-3-1 model that contains the minimal lepton and minimal scalar contents is detailedly studied. The impact of the inert scalars (i.e., the extra fundamental fields that provide realistic dark matter candidates) on the model is discussed. All the interactions of the model are derived, in which the standard model ones are identified. We constrain the standard model like Higgs particle at…
▽ More
The simple 3-3-1 model that contains the minimal lepton and minimal scalar contents is detailedly studied. The impact of the inert scalars (i.e., the extra fundamental fields that provide realistic dark matter candidates) on the model is discussed. All the interactions of the model are derived, in which the standard model ones are identified. We constrain the standard model like Higgs particle at the LHC. We search for the new particles including the inert ones, which contribute to the $B_s$-$\bar{B}_s$ mixing, the rare $B_s\rightarrow μ^+μ^-$ decay, the CKM unitarity violation, as well as producing the dilepton, dijet, diboson, diphoton, and monojet final states at the LHC.
△ Less
Submitted 25 March, 2019; v1 submitted 30 December, 2015;
originally announced December 2015.
-
Simple 3-3-1 model and implication for dark matter
Authors:
P. V. Dong,
N. T. K. Ngan,
D. V. Soa
Abstract:
We propose a new and realistic 3-3-1 model with the minimal lepton and scalar contents, named the simple 3-3-1 model. The scalar sector contains two new heavy Higgs bosons, one neutral H and another singly-charged H^\pm, besides the standard model Higgs boson. There is a mixing between the Z boson and the new neutral gauge boson (Z'). The ρparameter constrains the 3-3-1 breaking scale (w) to be w>…
▽ More
We propose a new and realistic 3-3-1 model with the minimal lepton and scalar contents, named the simple 3-3-1 model. The scalar sector contains two new heavy Higgs bosons, one neutral H and another singly-charged H^\pm, besides the standard model Higgs boson. There is a mixing between the Z boson and the new neutral gauge boson (Z'). The ρparameter constrains the 3-3-1 breaking scale (w) to be w>460 GeV. The quarks get consistent masses via five-dimensional effective interactions while the leptons via interactions up to six dimensions. Particularly, the neutrino small masses are generated as a consequence of the approximate lepton-number symmetry of the model. The proton is stabilized due to the lepton-parity conservation (-1)^L. The hadronic FCNCs are calculated that give a bound w>3.6 TeV and yield that the third quark generation is different from the first two. The correct mass generation for top quark implies that the minimal scalar sector as proposed is unique. By the simple 3-3-1 model, the other scalars beside the minimal ones can behave as inert fields responsible for dark matter. A triplet, doublet and singlet dark matter are respectively recognized. Our proposals provide the solutions for the long-standing dark matter issue in the minimal 3-3-1 model.
△ Less
Submitted 6 October, 2014; v1 submitted 14 July, 2014;
originally announced July 2014.
-
Scalar decay in a three-dimensional chaotic flow
Authors:
Keith Ngan,
Jacques Vanneste
Abstract:
The decay of a passive scalar in a three-dimensional chaotic flow is studied using high-resolution numerical simulations. The (volume-preserving) flow considered is a three-dimensional extension of the randomised alternating sine flow employed extensively in studies of mixing in two dimensions. It is used to show that theoretical predictions for two-dimensional flows with small diffusivity carry o…
▽ More
The decay of a passive scalar in a three-dimensional chaotic flow is studied using high-resolution numerical simulations. The (volume-preserving) flow considered is a three-dimensional extension of the randomised alternating sine flow employed extensively in studies of mixing in two dimensions. It is used to show that theoretical predictions for two-dimensional flows with small diffusivity carry over to three dimensions even though the stretching properties differ significantly. The variance decay rate, scalar field structure, and time evolution of statistical moments confirm that there are two distinct regimes of scalar decay: a locally controlled regime, which applies when the domain size is comparable to the characteristic lengthscale of the velocity field, and a globally controlled regime, which when applies when the domain is larger. Asymptotic predictions for the variance decay rate in both regimes show excellent agreement with the numerical results. Consideration of both the forward flow and its time reverse makes it possible to compare the scalar evolution in flows with one or two expanding directions; simulations confirm the theoretical prediction that the decay rate of the scalar is the same in both flows, despite the very different scalar field structures.
△ Less
Submitted 15 March, 2011; v1 submitted 31 January, 2011;
originally announced January 2011.