subscribe to arXiv mailings

A unified fourth-order Bhatnagar-Gross-Krook lattice Boltzmann model for high-dimensional linear hyperbolic equations

Authors: Ying Chen, Zhenhua Chai, Baochang Shi

Abstract: In this work, we first develop a unified Bhatnagar-Gross-Krook lattice Boltzmann (BGK-LB) model for the $d$($d\geq 1$)-dimensional linear hyperbolic equation (L-HE), where the natural moments and the D$d$Q$(2d^2+1)$ [($2d^2+1$) discrete velocities in $d$-dimensional space] lattice structure are considered. Subsequently, at the acoustic scaling, we conduct an accuracy analysis on the developed BGK-… ▽ More In this work, we first develop a unified Bhatnagar-Gross-Krook lattice Boltzmann (BGK-LB) model for the $d$($d\geq 1$)-dimensional linear hyperbolic equation (L-HE), where the natural moments and the D$d$Q$(2d^2+1)$ [($2d^2+1$) discrete velocities in $d$-dimensional space] lattice structure are considered. Subsequently, at the acoustic scaling, we conduct an accuracy analysis on the developed BGK-LB model by the direct Taylor expansion (DTE) method, and present the second- and third-order moments of the equilibrium distribution functions (EDFs) to ensure that the BGK-LB model can be fourth-order consistent with the L-HE. And on this basis, when considering the Dirichlet boundary condition, the fourth-order full-way and half-way boundary schemes are proposed to approximate the unknown distribution functions to ensure that the BGK-LB model can be overall fourth-order accurate. Thereafter, based on the kinetic entropy theory, we derive the conditions that the fourth-order moments of the EDFs should satisfy to ensure the microscopic entropy stability of the BGK-LB model. In addition, with the aid of the von Neumann stability analysis, we also discuss the $L^2$ stability of the BGK-LB model and numerically plot the stability regions. In particular, from a numerical perspective, we find that the region of microscopic entropy stability is identical to that of $L^2$ stability. Finally, we carry out some numerical experiments to test the accuracy and stability of the BGK-LB model, and the numerical results are in agreement with our theoretical analysis. In addition, we compare the developed full-way and half-way boundary schemes for the Dirichlet boundary condition, which shows that the full-way boundary scheme is more stable. △ Less

Submitted 16 October, 2024; originally announced October 2024.

arXiv:2410.06657 [pdf, ps, other]

Freezing dynamics of wetting droplet under a uniform electric field

Authors: Jiangxu Huang, Hanqing Li, Jiaqi Che, Zhenhua Chai, Lei Wang, Baochang Shi

Abstract: Electrofreezing is a powerful technique that employs the electric field to control and enhance the freezing process. In this work, a phase-field-based lattice Boltzmann (LB) method is developed to study the electrofreezing process of sessile droplet on a cooled substrate. The accuracy of the present LB method is first validated through performing some simulations of the three-phase Stefan problem,… ▽ More Electrofreezing is a powerful technique that employs the electric field to control and enhance the freezing process. In this work, a phase-field-based lattice Boltzmann (LB) method is developed to study the electrofreezing process of sessile droplet on a cooled substrate. The accuracy of the present LB method is first validated through performing some simulations of the three-phase Stefan problem, the droplet freezing on a cold wall, and the droplet deformation under a uniform electric field. Then it is used to investigate the effect of an electric field on the freezing of a wetting droplet on a cold substrate, and the numerical results show that the electric field has a significant influence on the freezing time of the droplet mainly through changing the morphology of the droplet. In particular, under the effect of the electric field, the freezing time is increased for the droplet with a prolate pattern, while the freezing time of the droplet with an oblate pattern is decreased. These numerical results bring some new insights on the electrofreezing and provide a valuable guidance for the precise regulation of droplet freezing. △ Less

Submitted 21 October, 2024; v1 submitted 9 October, 2024; originally announced October 2024.

Comments: 19 pages, 14 figures

arXiv:2410.00773 [pdf, other]

BabelBench: An Omni Benchmark for Code-Driven Analysis of Multimodal and Multistructured Data

Authors: Xuwu Wang, Qiwen Cui, Yunzhe Tao, Yiran Wang, Ziwei Chai, Xiaotian Han, Boyi Liu, Jianbo Yuan, Jing Su, Guoyin Wang, Tingkai Liu, Liyu Chen, Tianyi Liu, Tao Sun, Yufeng Zhang, Sirui Zheng, Quanzeng You, Yang Yang, Hongxia Yang

Abstract: Large language models (LLMs) have become increasingly pivotal across various domains, especially in handling complex data types. This includes structured data processing, as exemplified by ChartQA and ChatGPT-Ada, and multimodal unstructured data processing as seen in Visual Question Answering (VQA). These areas have attracted significant attention from both industry and academia. Despite this, th… ▽ More Large language models (LLMs) have become increasingly pivotal across various domains, especially in handling complex data types. This includes structured data processing, as exemplified by ChartQA and ChatGPT-Ada, and multimodal unstructured data processing as seen in Visual Question Answering (VQA). These areas have attracted significant attention from both industry and academia. Despite this, there remains a lack of unified evaluation methodologies for these diverse data handling scenarios. In response, we introduce BabelBench, an innovative benchmark framework that evaluates the proficiency of LLMs in managing multimodal multistructured data with code execution. BabelBench incorporates a dataset comprising 247 meticulously curated problems that challenge the models with tasks in perception, commonsense reasoning, logical reasoning, and so on. Besides the basic capabilities of multimodal understanding, structured data processing as well as code generation, these tasks demand advanced capabilities in exploration, planning, reasoning and debugging. Our experimental findings on BabelBench indicate that even cutting-edge models like ChatGPT 4 exhibit substantial room for improvement. The insights derived from our comprehensive analysis offer valuable guidance for future research within the community. The benchmark data can be found at https://github.com/FFD8FFE/babelbench. △ Less

Submitted 1 October, 2024; originally announced October 2024.

arXiv:2408.15654 [pdf, other]

Phase-field-based lattice Boltzmann method for the transport of insoluble surfactant in two-phase flows

Authors: Chengjie Zhan, Hong Liang, Zhenhua Chai, Baochang Shi

Abstract: In this work, we present a general second-order phase-field model for the transport of insoluble surfactant in incompressible two-phase flows. In this model, the second-order local Allen-Cahn equation is applied for interface capturing, a general form of the simple scalar transport equation [S. S. Jain, J. Comput. Phys. 515, 113277 (2024)] is adopted for interface-confined surfactant, and the cons… ▽ More In this work, we present a general second-order phase-field model for the transport of insoluble surfactant in incompressible two-phase flows. In this model, the second-order local Allen-Cahn equation is applied for interface capturing, a general form of the simple scalar transport equation [S. S. Jain, J. Comput. Phys. 515, 113277 (2024)] is adopted for interface-confined surfactant, and the consistent and conservative Navier-Stokes equations with the Marangoni force is used for fluid flows. To solve this model, we further developed a mesoscopic lattice Boltzmann (LB) method, in which the LB model for surfactant transport equation is proposed under the general LB framework for the convection-diffusion type equation, and it can correctly recover the governing equation for surfactant transport. The accuracy of the present LB method is tested by several benchmark problems, and the numerical results show it has a good performance for the transport of the insoluble surfactant in two-phase flows. △ Less

Submitted 31 August, 2024; v1 submitted 28 August, 2024; originally announced August 2024.

arXiv:2407.20404 [pdf, other]

Functional Analytic Derivation and CP2K Implementation of the SCCS Model Based on the Solvent-Aware Interface

Authors: Ziwei Chai, Sandra Luber

Abstract: In the self-consistent continuum solvation (SCCS) approach ($\textit{J. Chem. Phys.}$ 136, 064102 (2012)), the analytical expressions of the local solute-solvent interface functions determine the interface function and dielectric function values at a given real space position based solely on the electron density at that position, completely disregarding the surrounding electron density distributio… ▽ More In the self-consistent continuum solvation (SCCS) approach ($\textit{J. Chem. Phys.}$ 136, 064102 (2012)), the analytical expressions of the local solute-solvent interface functions determine the interface function and dielectric function values at a given real space position based solely on the electron density at that position, completely disregarding the surrounding electron density distribution. Therefore, the low electron density areas inside the solute will be identified by the algorithm as regions where implicit solvent exists, resulting in the emergence of non-physical implicit solvent regions within the solute and even potentially leading to the divergence catastrophe of Kohn-Sham SCF calculations. We present a new and efficient SCCS implementation based on the solvent-aware interface ($\textit{J. Chem. Theory Comput.}$ 15, 3, 1996-2009 (2019)) which addresses this issue by utilizing a solute-solvent interface function based on convolution of electron density in the CP2K software package, which is based on the mixed Gaussian and plane waves (GPW) approach. Starting with the foundational formulas of SCCS, we have rigorously and meticulously derived the contributions of the newly defined electrostatic energy to the Kohn-Sham potential and the analytical forces. This comprehensive derivation utilizes the updated versions of the solute-solvent interface function and the dielectric function, tailored to align with the specifics of the GPW implementation. Our implementation has been tested to successfully eliminate non-physical implicit solvent regions within the solute and achieve good SCF convergence, as demonstrated by test results for both bulk and surface models, namely liquid $H_2O$, titanium dioxide, and platinum. △ Less

Submitted 29 July, 2024; originally announced July 2024.

arXiv:2407.13256 [pdf]

Minimum tracking linear response Hubbard and Hund corrected Density Functional Theory in CP2K

Authors: Ziwei Chai, Rutong Si, Mingyang Chen, Gilberto Teobaldi, David D. O'Regan, Li-Min Liu

Abstract: We present the implementation of the Hubbard ($U$) and Hund ($J$) corrected Density Functional Theory (DFT+$U$+$J$) functionality in the Quickstep program, which is part of the CP2K suite. The tensorial and Löwdin subspace representations are implemented and compared. Full analytical DFT+$U$+$J$ forces are implemented and benchmarked for the tensorial and Löwdin representations. We also present th… ▽ More We present the implementation of the Hubbard ($U$) and Hund ($J$) corrected Density Functional Theory (DFT+$U$+$J$) functionality in the Quickstep program, which is part of the CP2K suite. The tensorial and Löwdin subspace representations are implemented and compared. Full analytical DFT+$U$+$J$ forces are implemented and benchmarked for the tensorial and Löwdin representations. We also present the implementation of the recently proposed minimum-tracking linear-response method that enables the $U$ and $J$ parameters to be calculated on first principles basis without reference to the Kohn-Sham eigensystem. These implementations are benchmarked against recent results for different materials properties including DFT+$U$ band gap opening in NiO, the relative stability of various polaron distributions in TiO$_2$, the dependence of the calculated TiO$_2$ band gap on +$J$ corrections, and, finally, the role of the +$U$ and +$J$ corrections for the computed properties of a series of the hexahydrated transition metals. Our implementation provides results consistent with those already reported in the literature from comparable methods. We conclude the contribution with tests on the influence of the Löwdin orthonormalization on the occupancies, calculated parameters, and derived properties. △ Less

Submitted 24 July, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

arXiv:2407.08386 [pdf, other]

Improved Model and Analysis for RIS-Assisted Indoor Terahertz Wireless Networks

Authors: Zhi Chai, Jiajie Xu, Mohamed-Slim Alouini, Justin P. Coon

Abstract: In this paper, we propose a new model for indoor THz communication assisted by RIS. We conduct a realistic modeling of indoor obstacles and analyze their impact on performance. Order statistics are applied to calculate the cumulative distribution functions (CDFs) of distances from the transmitter to the selected RIS, i.e., the nearest RIS in the bounded indoor environment to the transmitter, and f… ▽ More In this paper, we propose a new model for indoor THz communication assisted by RIS. We conduct a realistic modeling of indoor obstacles and analyze their impact on performance. Order statistics are applied to calculate the cumulative distribution functions (CDFs) of distances from the transmitter to the selected RIS, i.e., the nearest RIS in the bounded indoor environment to the transmitter, and from the selected RIS to the receiver. We calculate the coverage probability (CP) as a function of RIS number, obstacle density, room size, and the transmitter's location. By comparing the numerical results obtained from the analytical expressions with Monte Carlo simulations, we verify the accuracy of our analysis. Through numerical results, it is observed that room size and obstacle density affect the CP in a significant way. However, by optimizing the transmitter's location and increasing the RIS number deployed in the room, the CP can be significantly improved (e.g., an increase of around 15% by optimizing the transmitter's location, and an increase of around 30% by increasing the RIS number deployed in the room). △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: 11 pages, 11 figures, submitted to IEEE Transactions on Wireless Communications

arXiv:2407.05010 [pdf, other]

PRANCE: Joint Token-Optimization and Structural Channel-Pruning for Adaptive ViT Inference

Authors: Ye Li, Chen Tang, Yuan Meng, Jiajun Fan, Zenghao Chai, Xinzhu Ma, Zhi Wang, Wenwu Zhu

Abstract: We introduce PRANCE, a Vision Transformer compression framework that jointly optimizes the activated channels and reduces tokens, based on the characteristics of inputs. Specifically, PRANCE~ leverages adaptive token optimization strategies for a certain computational budget, aiming to accelerate ViTs' inference from a unified data and architectural perspective. However, the joint framework poses… ▽ More We introduce PRANCE, a Vision Transformer compression framework that jointly optimizes the activated channels and reduces tokens, based on the characteristics of inputs. Specifically, PRANCE~ leverages adaptive token optimization strategies for a certain computational budget, aiming to accelerate ViTs' inference from a unified data and architectural perspective. However, the joint framework poses challenges to both architectural and decision-making aspects. Firstly, while ViTs inherently support variable-token inference, they do not facilitate dynamic computations for variable channels. To overcome this limitation, we propose a meta-network using weight-sharing techniques to support arbitrary channels of the Multi-head Self-Attention and Multi-layer Perceptron layers, serving as a foundational model for architectural decision-making. Second, simultaneously optimizing the structure of the meta-network and input data constitutes a combinatorial optimization problem with an extremely large decision space, reaching up to around $10^{14}$, making supervised learning infeasible. To this end, we design a lightweight selector employing Proximal Policy Optimization for efficient decision-making. Furthermore, we introduce a novel "Result-to-Go" training mechanism that models ViTs' inference process as a Markov decision process, significantly reducing action space and mitigating delayed-reward issues during training. Extensive experiments demonstrate the effectiveness of PRANCE~ in reducing FLOPs by approximately 50\%, retaining only about 10\% of tokens while achieving lossless Top-1 accuracy. Additionally, our framework is shown to be compatible with various token optimization techniques such as pruning, merging, and sequential pruning-merging strategies. The code is available at \href{https://github.com/ChildTang/PRANCE}{https://github.com/ChildTang/PRANCE}. △ Less

Submitted 6 July, 2024; originally announced July 2024.

arXiv:2407.01651 [pdf, other]

Phase-field modeling of dendritic growth with gas bubbles in the solidification of binary alloys

Authors: Chengjie Zhan, Zhenhua Chai, Dongke Sun, Baochang Shi, Shaoning Geng, Ping Jiang

Abstract: In this work, a phase-field model is developed for the dendritic growth with gas bubbles in the solidification of binary alloys. In this model, a total free energy for the complex gas-liquid-dendrite system is proposed through considering the interactions of gas bubbles, liquid melt and solid dendrites, and it can reduce to the energy for gas-liquid flows in the region far from the solid phase, wh… ▽ More In this work, a phase-field model is developed for the dendritic growth with gas bubbles in the solidification of binary alloys. In this model, a total free energy for the complex gas-liquid-dendrite system is proposed through considering the interactions of gas bubbles, liquid melt and solid dendrites, and it can reduce to the energy for gas-liquid flows in the region far from the solid phase, while degenerate to the energy for thermosolutal dendritic growth when the gas bubble disappears. The governing equations are usually obtained by minimizing the total free energy, but here some modifications are made to improve the capacity of the conservative phase-field equation for gas bubbles and convection-diffusion equation for solute transfer. Additionally, through the asymptotic analysis of the thin-interface limit, the present general phase-field model for alloy solidification can match the corresponding free boundary problem, and it is identical to the commonly used models under a specific choice of model parameters. Furthermore, to describe the fluid flow, the incompressible Navier-Stokes equations are adopted in the entire domain including gas, liquid, and solid regions, where the fluid-structure interaction is considered by a simple diffuse-interface method. To test the present phase-field model, the lattice Boltzmann method is used to study several problems of gas-liquid flows, dendritic growth as well as the solidification in presence of gas bubbles, and a good performance of the present model for such complex problems is observed. △ Less

Submitted 1 July, 2024; originally announced July 2024.

Comments: 29 pages, 23 figures

arXiv:2406.04629 [pdf, other]

STAR: Skeleton-aware Text-based 4D Avatar Generation with In-Network Motion Retargeting

Authors: Zenghao Chai, Chen Tang, Yongkang Wong, Mohan Kankanhalli

Abstract: The creation of 4D avatars (i.e., animated 3D avatars) from text description typically uses text-to-image (T2I) diffusion models to synthesize 3D avatars in the canonical space and subsequently applies animation with target motions. However, such an optimization-by-animation paradigm has several drawbacks. (1) For pose-agnostic optimization, the rendered images in canonical pose for naive Score Di… ▽ More The creation of 4D avatars (i.e., animated 3D avatars) from text description typically uses text-to-image (T2I) diffusion models to synthesize 3D avatars in the canonical space and subsequently applies animation with target motions. However, such an optimization-by-animation paradigm has several drawbacks. (1) For pose-agnostic optimization, the rendered images in canonical pose for naive Score Distillation Sampling (SDS) exhibit domain gap and cannot preserve view-consistency using only T2I priors, and (2) For post hoc animation, simply applying the source motions to target 3D avatars yields translation artifacts and misalignment. To address these issues, we propose Skeleton-aware Text-based 4D Avatar generation with in-network motion Retargeting (STAR). STAR considers the geometry and skeleton differences between the template mesh and target avatar, and corrects the mismatched source motion by resorting to the pretrained motion retargeting techniques. With the informatively retargeted and occlusion-aware skeleton, we embrace the skeleton-conditioned T2I and text-to-video (T2V) priors, and propose a hybrid SDS module to coherently provide multi-view and frame-consistent supervision signals. Hence, STAR can progressively optimize the geometry, texture, and motion in an end-to-end manner. The quantitative and qualitative experiments demonstrate our proposed STAR can synthesize high-quality 4D avatars with vivid animations that align well with the text description. Additional ablation studies shows the contributions of each component in STAR. The source code and demos are available at: \href{https://star-avatar.github.io}{https://star-avatar.github.io}. △ Less

Submitted 7 June, 2024; originally announced June 2024.

Comments: Tech report

arXiv:2403.18569 [pdf, other]

PDNNet: PDN-Aware GNN-CNN Heterogeneous Network for Dynamic IR Drop Prediction

Authors: Yuxiang Zhao, Zhuomin Chai, Xun Jiang, Yibo Lin, Runsheng Wang, Ru Huang

Abstract: IR drop on the power delivery network (PDN) is closely related to PDN's configuration and cell current consumption. As the integrated circuit (IC) design is growing larger, dynamic IR drop simulation becomes computationally unaffordable and machine learning based IR drop prediction has been explored as a promising solution. Although CNN-based methods have been adapted to IR drop prediction task in… ▽ More IR drop on the power delivery network (PDN) is closely related to PDN's configuration and cell current consumption. As the integrated circuit (IC) design is growing larger, dynamic IR drop simulation becomes computationally unaffordable and machine learning based IR drop prediction has been explored as a promising solution. Although CNN-based methods have been adapted to IR drop prediction task in several works, the shortcomings of overlooking PDN configuration is non-negligible. In this paper, we consider not only how to properly represent cell-PDN relation, but also how to model IR drop following its physical nature in the feature aggregation procedure. Thus, we propose a novel graph structure, PDNGraph, to unify the representations of the PDN structure and the fine-grained cell-PDN relation. We further propose a dual-branch heterogeneous network, PDNNet, incorporating two parallel GNN-CNN branches to favorably capture the above features during the learning process. Several key designs are presented to make the dynamic IR drop prediction highly effective and interpretable. We are the first work to apply graph structure to deep-learning based dynamic IR drop prediction method. Experiments show that PDNNet outperforms the state-of-the-art CNN-based methods by up to 39.3% reduction in prediction error and achieves 545x speedup compared to the commercial tool, which demonstrates the superiority of our method. △ Less

Submitted 27 March, 2024; originally announced March 2024.

arXiv:2403.16854 [pdf, other]

An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing

Authors: Ziwei Chai, Guoyin Wang, Jing Su, Tianjie Zhang, Xuanwen Huang, Xuwu Wang, Jingjing Xu, Jianbo Yuan, Hongxia Yang, Fei Wu, Yang Yang

Abstract: We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs. Our framework represents expert LLMs as special expert tokens within the vocabulary of a meta LLM. The meta LLM can route to an expert LLM like generating new tokens. Expert-Token-Routing not only supports learning the implicit expertise of expert LLMs from existing instru… ▽ More We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs. Our framework represents expert LLMs as special expert tokens within the vocabulary of a meta LLM. The meta LLM can route to an expert LLM like generating new tokens. Expert-Token-Routing not only supports learning the implicit expertise of expert LLMs from existing instruction dataset but also allows for dynamic extension of new expert LLMs in a plug-and-play manner. It also conceals the detailed collaboration process from the user's perspective, facilitating interaction as though it were a singular LLM. Our framework outperforms various existing multi-LLM collaboration paradigms across benchmarks that incorporate six diverse expert domains, demonstrating effectiveness and robustness in building generalist LLM system via synergizing multiple expert LLMs. △ Less

Submitted 11 June, 2024; v1 submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.16122 [pdf, ps, other]

Phase-field based lattice Boltzmann method for containerless freezing

Authors: Jiangxu Huang, Lei Wang, Zhenhua Chai, Baochang Shi

Abstract: In this paper, a lattice Boltzmann model is proposed to simulate solid-liquid phase change phenomena in multiphase systems. The model couples the thermal properties of the solidification front with the dynamics of the liquid droplet interface, which enables the description of the complex interfacial changes during solid-liquid phase change process. The model treats the interfaces of gas, liquid, a… ▽ More In this paper, a lattice Boltzmann model is proposed to simulate solid-liquid phase change phenomena in multiphase systems. The model couples the thermal properties of the solidification front with the dynamics of the liquid droplet interface, which enables the description of the complex interfacial changes during solid-liquid phase change process. The model treats the interfaces of gas, liquid, and solid phases using the phase field order parameter and the solid fraction. The volume expansion or contraction caused by the change of properties such as density during phase change is represented by adding a mass source term to the continuum equation. The proposed model is first validated by the three-phase Stefan problem and the droplet solidification on a cold surface, and the numerical results are in good agreement with the analytical and experimental results. Then it is used to model the solidification problem with bubbles. The results show that the model is able to accurately capture the effect of bubbles on the solidification process, which is in good agreement with previous work. In addition, a parametric study is carried out to examine the dependence of the sessile droplet solidification on different physical and numerical parameters. The results show that the droplet solidification time increases with increasing droplet volume and contact angle. △ Less

Submitted 3 April, 2024; v1 submitted 24 March, 2024; originally announced March 2024.

Comments: 18 pages, 11 figures

arXiv:2402.15752 [pdf, ps, other]

A phase-field-based lattice Boltzmann method for two-phase flows with the interfacial mass/heat transfer

Authors: Baihui Chen, Chengjie Zhan, Zhenhua Chai, Baochang Shi

Abstract: In this work, we develop a phase-field-based lattice Boltzmann (LB) method for a two-scalar model of the two-phase flows with interfacial mass/heat transfer. Through the Chapman-Enskog analysis, we show that the present LB method can correctly recover the governing equations for phase field, flow field and concentration/temperature field. In particular, to derive the two-scalar equations for the m… ▽ More In this work, we develop a phase-field-based lattice Boltzmann (LB) method for a two-scalar model of the two-phase flows with interfacial mass/heat transfer. Through the Chapman-Enskog analysis, we show that the present LB method can correctly recover the governing equations for phase field, flow field and concentration/temperature field. In particular, to derive the two-scalar equations for the mass/heat transfer, we propose a new LB model with an auxiliary source distribution function to describe the extra flux terms, and the discretizations of some derivative terms can be avoided. The accuracy and efficiency of the present method is also tested through several benchmark problems, and the influence of mass/heat transfer on the fluid viscosity is further considered by introducing an exponential relation. The numerical results show that the present LB method is suitable for the two-phase flows with interfacial mass/heat transfer. △ Less

Submitted 24 February, 2024; originally announced February 2024.

Comments: 30 pages, 12 figures

arXiv:2402.12984 [pdf, other]

Can GNN be Good Adapter for LLMs?

Authors: Xuanwen Huang, Kaiqiao Han, Yang Yang, Dezheng Bao, Quanjin Tao, Ziwei Chai, Qi Zhu

Abstract: Recently, large language models (LLMs) have demonstrated superior capabilities in understanding and zero-shot learning on textual data, promising significant advances for many text-related domains. In the graph domain, various real-world scenarios also involve textual data, where tasks and node features can be described by text. These text-attributed graphs (TAGs) have broad applications in social… ▽ More Recently, large language models (LLMs) have demonstrated superior capabilities in understanding and zero-shot learning on textual data, promising significant advances for many text-related domains. In the graph domain, various real-world scenarios also involve textual data, where tasks and node features can be described by text. These text-attributed graphs (TAGs) have broad applications in social media, recommendation systems, etc. Thus, this paper explores how to utilize LLMs to model TAGs. Previous methods for TAG modeling are based on million-scale LMs. When scaled up to billion-scale LLMs, they face huge challenges in computational costs. Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs. In terms of efficiency, the GNN adapter introduces only a few trainable parameters and can be trained with low computation costs. The entire framework is trained using auto-regression on node text (next token prediction). Once trained, GraphAdapter can be seamlessly fine-tuned with task-specific prompts for various downstream tasks. Through extensive experiments across multiple real-world TAGs, GraphAdapter based on Llama 2 gains an average improvement of approximately 5\% in terms of node classification. Furthermore, GraphAdapter can also adapt to other language models, including RoBERTa, GPT-2. The promising results demonstrate that GNNs can serve as effective adapters for LLMs in TAG modeling. △ Less

Submitted 20 February, 2024; originally announced February 2024.

Comments: Accepted by WWW'24

arXiv:2402.09372 [pdf, other]

Deep Rib Fracture Instance Segmentation and Classification from CT on the RibFrac Challenge

Authors: Jiancheng Yang, Rui Shi, Liang Jin, Xiaoyang Huang, Kaiming Kuang, Donglai Wei, Shixuan Gu, Jianying Liu, Pengfei Liu, Zhizhong Chai, Yongjie Xiao, Hao Chen, Liming Xu, Bang Du, Xiangyi Yan, Hao Tang, Adam Alessio, Gregory Holste, Jiapeng Zhang, Xiaoming Wang, Jianye He, Lixuan Che, Hanspeter Pfister, Ming Li, Bingbing Ni

Abstract: Rib fractures are a common and potentially severe injury that can be challenging and labor-intensive to detect in CT scans. While there have been efforts to address this field, the lack of large-scale annotated datasets and evaluation benchmarks has hindered the development and validation of deep learning algorithms. To address this issue, the RibFrac Challenge was introduced, providing a benchmar… ▽ More Rib fractures are a common and potentially severe injury that can be challenging and labor-intensive to detect in CT scans. While there have been efforts to address this field, the lack of large-scale annotated datasets and evaluation benchmarks has hindered the development and validation of deep learning algorithms. To address this issue, the RibFrac Challenge was introduced, providing a benchmark dataset of over 5,000 rib fractures from 660 CT scans, with voxel-level instance mask annotations and diagnosis labels for four clinical categories (buckle, nondisplaced, displaced, or segmental). The challenge includes two tracks: a detection (instance segmentation) track evaluated by an FROC-style metric and a classification track evaluated by an F1-style metric. During the MICCAI 2020 challenge period, 243 results were evaluated, and seven teams were invited to participate in the challenge summary. The analysis revealed that several top rib fracture detection solutions achieved performance comparable or even better than human experts. Nevertheless, the current rib fracture classification solutions are hardly clinically applicable, which can be an interesting area in the future. As an active benchmark and research resource, the data and online evaluation of the RibFrac Challenge are available at the challenge website. As an independent contribution, we have also extended our previous internal baseline by incorporating recent advancements in large-scale pretrained networks and point-based rib segmentation techniques. The resulting FracNet+ demonstrates competitive performance in rib fracture detection, which lays a foundation for further research and development in AI-assisted rib fracture detection and diagnosis. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: Challenge paper for MICCAI RibFrac Challenge (https://ribfrac.grand-challenge.org/)

arXiv:2402.02168 [pdf, other]

One Graph Model for Cross-domain Dynamic Link Prediction

Authors: Xuanwen Huang, Wei Chow, Yang Wang, Ziwei Chai, Chunping Wang, Lei Chen, Yang Yang

Abstract: This work proposes DyExpert, a dynamic graph model for cross-domain link prediction. It can explicitly model historical evolving processes to learn the evolution pattern of a specific downstream graph and subsequently make pattern-specific link predictions. DyExpert adopts a decode-only transformer and is capable of efficiently parallel training and inference by \textit{conditioned link generation… ▽ More This work proposes DyExpert, a dynamic graph model for cross-domain link prediction. It can explicitly model historical evolving processes to learn the evolution pattern of a specific downstream graph and subsequently make pattern-specific link predictions. DyExpert adopts a decode-only transformer and is capable of efficiently parallel training and inference by \textit{conditioned link generation} that integrates both evolution modeling and link prediction. DyExpert is trained by extensive dynamic graphs across diverse domains, comprising 6M dynamic edges. Extensive experiments on eight untrained graphs demonstrate that DyExpert achieves state-of-the-art performance in cross-domain link prediction. Compared to the advanced baseline under the same setting, DyExpert achieves an average of 11.40% improvement Average Precision across eight graphs. More impressive, it surpasses the fully supervised performance of 8 advanced baselines on 6 untrained graphs. △ Less

Submitted 3 February, 2024; originally announced February 2024.

Comments: Under review

arXiv:2401.07314 [pdf, other]

MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation

Authors: Jiaqi Chen, Bingqian Lin, Ran Xu, Zhenhua Chai, Xiaodan Liang, Kwan-Yee K. Wong

Abstract: Embodied agents equipped with GPT as their brains have exhibited extraordinary decision-making and generalization abilities across various tasks. However, existing zero-shot agents for vision-and-language navigation (VLN) only prompt GPT-4 to select potential locations within localized environments, without constructing an effective "global-view" for the agent to understand the overall environment… ▽ More Embodied agents equipped with GPT as their brains have exhibited extraordinary decision-making and generalization abilities across various tasks. However, existing zero-shot agents for vision-and-language navigation (VLN) only prompt GPT-4 to select potential locations within localized environments, without constructing an effective "global-view" for the agent to understand the overall environment. In this work, we present a novel map-guided GPT-based agent, dubbed MapGPT, which introduces an online linguistic-formed map to encourage global exploration. Specifically, we build an online map and incorporate it into the prompts that include node information and topological relationships, to help GPT understand the spatial environment. Benefiting from this design, we further propose an adaptive planning mechanism to assist the agent in performing multi-step path planning based on a map, systematically exploring multiple candidate nodes or sub-goals step by step. Extensive experiments demonstrate that our MapGPT is applicable to both GPT-4 and GPT-4V, achieving state-of-the-art zero-shot performance on R2R and REVERIE simultaneously (~10% and ~12% improvements in SR), and showcasing the newly emergent global thinking and path planning abilities of the GPT. △ Less

Submitted 20 June, 2024; v1 submitted 14 January, 2024; originally announced January 2024.

Comments: LLM/VLM-based VLN Agents. Accepted to ACL 2024. Project: https://chen-judge.github.io/MapGPT/

arXiv:2401.05507 [pdf, other]

InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks

Authors: Xueyu Hu, Ziyu Zhao, Shuang Wei, Ziwei Chai, Qianli Ma, Guoyin Wang, Xuwu Wang, Jing Su, Jingjing Xu, Ming Zhu, Yao Cheng, Jianbo Yuan, Jiwei Li, Kun Kuang, Yang Yang, Hongxia Yang, Fei Wu

Abstract: In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks. These tasks require agents to end-to-end solving complex tasks by interacting with an execution environment. This benchmark contains DAEval, a dataset consisting of 257 data analysis questions derived from 52 CSV files, and an agent framework which incorpora… ▽ More In this paper, we introduce InfiAgent-DABench, the first benchmark specifically designed to evaluate LLM-based agents on data analysis tasks. These tasks require agents to end-to-end solving complex tasks by interacting with an execution environment. This benchmark contains DAEval, a dataset consisting of 257 data analysis questions derived from 52 CSV files, and an agent framework which incorporates LLMs to serve as data analysis agents for both serving and evaluation. Since data analysis questions are often open-ended and hard to evaluate without human supervision, we adopt a format-prompting technique to convert each question into a closed-form format so that they can be automatically evaluated. Our extensive benchmarking of 34 LLMs uncovers the current challenges encountered in data analysis tasks. In addition, building on top of our agent framework, we develop a specialized agent, DAAgent, which surpasses GPT-3.5 by 3.9% on DABench. Evaluation datasets and toolkits for InfiAgent-DABench are released at https://github.com/InfiAgent/InfiAgent . △ Less

Submitted 11 March, 2024; v1 submitted 10 January, 2024; originally announced January 2024.

Comments: 27 pages, 7 figures, work in progress

arXiv:2401.00625 [pdf, ps, other]

Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models

Authors: Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao

Abstract: The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated models like OpenAI's ChatGPT, represents a significant advancement in artificial intelligence. These models, however, bring forth substantial challenges in the high consumption of computational, memory, energy, and financial resources, especially in environments with limited resource capabilities. This survey aims t… ▽ More The burgeoning field of Large Language Models (LLMs), exemplified by sophisticated models like OpenAI's ChatGPT, represents a significant advancement in artificial intelligence. These models, however, bring forth substantial challenges in the high consumption of computational, memory, energy, and financial resources, especially in environments with limited resource capabilities. This survey aims to systematically address these challenges by reviewing a broad spectrum of techniques designed to enhance the resource efficiency of LLMs. We categorize methods based on their optimization focus: computational, memory, energy, financial, and network resources and their applicability across various stages of an LLM's lifecycle, including architecture design, pretraining, finetuning, and system design. Additionally, the survey introduces a nuanced categorization of resource efficiency techniques by their specific resource types, which uncovers the intricate relationships and mappings between various resources and corresponding optimization techniques. A standardized set of evaluation metrics and datasets is also presented to facilitate consistent and fair comparisons across different models and techniques. By offering a comprehensive overview of the current sota and identifying open research avenues, this survey serves as a foundational reference for researchers and practitioners, aiding them in developing more sustainable and efficient LLMs in a rapidly evolving landscape. △ Less

Submitted 3 January, 2024; v1 submitted 31 December, 2023; originally announced January 2024.

Comments: Preprint. GitHub repo: https://github.com/tiingweii-shii/Awesome-Resource-Efficient-LLM-Papers

arXiv:2312.07010 [pdf, ps, other]

Regularized lattice Boltzmann method based maximum principle and energy stability preserving finite-difference scheme for the Allen-Cahn equation

Authors: Ying Chen, Xi Liu, Zhenhua Chai, Baochang Shi

Abstract: The Allen-Cahn equation (ACE) inherently possesses two crucial properties: the maximum principle and the energy dissipation law. Preserving these two properties at the discrete level is also necessary in the numerical methods for the ACE. In this paper, unlike the traditional top-down macroscopic numerical schemes which discretize the ACE directly, we first propose a novel bottom-up mesoscopic reg… ▽ More The Allen-Cahn equation (ACE) inherently possesses two crucial properties: the maximum principle and the energy dissipation law. Preserving these two properties at the discrete level is also necessary in the numerical methods for the ACE. In this paper, unlike the traditional top-down macroscopic numerical schemes which discretize the ACE directly, we first propose a novel bottom-up mesoscopic regularized lattice Boltzmann method based macroscopic numerical scheme for d (=1, 2, 3)-dimensional ACE, where the DdQ(2d+1) [(2d+1) discrete velocities in d-dimensional space] lattice structure is adopted. In particular, the proposed macroscopic numerical scheme has a second-order accuracy in space, and can also be viewd as an implicit-explicit finite-difference scheme for the ACE, in which the nonlinear term is discretized semi-implicitly, the temporal derivative and dissipation term of the ACE are discretized by using the explicit Euler method and second-order central difference method, respectively. Then we also demonstrate that the proposed scheme can preserve the maximum bound principle and the original energy dissipation law at the discrete level under some conditions. Finally, some numerical experiments are conducted to validate our theoretical analysis. △ Less

Submitted 3 January, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

arXiv:2311.10314 [pdf, other]

A consistent and conservative diffuse-domain lattice Boltzmann method for multiphase flows in complex geometries

Authors: Xi Liu, Chengjie Zhan, Yin Chen, Zhenhua Chai, Baochang Shi

Abstract: Modeling and simulation of multiphase flows in complex geomerties are challenging due to the complexity in describing the interface topology changes among different phases and the difficulty in implementing the boundary conditions on the irregular solid surface. In this work, we first developed a diffuse-domain (DD) based phase-field model for multiphase flows in complex geometries. In this model,… ▽ More Modeling and simulation of multiphase flows in complex geomerties are challenging due to the complexity in describing the interface topology changes among different phases and the difficulty in implementing the boundary conditions on the irregular solid surface. In this work, we first developed a diffuse-domain (DD) based phase-field model for multiphase flows in complex geometries. In this model, the irregular fluid region is embedded into a larger and regular domain by introducing a smooth characteristic function. Then, the reduction-consistent and conservative phase-field equation for the multiphase field and the consistent and conservative Navier-Stokes equations for the flow field are reformulated as the diffuse-domain based consistent and conservative (DD-CC) equations where some additional source terms are added to reflect the effects of boundary conditions. In this case, there is no need to directly treat the complex boundary conditions on the irregular solid surface, and additionally, based on a matched asymptotic analysis, it is also shown that the DD-CC equations can converge to the original governing equations as the interface width parameter tends to zero. Furthermore, to solve the DD-CC equations, we proposed a novel and simple lattice Boltzmann (LB) method with a Hermite-moment-based collision matrix which can not only keep consistent and conservation properties, but also improve the numerical stability with a flexible parameter. With the help of the direct Taylor expansion, the macroscopic DD-CC equations can be recovered correctly from the present LB method. Finally, to test the capacity of LB method, several benchmarks and complex problems are considered, and the numerical results show that the present LB method is accurate and efficient for the multiphase flows in complex geomerties. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: 22 pages, 9 figures

arXiv:2311.05097 [pdf, other]

A thermodynamically consistent and conservative diffuse-interface model for gas-liquid-solid multiphase flows

Authors: Chengjie Zhan, Xi Liu, Zhenhua Chai, Baochang Shi

Abstract: In this work, a thermodynamically consistent and conservative diffuse-interface model for gas-liquid-solid multiphase flows is proposed. In this model, a novel free energy for the gas-liquid-solid multiphase flows is established according to a ternary phase-field model, and it not only contains the standard bulk and interface free energies for two-phase flows, but also includes some additional ter… ▽ More In this work, a thermodynamically consistent and conservative diffuse-interface model for gas-liquid-solid multiphase flows is proposed. In this model, a novel free energy for the gas-liquid-solid multiphase flows is established according to a ternary phase-field model, and it not only contains the standard bulk and interface free energies for two-phase flows, but also includes some additional terms to reflect the penalty in the solid phase and the wettability on the solid surface. Furthermore, a smooth indicator function of the solid phase is also introduced in the consistent Navier-Stokes equations to achieve a high viscosity in the solid phase and preserve the velocity boundary conditions on the solid surface. Based on the proposed diffuse-interface model, the fluid interface dynamics, the fluid-structure interaction, and the wetting property of the solid surface can be described simply and efficiently. Additionally, the total energy is also proved to be dissipative for the two-phase flows in the stationary geometries. To test the present diffuse-interface model, we develop a consistent and conservative lattice Boltzmann method and conduct some simulations. The numerical results also confirm the energy dissipation and good capability of the proposed diffuse-interface model in the study of two-phase flows in complex geometries and gas-liquid-particle multiphase flows. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: 27 pages, 23 figures

arXiv:2311.05091 [pdf, other]

doi 10.1016/j.physd.2024.134087

A ternary phase-field model for two-phase flows in complex geometries

Authors: Chengjie Zhan, Zhenhua Chai, Baochang Shi

Abstract: In this work, a ternary phase-field model for two-phase flows in complex geometries is proposed. In this model, one of the three components in the classical ternary Cahn-Hilliard model is considered as the solid phase, and only one Cahn-Hilliard equation with degenerate mobility needs to be solved due to the condition of volume conservation, which is consistent with the standard phase-field model… ▽ More In this work, a ternary phase-field model for two-phase flows in complex geometries is proposed. In this model, one of the three components in the classical ternary Cahn-Hilliard model is considered as the solid phase, and only one Cahn-Hilliard equation with degenerate mobility needs to be solved due to the condition of volume conservation, which is consistent with the standard phase-field model with a single-scalar variable for two-phase flows. To depict different wetting properties at the complex fluid-solid boundaries, the spreading parameters in ternary phase-field model are determined based on the Young's law, in which the liquid-solid surface tension coefficient is assumed to be a linear function of gas-liquid surface tension coefficient and related to the contact angle and the minimum curvature of the solid surface. In addition, to achieve a high viscosity in the solid phase and preserve the velocity boundary conditions on the solid surface, the phase-field variable of the solid phase is also used to derive the modified Navier-Stokes equations. To test the present model, we further develop a consistent and conservative Hermite-moment based lattice Boltzmann method where an adjustable scale factor is introduced to improve the numerical stability, and conduct the numerical simulations of several benchmark problems. The results illustrate that present model has the good capability in the study of the two-phase flows in complex geometries. △ Less

Submitted 8 November, 2023; originally announced November 2023.

Comments: 23 pages, 25 figures

arXiv:2310.05845 [pdf, other]

GraphLLM: Boosting Graph Reasoning Ability of Large Language Model

Authors: Ziwei Chai, Tianjie Zhang, Liang Wu, Kaiqiao Han, Xiaohai Hu, Xuanwen Huang, Yang Yang

Abstract: The advancement of Large Language Models (LLMs) has remarkably pushed the boundaries towards artificial general intelligence (AGI), with their exceptional ability on understanding diverse types of information, including but not limited to images and audio. Despite this progress, a critical gap remains in empowering LLMs to proficiently understand and reason on graph data. Recent studies underscore… ▽ More The advancement of Large Language Models (LLMs) has remarkably pushed the boundaries towards artificial general intelligence (AGI), with their exceptional ability on understanding diverse types of information, including but not limited to images and audio. Despite this progress, a critical gap remains in empowering LLMs to proficiently understand and reason on graph data. Recent studies underscore LLMs' underwhelming performance on fundamental graph reasoning tasks. In this paper, we endeavor to unearth the obstacles that impede LLMs in graph reasoning, pinpointing the common practice of converting graphs into natural language descriptions (Graph2Text) as a fundamental bottleneck. To overcome this impediment, we introduce GraphLLM, a pioneering end-to-end approach that synergistically integrates graph learning models with LLMs. This synergy equips LLMs with the ability to proficiently interpret and reason on graph data, harnessing the superior expressive power of graph learning models. Our empirical evaluations across four fundamental graph reasoning tasks validate the effectiveness of GraphLLM. The results exhibit a substantial average accuracy enhancement of 54.44%, alongside a noteworthy context reduction of 96.45% across various graph reasoning tasks. △ Less

Submitted 9 October, 2023; originally announced October 2023.

arXiv:2309.02825 [pdf, ps, other]

A Cole-Hopf transformation based fourth-order multiple-relaxation-time lattice Boltzmann model for the coupled Burgers' equations

Authors: Ying Chen, Xi Liu, Zhenhua Chai, Baochang Shi

Abstract: In this work, a Cole-Hopf transformation based fourth-order multiple-relaxation-time lattice Boltzmann (MRT-LB) model for d-dimensional coupled Burgers' equations is developed. We first adopt the Cole-Hopf transformation where an intermediate variable θis introduced to eliminate the nonlinear convection terms in the Burgers' equations on the velocity u=(u_1,u_2,...,u_d). In this case, a diffusion… ▽ More In this work, a Cole-Hopf transformation based fourth-order multiple-relaxation-time lattice Boltzmann (MRT-LB) model for d-dimensional coupled Burgers' equations is developed. We first adopt the Cole-Hopf transformation where an intermediate variable θis introduced to eliminate the nonlinear convection terms in the Burgers' equations on the velocity u=(u_1,u_2,...,u_d). In this case, a diffusion equation on the variable θcan be obtained, and particularly, the velocity u in the coupled Burgers' equations is determined by the variable θand its gradient term \nablaθ. Then we develop a general MRT-LB model with the natural moments for the d-dimensional transformed diffusion equation and present the corresponding macroscopic finite-difference scheme. At the diffusive scaling, the fourth-order modified equation of the developed MRT-LB model is derived through the Maxwell iteration method. With the aid of the free parameters in the MRT-LB model, we find that not only the consistent fourth-order modified equation can be obtained, but also the gradient term $\nablaθ$ can be calculated locally by the non-equilibrium distribution function with a fourth-order accuracy, this indicates that theoretically, the MRT-LB model for $d$-dimensional coupled Burgers' equations can achieve a fourth-order accuracy in space. Finally, some simulations are conducted to test the MRT-LB model, and the numerical results show that the proposed MRT-LB model has a fourth-order convergence rate, which is consistent with our theoretical analysis. △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2308.13466 [pdf, other]

Staleness-Alleviated Distributed GNN Training via Online Dynamic-Embedding Prediction

Authors: Guangji Bai, Ziyang Yu, Zheng Chai, Yue Cheng, Liang Zhao

Abstract: Despite the recent success of Graph Neural Networks (GNNs), it remains challenging to train GNNs on large-scale graphs due to neighbor explosions. As a remedy, distributed computing becomes a promising solution by leveraging abundant computing resources (e.g., GPU). However, the node dependency of graph data increases the difficulty of achieving high concurrency in distributed GNN training, which… ▽ More Despite the recent success of Graph Neural Networks (GNNs), it remains challenging to train GNNs on large-scale graphs due to neighbor explosions. As a remedy, distributed computing becomes a promising solution by leveraging abundant computing resources (e.g., GPU). However, the node dependency of graph data increases the difficulty of achieving high concurrency in distributed GNN training, which suffers from the massive communication overhead. To address it, Historical value approximation is deemed a promising class of distributed training techniques. It utilizes an offline memory to cache historical information (e.g., node embedding) as an affordable approximation of the exact value and achieves high concurrency. However, such benefits come at the cost of involving dated training information, leading to staleness, imprecision, and convergence issues. To overcome these challenges, this paper proposes SAT (Staleness-Alleviated Training), a novel and scalable distributed GNN training framework that reduces the embedding staleness adaptively. The key idea of SAT is to model the GNN's embedding evolution as a temporal graph and build a model upon it to predict future embedding, which effectively alleviates the staleness of the cached historical embedding. We propose an online algorithm to train the embedding predictor and the distributed GNN alternatively and further provide a convergence analysis. Empirically, we demonstrate that SAT can effectively reduce embedding staleness and thus achieve better performance and convergence speed on multiple large-scale graph datasets. △ Less

Submitted 10 December, 2023; v1 submitted 25 August, 2023; originally announced August 2023.

Comments: Preprint. Do not distribute

arXiv:2308.11882 [pdf, other]

The macroscopic finite-difference scheme and modified equations of the general propagation multiple-relaxation-time lattice Boltzmann model

Authors: Ying Chen, Xi Liu, Zhenhua Chai, Baochang Shi

Abstract: In this paper, we first present the general propagation multiple-relaxation-time lattice Boltzmann (GPMRT-LB) model and obtain the corresponding macroscopic finite-difference (GPMFD) scheme on conservative moments. Then based on the Maxwell iteration method, we conduct the analysis on the truncation errors and modified equations (MEs) of the GPMRT-LB model and GPMFD scheme at both diffusive and ac… ▽ More In this paper, we first present the general propagation multiple-relaxation-time lattice Boltzmann (GPMRT-LB) model and obtain the corresponding macroscopic finite-difference (GPMFD) scheme on conservative moments. Then based on the Maxwell iteration method, we conduct the analysis on the truncation errors and modified equations (MEs) of the GPMRT-LB model and GPMFD scheme at both diffusive and acoustic scalings. For the nonlinear anisotropic convection-diffusion equation (NACDE) and Navier-Stokes equations (NSEs), we also derive the first- and second-order MEs of the GPMRT-LB model and GPMFD scheme. In particular, for the one-dimensional convection-diffusion equation (CDE) with the constant velocity and diffusion coefficient, we can develop a fourth-order GPMRT-LB (F-GPMRT-LB) model and the corresponding fourth-order GPMFD (F-GPMFD) scheme at the diffusive scaling. Finally, two benchmark problems, Gauss hill problem and Poiseuille flow in two-dimensional space, are used to test the GPMRT-LB model and GPMFD scheme, and it is found that the numerical results are not only in good agreement with corresponding analytical solutions, but also have a second-order convergence rate in space. Additionally, a numerical study on one-dimensional CDE also demonstrates that the F-GPMRT-LB model and F-GPMFD scheme can achieve a fourth-order accuracy in space, which is consistent with our theoretical analysis. △ Less

Submitted 22 August, 2023; originally announced August 2023.

arXiv:2308.05280 [pdf, ps, other]

A general fourth-order mesoscopic multiple-relaxation-time lattice Boltzmann model and equivalent macroscopic finite-difference scheme for two-dimensional diffusion equations

Authors: Ying Chen, Zhenhua Chai, Baochang Shi

Abstract: In this work, we first develop a general mesoscopic multiple-relaxation-time lattice Boltzmann (MRT-LB) model for the two-dimensional diffusion equation with the constant diffusion coefficient and source term, where the D2Q5 (five discrete velocities in two-dimensional space) lattice structure is considered. Then we exactly derive the equivalent macroscopic finite-difference scheme of the MRT-LB m… ▽ More In this work, we first develop a general mesoscopic multiple-relaxation-time lattice Boltzmann (MRT-LB) model for the two-dimensional diffusion equation with the constant diffusion coefficient and source term, where the D2Q5 (five discrete velocities in two-dimensional space) lattice structure is considered. Then we exactly derive the equivalent macroscopic finite-difference scheme of the MRT-LB model. Additionally, we also propose a proper MRT-LB model for the diffusion equation with a linear source term, and obtain an equivalent macroscopic six-level finite-difference scheme. After that, we conduct the accuracy and stability analysis of the finite-difference scheme and the mesoscopic MRT-LB model. It is found that at the diffusive scaling, both of them can achieve a fourth-order accuracy in space based on the Taylor expansion. The stability analysis also shows that they are both unconditionally stable. Finally, some numerical experiments are conducted, and the numerical results are also consistent with our theoretical analysis. △ Less

Submitted 9 August, 2023; originally announced August 2023.

arXiv:2307.11618 [pdf, other]

Divide and Adapt: Active Domain Adaptation via Customized Learning

Authors: Duojun Huang, Jichang Li, Weikai Chen, Junshi Huang, Zhenhua Chai, Guanbin Li

Abstract: Active domain adaptation (ADA) aims to improve the model adaptation performance by incorporating active learning (AL) techniques to label a maximally-informative subset of target samples. Conventional AL methods do not consider the existence of domain shift, and hence, fail to identify the truly valuable samples in the context of domain adaptation. To accommodate active learning and domain adaptio… ▽ More Active domain adaptation (ADA) aims to improve the model adaptation performance by incorporating active learning (AL) techniques to label a maximally-informative subset of target samples. Conventional AL methods do not consider the existence of domain shift, and hence, fail to identify the truly valuable samples in the context of domain adaptation. To accommodate active learning and domain adaption, the two naturally different tasks, in a collaborative framework, we advocate that a customized learning strategy for the target data is the key to the success of ADA solutions. We present Divide-and-Adapt (DiaNA), a new ADA framework that partitions the target instances into four categories with stratified transferable properties. With a novel data subdivision protocol based on uncertainty and domainness, DiaNA can accurately recognize the most gainful samples. While sending the informative instances for annotation, DiaNA employs tailored learning strategies for the remaining categories. Furthermore, we propose an informativeness score that unifies the data partitioning criteria. This enables the use of a Gaussian mixture model (GMM) to automatically sample unlabeled data into the proposed four categories. Thanks to the "divideand-adapt" spirit, DiaNA can handle data with large variations of domain gap. In addition, we show that DiaNA can generalize to different domain adaptation settings, such as unsupervised domain adaptation (UDA), semi-supervised domain adaptation (SSDA), source-free domain adaptation (SFDA), etc. △ Less

Submitted 21 July, 2023; originally announced July 2023.

Comments: CVPR2023, Highlight paper

arXiv:2306.17699 [pdf, other]

Exploration and Exploitation of Unlabeled Data for Open-Set Semi-Supervised Learning

Authors: Ganlong Zhao, Guanbin Li, Yipeng Qin, Jinjin Zhang, Zhenhua Chai, Xiaolin Wei, Liang Lin, Yizhou Yu

Abstract: In this paper, we address a complex but practical scenario in semi-supervised learning (SSL) named open-set SSL, where unlabeled data contain both in-distribution (ID) and out-of-distribution (OOD) samples. Unlike previous methods that only consider ID samples to be useful and aim to filter out OOD ones completely during training, we argue that the exploration and exploitation of both ID and OOD s… ▽ More In this paper, we address a complex but practical scenario in semi-supervised learning (SSL) named open-set SSL, where unlabeled data contain both in-distribution (ID) and out-of-distribution (OOD) samples. Unlike previous methods that only consider ID samples to be useful and aim to filter out OOD ones completely during training, we argue that the exploration and exploitation of both ID and OOD samples can benefit SSL. To support our claim, i) we propose a prototype-based clustering and identification algorithm that explores the inherent similarity and difference among samples at feature level and effectively cluster them around several predefined ID and OOD prototypes, thereby enhancing feature learning and facilitating ID/OOD identification; ii) we propose an importance-based sampling method that exploits the difference in importance of each ID and OOD sample to SSL, thereby reducing the sampling bias and improving the training. Our proposed method achieves state-of-the-art in several challenging benchmarks, and improves upon existing SSL methods even when ID samples are totally absent in unlabeled data. △ Less

Submitted 30 June, 2023; originally announced June 2023.

arXiv:2306.13301 [pdf, other]

Deep Omni-supervised Learning for Rib Fracture Detection from Chest Radiology Images

Authors: Zhizhong Chai, Luyang Luo, Huangjing Lin, Pheng-Ann Heng, Hao Chen

Abstract: Deep learning (DL)-based rib fracture detection has shown promise of playing an important role in preventing mortality and improving patient outcome. Normally, developing DL-based object detection models requires a huge amount of bounding box annotation. However, annotating medical data is time-consuming and expertise-demanding, making obtaining a large amount of fine-grained annotations extremely… ▽ More Deep learning (DL)-based rib fracture detection has shown promise of playing an important role in preventing mortality and improving patient outcome. Normally, developing DL-based object detection models requires a huge amount of bounding box annotation. However, annotating medical data is time-consuming and expertise-demanding, making obtaining a large amount of fine-grained annotations extremely infeasible. This poses a pressing need {for} developing label-efficient detection models to alleviate radiologists' labeling burden. To tackle this challenge, the literature on object detection has witnessed an increase of weakly-supervised and semi-supervised approaches, yet still lacks a unified framework that leverages various forms of fully-labeled, weakly-labeled, and unlabeled data. In this paper, we present a novel omni-supervised object detection network, ORF-Netv2, to leverage as much available supervision as possible. Specifically, a multi-branch omni-supervised detection head is introduced with each branch trained with a specific type of supervision. A co-training-based dynamic label assignment strategy is then proposed to enable flexible and robust learning from the weakly-labeled and unlabeled data. Extensive evaluation was conducted for the proposed framework with three rib fracture datasets on both chest CT and X-ray. By leveraging all forms of supervision, ORF-Netv2 achieves mAPs of 34.7, 44.7, and 19.4 on the three datasets, respectively, surpassing the baseline detector which uses only box annotations by mAP gains of 3.8, 4.8, and 5.0, respectively. Furthermore, ORF-Netv2 consistently outperforms other competitive label-efficient methods over various scenarios, showing a promising framework for label-efficient fracture detection. The code is available at: https://github.com/zhizhongchai/ORF-Net. △ Less

Submitted 19 January, 2024; v1 submitted 23 June, 2023; originally announced June 2023.

Comments: TMI 2024. Zhizhong Chai and Luyang Luo contributed equally. Code is available via: https://github.com/zhizhongchai/ORF-Net/tree/main

arXiv:2306.08216 [pdf, ps, other]

Multiple-distribution-function finite-difference lattice Boltzmann method for incompressible Navier-Stokes equation

Authors: Xinmeng Chen, Zhenhua Chai, Yong Zhao, Baochang Shi

Abstract: In this paper, a multiple-distribution-function finite-difference lattice Boltzmann method (MDF-FDLBM) is proposed for the convection-diffusion system based incompressible Navier-Stokes equations (NSEs). By Chapman Enskog analysis, the convection-diffusion system based incompressible NSEs can be recovered from MDF-FDLBM. Some quantities, including the velocity gradient, velocity divergence, strain… ▽ More In this paper, a multiple-distribution-function finite-difference lattice Boltzmann method (MDF-FDLBM) is proposed for the convection-diffusion system based incompressible Navier-Stokes equations (NSEs). By Chapman Enskog analysis, the convection-diffusion system based incompressible NSEs can be recovered from MDF-FDLBM. Some quantities, including the velocity gradient, velocity divergence, strain rate tensor, shear stress and vorticity, can be computed locally by the first-order moment of the non-equilibrium distribution function. Through the von Neumann analysis, we conduct the stability analysis for the MDF-FDLBM and incompressible finite-difference lattice Boltzmann method (IFDLBM). It is found that the IFDLBM will be more stable than that of MDF-FDLBM with small kinematic viscosity, and the MDF-FDLBM will be more stable than that of IFDLBM with large Courant-Friedrichs-Lewy condition number. Finally, some simulations are conducted to validate the MDF-FDLBM. The results agree well with the analytical solutions and previous results. Through the numerical testing, we find that the MDF-FDLBM has a second-order convergence rate in space and time. The MDF-FDLBMcombined with non-uniform grid also works well. Meanwhile, compared with IFDLBM, it can be found that MDF-FDLBM offers higher accuracy and computational efficiency, reducing computation time by more than 36%. △ Less

Submitted 13 June, 2023; originally announced June 2023.

arXiv:2306.07603 [pdf, ps, other]

Numerical Simulation of Power-Law Fluid Flow in a Trapezoidal Cavity using the Incompressible Finite-Difference Lattice Boltzmann Method

Authors: Xinmeng Chen, Zhenhua Chai, Yong Zhao, Baochang Shi

Abstract: In this paper, a numerical investigation of power-law fluid flow in the trapezoidal cavity has been conducted by incompressible finite-difference lattice Boltzmann method (IFDLBM). By designing the equilibrium distribution function, the Navier-Stokes equations (NSEs) can be recovered exactly. Through the coordinate transformation method, the body-fitted grid in physical region is transformed into… ▽ More In this paper, a numerical investigation of power-law fluid flow in the trapezoidal cavity has been conducted by incompressible finite-difference lattice Boltzmann method (IFDLBM). By designing the equilibrium distribution function, the Navier-Stokes equations (NSEs) can be recovered exactly. Through the coordinate transformation method, the body-fitted grid in physical region is transformed into a uniform grid in computational region. The effect of Reynolds (Re) number, the power-law index $n$ and the vertical angle θ on the trapezoidal cavity are investigated. According to the numerical results, we come to some conclusions. For low Re number Re=100, it can be found that the behavior of power-law fluid flow becomes more complicated with the increase of n. And as vertical angle θ decreases, the flow becomes smooth and the number of vortices decreases. For high Re numbers, the flow development becomes more complex, the number and strength of vortices increase. If the Reynolds number increases further, the power-law fluid will changes from steady flow to periodic flow and then to turbulent flow. For the steady flow, the lager the θ, the more complicated the vortices. And the critical Re number from steady to periodic state decreases with the decrease of power-law index n. △ Less

Submitted 13 June, 2023; originally announced June 2023.

arXiv:2306.06983 [pdf, other]

A comparative study of two Allen-Cahn models for immiscible $N$-phase flows by using a consistent and conservative lattice Boltzmann method

Authors: Chengjie Zhan, Xi Liu, Zhenhua Chai, Baochang Shi

Abstract: In this work, we conduct a detailed comparison between two second-order conservative Allen-Cahn (AC) models [\emph{Model A}: Zheng \emph{et al.}, Phys. Rev. E 101, 0433202 (2020) and \emph{Model B}: Mirjalili and Mani, (2023)] for the immiscible $N$-phase flows. Mathematically, these two AC equations can be proved to be equivalent under some approximate conditions. However, the effects of these ap… ▽ More In this work, we conduct a detailed comparison between two second-order conservative Allen-Cahn (AC) models [\emph{Model A}: Zheng \emph{et al.}, Phys. Rev. E 101, 0433202 (2020) and \emph{Model B}: Mirjalili and Mani, (2023)] for the immiscible $N$-phase flows. Mathematically, these two AC equations can be proved to be equivalent under some approximate conditions. However, the effects of these approximations are unclear from the theoretical point of view, and would be considered numerically. To this end, we propose a consistent and conservative lattice Boltzmann method for the AC models for $N$-phase flows, and present some numerical comparisons of accuracy and stability between these two AC models. The results show that both two AC models have good performances in accuracy, but the \emph{Model B} is more stable for the realistic complex $N$-phase flows, although there is an adjustable parameter in the \emph{Model A}. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: 21 pages, 18 figures

arXiv:2305.16616 [pdf, other]

Channel Measurement, Modeling, and Simulation for 6G: A Survey and Tutorial

Authors: Jianhua Zhang, Jiaxin Lin, Pan Tang, Yuxiang Zhang, Huixin Xu, Tianyang Gao, Haiyang Miao, Zeyong Chai, Zhengfu Zhou, Yi Li, Huiwen Gong, Yameng Liu, Zhiqiang Yuan, Lei Tian, Shaoshi Yang, Liang Xia, Guangyi Liu, Ping Zhang

Abstract: The sixth generation (6G) mobile communications have attracted substantial attention in the global research community of information and communication technologies (ICT). 6G systems are expected to support not only extended 5G usage scenarios, but also new usage scenarios, such as integrated sensing and communication (ISAC), integrated artificial intelligence (AI) and communication, and communicat… ▽ More The sixth generation (6G) mobile communications have attracted substantial attention in the global research community of information and communication technologies (ICT). 6G systems are expected to support not only extended 5G usage scenarios, but also new usage scenarios, such as integrated sensing and communication (ISAC), integrated artificial intelligence (AI) and communication, and communication and ubiquitous connectivity. To realize this goal, channel characteristics must be comprehensively studied and properly exploited, so as to promote the design, standardization, and optimization of 6G systems. In this paper, we first summarize the requirements and challenges in 6G channel research. Our focus is on channels for five promising technologies enabling 6G, including terahertz (THz), extreme MIMO (E-MIMO), ISAC, reconfigurable intelligent surface (RIS), and space-air-ground integrated network (SAGIN). Then, a survey of the progress of the 6G channel research regarding the above five promising technologies is presented in terms of the latest measurement campaigns, new characteristics, modeling methods, and research prospects. Moreover, a tutorial on the 6G channel simulations is presented. We introduce the BUPTCMG- 6G, a 6G link-level channel simulator, developed based on the ITU/3GPP 3D geometry-based stochastic model (GBSM) methodology. The simulator supports the channel simulation of the aforementioned 6G potential technologies. To facilitate the use of the simulator, the tutorial encompasses the design framework, user guidelines, and application examples. This paper offers in-depth, hands-on insights into the best practices of channel measurements, modeling, and simulations for the evaluation of 6G technologies, the development of 6G standards, and the implementation and optimization of 6G systems. △ Less

Submitted 28 March, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: 41 pages,52 figures

arXiv:2305.05374 [pdf, other]

HybridNet: Dual-Branch Fusion of Geometrical and Topological Views for VLSI Congestion Prediction

Authors: Yuxiang Zhao, Zhuomin Chai, Yibo Lin, Runsheng Wang, Ru Huang

Abstract: Accurate early congestion prediction can prevent unpleasant surprises at the routing stage, playing a crucial character in assisting designers to iterate faster in VLSI design cycles. In this paper, we introduce a novel strategy to fully incorporate topological and geometrical features of circuits by making several key designs in our network architecture. To be more specific, we construct two indi… ▽ More Accurate early congestion prediction can prevent unpleasant surprises at the routing stage, playing a crucial character in assisting designers to iterate faster in VLSI design cycles. In this paper, we introduce a novel strategy to fully incorporate topological and geometrical features of circuits by making several key designs in our network architecture. To be more specific, we construct two individual graphs (geometry-graph, topology-graph) with distinct edge construction schemes according to their unique properties. We then propose a dual-branch network with different encoder layers in each pathway and aggregate representations with a sophisticated fusion strategy. Our network, named HybridNet, not only provides a simple yet effective way to capture the geometric interactions of cells, but also preserves the original topological relationships in the netlist. Experimental results on the ISPD2015 benchmarks show that we achieve an improvement of 10.9% compared to previous methods. △ Less

Submitted 12 June, 2023; v1 submitted 7 May, 2023; originally announced May 2023.

Journal ref: 2023 IEEE International Symposium of EDA

arXiv:2305.03378 [pdf, other]

Towards Effective Collaborative Learning in Long-Tailed Recognition

Authors: Zhengzhuo Xu, Zenghao Chai, Chengyin Xu, Chun Yuan, Haiqin Yang

Abstract: Real-world data usually suffers from severe class imbalance and long-tailed distributions, where minority classes are significantly underrepresented compared to the majority ones. Recent research prefers to utilize multi-expert architectures to mitigate the model uncertainty on the minority, where collaborative learning is employed to aggregate the knowledge of experts, i.e., online distillation.… ▽ More Real-world data usually suffers from severe class imbalance and long-tailed distributions, where minority classes are significantly underrepresented compared to the majority ones. Recent research prefers to utilize multi-expert architectures to mitigate the model uncertainty on the minority, where collaborative learning is employed to aggregate the knowledge of experts, i.e., online distillation. In this paper, we observe that the knowledge transfer between experts is imbalanced in terms of class distribution, which results in limited performance improvement of the minority classes. To address it, we propose a re-weighted distillation loss by comparing two classifiers' predictions, which are supervised by online distillation and label annotations, respectively. We also emphasize that feature-level distillation will significantly improve model performance and increase feature robustness. Finally, we propose an Effective Collaborative Learning (ECL) framework that integrates a contrastive proxy task branch to further improve feature quality. Quantitative and qualitative experiments on four standard datasets demonstrate that ECL achieves state-of-the-art performance and the detailed ablation studies manifest the effectiveness of each component in ECL. △ Less

Submitted 5 May, 2023; originally announced May 2023.

arXiv:2304.03994 [pdf, other]

RIDCP: Revitalizing Real Image Dehazing via High-Quality Codebook Priors

Authors: Rui-Qi Wu, Zheng-Peng Duan, Chun-Le Guo, Zhi Chai, Chong-Yi Li

Abstract: Existing dehazing approaches struggle to process real-world hazy images owing to the lack of paired real data and robust priors. In this work, we present a new paradigm for real image dehazing from the perspectives of synthesizing more realistic hazy data and introducing more robust priors into the network. Specifically, (1) instead of adopting the de facto physical scattering model, we rethink th… ▽ More Existing dehazing approaches struggle to process real-world hazy images owing to the lack of paired real data and robust priors. In this work, we present a new paradigm for real image dehazing from the perspectives of synthesizing more realistic hazy data and introducing more robust priors into the network. Specifically, (1) instead of adopting the de facto physical scattering model, we rethink the degradation of real hazy images and propose a phenomenological pipeline considering diverse degradation types. (2) We propose a Real Image Dehazing network via high-quality Codebook Priors (RIDCP). Firstly, a VQGAN is pre-trained on a large-scale high-quality dataset to obtain the discrete codebook, encapsulating high-quality priors (HQPs). After replacing the negative effects brought by haze with HQPs, the decoder equipped with a novel normalized feature alignment module can effectively utilize high-quality features and produce clean results. However, although our degradation pipeline drastically mitigates the domain gap between synthetic and real data, it is still intractable to avoid it, which challenges HQPs matching in the wild. Thus, we re-calculate the distance when matching the features to the HQPs by a controllable matching operation, which facilitates finding better counterparts. We provide a recommendation to control the matching based on an explainable solution. Users can also flexibly adjust the enhancement degree as per their preference. Extensive experiments verify the effectiveness of our data synthesis pipeline and the superior performance of RIDCP in real image dehazing. △ Less

Submitted 8 April, 2023; originally announced April 2023.

Comments: Acceptted by CVPR 2023

arXiv:2303.14341 [pdf, other]

doi 10.1145/3503161.3547826

Towards Accurate Post-Training Quantization for Vision Transformer

Authors: Yifu Ding, Haotong Qin, Qinghua Yan, Zhenhua Chai, Junjie Liu, Xiaolin Wei, Xianglong Liu

Abstract: Vision transformer emerges as a potential architecture for vision tasks. However, the intense computation and non-negligible delay hinder its application in the real world. As a widespread model compression technique, existing post-training quantization methods still cause severe performance drops. We find the main reasons lie in (1) the existing calibration metric is inaccurate in measuring the q… ▽ More Vision transformer emerges as a potential architecture for vision tasks. However, the intense computation and non-negligible delay hinder its application in the real world. As a widespread model compression technique, existing post-training quantization methods still cause severe performance drops. We find the main reasons lie in (1) the existing calibration metric is inaccurate in measuring the quantization influence for extremely low-bit representation, and (2) the existing quantization paradigm is unfriendly to the power-law distribution of Softmax. Based on these observations, we propose a novel Accurate Post-training Quantization framework for Vision Transformer, namely APQ-ViT. We first present a unified Bottom-elimination Blockwise Calibration scheme to optimize the calibration metric to perceive the overall quantization disturbance in a blockwise manner and prioritize the crucial quantization errors that influence more on the final output. Then, we design a Matthew-effect Preserving Quantization for Softmax to maintain the power-law character and keep the function of the attention mechanism. Comprehensive experiments on large-scale classification and detection datasets demonstrate that our APQ-ViT surpasses the existing post-training quantization methods by convincing margins, especially in lower bit-width settings (e.g., averagely up to 5.17% improvement for classification and 24.43% for detection on W4A4). We also highlight that APQ-ViT enjoys versatility and works well on diverse transformer variants. △ Less

Submitted 24 March, 2023; originally announced March 2023.

Comments: 9 pages, 5 figures, accepted by ACM Multimedia 2022

arXiv:2303.11225 [pdf, other]

HiFace: High-Fidelity 3D Face Reconstruction by Learning Static and Dynamic Details

Authors: Zenghao Chai, Tianke Zhang, Tianyu He, Xu Tan, Tadas Baltrušaitis, HsiangTao Wu, Runnan Li, Sheng Zhao, Chun Yuan, Jiang Bian

Abstract: 3D Morphable Models (3DMMs) demonstrate great potential for reconstructing faithful and animatable 3D facial surfaces from a single image. The facial surface is influenced by the coarse shape, as well as the static detail (e,g., person-specific appearance) and dynamic detail (e.g., expression-driven wrinkles). Previous work struggles to decouple the static and dynamic details through image-level s… ▽ More 3D Morphable Models (3DMMs) demonstrate great potential for reconstructing faithful and animatable 3D facial surfaces from a single image. The facial surface is influenced by the coarse shape, as well as the static detail (e,g., person-specific appearance) and dynamic detail (e.g., expression-driven wrinkles). Previous work struggles to decouple the static and dynamic details through image-level supervision, leading to reconstructions that are not realistic. In this paper, we aim at high-fidelity 3D face reconstruction and propose HiFace to explicitly model the static and dynamic details. Specifically, the static detail is modeled as the linear combination of a displacement basis, while the dynamic detail is modeled as the linear interpolation of two displacement maps with polarized expressions. We exploit several loss functions to jointly learn the coarse shape and fine details with both synthetic and real-world datasets, which enable HiFace to reconstruct high-fidelity 3D shapes with animatable details. Extensive quantitative and qualitative experiments demonstrate that HiFace presents state-of-the-art reconstruction quality and faithfully recovers both the static and dynamic details. Our project page can be found at https://project-hiface.github.io. △ Less

Submitted 23 August, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

Comments: Accepted to ICCV 2023, camera-ready version; Project page: https://project-hiface.github.io/

arXiv:2302.06845 [pdf, other]

SEAM: Searching Transferable Mixed-Precision Quantization Policy through Large Margin Regularization

Authors: Chen Tang, Kai Ouyang, Zenghao Chai, Yunpeng Bai, Yuan Meng, Zhi Wang, Wenwu Zhu

Abstract: Mixed-precision quantization (MPQ) suffers from the time-consuming process of searching the optimal bit-width allocation i.e., the policy) for each layer, especially when using large-scale datasets such as ISLVRC-2012. This limits the practicality of MPQ in real-world deployment scenarios. To address this issue, this paper proposes a novel method for efficiently searching for effective MPQ policie… ▽ More Mixed-precision quantization (MPQ) suffers from the time-consuming process of searching the optimal bit-width allocation i.e., the policy) for each layer, especially when using large-scale datasets such as ISLVRC-2012. This limits the practicality of MPQ in real-world deployment scenarios. To address this issue, this paper proposes a novel method for efficiently searching for effective MPQ policies using a small proxy dataset instead of the large-scale dataset used for training the model. Deviating from the established norm of employing a consistent dataset for both model training and MPQ policy search stages, our approach, therefore, yields a substantial enhancement in the efficiency of MPQ exploration. Nonetheless, using discrepant datasets poses challenges in searching for a transferable MPQ policy. Driven by the observation that quantization noise of sub-optimal policy exerts a detrimental influence on the discriminability of feature representations -- manifesting as diminished class margins and ambiguous decision boundaries -- our method aims to identify policies that uphold the discriminative nature of feature representations, i.e., intra-class compactness and inter-class separation. This general and dataset-independent property makes us search for the MPQ policy over a rather small-scale proxy dataset and then the policy can be directly used to quantize the model trained on a large-scale dataset. Our method offers several advantages, including high proxy data utilization, no excessive hyper-parameter tuning, and high searching efficiency. We search high-quality MPQ policies with the proxy dataset that has only 4% of the data scale compared to the large-scale target dataset, achieving the same accuracy as searching directly on the latter, improving MPQ searching efficiency by up to 300 times. △ Less

Submitted 22 August, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

arXiv:2301.12935 [pdf, other]

ERA-Solver: Error-Robust Adams Solver for Fast Sampling of Diffusion Probabilistic Models

Authors: Shengmeng Li, Luping Liu, Zenghao Chai, Runnan Li, Xu Tan

Abstract: Though denoising diffusion probabilistic models (DDPMs) have achieved remarkable generation results, the low sampling efficiency of DDPMs still limits further applications. Since DDPMs can be formulated as diffusion ordinary differential equations (ODEs), various fast sampling methods can be derived from solving diffusion ODEs. However, we notice that previous sampling methods with fixed analytica… ▽ More Though denoising diffusion probabilistic models (DDPMs) have achieved remarkable generation results, the low sampling efficiency of DDPMs still limits further applications. Since DDPMs can be formulated as diffusion ordinary differential equations (ODEs), various fast sampling methods can be derived from solving diffusion ODEs. However, we notice that previous sampling methods with fixed analytical form are not robust with the error in the noise estimated from pretrained diffusion models. In this work, we construct an error-robust Adams solver (ERA-Solver), which utilizes the implicit Adams numerical method that consists of a predictor and a corrector. Different from the traditional predictor based on explicit Adams methods, we leverage a Lagrange interpolation function as the predictor, which is further enhanced with an error-robust strategy to adaptively select the Lagrange bases with lower error in the estimated noise. Experiments on Cifar10, LSUN-Church, and LSUN-Bedroom datasets demonstrate that our proposed ERA-Solver achieves 5.14, 9.42, and 9.69 Fenchel Inception Distance (FID) for image generation, with only 10 network evaluations. △ Less

Submitted 6 February, 2023; v1 submitted 30 January, 2023; originally announced January 2023.

Comments: 16 pages, 12 figures

arXiv:2212.02015 [pdf, other]

Learning Imbalanced Data with Vision Transformers

Authors: Zhengzhuo Xu, Ruikang Liu, Shuo Yang, Zenghao Chai, Chun Yuan

Abstract: The real-world data tends to be heavily imbalanced and severely skew the data-driven deep neural networks, which makes Long-Tailed Recognition (LTR) a massive challenging task. Existing LTR methods seldom train Vision Transformers (ViTs) with Long-Tailed (LT) data, while the off-the-shelf pretrain weight of ViTs always leads to unfair comparisons. In this paper, we systematically investigate the V… ▽ More The real-world data tends to be heavily imbalanced and severely skew the data-driven deep neural networks, which makes Long-Tailed Recognition (LTR) a massive challenging task. Existing LTR methods seldom train Vision Transformers (ViTs) with Long-Tailed (LT) data, while the off-the-shelf pretrain weight of ViTs always leads to unfair comparisons. In this paper, we systematically investigate the ViTs' performance in LTR and propose LiVT to train ViTs from scratch only with LT data. With the observation that ViTs suffer more severe LTR problems, we conduct Masked Generative Pretraining (MGP) to learn generalized features. With ample and solid evidence, we show that MGP is more robust than supervised manners. In addition, Binary Cross Entropy (BCE) loss, which shows conspicuous performance with ViTs, encounters predicaments in LTR. We further propose the balanced BCE to ameliorate it with strong theoretical groundings. Specially, we derive the unbiased extension of Sigmoid and compensate extra logit margins to deploy it. Our Bal-BCE contributes to the quick convergence of ViTs in just a few epochs. Extensive experiments demonstrate that with MGP and Bal-BCE, LiVT successfully trains ViTs well without any additional data and outperforms comparable state-of-the-art methods significantly, e.g., our ViT-B achieves 81.0% Top-1 accuracy in iNaturalist 2018 without bells and whistles. Code is available at https://github.com/XuZhengzhuo/LiVT. △ Less

Submitted 8 March, 2023; v1 submitted 4 December, 2022; originally announced December 2022.

Comments: Accepted to CVPR 2023, camera-ready version; Code: https://github.com/XuZhengzhuo/LiVT

arXiv:2210.15405 [pdf, ps, other]

doi 10.1063/5.0149182

The optimal displacement of immiscible two-phase fluids in a pore doublet

Authors: Fang Shan, Zhenhua Chai, Baochang Shi, Meng Zhao

Abstract: The displacement of multiphase fluid flow in a pore doublet is a fundamental problem, and is also of importance in understanding of the transport mechanisms of multiphase flows in the porous media. During the displacement of immiscible two-phase fluids in the pore doublet, the transport process is not only influenced by the capillary and viscous forces, but also affected by the channel geometry. I… ▽ More The displacement of multiphase fluid flow in a pore doublet is a fundamental problem, and is also of importance in understanding of the transport mechanisms of multiphase flows in the porous media. During the displacement of immiscible two-phase fluids in the pore doublet, the transport process is not only influenced by the capillary and viscous forces, but also affected by the channel geometry. In this paper, we first present a mathematical model to describe the two-phase fluid displacement in the pore doublet where the effects of capillary force, viscous force and the geometric structure are included. Then we derive an analytical solution of the model for the first time, and find that the displacement process is dominated by the capillary number, the viscosity ratio and the radius ratio. Furthermore, we define the optimal displacement that the wetting fluids in two daughter channels break through the branches simultaneously (both of them have the same breakthrough time), and also obtain the critical capillary number corresponding to the optimal displacement, which is related to the radius ratio of two daughter channels and viscosity ratio of two immiscible fluids. Finally, it is worthy noting that the present analytical results on the displacement in the pore doublet can be used to explain and understand the phenomenon of preferential imbibition or preferential flow in porous media. △ Less

Submitted 27 October, 2022; originally announced October 2022.

arXiv:2209.10907 [pdf, other]

DRKF: Distilled Rotated Kernel Fusion for Efficient Rotation Invariant Descriptors in Local Feature Matching

Authors: Ranran Huang, Jiancheng Cai, Chao Li, Zhuoyuan Wu, Xinmin Liu, Zhenhua Chai

Abstract: The performance of local feature descriptors degrades in the presence of large rotation variations. To address this issue, we present an efficient approach to learning rotation invariant descriptors. Specifically, we propose Rotated Kernel Fusion (RKF) which imposes rotations on the convolution kernel to improve the inherent nature of CNN. Since RKF can be processed by the subsequent re-parameteri… ▽ More The performance of local feature descriptors degrades in the presence of large rotation variations. To address this issue, we present an efficient approach to learning rotation invariant descriptors. Specifically, we propose Rotated Kernel Fusion (RKF) which imposes rotations on the convolution kernel to improve the inherent nature of CNN. Since RKF can be processed by the subsequent re-parameterization, no extra computational costs will be introduced in the inference stage. Moreover, we present Multi-oriented Feature Aggregation (MOFA) which aggregates features extracted from multiple rotated versions of the input image and can provide auxiliary knowledge for the training of RKF by leveraging the distillation strategy. We refer to the distilled RKF model as DRKF. Besides the evaluation on a rotation-augmented version of the public dataset HPatches, we also contribute a new dataset named DiverseBEV which is collected during the drone's flight and consists of bird's eye view images with large viewpoint changes and camera rotations. Extensive experiments show that our method can outperform other state-of-the-art techniques when exposed to large rotation variations. △ Less

Submitted 5 January, 2024; v1 submitted 22 September, 2022; originally announced September 2022.

Comments: 8 pages, 7 figures

arXiv:2209.01444 [pdf, other]

doi 10.4208/cicp.OA-2022-0294

A diffuse-interface lattice Boltzmann method for the dendritic growth with thermosolutal convection

Authors: Chengjie Zhan, Zhenhua Chai, Baochang Shi, Ping Jiang, Shaoning Geng, Dongke Sun

Abstract: In this work, we proposed a diffuse interface model for the dendritic growth with thermosolutal convection. In this model, the sharp boundary between the fluid and solid dendrite is replaced by a thin but nonzero thickness diffuse interface, which is described by the order parameter governed by the phase-field equation for the dendritic growth. The governing equations for solute and heat transfer… ▽ More In this work, we proposed a diffuse interface model for the dendritic growth with thermosolutal convection. In this model, the sharp boundary between the fluid and solid dendrite is replaced by a thin but nonzero thickness diffuse interface, which is described by the order parameter governed by the phase-field equation for the dendritic growth. The governing equations for solute and heat transfer are modified such that the previous special treatments for source term can be avoided. To solve the model for the dendritic growth with thermosolutal convection, we also developed a diffuse-interface multi-relaxation-time lattice Boltzmann (LB) method. In this method, the order parameter in the phase-field equation is combined into the force caused by the fluid-solid interaction, and the treatment on the complex fluid-solid interface can be avoided. In addition, four LB models are developed for the phase-field equation, concentration equation, temperature equation and the Navier-Stokes equations in a unified framework. Finally, to test the present diffuse-interface LB method, we performed some simulations of the dendritic growth, and found that the numerical results are in good agreements with some previous works. △ Less

Submitted 3 September, 2022; originally announced September 2022.

Comments: 20 pages, 14 figures

arXiv:2208.06866 [pdf, other]

HyP$^2$ Loss: Beyond Hypersphere Metric Space for Multi-label Image Retrieval

Authors: Chengyin Xu, Zenghao Chai, Zhengzhuo Xu, Chun Yuan, Yanbo Fan, Jue Wang

Abstract: Image retrieval has become an increasingly appealing technique with broad multimedia application prospects, where deep hashing serves as the dominant branch towards low storage and efficient retrieval. In this paper, we carried out in-depth investigations on metric learning in deep hashing for establishing a powerful metric space in multi-label scenarios, where the pair loss suffers high computati… ▽ More Image retrieval has become an increasingly appealing technique with broad multimedia application prospects, where deep hashing serves as the dominant branch towards low storage and efficient retrieval. In this paper, we carried out in-depth investigations on metric learning in deep hashing for establishing a powerful metric space in multi-label scenarios, where the pair loss suffers high computational overhead and converge difficulty, while the proxy loss is theoretically incapable of expressing the profound label dependencies and exhibits conflicts in the constructed hypersphere space. To address the problems, we propose a novel metric learning framework with Hybrid Proxy-Pair Loss (HyP$^2$ Loss) that constructs an expressive metric space with efficient training complexity w.r.t. the whole dataset. The proposed HyP$^2$ Loss focuses on optimizing the hypersphere space by learnable proxies and excavating data-to-data correlations of irrelevant pairs, which integrates sufficient data correspondence of pair-based methods and high-efficiency of proxy-based methods. Extensive experiments on four standard multi-label benchmarks justify the proposed method outperforms the state-of-the-art, is robust among different hash bits and achieves significant performance gains with a faster, more stable convergence speed. Our code is available at https://github.com/JerryXu0129/HyP2-Loss. △ Less

Submitted 14 August, 2022; originally announced August 2022.

Comments: Accepted by ACM International Conference on Multimedia (ACM MM) 2022

arXiv:2208.06857 [pdf, other]

Underwater Ranker: Learn Which Is Better and How to Be Better

Authors: Chunle Guo, Ruiqi Wu, Xin Jin, Linghao Han, Zhi Chai, Weidong Zhang, Chongyi Li

Abstract: In this paper, we present a ranking-based underwater image quality assessment (UIQA) method, abbreviated as URanker. The URanker is built on the efficient conv-attentional image Transformer. In terms of underwater images, we specially devise (1) the histogram prior that embeds the color distribution of an underwater image as histogram token to attend global degradation and (2) the dynamic cross-sc… ▽ More In this paper, we present a ranking-based underwater image quality assessment (UIQA) method, abbreviated as URanker. The URanker is built on the efficient conv-attentional image Transformer. In terms of underwater images, we specially devise (1) the histogram prior that embeds the color distribution of an underwater image as histogram token to attend global degradation and (2) the dynamic cross-scale correspondence to model local degradation. The final prediction depends on the class tokens from different scales, which comprehensively considers multi-scale dependencies. With the margin ranking loss, our URanker can accurately rank the order of underwater images of the same scene enhanced by different underwater image enhancement (UIE) algorithms according to their visual quality. To achieve that, we also contribute a dataset, URankerSet, containing sufficient results enhanced by different UIE algorithms and the corresponding perceptual rankings, to train our URanker. Apart from the good performance of URanker, we found that a simple U-shape UIE network can obtain promising performance when it is coupled with our pre-trained URanker as additional supervision. In addition, we also propose a normalization tail that can significantly improve the performance of UIE networks. Extensive experiments demonstrate the state-of-the-art performance of our method. The key designs of our method are discussed. We will release our dataset and code. △ Less

Submitted 26 November, 2022; v1 submitted 14 August, 2022; originally announced August 2022.

Comments: 9 pages, 10 figures

arXiv:2208.01040 [pdf, other]

doi 10.1007/s11432-022-3571-8.

CircuitNet: An Open-Source Dataset for Machine Learning Applications in Electronic Design Automation (EDA)

Authors: Zhuomin Chai, Yuxiang Zhao, Yibo Lin, Wei Liu, Runsheng Wang, Ru Huang

Abstract: The electronic design automation (EDA) community has been actively exploring machine learning (ML) for very large-scale integrated computer-aided design (VLSI CAD). Many studies explored learning-based techniques for cross-stage prediction tasks in the design flow to achieve faster design convergence. Although building ML models usually requires a large amount of data, most studies can only genera… ▽ More The electronic design automation (EDA) community has been actively exploring machine learning (ML) for very large-scale integrated computer-aided design (VLSI CAD). Many studies explored learning-based techniques for cross-stage prediction tasks in the design flow to achieve faster design convergence. Although building ML models usually requires a large amount of data, most studies can only generate small internal datasets for validation because of the lack of large public datasets. In this essay, we present the first open-source dataset called CircuitNet for ML tasks in VLSI CAD. △ Less

Submitted 31 August, 2022; v1 submitted 31 July, 2022; originally announced August 2022.

Journal ref: SCIENCE CHINA Information Sciences 2022

Showing 1–50 of 150 results for author: Chai, Z