subscribe to arXiv mailings

TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling

Authors: Jiahao Qiu, Yifu Lu, Yifan Zeng, Jiacheng Guo, Jiayi Geng, Huazheng Wang, Kaixuan Huang, Yue Wu, Mengdi Wang

Abstract: Inference-time alignment enhances the performance of large language models without requiring additional training or fine-tuning but presents challenges due to balancing computational efficiency with high-quality output. Best-of-N (BoN) sampling, as a simple yet powerful approach, generates multiple responses and selects the best one, achieving improved performance but with a high computational cos… ▽ More Inference-time alignment enhances the performance of large language models without requiring additional training or fine-tuning but presents challenges due to balancing computational efficiency with high-quality output. Best-of-N (BoN) sampling, as a simple yet powerful approach, generates multiple responses and selects the best one, achieving improved performance but with a high computational cost. We propose TreeBoN, a novel framework that integrates a speculative tree-search strategy into Best-of-N (BoN) Sampling. TreeBoN maintains a set of parent nodes, iteratively branching and pruning low-quality responses, thereby reducing computational overhead while maintaining high output quality. Our approach also leverages token-level rewards from Direct Preference Optimization (DPO) to guide tree expansion and prune low-quality paths. We evaluate TreeBoN using AlpacaFarm, UltraFeedback, GSM8K, HH-RLHF, and TutorEval datasets, demonstrating consistent improvements. Specifically, TreeBoN achieves a 65% win rate at maximum lengths of 192 and 384 tokens, outperforming standard BoN with the same computational cost. Furthermore, TreeBoN achieves around a 60% win rate across longer responses, showcasing its scalability and alignment efficacy. △ Less

Submitted 18 October, 2024; originally announced October 2024.

arXiv:2409.17983 [pdf, other]

GRB 240529A: A Tale of Two Shocks

Authors: Tian-Rui Sun, Jin-Jun Geng, Jing-Zhi Yan, You-Dong Hu, Xue-Feng Wu, Alberto J. Castro-Tirado, Chao Yang, Yi-Ding Ping, Chen-Ran Hu, Fan Xu, Hao-Xuan Gao, Ji-An Jiang, Yan-Tian Zhu, Yongquan Xue, Ignacio Pérez-García, Si-Yu Wu, Emilio Fernández-García, María D. Caballero-García, Rubén Sánchez-Ramírez, Sergiy Guziy, Ignacio Olivares, Carlos Jesus Pérez del Pulgar, A. Castellón, Sebastián Castillo, Ding-Rong Xiong , et al. (44 additional authors not shown)

Abstract: Thanks to the rapidly increasing time-domain facilities, we are entering a golden era of research on gamma-ray bursts (GRBs). In this Letter, we report our observations of GRB 240529A with the Burst Optical Observer and Transient Exploring System, the 1.5-meter telescope at Observatorio Sierra Nevada, the 2.5-meter Wide Field Survey Telescope of China, the Large Binocular Telescope, and the Telesc… ▽ More Thanks to the rapidly increasing time-domain facilities, we are entering a golden era of research on gamma-ray bursts (GRBs). In this Letter, we report our observations of GRB 240529A with the Burst Optical Observer and Transient Exploring System, the 1.5-meter telescope at Observatorio Sierra Nevada, the 2.5-meter Wide Field Survey Telescope of China, the Large Binocular Telescope, and the Telescopio Nazionale Galileo. The prompt emission of GRB 240529A shows two comparable energetic episodes separated by a quiescence time of roughly 400 s. Combining all available data on the GRB Coordinates Network, we reveal the simultaneous apparent X-ray plateau and optical re-brightening around $10^3-10^4$ s after the burst. Rather than the energy injection from the magnetar as widely invoked for similar GRBs, the multi-wavelength emissions could be better explained as two shocks launched from the central engine separately. The optical peak time and our numerical modeling suggest that the initial bulk Lorentz factor of the later shock is roughly 50, which indicates that the later jet should be accretion-driven and have a higher mass loading than a typical one. The quiescence time between the two prompt emission episodes may be caused by the transition between different accretion states of a central magnetar or black hole, or the fall-back accretion process. A sample of similar bursts with multiple emission episodes in the prompt phase and sufficient follow-up could help to probe the underlying physics of GRB central engines. △ Less

Submitted 26 September, 2024; originally announced September 2024.

Comments: Resubmitted to ApJL after addressing the referee's comments; comments are welcome

arXiv:2409.17572 [pdf, ps, other]

Dr. GPT in Campus Counseling: Understanding Higher Education Students' Opinions on LLM-assisted Mental Health Services

Authors: Owen Xingjian Zhang, Shuyao Zhou, Jiayi Geng, Yuhan Liu, Sunny Xun Liu

Abstract: In response to the increasing mental health challenges faced by college students, we sought to understand their perspectives on how AI applications, particularly Large Language Models (LLMs), can be leveraged to enhance their mental well-being. Through pilot interviews with ten diverse students, we explored their opinions on the use of LLMs across five fictional scenarios: General Information Inqu… ▽ More In response to the increasing mental health challenges faced by college students, we sought to understand their perspectives on how AI applications, particularly Large Language Models (LLMs), can be leveraged to enhance their mental well-being. Through pilot interviews with ten diverse students, we explored their opinions on the use of LLMs across five fictional scenarios: General Information Inquiry, Initial Screening, Reshaping Patient-Expert Dynamics, Long-term Care, and Follow-up Care. Our findings revealed that students' acceptance of LLMs varied by scenario, with participants highlighting both potential benefits, such as proactive engagement and personalized follow-up care, and concerns, including limitations in training data and emotional support. These insights inform how AI technology should be designed and implemented to effectively support and enhance students' mental well-being, particularly in scenarios where LLMs can complement traditional methods, while maintaining empathy and respecting individual preferences. △ Less

Submitted 26 September, 2024; originally announced September 2024.

Comments: 5 pages

arXiv:2409.12938 [pdf, other]

Hybrid spin-phonon architecture for scalable solid-state quantum nodes

Authors: Ruoming Peng, Xuntao Wu, Yang Wang, Jixing Zhang, Jianpei Geng, Durga Bhaktavatsala Rao Dasari, Andrew N. Cleland, Jörg Wrachtrup

Abstract: Solid-state spin systems hold great promise for quantum information processing and the construction of quantum networks. However, the considerable inhomogeneity of spins in solids poses a significant challenge to the scaling of solid-state quantum systems. A practical protocol to individually control and entangle spins remains elusive. To this end, we propose a hybrid spin-phonon architecture base… ▽ More Solid-state spin systems hold great promise for quantum information processing and the construction of quantum networks. However, the considerable inhomogeneity of spins in solids poses a significant challenge to the scaling of solid-state quantum systems. A practical protocol to individually control and entangle spins remains elusive. To this end, we propose a hybrid spin-phonon architecture based on spin-embedded SiC optomechanical crystal (OMC) cavities, which integrates photonic and phononic channels allowing for interactions between multiple spins. With a Raman-facilitated process, the OMC cavities support coupling between the spin and the zero-point motion of the OMC cavity mode reaching 0.57 MHz, facilitating phonon preparation and spin Rabi swap processes. Based on this, we develop a spin-phonon interface that achieves a two-qubit controlled-Z gate with a simulated fidelity of $96.80\%$ and efficiently generates highly entangled Dicke states with over $99\%$ fidelity, by engineering the strongly coupled spin-phonon dark state which is robust against loss from excited state relaxation as well as spectral inhomogeneity of the defect centers. This provides a hybrid platform for exploring spin entanglement with potential scalability and full connectivity in addition to an optical link, and offers a pathway to investigate quantum acoustics in solid-state systems. △ Less

Submitted 19 September, 2024; originally announced September 2024.

arXiv:2409.09287 [pdf, other]

Panoramic Direct LiDAR-assisted Visual Odometry

Authors: Zikang Yuan, Tianle Xu, Xiaoxiang Wang, Jinni Geng, Xin Yang

Abstract: Enhancing visual odometry by exploiting sparse depth measurements from LiDAR is a promising solution for improving tracking accuracy of an odometry. Most existing works utilize a monocular pinhole camera, yet could suffer from poor robustness due to less available information from limited field-of-view (FOV). This paper proposes a panoramic direct LiDAR-assisted visual odometry, which fully associ… ▽ More Enhancing visual odometry by exploiting sparse depth measurements from LiDAR is a promising solution for improving tracking accuracy of an odometry. Most existing works utilize a monocular pinhole camera, yet could suffer from poor robustness due to less available information from limited field-of-view (FOV). This paper proposes a panoramic direct LiDAR-assisted visual odometry, which fully associates the 360-degree FOV LiDAR points with the 360-degree FOV panoramic image datas. 360-degree FOV panoramic images can provide more available information, which can compensate inaccurate pose estimation caused by insufficient texture or motion blur from a single view. In addition to constraints between a specific view at different times, constraints can also be built between different views at the same moment. Experimental results on public datasets demonstrate the benefit of large FOV of our panoramic direct LiDAR-assisted visual odometry to state-of-the-art approaches. △ Less

Submitted 13 September, 2024; originally announced September 2024.

Comments: 6 pages, 6 figures

ACM Class: C.5

Journal ref: in submission 2024

arXiv:2409.00327 [pdf, other]

Demo: FedCampus: A Real-world Privacy-preserving Mobile Application for Smart Campus via Federated Learning & Analytics

Authors: Jiaxiang Geng, Beilong Tang, Boyan Zhang, Jiaqi Shao, Bing Luo

Abstract: In this demo, we introduce FedCampus, a privacy-preserving mobile application for smart \underline{campus} with \underline{fed}erated learning (FL) and federated analytics (FA). FedCampus enables cross-platform on-device FL/FA for both iOS and Android, supporting continuously models and algorithms deployment (MLOps). Our app integrates privacy-preserving processed data via differential privacy (DP… ▽ More In this demo, we introduce FedCampus, a privacy-preserving mobile application for smart \underline{campus} with \underline{fed}erated learning (FL) and federated analytics (FA). FedCampus enables cross-platform on-device FL/FA for both iOS and Android, supporting continuously models and algorithms deployment (MLOps). Our app integrates privacy-preserving processed data via differential privacy (DP) from smartwatches, where the processed parameters are used for FL/FA through the FedCampus backend platform. We distributed 100 smartwatches to volunteers at Duke Kunshan University and have successfully completed a series of smart campus tasks featuring capabilities such as sleep tracking, physical activity monitoring, personalized recommendations, and heavy hitters. Our project is opensourced at https://github.com/FedCampus/FedCampus_Flutter. See the FedCampus video at https://youtu.be/k5iu46IjA38. △ Less

Submitted 30 August, 2024; originally announced September 2024.

Comments: 2 pages, 3 figures, accepted for publication in ACM Mobihoc 2024

arXiv:2408.11832 [pdf, other]

OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs

Authors: Hasan Iqbal, Yuxia Wang, Minghan Wang, Georgi Georgiev, Jiahui Geng, Iryna Gurevych, Preslav Nakov

Abstract: The increased use of large language models (LLMs) across a variety of real-world applications calls for automatic tools to check the factual accuracy of their outputs, as LLMs often hallucinate. This is difficult as it requires assessing the factuality of free-form open-domain responses. While there has been a lot of research on this topic, different papers use different evaluation benchmarks and… ▽ More The increased use of large language models (LLMs) across a variety of real-world applications calls for automatic tools to check the factual accuracy of their outputs, as LLMs often hallucinate. This is difficult as it requires assessing the factuality of free-form open-domain responses. While there has been a lot of research on this topic, different papers use different evaluation benchmarks and measures, which makes them hard to compare and hampers future progress. To mitigate these issues, we developed OpenFactCheck, a unified framework, with three modules: (i) RESPONSEEVAL, which allows users to easily customize an automatic fact-checking system and to assess the factuality of all claims in an input document using that system, (ii) LLMEVAL, which assesses the overall factuality of an LLM, and (iii) CHECKEREVAL, a module to evaluate automatic fact-checking systems. OpenFactCheck is open-sourced (https://github.com/hasaniqbal777/openfactcheck) and publicly released as a Python library (https://pypi.org/project/openfactcheck/) and also as a web service (https://huggingface.co/spaces/hasaniqbal777/OpenFactCheck). A video describing the system is available at https://youtu.be/-i9VKL0HleI. △ Less

Submitted 6 August, 2024; originally announced August 2024.

Comments: 10 pages, 4 Figures, 3 Tables, Submitted to EMNLP 2024 System Demonstration. arXiv admin note: substantial text overlap with arXiv:2405.05583

ACM Class: I.2.7

arXiv:2408.05767 [pdf, other]

Reference-free Hallucination Detection for Large Vision-Language Models

Authors: Qing Li, Chenyang Lyu, Jiahui Geng, Derui Zhu, Maxim Panov, Fakhri Karray

Abstract: Large vision-language models (LVLMs) have made significant progress in recent years. While LVLMs exhibit excellent ability in language understanding, question answering, and conversations of visual inputs, they are prone to producing hallucinations. While several methods are proposed to evaluate the hallucinations in LVLMs, most are reference-based and depend on external tools, which complicates t… ▽ More Large vision-language models (LVLMs) have made significant progress in recent years. While LVLMs exhibit excellent ability in language understanding, question answering, and conversations of visual inputs, they are prone to producing hallucinations. While several methods are proposed to evaluate the hallucinations in LVLMs, most are reference-based and depend on external tools, which complicates their practical application. To assess the viability of alternative methods, it is critical to understand whether the reference-free approaches, which do not rely on any external tools, can efficiently detect hallucinations. Therefore, we initiate an exploratory study to demonstrate the effectiveness of different reference-free solutions in detecting hallucinations in LVLMs. In particular, we conduct an extensive study on three kinds of techniques: uncertainty-based, consistency-based, and supervised uncertainty quantification methods on four representative LVLMs across two different tasks. The empirical results show that the reference-free approaches are capable of effectively detecting non-factual responses in LVLMs, with the supervised uncertainty quantification method outperforming the others, achieving the best performance across different settings. △ Less

Submitted 11 August, 2024; originally announced August 2024.

arXiv:2408.04284 [pdf, other]

LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection

Authors: Mervat Abassy, Kareem Elozeiri, Alexander Aziz, Minh Ngoc Ta, Raj Vardhan Tomar, Bimarsha Adhikari, Saad El Dine Ahmed, Yuxia Wang, Osama Mohammed Afzal, Zhuohan Xie, Jonibek Mansurov, Ekaterina Artemova, Vladislav Mikhailov, Rui Xing, Jiahui Geng, Hasan Iqbal, Zain Muhammad Mujahid, Tarek Mahmoud, Akim Tsvigun, Alham Fikri Aji, Artem Shelmanov, Nizar Habash, Iryna Gurevych, Preslav Nakov

Abstract: The ease of access to large language models (LLMs) has enabled a widespread of machine-generated texts, and now it is often hard to tell whether a piece of text was human-written or machine-generated. This raises concerns about potential misuse, particularly within educational and academic domains. Thus, it is important to develop practical systems that can automate the process. Here, we present o… ▽ More The ease of access to large language models (LLMs) has enabled a widespread of machine-generated texts, and now it is often hard to tell whether a piece of text was human-written or machine-generated. This raises concerns about potential misuse, particularly within educational and academic domains. Thus, it is important to develop practical systems that can automate the process. Here, we present one such system, LLM-DetectAIve, designed for fine-grained detection. Unlike most previous work on machine-generated text detection, which focused on binary classification, LLM-DetectAIve supports four categories: (i) human-written, (ii) machine-generated, (iii) machine-written, then machine-humanized, and (iv) human-written, then machine-polished. Category (iii) aims to detect attempts to obfuscate the fact that a text was machine-generated, while category (iv) looks for cases where the LLM was used to polish a human-written text, which is typically acceptable in academic writing, but not in education. Our experiments show that LLM-DetectAIve can effectively identify the above four categories, which makes it a potentially useful tool in education, academia, and other domains. LLM-DetectAIve is publicly accessible at https://github.com/mbzuai-nlp/LLM-DetectAIve. The video describing our system is available at https://youtu.be/E8eT_bE7k8c. △ Less

Submitted 21 October, 2024; v1 submitted 8 August, 2024; originally announced August 2024.

arXiv:2408.03844 [pdf, ps, other]

Resolvent Estimates in $L^\infty$ for the Stokes Operator in Nonsmooth Domains

Authors: Jun Geng, Zhongwei Shen

Abstract: We establish resolvent estimates in spaces of bounded solenoidal functions for the Stokes operator in a bounded domain $Ω$ in $R^d$ under the assumptions that $Ω$ is $C^1$ for $d\ge 3$ and Lipschitz for $d=2$. As a corollary, it follows that the Stokes operator generates a uniformly bounded analytic semigroup in the spaces of bounded solenoidal functions in $Ω$. The smoothness conditions on $Ω$ ar… ▽ More We establish resolvent estimates in spaces of bounded solenoidal functions for the Stokes operator in a bounded domain $Ω$ in $R^d$ under the assumptions that $Ω$ is $C^1$ for $d\ge 3$ and Lipschitz for $d=2$. As a corollary, it follows that the Stokes operator generates a uniformly bounded analytic semigroup in the spaces of bounded solenoidal functions in $Ω$. The smoothness conditions on $Ω$ are sharp. The case of exterior domains with nonsmooth boundaries is also studied.The key step in the proof involves new estimates which connect the pressure to the velocity in the $L^q$ average, but only on scales above certain level. △ Less

Submitted 7 August, 2024; originally announced August 2024.

Comments: 40 pages

arXiv:2407.09876 [pdf, other]

Detection of hidden emissions in two rotating radio transients with high surface magnetic fields

Authors: S. B. Zhang, X. Yang, J. J. Geng, Y. P. Yang, X. F. Wu

Abstract: Rotating Radio Transients (RRATs) are neutron stars emitting sporadic radio pulses. The unique emission of RRATs has been proposed to resemble those of known pulsar types, such as extreme nulling pulsars or pulsars with giant pulses. However, the presence of additional radiation beyond these sporadic pulses remains unclear. Through high-sensitivity observations and extended tracking, we detected t… ▽ More Rotating Radio Transients (RRATs) are neutron stars emitting sporadic radio pulses. The unique emission of RRATs has been proposed to resemble those of known pulsar types, such as extreme nulling pulsars or pulsars with giant pulses. However, the presence of additional radiation beyond these sporadic pulses remains unclear. Through high-sensitivity observations and extended tracking, we detected the sequential weak emissions in two RRATs with relatively high surface magnetic fields (Bs > 10^13 G): J1846-0257 and J1854+0306. These emissions show peak flux densities of 0.15 and 0.41 mJy, up to 687 and 512 times weaker than our detected RRAT single pulses, respectively. The weak emissions contribute small fractions (~ 16% and 5%) to the total radio pulse energy releases, contrasting significantly with giant-pulse pulsars where normal pulses dominate. Polarization analysis of J1854+0306 suggests that its sporadic RRAT pulses may originate from intermittent enhanced sparking processes due to magnetospheric evolution. Our findings indicate that some RRATs may represent a novel class of pulsars, distinct from any previously known subclass. Further observations of sources with similar rotational properties using high-sensitivity instruments could validate the generality of these hidden emissions. △ Less

Submitted 13 July, 2024; originally announced July 2024.

Comments: 10 pages, 1 table, 6 figures

arXiv:2407.05587 [pdf, other]

Flying Calligrapher: Contact-Aware Motion and Force Planning and Control for Aerial Manipulation

Authors: Xiaofeng Guo, Guanqi He, Jiahe Xu, Mohammadreza Mousaei, Junyi Geng, Sebastian Scherer, Guanya Shi

Abstract: Aerial manipulation has gained interest in completing high-altitude tasks that are challenging for human workers, such as contact inspection and defect detection, etc. Previous research has focused on maintaining static contact points or forces. This letter addresses a more general and dynamic task: simultaneously tracking time-varying contact force in the surface normal direction and motion traje… ▽ More Aerial manipulation has gained interest in completing high-altitude tasks that are challenging for human workers, such as contact inspection and defect detection, etc. Previous research has focused on maintaining static contact points or forces. This letter addresses a more general and dynamic task: simultaneously tracking time-varying contact force in the surface normal direction and motion trajectories on tangential surfaces. We propose a pipeline that includes a contact-aware trajectory planner to generate dynamically feasible trajectories, and a hybrid motion-force controller to track such trajectories. We demonstrate the approach in an aerial calligraphy task using a novel sponge pen design as the end-effector, whose stroke width is proportional to the contact force. Additionally, we develop a touchscreen interface for flexible user input. Experiments show our method can effectively draw diverse letters, achieving an IoU of 0.59 and an end-effector position (force) tracking RMSE of 2.9 cm (0.7 N). Website: https://xiaofeng-guo.github.io/flying-calligrapher/ △ Less

Submitted 7 July, 2024; originally announced July 2024.

Comments: 8 pages, 9 figures, 1 table

arXiv:2407.02803 [pdf, other]

KnobCF: Uncertainty-aware Knob Tuning

Authors: Yu Yan, Junfang Huang, Hongzhi Wang, Jian Geng, Kaixin Zhang, Tao Yu

Abstract: The knob tuning aims to optimize database performance by searching for the most effective knob configuration under a certain workload. Existing works suffer two significant problems. On the one hand, there exist multiple similar even useless evaluations of knob tuning even with the diverse searching methods because of the different sensitivities of knobs on a certain workload. On the other hand, t… ▽ More The knob tuning aims to optimize database performance by searching for the most effective knob configuration under a certain workload. Existing works suffer two significant problems. On the one hand, there exist multiple similar even useless evaluations of knob tuning even with the diverse searching methods because of the different sensitivities of knobs on a certain workload. On the other hand, the single evaluation of knob configurations may bring overestimation or underestimation because of the query uncertainty performance. To solve the above problems, we propose a decoupled query uncertainty-aware knob classifier, called KnobCF, to enhance the knob tuning. Our method has three significant contributions: (1) We propose a novel concept of the uncertainty-aware knob configuration estimation to enhance the knob tuning process. (2) We provide an effective few-shot uncertainty knob estimator without extra time consumption in training data collection, which has a high time efficiency in practical tuning tasks. (3) Our method provides a general framework that could be easily deployed in any knob tuning task because we make no changes to the knob tuners and the database management system. Our experiments on four open-source benchmarks demonstrate that our method effectively reduces useless evaluations and improves the tuning results. Especially in TPCC, our method achieves competitive tuning results with only 60% to 70% time consumption compared to the full workload evaluations. △ Less

Submitted 3 July, 2024; originally announced July 2024.

arXiv:2407.00943 [pdf, other]

FedEx: Expediting Federated Learning over Heterogeneous Mobile Devices by Overlapping and Participant Selection

Authors: Jiaxiang Geng, Boyu Li, Xiaoqi Qin, Yixuan Li, Liang Li, Yanzhao Hou, Miao Pan

Abstract: Training latency is critical for the success of numerous intrigued applications ignited by federated learning (FL) over heterogeneous mobile devices. By revolutionarily overlapping local gradient transmission with continuous local computing, FL can remarkably reduce its training latency over homogeneous clients, yet encounter severe model staleness, model drifts, memory cost and straggler issues i… ▽ More Training latency is critical for the success of numerous intrigued applications ignited by federated learning (FL) over heterogeneous mobile devices. By revolutionarily overlapping local gradient transmission with continuous local computing, FL can remarkably reduce its training latency over homogeneous clients, yet encounter severe model staleness, model drifts, memory cost and straggler issues in heterogeneous environments. To unleash the full potential of overlapping, we propose, FedEx, a novel \underline{fed}erated learning approach to \underline{ex}pedite FL training over mobile devices under data, computing and wireless heterogeneity. FedEx redefines the overlapping procedure with staleness ceilings to constrain memory consumption and make overlapping compatible with participation selection (PS) designs. Then, FedEx characterizes the PS utility function by considering the latency reduced by overlapping, and provides a holistic PS solution to address the straggler issue. FedEx also introduces a simple but effective metric to trigger overlapping, in order to avoid model drifts. Experimental results show that compared with its peer designs, FedEx demonstrates substantial reductions in FL training latency over heterogeneous mobile devices with limited memory cost. △ Less

Submitted 2 July, 2024; v1 submitted 30 June, 2024; originally announced July 2024.

Comments: 21 pages, 10 figures, Submitted to Sensys2024

arXiv:2406.17055 [pdf, other]

Large Language Models Assume People are More Rational than We Really are

Authors: Ryan Liu, Jiayi Geng, Joshua C. Peterson, Ilia Sucholutsky, Thomas L. Griffiths

Abstract: In order for AI systems to communicate effectively with people, they must understand how we make decisions. However, people's decisions are not always rational, so the implicit internal models of human decision-making in Large Language Models (LLMs) must account for this. Previous empirical evidence seems to suggest that these implicit models are accurate -- LLMs offer believable proxies of human… ▽ More In order for AI systems to communicate effectively with people, they must understand how we make decisions. However, people's decisions are not always rational, so the implicit internal models of human decision-making in Large Language Models (LLMs) must account for this. Previous empirical evidence seems to suggest that these implicit models are accurate -- LLMs offer believable proxies of human behavior, acting how we expect humans would in everyday interactions. However, by comparing LLM behavior and predictions to a large dataset of human decisions, we find that this is actually not the case: when both simulating and predicting people's choices, a suite of cutting-edge LLMs (GPT-4o & 4-Turbo, Llama-3-8B & 70B, Claude 3 Opus) assume that people are more rational than we really are. Specifically, these models deviate from human behavior and align more closely with a classic model of rational choice -- expected value theory. Interestingly, people also tend to assume that other people are rational when interpreting their behavior. As a consequence, when we compare the inferences that LLMs and people draw from the decisions of others using another psychological dataset, we find that these inferences are highly correlated. Thus, the implicit decision-making models of LLMs appear to be aligned with the human expectation that other people will act rationally, rather than with how people actually act. △ Less

Submitted 30 July, 2024; v1 submitted 24 June, 2024; originally announced June 2024.

arXiv:2406.16087 [pdf, other]

Imperative Learning: A Self-supervised Neural-Symbolic Learning Framework for Robot Autonomy

Authors: Chen Wang, Kaiyi Ji, Junyi Geng, Zhongqiang Ren, Taimeng Fu, Fan Yang, Yifan Guo, Haonan He, Xiangyu Chen, Zitong Zhan, Qiwei Du, Shaoshu Su, Bowen Li, Yuheng Qiu, Yi Du, Qihang Li, Yifan Yang, Xiao Lin, Zhipeng Zhao

Abstract: Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeS… ▽ More Data-driven methods such as reinforcement and imitation learning have achieved remarkable success in robot autonomy. However, their data-centric nature still hinders them from generalizing well to ever-changing environments. Moreover, collecting large datasets for robotic tasks is often impractical and expensive. To overcome these challenges, we introduce a new self-supervised neural-symbolic (NeSy) computational framework, imperative learning (IL), for robot autonomy, leveraging the generalization abilities of symbolic reasoning. The framework of IL consists of three primary components: a neural module, a reasoning engine, and a memory system. We formulate IL as a special bilevel optimization (BLO), which enables reciprocal learning over the three modules. This overcomes the label-intensive obstacles associated with data-driven approaches and takes advantage of symbolic reasoning concerning logical reasoning, physical principles, geometric analysis, etc. We discuss several optimization techniques for IL and verify their effectiveness in five distinct robot autonomy tasks including path planning, rule induction, optimal control, visual odometry, and multi-robot routing. Through various experiments, we show that IL can significantly enhance robot autonomy capabilities and we anticipate that it will catalyze further research across diverse domains. △ Less

Submitted 6 August, 2024; v1 submitted 23 June, 2024; originally announced June 2024.

arXiv:2406.09181 [pdf, other]

A Large-scale Universal Evaluation Benchmark For Face Forgery Detection

Authors: Yijun Bei, Hengrui Lou, Jinsong Geng, Erteng Liu, Lechao Cheng, Jie Song, Mingli Song, Zunlei Feng

Abstract: With the rapid development of AI-generated content (AIGC) technology, the production of realistic fake facial images and videos that deceive human visual perception has become possible. Consequently, various face forgery detection techniques have been proposed to identify such fake facial content. However, evaluating the effectiveness and generalizability of these detection techniques remains a si… ▽ More With the rapid development of AI-generated content (AIGC) technology, the production of realistic fake facial images and videos that deceive human visual perception has become possible. Consequently, various face forgery detection techniques have been proposed to identify such fake facial content. However, evaluating the effectiveness and generalizability of these detection techniques remains a significant challenge. To address this, we have constructed a large-scale evaluation benchmark called DeepFaceGen, aimed at quantitatively assessing the effectiveness of face forgery detection and facilitating the iterative development of forgery detection technology. DeepFaceGen consists of 776,990 real face image/video samples and 773,812 face forgery image/video samples, generated using 34 mainstream face generation techniques. During the construction process, we carefully consider important factors such as content diversity, fairness across ethnicities, and availability of comprehensive labels, in order to ensure the versatility and convenience of DeepFaceGen. Subsequently, DeepFaceGen is employed in this study to evaluate and analyze the performance of 13 mainstream face forgery detection techniques from various perspectives. Through extensive experimental analysis, we derive significant findings and propose potential directions for future research. The code and dataset for DeepFaceGen are available at https://github.com/HengruiLou/DeepFaceGen. △ Less

Submitted 13 June, 2024; v1 submitted 13 June, 2024; originally announced June 2024.

Comments: This is a paper about constructing a large-scale universal evaluation benchmark for face forgery detection.The full text is 30 pages

arXiv:2406.06846 [pdf, other]

Relativistic Hartree-Fock model for axial-symmetric nuclei with quadruple and octupole deformations

Authors: Yong Peng, Jing Geng, Wen Hui Long

Abstract: \textbf{Background:} The initial observation of a negative-parity state in proximity to the ground state in the 1950s marked the advent of extensive research into octupole deformed nuclei. Since then, the physics of octupole deformed nuclei has consistently held a special interest within the field of nuclear physics. In the present era, with the advent of sophisticated radioactive ion beam (RIB) f… ▽ More \textbf{Background:} The initial observation of a negative-parity state in proximity to the ground state in the 1950s marked the advent of extensive research into octupole deformed nuclei. Since then, the physics of octupole deformed nuclei has consistently held a special interest within the field of nuclear physics. In the present era, with the advent of sophisticated radioactive ion beam (RIB) facilities and advanced detectors, coupled with the remarkable capabilities of high-performance computing, extensive and intensive explorations are being conducted from both experimental and theoretical perspectives to elucidate the physics of octupole deformed nuclei. \textbf{Results:} This work establishes the OD-RHF model, which provides a reliable tool for studying octupole nuclei over a fairly wide range. The reliability of the newly developed OD-RHF model is illustrated by taking the octupole nucleus $^{144}$Ba as an example. Furthermore, the octupole deformation effects in $^{144}$Ba is verified by using the RHF Lagrangians PKO$i$ ($i=1,2,3$) and the RMF one DD-ME2. The intrusion of the neutron $1i_{13/2}$ and proton $1h_{11/2}$ components is demonstrated to play an essential role in determining the notable octupole deformation of $^{144}$Ba using PKO$i$ ($i=1,2,3$) and DD-ME2. It is indicated that the Fock terms play an important role in stabilizing the octupole deformation. More specifically, due to the repulsive tensor coupling between the intrude components and the core of $^{144}$Ba, the tensor force component carried by the $π$-PV coupling, that contributes only via the Fock terms, plays an opposing role in the formation of the octupole deformation of $^{144}$Ba. △ Less

Submitted 10 June, 2024; originally announced June 2024.

Comments: 20pages, 3 figures

arXiv:2406.06845 [pdf, other]

Time-dependent Relativistic Hartree-Fock model with spherical symmetry

Authors: Jing Geng, Zhi Heng Wang, Peng Wei Zhao, Yi Fei Niu, Haozhao Liang, Wen Hui Long

Abstract: This work establishes the time-dependent relativistic Hartree-Fock (TD-RHF) model with spherical symmetry for the first time. The time-dependent integro-differential Dirac equations are solved by expanding Dirac spinors on the spherical Dirac Woods-Saxon (DWS) basis. The numerical verification demonstrates the high conservation qualities for both the total binding energy and the particle number, a… ▽ More This work establishes the time-dependent relativistic Hartree-Fock (TD-RHF) model with spherical symmetry for the first time. The time-dependent integro-differential Dirac equations are solved by expanding Dirac spinors on the spherical Dirac Woods-Saxon (DWS) basis. The numerical verification demonstrates the high conservation qualities for both the total binding energy and the particle number, as well as the time-reversal invariance of the system, which ensures the precision and reliability of the newly developed TD-RHF model. Subsequently, the isoscalar giant monopole resonance (ISGMR) mode of $^{208}$Pb is investigated using the RHF Lagrangian PKO1. The constrained energy of the ISGMR calculated by PKO1 is found to be in close agreement with the experimental data, and the strength function is similar to the results given by the relativistic Hartree-Fock plus random phase approximation. Based on the advantage of the TD-RHF model in avoiding complicated calculations of the residual interactions, the ISGMR mode of $^{208}$Pb is calculated by twelve relativistic effective Lagrangians. The results indicate that the value of the incompressibility of nuclear matter $K_\infty$ constrained by relativistic effective Lagrangians is in the range of $237\sim246$ MeV, which is lower than the previous investigations based on the relativistic models. △ Less

Submitted 12 June, 2024; v1 submitted 10 June, 2024; originally announced June 2024.

Comments: 9 pages, 5 figures

arXiv:2406.05967 [pdf, other]

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark

Authors: David Romero, Chenyang Lyu, Haryo Akbarianto Wibowo, Teresa Lynn, Injy Hamed, Aditya Nanda Kishore, Aishik Mandal, Alina Dragonetti, Artem Abzaliev, Atnafu Lambebo Tonja, Bontu Fufa Balcha, Chenxi Whitehouse, Christian Salamea, Dan John Velasco, David Ifeoluwa Adelani, David Le Meur, Emilio Villa-Cueva, Fajri Koto, Fauzan Farooqui, Frederico Belcavello, Ganzorig Batnasan, Gisela Vallejo, Grainne Caulfield, Guido Ivetta, Haiyue Song , et al. (50 additional authors not shown)

Abstract: Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data. However, most of the current VQA models use datasets that are primarily focused on English and a few major world languages, with images that are typically Western-centric. While recen… ▽ More Visual Question Answering (VQA) is an important task in multimodal AI, and it is often used to test the ability of vision-language models to understand and reason on knowledge present in both visual and textual data. However, most of the current VQA models use datasets that are primarily focused on English and a few major world languages, with images that are typically Western-centric. While recent efforts have tried to increase the number of languages covered on VQA datasets, they still lack diversity in low-resource languages. More importantly, although these datasets often extend their linguistic range via translation or some other approaches, they usually keep images the same, resulting in narrow cultural representation. To address these limitations, we construct CVQA, a new Culturally-diverse multilingual Visual Question Answering benchmark, designed to cover a rich set of languages and cultures, where we engage native speakers and cultural experts in the data collection process. As a result, CVQA includes culturally-driven images and questions from across 28 countries on four continents, covering 26 languages with 11 scripts, providing a total of 9k questions. We then benchmark several Multimodal Large Language Models (MLLMs) on CVQA, and show that the dataset is challenging for the current state-of-the-art models. This benchmark can serve as a probing evaluation suite for assessing the cultural capability and bias of multimodal models and hopefully encourage more research efforts toward increasing cultural awareness and linguistic diversity in this field. △ Less

Submitted 9 June, 2024; originally announced June 2024.

arXiv:2406.00616 [pdf, other]

EMIT: Micro-Invasive Database Configuration Tuning

Authors: Jian Geng, Hongzhi Wang, Yu Yan

Abstract: The process of database knob tuning has always been a challenging task. Recently, database knob tuning methods has emerged as a promising solution to mitigate these issues. However, these methods still face certain limitations.On one hand, when applying knob tuning algorithms to optimize databases in practice, it either requires frequent updates to the database or necessitates acquiring database w… ▽ More The process of database knob tuning has always been a challenging task. Recently, database knob tuning methods has emerged as a promising solution to mitigate these issues. However, these methods still face certain limitations.On one hand, when applying knob tuning algorithms to optimize databases in practice, it either requires frequent updates to the database or necessitates acquiring database workload and optimizing through workload replay. The former approach involves constant exploration and updating of database configurations, inevitably leading to a decline in database performance during optimization. The latter, on the other hand, requires the acquisition of workload data, which could lead to data leakage issues. Moreover, the hyperparameter configuration space for database knobs is vast, making it challenging for optimizers to converge. These factors significantly hinder the practical implementation of database tuning. To address these concerns, we proposes an efficient and micro-invasive knob tuning method. This method relies on workload synthesis on cloned databases to simulate the workload that needs tuning, thus minimizing the intrusion on the database. And we utilizing a configuration replacement strategy to filter configuration candidates that perform well under the synthesized workload to find best configuration. And during the tuning process, we employ a knowledge transfer method to extract a common high-performance space, to boost the convergence of the optimizer. △ Less

Submitted 2 June, 2024; originally announced June 2024.

arXiv:2405.09120 [pdf]

doi 10.1016/j.epsl.2024.119047

Martian seismic anisotropy underneath Elysium Planitia revealed by direct S wave splitting

Authors: Jing Shi, Cunrui Han, Tao Wang, Chao Qi, Han Chen, Zhihan Yu, Jiaqi Geng, Minghan Yang, Xu Wang, Ling Chen, Hejiu Hui

Abstract: Seismic anisotropy, arising from the crystallographic or lattice-preferred orientation of anisotropic minerals or the shape-preferred orientation of melts or cracks, can establish a critical link between Mars's past evolution and its current state. So far, although seismic anisotropy in Mars has been proposed due to different velocities of vertically and horizontally polarized shear waves in the M… ▽ More Seismic anisotropy, arising from the crystallographic or lattice-preferred orientation of anisotropic minerals or the shape-preferred orientation of melts or cracks, can establish a critical link between Mars's past evolution and its current state. So far, although seismic anisotropy in Mars has been proposed due to different velocities of vertically and horizontally polarized shear waves in the Martian crust, obtained from crustal converted waves, multiples, and surface waves recorded by the InSight seismometer, the evidence is plausible. Notably, the shear wave splitting, which stands out as a straight indicator of seismic anisotropy, has not been reported using marsquake records. In this study, we employ Low-frequency marsquakes detected by the InSight seismometer to reveal shear wave splitting in Mars. We find that the direct S waves of three marsquake recordings (S0173a, S0235b, and S1133c) with high signal-to-noise ratios exhibit the splitting pheonmenon. We rule out the possibility of apparent anisotropy through synthetic tests, affirming the presence of seismic anisotropy in Mars. The delay time (about 1.33 s on average) measured from the direct S wave splitting is too large to be solely attributed to the seismic anisotropy in the upper crust (0 - 10 km) beneath the InSight. Thus, seismic anisotropy in the deeper region of Mars is indispensable. Combined with other geophysical evidence near the InSight landing site, the strong seismic anisotropy observed in this study implies the porous crust with aligned cracks being greater than 10 km beneath the InSight and/or the presence of an active mantle plume underneath the Elysium Planitia of Mars. △ Less

Submitted 15 May, 2024; originally announced May 2024.

Comments: Manuscript has been submitted to Earth and Planetary Science Letters; 9 figures; 33 pages

Journal ref: Earth and Planetary Science Letters,2024

arXiv:2405.05583 [pdf, other]

OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs

Authors: Yuxia Wang, Minghan Wang, Hasan Iqbal, Georgi Georgiev, Jiahui Geng, Preslav Nakov

Abstract: The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. Difficulties lie in assessing the factuality of free-form responses in open domains. Also, different papers use disparate evaluation benchmarks and measurements, which renders them hard to compare and hampers future progress. To mitigat… ▽ More The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. Difficulties lie in assessing the factuality of free-form responses in open domains. Also, different papers use disparate evaluation benchmarks and measurements, which renders them hard to compare and hampers future progress. To mitigate these issues, we propose OpenFactCheck, a unified factuality evaluation framework for LLMs. OpenFactCheck consists of three modules: (i) CUSTCHECKER allows users to easily customize an automatic fact-checker and verify the factual correctness of documents and claims, (ii) LLMEVAL, a unified evaluation framework assesses LLM's factuality ability from various perspectives fairly, and (iii) CHECKEREVAL is an extensible solution for gauging the reliability of automatic fact-checkers' verification results using human-annotated datasets. OpenFactCheck is publicly released at https://github.com/yuxiaw/OpenFactCheck. △ Less

Submitted 9 May, 2024; originally announced May 2024.

Comments: 19 pages, 8 tables, 8 figures

arXiv:2405.00885 [pdf, other]

WHALE-FL: Wireless and Heterogeneity Aware Latency Efficient Federated Learning over Mobile Devices via Adaptive Subnetwork Scheduling

Authors: Huai-an Su, Jiaxiang Geng, Liang Li, Xiaoqi Qin, Yanzhao Hou, Hao Wang, Xin Fu, Miao Pan

Abstract: As a popular distributed learning paradigm, federated learning (FL) over mobile devices fosters numerous applications, while their practical deployment is hindered by participating devices' computing and communication heterogeneity. Some pioneering research efforts proposed to extract subnetworks from the global model, and assign as large a subnetwork as possible to the device for local training b… ▽ More As a popular distributed learning paradigm, federated learning (FL) over mobile devices fosters numerous applications, while their practical deployment is hindered by participating devices' computing and communication heterogeneity. Some pioneering research efforts proposed to extract subnetworks from the global model, and assign as large a subnetwork as possible to the device for local training based on its full computing and communications capacity. Although such fixed size subnetwork assignment enables FL training over heterogeneous mobile devices, it is unaware of (i) the dynamic changes of devices' communication and computing conditions and (ii) FL training progress and its dynamic requirements of local training contributions, both of which may cause very long FL training delay. Motivated by those dynamics, in this paper, we develop a wireless and heterogeneity aware latency efficient FL (WHALE-FL) approach to accelerate FL training through adaptive subnetwork scheduling. Instead of sticking to the fixed size subnetwork, WHALE-FL introduces a novel subnetwork selection utility function to capture device and FL training dynamics, and guides the mobile device to adaptively select the subnetwork size for local training based on (a) its computing and communication capacity, (b) its dynamic computing and/or communication conditions, and (c) FL training status and its corresponding requirements for local training contributions. Our evaluation shows that, compared with peer designs, WHALE-FL effectively accelerates FL training without sacrificing learning accuracy. △ Less

Submitted 19 August, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

arXiv:2404.16425 [pdf, other]

Soft X-ray prompt emission from a high-redshift gamma-ray burst EP240315a

Authors: Y. Liu, H. Sun, D. Xu, D. S. Svinkin, J. Delaunay, N. R. Tanvir, H. Gao, C. Zhang, Y. Chen, X. -F. Wu, B. Zhang, W. Yuan, J. An, G. Bruni, D. D. Frederiks, G. Ghirlanda, J. -W. Hu, A. Li, C. -K. Li, J. -D. Li, D. B. Malesani, L. Piro, G. Raman, R. Ricci, E. Troja , et al. (170 additional authors not shown)

Abstract: Long gamma-ray bursts (GRBs) are believed to originate from core collapse of massive stars. High-redshift GRBs can probe the star formation and reionization history of the early universe, but their detection remains rare. Here we report the detection of a GRB triggered in the 0.5--4 keV band by the Wide-field X-ray Telescope (WXT) on board the Einstein Probe (EP) mission, designated as EP240315a,… ▽ More Long gamma-ray bursts (GRBs) are believed to originate from core collapse of massive stars. High-redshift GRBs can probe the star formation and reionization history of the early universe, but their detection remains rare. Here we report the detection of a GRB triggered in the 0.5--4 keV band by the Wide-field X-ray Telescope (WXT) on board the Einstein Probe (EP) mission, designated as EP240315a, whose bright peak was also detected by the Swift Burst Alert Telescope and Konus-Wind through off-line analyses. At a redshift of $z=4.859$, EP240315a showed a much longer and more complicated light curve in the soft X-ray band than in gamma-rays. Benefiting from a large field-of-view ($\sim$3600 deg$^2$) and a high sensitivity, EP-WXT captured the earlier engine activation and extended late engine activity through a continuous detection. With a peak X-ray flux at the faint end of previously known high-$z$ GRBs, the detection of EP240315a demonstrates the great potential for EP to study the early universe via GRBs. △ Less

Submitted 25 April, 2024; originally announced April 2024.

Comments: 41 pages, 8 figures, 7 tables

arXiv:2404.04848 [pdf, other]

Task-Aware Encoder Control for Deep Video Compression

Authors: Xingtong Ge, Jixiang Luo, Xinjie Zhang, Tongda Xu, Guo Lu, Dailan He, Jing Geng, Yan Wang, Jun Zhang, Hongwei Qin

Abstract: Prior research on deep video compression (DVC) for machine tasks typically necessitates training a unique codec for each specific task, mandating a dedicated decoder per task. In contrast, traditional video codecs employ a flexible encoder controller, enabling the adaptation of a single codec to different tasks through mechanisms like mode prediction. Drawing inspiration from this, we introduce an… ▽ More Prior research on deep video compression (DVC) for machine tasks typically necessitates training a unique codec for each specific task, mandating a dedicated decoder per task. In contrast, traditional video codecs employ a flexible encoder controller, enabling the adaptation of a single codec to different tasks through mechanisms like mode prediction. Drawing inspiration from this, we introduce an innovative encoder controller for deep video compression for machines. This controller features a mode prediction and a Group of Pictures (GoP) selection module. Our approach centralizes control at the encoding stage, allowing for adaptable encoder adjustments across different tasks, such as detection and tracking, while maintaining compatibility with a standard pre-trained DVC decoder. Empirical evidence demonstrates that our method is applicable across multiple tasks with various existing pre-trained DVCs. Moreover, extensive experiments demonstrate that our method outperforms previous DVC by about 25% bitrate for different tasks, with only one pre-trained decoder. △ Less

Submitted 20 April, 2024; v1 submitted 7 April, 2024; originally announced April 2024.

Comments: Accepted by CVPR 2024

arXiv:2403.19724 [pdf]

Towards Reverse-Engineering the Brain: Brain-Derived Neuromorphic Computing Approach with Photonic, Electronic, and Ionic Dynamicity in 3D integrated circuits

Authors: S. J. Ben Yoo, Luis El-Srouji, Suman Datta, Shimeng Yu, Jean Anne Incorvia, Alberto Salleo, Volker Sorger, Juejun Hu, Lionel C Kimerling, Kristofer Bouchard, Joy Geng, Rishidev Chaudhuri, Charan Ranganath, Randall O'Reilly

Abstract: The human brain has immense learning capabilities at extreme energy efficiencies and scale that no artificial system has been able to match. For decades, reverse engineering the brain has been one of the top priorities of science and technology research. Despite numerous efforts, conventional electronics-based methods have failed to match the scalability, energy efficiency, and self-supervised lea… ▽ More The human brain has immense learning capabilities at extreme energy efficiencies and scale that no artificial system has been able to match. For decades, reverse engineering the brain has been one of the top priorities of science and technology research. Despite numerous efforts, conventional electronics-based methods have failed to match the scalability, energy efficiency, and self-supervised learning capabilities of the human brain. On the other hand, very recent progress in the development of new generations of photonic and electronic memristive materials, device technologies, and 3D electronic-photonic integrated circuits (3D EPIC ) promise to realize new brain-derived neuromorphic systems with comparable connectivity, density, energy-efficiency, and scalability. When combined with bio-realistic learning algorithms and architectures, it may be possible to realize an 'artificial brain' prototype with general self-learning capabilities. This paper argues the possibility of reverse-engineering the brain through architecting a prototype of a brain-derived neuromorphic computing system consisting of artificial electronic, ionic, photonic materials, devices, and circuits with dynamicity resembling the bio-plausible molecular, neuro/synaptic, neuro-circuit, and multi-structural hierarchical macro-circuits of the brain based on well-tested computational models. We further argue the importance of bio-plausible local learning algorithms applicable to the neuromorphic computing system that capture the flexible and adaptive unsupervised and self-supervised learning mechanisms central to human intelligence. Most importantly, we emphasize that the unique capabilities in brain-derived neuromorphic computing prototype systems will enable us to understand links between specific neuronal and network-level properties with system-level functioning and behavior. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 15 pages, 12 figures

arXiv:2403.19168 [pdf]

Tunable Superconducting Magnetic Levitation with Self-Stability

Authors: Qi Xu, Yi Lin, Yunfei Tan, Jianzhao Geng

Abstract: Magnetic levitation based on the flux pinning nature of type II superconductors has the merit of self-stability, making it appealing for applications such as high speed bearings, maglev trains, space generators, etc. However, such levitation systems physically rely on the superconductor pre-capturing magnetic flux (i.e. field cooling process) before establishing the levitation state which is nonad… ▽ More Magnetic levitation based on the flux pinning nature of type II superconductors has the merit of self-stability, making it appealing for applications such as high speed bearings, maglev trains, space generators, etc. However, such levitation systems physically rely on the superconductor pre-capturing magnetic flux (i.e. field cooling process) before establishing the levitation state which is nonadjustable afterwards. Moreover, practical type II superconductors in the levitation system inevitably suffer from various sources of energy losses, leading to continuous levitation force decay. These intrinsic drawbacks make superconducting maglev inflexible and impractical for long term operation. Here we propose and demonstrate a new form of superconducting maglev which is tunable and with self-stability. The maglev system uses a closed-loop type II superconducting coil to lock flux of a magnet, establishing self-stable levitation between the two objects. A flux pump is used to modulate the total magnetic flux of the coil without breaking its superconductivity, thus flexibly tuning levitation force and height meanwhile maintaining self-stability. For the first time, we experimentally demonstrate a self-stable type II superconducting maglev system which is able to: counteract long term levitation force decay, adjust levitation force and equilibrium position, and establish levitation under zero field cooling condition. These breakthroughs may bridge the gap between demonstrations and practical applications of type II superconducting maglevs. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 15pages,5 figures

arXiv:2403.08551 [pdf, other]

GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting

Authors: Xinjie Zhang, Xingtong Ge, Tongda Xu, Dailan He, Yan Wang, Hongwei Qin, Guo Lu, Jing Geng, Jun Zhang

Abstract: Implicit neural representations (INRs) recently achieved great success in image representation and compression, offering high visual quality and fast rendering speeds with 10-1000 FPS, assuming sufficient GPU resources are available. However, this requirement often hinders their use on low-end devices with limited memory. In response, we propose a groundbreaking paradigm of image representation an… ▽ More Implicit neural representations (INRs) recently achieved great success in image representation and compression, offering high visual quality and fast rendering speeds with 10-1000 FPS, assuming sufficient GPU resources are available. However, this requirement often hinders their use on low-end devices with limited memory. In response, we propose a groundbreaking paradigm of image representation and compression by 2D Gaussian Splatting, named GaussianImage. We first introduce 2D Gaussian to represent the image, where each Gaussian has 8 parameters including position, covariance and color. Subsequently, we unveil a novel rendering algorithm based on accumulated summation. Remarkably, our method with a minimum of 3$\times$ lower GPU memory usage and 5$\times$ faster fitting time not only rivals INRs (e.g., WIRE, I-NGP) in representation performance, but also delivers a faster rendering speed of 1500-2000 FPS regardless of parameter size. Furthermore, we integrate existing vector quantization technique to build an image codec. Experimental results demonstrate that our codec attains rate-distortion performance comparable to compression-based INRs such as COIN and COIN++, while facilitating decoding speeds of approximately 2000 FPS. Additionally, preliminary proof of concept shows that our codec surpasses COIN and COIN++ in performance when using partial bits-back coding. Code is available at https://github.com/Xinjie-Q/GaussianImage. △ Less

Submitted 9 July, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

Comments: Accepted by ECCV 2024. Project Page:https://xingtongge.github.io/GaussianImage-page/ Code: https://github.com/Xinjie-Q/GaussianImage

arXiv:2403.07672 [pdf, ps, other]

Quantitative estimates in almost periodic homogenization of parabolic systems

Authors: Jun Geng, Bojing Shi

Abstract: We consider a family of second-order parabolic operators $\partial_t+\mathcal{L}_\varepsilon$ in divergence form with rapidly oscillating, time-dependent and almost-periodic coefficients. We establish uniform interior and boundary Hölder and Lipschitz estimates as well as convergence rate. The estimates of fundamental solution and Green's function are also established. In contrast to periodic case… ▽ More We consider a family of second-order parabolic operators $\partial_t+\mathcal{L}_\varepsilon$ in divergence form with rapidly oscillating, time-dependent and almost-periodic coefficients. We establish uniform interior and boundary Hölder and Lipschitz estimates as well as convergence rate. The estimates of fundamental solution and Green's function are also established. In contrast to periodic case, the main difficulty is that the corrector equation $ (\partial_s+\mathcal{L}_1)(χ^β_{j})=-\mathcal{L}_1(P^β_j) $ in $\mathbb{R}^{d+1}$ may not be solvable in the almost periodic setting for linear functions $P(y)$ and $\partial_t χ_S$ may not in $B^2(\mathbb{R}^{d+1})$. Our results are new even in the case of time-independent coefficients. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: 61 pages

arXiv:2403.05768 [pdf, other]

Deep Contrastive Multi-view Clustering under Semantic Feature Guidance

Authors: Siwen Liu, Jinyan Liu, Hanning Yuan, Qi Li, Jing Geng, Ziqiang Yuan, Huaxu Han

Abstract: Contrastive learning has achieved promising performance in the field of multi-view clustering recently. However, the positive and negative sample construction mechanisms ignoring semantic consistency lead to false negative pairs, limiting the performance of existing algorithms from further improvement. To solve this problem, we propose a multi-view clustering framework named Deep Contrastive Multi… ▽ More Contrastive learning has achieved promising performance in the field of multi-view clustering recently. However, the positive and negative sample construction mechanisms ignoring semantic consistency lead to false negative pairs, limiting the performance of existing algorithms from further improvement. To solve this problem, we propose a multi-view clustering framework named Deep Contrastive Multi-view Clustering under Semantic feature guidance (DCMCS) to alleviate the influence of false negative pairs. Specifically, view-specific features are firstly extracted from raw features and fused to obtain fusion view features according to view importance. To mitigate the interference of view-private information, specific view and fusion view semantic features are learned by cluster-level contrastive learning and concatenated to measure the semantic similarity of instances. By minimizing instance-level contrastive loss weighted by semantic similarity, DCMCS adaptively weakens contrastive leaning between false negative pairs. Experimental results on several public datasets demonstrate the proposed framework outperforms the state-of-the-art methods. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.03627 [pdf, other]

Multimodal Large Language Models to Support Real-World Fact-Checking

Authors: Jiahui Geng, Yova Kementchedjhieva, Preslav Nakov, Iryna Gurevych

Abstract: Multimodal large language models (MLLMs) carry the potential to support humans in processing vast amounts of information. While MLLMs are already being used as a fact-checking tool, their abilities and limitations in this regard are understudied. Here is aim to bridge this gap. In particular, we propose a framework for systematically assessing the capacity of current multimodal models to facilitat… ▽ More Multimodal large language models (MLLMs) carry the potential to support humans in processing vast amounts of information. While MLLMs are already being used as a fact-checking tool, their abilities and limitations in this regard are understudied. Here is aim to bridge this gap. In particular, we propose a framework for systematically assessing the capacity of current multimodal models to facilitate real-world fact-checking. Our methodology is evidence-free, leveraging only these models' intrinsic knowledge and reasoning capabilities. By designing prompts that extract models' predictions, explanations, and confidence levels, we delve into research questions concerning model accuracy, robustness, and reasons for failure. We empirically find that (1) GPT-4V exhibits superior performance in identifying malicious and misleading multimodal claims, with the ability to explain the unreasonable aspects and underlying motives, and (2) existing open-source models exhibit strong biases and are highly sensitive to the prompt. Our study offers insights into combating false multimodal information and building secure, trustworthy multimodal models. To the best of our knowledge, we are the first to evaluate MLLMs for real-world fact-checking. △ Less

Submitted 26 April, 2024; v1 submitted 6 March, 2024; originally announced March 2024.

arXiv:2402.19132 [pdf, ps, other]

Weighted least $\ell_p$ approximation on compact Riemannian manifolds

Authors: Jiansong Li, Yun Ling, Jiaxin Geng, Heping Wang

Abstract: Given a sequence of Marcinkiewicz-Zygmund inequalities in $L_2$ on a compact space, Gröchenig in \cite{G} discussed weighted least squares approximation and least squares quadrature. Inspired by this work, for all $1\le p\le\infty$, we develop weighted least $\ell_p$ approximation induced by a sequence of Marcinkiewicz-Zygmund inequalities in $L_p$ on a compact smooth Riemannian manifold $\Bbb M$… ▽ More Given a sequence of Marcinkiewicz-Zygmund inequalities in $L_2$ on a compact space, Gröchenig in \cite{G} discussed weighted least squares approximation and least squares quadrature. Inspired by this work, for all $1\le p\le\infty$, we develop weighted least $\ell_p$ approximation induced by a sequence of Marcinkiewicz-Zygmund inequalities in $L_p$ on a compact smooth Riemannian manifold $\Bbb M$ with normalized Riemannian measure (typical examples are the torus and the sphere). In this paper we derive corresponding approximation theorems with the error measured in $L_q,\,1\le q\le\infty$, and least quadrature errors for both Sobolev spaces $H_p^r(\Bbb M), \, r>d/p$ generated by eigenfunctions associated with the Laplace-Beltrami operator and Besov spaces $B_{p,τ}^r(\Bbb M),\, 0<τ\le \infty, r>d/p $ defined by best polynomial approximation. Finally, we discuss the optimality of the obtained results by giving sharp estimates of sampling numbers and optimal quadrature errors for the aforementioned spaces. △ Less

Submitted 29 February, 2024; originally announced February 2024.

Comments: 23 pages

MSC Class: 41A17; 41A55; 41A63; 65D15; 65D30; 65D32

arXiv:2402.11111 [pdf, other]

Language Models as Science Tutors

Authors: Alexis Chevalier, Jiayi Geng, Alexander Wettig, Howard Chen, Sebastian Mizera, Toni Annala, Max Jameson Aragon, Arturo Rodríguez Fanlo, Simon Frieder, Simon Machado, Akshara Prabhakar, Ellie Thieu, Jiachen T. Wang, Zirui Wang, Xindi Wu, Mengzhou Xia, Wenhan Xia, Jiatong Yu, Jun-Jie Zhu, Zhiyong Jason Ren, Sanjeev Arora, Danqi Chen

Abstract: NLP has recently made exciting progress toward training language models (LMs) with strong scientific problem-solving skills. However, model development has not focused on real-life use-cases of LMs for science, including applications in education that require processing long scientific documents. To address this, we introduce TutorEval and TutorChat. TutorEval is a diverse question-answering bench… ▽ More NLP has recently made exciting progress toward training language models (LMs) with strong scientific problem-solving skills. However, model development has not focused on real-life use-cases of LMs for science, including applications in education that require processing long scientific documents. To address this, we introduce TutorEval and TutorChat. TutorEval is a diverse question-answering benchmark consisting of questions about long chapters from STEM textbooks, written by experts. TutorEval helps measure real-life usability of LMs as scientific assistants, and it is the first benchmark combining long contexts, free-form generation, and multi-disciplinary scientific knowledge. Moreover, we show that fine-tuning base models with existing dialogue datasets leads to poor performance on TutorEval. Therefore, we create TutorChat, a dataset of 80,000 long synthetic dialogues about textbooks. We use TutorChat to fine-tune Llemma models with 7B and 34B parameters. These LM tutors specialized in math have a 32K-token context window, and they excel at TutorEval while performing strongly on GSM8K and MATH. Our datasets build on open-source materials, and we release our models, data, and evaluations. △ Less

Submitted 21 July, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

Comments: 8 pages without bibliography and appendix, 26 pages total

arXiv:2402.10097 [pdf, other]

Adaptive Federated Learning in Heterogeneous Wireless Networks with Independent Sampling

Authors: Jiaxiang Geng, Yanzhao Hou, Xiaofeng Tao, Juncheng Wang, Bing Luo

Abstract: Federated Learning (FL) algorithms commonly sample a random subset of clients to address the straggler issue and improve communication efficiency. While recent works have proposed various client sampling methods, they have limitations in joint system and data heterogeneity design, which may not align with practical heterogeneous wireless networks. In this work, we advocate a new independent client… ▽ More Federated Learning (FL) algorithms commonly sample a random subset of clients to address the straggler issue and improve communication efficiency. While recent works have proposed various client sampling methods, they have limitations in joint system and data heterogeneity design, which may not align with practical heterogeneous wireless networks. In this work, we advocate a new independent client sampling strategy to minimize the wall-clock training time of FL, while considering data heterogeneity and system heterogeneity in both communication and computation. We first derive a new convergence bound for non-convex loss functions with independent client sampling and then propose an adaptive bandwidth allocation scheme. Furthermore, we propose an efficient independent client sampling algorithm based on the upper bounds on the convergence rounds and the expected per-round training time, to minimize the wall-clock time of FL, while considering both the data and system heterogeneity. Experimental results under practical wireless network settings with real-world prototype demonstrate that the proposed independent sampling scheme substantially outperforms the current best sampling schemes under various training models and datasets. △ Less

Submitted 13 May, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: 6 pages, 5 figures, accepted for publication in IEEE International Conference on Communications (ICC)

arXiv:2402.09527 [pdf, other]

Design and Implementation of a Scalable Financial Exchange in the Public Cloud

Authors: Muhammad Haseeb, Jinkun Geng, Ulysses Butler, Xiyu Hao, Daniel Duclos-Cavalcanti, Anirudh Sivaraman, Srinivas Narayana

Abstract: Financial exchanges are migrating to the cloud, but the best-effort nature of the public cloud is at odds with the stringent latency requirements of exchanges. We present Jasper, a system for meeting the networking requirements of financial exchanges on the public cloud. Jasper uses an overlay tree to scalably multicast market data from an exchange to ~1000 participants with low latency (250 micro… ▽ More Financial exchanges are migrating to the cloud, but the best-effort nature of the public cloud is at odds with the stringent latency requirements of exchanges. We present Jasper, a system for meeting the networking requirements of financial exchanges on the public cloud. Jasper uses an overlay tree to scalably multicast market data from an exchange to ~1000 participants with low latency (250 microseconds) and a 1-microsecond difference in data reception time between any two participants. Jasper reuses the same tree for scalable inbound communication (participants to exchange), augmenting it with order pacing and a new priority queue, Limit Order Queue (LOQ), to efficiently handle bursts of market orders. Jasper achieves better scalability and 50% lower latency than the AWS multicast service. During bursty market activity, LOQ nearly doubles the order processing rate. △ Less

Submitted 30 September, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

arXiv:2402.07375 [pdf, other]

doi 10.2514/6.2024-2878

A Unified MPC Strategy for a Tilt-rotor VTOL UAV Towards Seamless Mode Transitioning

Authors: Qizhao Chen, Ziqi Hu, Junyi Geng, Dongwei Bai, Mohammad Mousaei, Sebastian Scherer

Abstract: Capabilities of long-range flight and vertical take-off and landing (VTOL) are essential for Urban Air Mobility (UAM). Tiltrotor VTOLs have the advantage of balancing control simplicity and system complexity due to their redundant control authority. Prior work on controlling these aircraft either requires separate controllers and switching modes for different vehicle configurations or performs the… ▽ More Capabilities of long-range flight and vertical take-off and landing (VTOL) are essential for Urban Air Mobility (UAM). Tiltrotor VTOLs have the advantage of balancing control simplicity and system complexity due to their redundant control authority. Prior work on controlling these aircraft either requires separate controllers and switching modes for different vehicle configurations or performs the control allocation on separate actuator sets, which cannot fully use the potential of the redundancy of tiltrotor. This paper introduces a unified MPC-based control strategy for a customized tiltrotor VTOL Unmanned Aerial Vehicle (UAV), which does not require mode-switching and can perform the control allocation in a consistent way. The incorporation of four independently controllable rotors in VTOL design offers an extra level of redundancy, allowing the VTOL to accommodate actuator failures. The result shows that our approach outperforms PID controllers while maintaining unified control. It allows the VTOL to perform smooth acceleration/deceleration, and precise coordinated turns. In addition, the independently controlled tilts enable the vehicle to handle actuator failures, ensuring that the aircraft remains operational even in the event of a servo or motor malfunction. △ Less

Submitted 11 February, 2024; originally announced February 2024.

Comments: In proceedings of the 2024 AIAA SciTech Forum, Session: Guidance, Navigation, and Control GNC-49

Journal ref: AIAA SCITECH 2024 Forum, p. 2878. January 2024

arXiv:2402.02360 [pdf, other]

On the Broadening of the Pulse Width of FRB 20121102A due to Propagation and Instrumental Effects

Authors: Jia-Peng Wei, Yong-Feng Huang, Lang Cui, Xiang Liu, Jin-Jun Geng, Xue-Feng Wu

Abstract: The pulse widths of fast radio bursts are always broadened due to the scattering of the plasma medium through which the electromagnetic wave passes. The recorded pulse width will be further affected by the radio telescopes since the sampling time and the bandwidth cannot be infinitely small. In this study, we focus on the pulse widths of the 3287 bursts detected from FRB 20121102A as of October 20… ▽ More The pulse widths of fast radio bursts are always broadened due to the scattering of the plasma medium through which the electromagnetic wave passes. The recorded pulse width will be further affected by the radio telescopes since the sampling time and the bandwidth cannot be infinitely small. In this study, we focus on the pulse widths of the 3287 bursts detected from FRB 20121102A as of October 2023. Various effects such as the scattering broadening, the redshift broadening and the instrumental broadening are examined. It is found that the instrumental broadening only contributes a fraction of $10^{-3}$--$10^{-1}$ to the observed pulse width. The scattering broadening is even smaller, which constitutes a tiny fraction of $10^{-5}$--$10^{-2}$ in the observed pulse width. After correcting for these broadenings, the intrinsic pulse width is derived for each burst. The maximum and minimum pulse widths at different frequencies are highlighted. Interestingly, both the mean value and the dispersion range of intrinsic pulse width are found to be inversely proportional to the square of the central frequency. The intrinsic widths of most bursts are in a narrow range of 1--10 ms, which leads to a quasi-linear correlation between the fluence and the peak flux. △ Less

Submitted 4 February, 2024; originally announced February 2024.

arXiv:2401.16199 [pdf, ps, other]

Optimal quadrature errors and sampling numbers for Sobolev spaces with logarithmic perturbation on spheres

Authors: Jiaxin Geng, Yun Ling, Jiansong Li, Heping Wang

Abstract: In this paper, we study optimal quadrature errors, approximation numbers, and sampling numbers in $L_2(\Bbb S^d)$ for Sobolev spaces ${\rm H}^{α,β}(\Bbb S^d)$ with logarithmic perturbation on the unit sphere $\Bbb S^d$ in $\Bbb R^{d+1}$. First we obtain strong equivalences of the approximation numbers for ${\rm H}^{α,β}(\Bbb S^d)$ with $α>0$, which gives a clue to Open problem 3 as posed by Krieg… ▽ More In this paper, we study optimal quadrature errors, approximation numbers, and sampling numbers in $L_2(\Bbb S^d)$ for Sobolev spaces ${\rm H}^{α,β}(\Bbb S^d)$ with logarithmic perturbation on the unit sphere $\Bbb S^d$ in $\Bbb R^{d+1}$. First we obtain strong equivalences of the approximation numbers for ${\rm H}^{α,β}(\Bbb S^d)$ with $α>0$, which gives a clue to Open problem 3 as posed by Krieg and Vybíral in \cite{KV}. Second, for the optimal quadrature errors for ${\rm H}^{α,β}(\Bbb S^d)$, we use the "fooling" function technique to get lower bounds in the case $α>d/2$, and apply Hilbert space structure and Vybíral's theorem about Schur product theory to obtain lower bounds in the case $α=d/2,\,β>1/2$ of small smoothness, which confirms the conjecture as posed by Grabner and Stepanyukin in \cite{GS} and solves Open problem 2 in \cite{KV}. Finally, we employ the weighted least squares operators and the least squares quadrature rules to obtain approximation theorems and quadrature errors for ${\rm H}^{α,β}(\Bbb S^d)$ with $α>d/2$ or $α=d/2,\,β>1/2$, which are order optimal. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: 24 pages

MSC Class: 41A63; 65C05; 65D15; 65Y20

arXiv:2401.03222 [pdf, ps, other]

Resolvent Estimates for the Stokes Operator in Bounded and Exterior $C^1$ Domains

Authors: Jun Geng, Zhongwei Shen

Abstract: We establish resolvent estimates in $L^q$ spaces for the Stokes operator in a bounded $C^1$ domain $Ω$ in $\mathbb{R}^d$. As a corollary, it follows that the Stokes operator generates a bounded analytic semigroup in $L^q(Ω; \mathbb{C}^d)$ for any $1< q< \infty$ and $d\ge 2$. The case of an exterior $C^1$ domain is also studied. We establish resolvent estimates in $L^q$ spaces for the Stokes operator in a bounded $C^1$ domain $Ω$ in $\mathbb{R}^d$. As a corollary, it follows that the Stokes operator generates a bounded analytic semigroup in $L^q(Ω; \mathbb{C}^d)$ for any $1< q< \infty$ and $d\ge 2$. The case of an exterior $C^1$ domain is also studied. △ Less

Submitted 1 August, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

Comments: Minor corrections. To appear in Math. Ann

MSC Class: 35Q30

arXiv:2311.09000 [pdf, other]

Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers

Authors: Yuxia Wang, Revanth Gangi Reddy, Zain Muhammad Mujahid, Arnav Arora, Aleksandr Rubashevskii, Jiahui Geng, Osama Mohammed Afzal, Liangming Pan, Nadav Borenstein, Aditya Pillai, Isabelle Augenstein, Iryna Gurevych, Preslav Nakov

Abstract: The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. In this work, we present a holistic end-to-end solution for annotating the factuality of LLM-generated responses, which encompasses a multi-stage annotation scheme designed to yield detailed labels concerning the verifiability and factu… ▽ More The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. In this work, we present a holistic end-to-end solution for annotating the factuality of LLM-generated responses, which encompasses a multi-stage annotation scheme designed to yield detailed labels concerning the verifiability and factual inconsistencies found in LLM outputs. We further construct an open-domain document-level factuality benchmark in three-level granularity: claim, sentence and document, aiming to facilitate the evaluation of automatic fact-checking systems. Preliminary experiments show that FacTool, FactScore and Perplexity.ai are struggling to identify false claims, with the best F1=0.63 by this annotation solution based on GPT-4. Annotation tool, benchmark and code are available at https://github.com/yuxiaw/Factcheck-GPT. △ Less

Submitted 16 April, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

Comments: 30 pages, 13 figures

arXiv:2311.08298 [pdf, other]

A Survey of Confidence Estimation and Calibration in Large Language Models

Authors: Jiahui Geng, Fengyu Cai, Yuxia Wang, Heinz Koeppl, Preslav Nakov, Iryna Gurevych

Abstract: Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks in various domains. Despite their impressive performance, they can be unreliable due to factual errors in their generations. Assessing their confidence and calibrating them across different tasks can help mitigate risks and enable LLMs to produce better generations. There has been a lot of recent re… ▽ More Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks in various domains. Despite their impressive performance, they can be unreliable due to factual errors in their generations. Assessing their confidence and calibrating them across different tasks can help mitigate risks and enable LLMs to produce better generations. There has been a lot of recent research aiming to address this, but there has been no comprehensive overview to organize it and outline the main lessons learned. The present survey aims to bridge this gap. In particular, we outline the challenges and we summarize recent technical advancements for LLM confidence estimation and calibration. We further discuss their applications and suggest promising directions for future work. △ Less

Submitted 25 March, 2024; v1 submitted 14 November, 2023; originally announced November 2023.

Comments: 16 pages, 1 page, 1 table

arXiv:2311.06451 [pdf, other]

Probing Thermal Electrons in GRB Afterglows

Authors: Hao-Xuan Gao, Jin-Jun Geng, Tian-Rui Sun, Liang Li, Yong-Feng Huang, Xue-Feng Wu

Abstract: Particle-in-cell simulations have unveiled that shock-accelerated electrons do not follow a pure power-law distribution, but have an additional low-energy "thermal" part, which owns a considerable portion of the total energy of electrons. Investigating the effects of these thermal electrons on gamma-ray burst (GRB) afterglows may provide valuable insights into the particle acceleration mechanisms.… ▽ More Particle-in-cell simulations have unveiled that shock-accelerated electrons do not follow a pure power-law distribution, but have an additional low-energy "thermal" part, which owns a considerable portion of the total energy of electrons. Investigating the effects of these thermal electrons on gamma-ray burst (GRB) afterglows may provide valuable insights into the particle acceleration mechanisms. We solve the continuity equation of electrons in the energy space, from which multi-wavelength afterglows are derived by incorporating processes including synchrotron radiation, synchrotron self-absorption, synchrotron self-Compton scattering, and gamma-gamma annihilation. First, there is an underlying positive correlation between temporal and spectral indices due to the cooling of electrons. Moreover, thermal electrons would result in the simultaneous non-monotonic variation in both spectral and temporal indices at multi-wavelength, which could be individually recorded by the 2.5-meter Wide Field Survey Telescope and Vera Rubin Observatory Legacy Survey of Space and Time (LSST). The thermal electrons could also be diagnosed from afterglow spectra by synergy observation in the optical (with LSST) and X-ray bands (with the Microchannel X-ray Telescope on board the Space Variable Objects Monitor). Finally, we use Monte Carlo simulations to obtain the distribution of peak flux ratio ($R_{\rm X}$) between soft and hard X-rays, and of the time delay ($Δt$) between peak times of soft X-ray and optical light curves. The thermal electrons significantly raise the upper limits of both $R_{\rm X}$ and $Δt$. Thus the distribution of GRB afterglows with thermal electrons is more dispersive in the $R_{\rm X} - Δt$ plane. △ Less

Submitted 10 June, 2024; v1 submitted 10 November, 2023; originally announced November 2023.

Comments: 19 pages, 18 figures, accepted by the Astrophysical Journal

arXiv:2310.19287 [pdf]

Enhancing Scalability and Reliability in Semi-Decentralized Federated Learning With Blockchain: Trust Penalization and Asynchronous Functionality

Authors: Ajay Kumar Shrestha, Faijan Ahamad Khan, Mohammed Afaan Shaikh, Amir Jaberzadeh, Jason Geng

Abstract: The paper presents an innovative approach to address the challenges of scalability and reliability in Distributed Federated Learning by leveraging the integration of blockchain technology. The paper focuses on enhancing the trustworthiness of participating nodes through a trust penalization mechanism while also enabling asynchronous functionality for efficient and robust model updates. By combinin… ▽ More The paper presents an innovative approach to address the challenges of scalability and reliability in Distributed Federated Learning by leveraging the integration of blockchain technology. The paper focuses on enhancing the trustworthiness of participating nodes through a trust penalization mechanism while also enabling asynchronous functionality for efficient and robust model updates. By combining Semi-Decentralized Federated Learning with Blockchain (SDFL-B), the proposed system aims to create a fair, secure and transparent environment for collaborative machine learning without compromising data privacy. The research presents a comprehensive system architecture, methodologies, experimental results, and discussions that demonstrate the advantages of this novel approach in fostering scalable and reliable SDFL-B systems. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: To appear in 2023 IEEE Ubiquitous Computing, Electronics & Mobile Communication Conference (IEEE UEMCON)

arXiv:2310.08233 [pdf, other]

The Impact of Time Step Frequency on the Realism of Robotic Manipulation Simulation for Objects of Different Scales

Authors: Minh Q. Ta, Holly Dinkel, Hameed Abdul-Rashid, Yangfei Dai, Jessica Myers, Tan Chen, Junyi Geng, Timothy Bretl

Abstract: This work evaluates the impact of time step frequency and component scale on robotic manipulation simulation accuracy. Increasing the time step frequency for small-scale objects is shown to improve simulation accuracy. This simulation, demonstrating pre-assembly part picking for two object geometries, serves as a starting point for discussing how to improve Sim2Real transfer in robotic assembly pr… ▽ More This work evaluates the impact of time step frequency and component scale on robotic manipulation simulation accuracy. Increasing the time step frequency for small-scale objects is shown to improve simulation accuracy. This simulation, demonstrating pre-assembly part picking for two object geometries, serves as a starting point for discussing how to improve Sim2Real transfer in robotic assembly processes. △ Less

Submitted 12 October, 2023; originally announced October 2023.

Comments: 3 pages, 3 figures, Best Poster Finalist at the 2023 Robotics and AI in Future Factory Workshop at the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Video presentation [https://www.youtube.com/watch?v=JOXrBpMmI0A]. Robotics and AI in Future Factory workshop [https://sites.google.com/view/robot-ai-future-factory/]

arXiv:2310.00908 [pdf, other]

A bright burst from FRB 20200120E in a globular cluster of the nearby galaxy M81

Authors: S. B. Zhang, J. S. Wang, X. Yang, Y. Li, J. J. Geng, Z. F. Tang, C. M. Chang, J. T. Luo, X. C. Wang, X. F. Wu, Z. G. Dai, B. Zhang

Abstract: Fast radio bursts (FRBs) are immensely energetic millisecond-duration radio pulses. Observations indicate that nearby FRBs can be produced by old stellar populations, as suggested by the localization of the repeating source FRB 20200120E in a globular cluster of M81. Nevertheless, the burst energies of FRB 20200120E are significantly smaller than those of other cosmological FRBs, even falling belo… ▽ More Fast radio bursts (FRBs) are immensely energetic millisecond-duration radio pulses. Observations indicate that nearby FRBs can be produced by old stellar populations, as suggested by the localization of the repeating source FRB 20200120E in a globular cluster of M81. Nevertheless, the burst energies of FRB 20200120E are significantly smaller than those of other cosmological FRBs, even falling below the energy of the Galactic event FRB 20200428. Here, we report the detection of a bright burst from FRB 20200120E in 1.1 -- 1.7 GHz, with a fluence of about 30 Jy ms, which is more than 42 times larger than the previously detected bursts near 1.4 GHz frequency. It reaches one-third of the energy of the weakest burst from FRB 20121102A and is detectable at a distance exceeding 200 Mpc. Our finding bridges the gap between nearby and cosmological FRBs and indicates that FRBs hosted in globular clusters can be bright enough to be observable at cosmological distances. △ Less

Submitted 31 July, 2024; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: 23 pages, 4 figures, 1 table, accepted at Nature Communications

arXiv:2310.00142 [pdf, other]

Aerial Interaction with Tactile Sensing

Authors: Xiaofeng Guo, Guanqi He, Mohammadreza Mousaei, Junyi Geng, Guanya Shi, Sebastian Scherer

Abstract: While autonomous Uncrewed Aerial Vehicles (UAVs) have grown rapidly, most applications only focus on passive visual tasks. Aerial interaction aims to execute tasks involving physical interactions, which offers a way to assist humans in high-risk, high-altitude operations, thereby reducing cost, time, and potential hazards. The coupled dynamics between the aerial vehicle and manipulator, however, p… ▽ More While autonomous Uncrewed Aerial Vehicles (UAVs) have grown rapidly, most applications only focus on passive visual tasks. Aerial interaction aims to execute tasks involving physical interactions, which offers a way to assist humans in high-risk, high-altitude operations, thereby reducing cost, time, and potential hazards. The coupled dynamics between the aerial vehicle and manipulator, however, pose challenges for precision control. Previous research has typically employed either position control, which often fails to meet mission accuracy, or force control using expensive, heavy, and cumbersome force/torque sensors that also lack local semantic information. Conversely, tactile sensors, being both cost-effective and lightweight, are capable of sensing contact information including force distribution, as well as recognizing local textures. Existing work on tactile sensing mainly focuses on tabletop manipulation tasks within a quasi-static process. In this paper, we pioneer the use of vision-based tactile sensors on a fully-actuated UAV to improve the accuracy of the more dynamic aerial manipulation tasks. We introduce a pipeline utilizing tactile feedback for real-time force tracking via a hybrid motion-force controller and a method for wall texture detection during aerial interactions. Our experiments demonstrate that our system can effectively replace or complement traditional force/torque sensors, improving flight performance by approximately 16% in position tracking error when using the fused force estimate compared to relying on a single sensor. Our tactile sensor achieves 93.4% accuracy in real-time texture recognition and 100% post-contact. To the best of our knowledge, this is the first work to incorporate a vision-based tactile sensor into aerial interaction tasks. △ Less

Submitted 29 September, 2023; originally announced October 2023.

Comments: 7 pages, 5 figures

arXiv:2309.13643 [pdf, other]

REWAFL: Residual Energy and Wireless Aware Participant Selection for Efficient Federated Learning over Mobile Devices

Authors: Y. Li, X. Qin, J. Geng, R. Chen, Y. Hou, Y. Gong, M. Pan, P. Zhang

Abstract: Participant selection (PS) helps to accelerate federated learning (FL) convergence, which is essential for the practical deployment of FL over mobile devices. While most existing PS approaches focus on improving training accuracy and efficiency rather than residual energy of mobile devices, which fundamentally determines whether the selected devices can participate. Meanwhile, the impacts of mobil… ▽ More Participant selection (PS) helps to accelerate federated learning (FL) convergence, which is essential for the practical deployment of FL over mobile devices. While most existing PS approaches focus on improving training accuracy and efficiency rather than residual energy of mobile devices, which fundamentally determines whether the selected devices can participate. Meanwhile, the impacts of mobile devices' heterogeneous wireless transmission rates on PS and FL training efficiency are largely ignored. Moreover, PS causes the staleness issue. Prior research exploits isolated functions to force long-neglected devices to participate, which is decoupled from original PS designs. In this paper, we propose a residual energy and wireless aware PS design for efficient FL training over mobile devices (REWAFL). REW AFL introduces a novel PS utility function that jointly considers global FL training utilities and local energy utility, which integrates energy consumption and residual battery energy of candidate mobile devices. Under the proposed PS utility function framework, REW AFL further presents a residual energy and wireless aware local computing policy. Besides, REWAFL buries the staleness solution into its utility function and local computing policy. The experimental results show that REW AFL is effective in improving training accuracy and efficiency, while avoiding "flat battery" of mobile devices. △ Less

Submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.13035 [pdf, other]

PyPose v0.6: The Imperative Programming Interface for Robotics

Authors: Zitong Zhan, Xiangfu Li, Qihang Li, Haonan He, Abhinav Pandey, Haitao Xiao, Yangmengfei Xu, Xiangyu Chen, Kuan Xu, Kun Cao, Zhipeng Zhao, Zihan Wang, Huan Xu, Zihang Fang, Yutian Chen, Wentao Wang, Xu Fang, Yi Du, Tianhao Wu, Xiao Lin, Yuheng Qiu, Fan Yang, Jingnan Shi, Shaoshu Su, Yiren Lu , et al. (11 additional authors not shown)

Abstract: PyPose is an open-source library for robot learning. It combines a learning-based approach with physics-based optimization, which enables seamless end-to-end robot learning. It has been used in many tasks due to its meticulously designed application programming interface (API) and efficient implementation. From its initial launch in early 2022, PyPose has experienced significant enhancements, inco… ▽ More PyPose is an open-source library for robot learning. It combines a learning-based approach with physics-based optimization, which enables seamless end-to-end robot learning. It has been used in many tasks due to its meticulously designed application programming interface (API) and efficient implementation. From its initial launch in early 2022, PyPose has experienced significant enhancements, incorporating a wide variety of new features into its platform. To satisfy the growing demand for understanding and utilizing the library and reduce the learning curve of new users, we present the fundamental design principle of the imperative programming interface, and showcase the flexible usage of diverse functionalities and modules using an extremely simple Dubins car example. We also demonstrate that the PyPose can be easily used to navigate a real quadruped robot with a few lines of code. △ Less

Submitted 22 September, 2023; originally announced September 2023.

arXiv:2309.06050 [pdf, other]

Coherent Cherenkov Radiation by Bunches in Fast Radio Bursts

Authors: Ze-Nan Liu, Jin-Jun Geng, Yuan-Pei Yang, Wei-Yang Wang, Zi-Gao Dai

Abstract: Fast radio bursts (FRBs) are extragalactic radio transients with extremely high brightness temperature, which strongly suggests the presence of coherent emission mechanisms. In this study, we introduce a novel radiation mechanism for FRBs involving coherent Cherenkov radiation (ChR) emitted by bunched particles that may originate within the magnetosphere of a magnetar. We assume that some relativi… ▽ More Fast radio bursts (FRBs) are extragalactic radio transients with extremely high brightness temperature, which strongly suggests the presence of coherent emission mechanisms. In this study, we introduce a novel radiation mechanism for FRBs involving coherent Cherenkov radiation (ChR) emitted by bunched particles that may originate within the magnetosphere of a magnetar. We assume that some relativistic particles are emitted from the polar cap of a magnetar and move along magnetic field lines through a charge-separated magnetic plasma, emitting coherent ChR along their trajectory. The crucial condition for ChR to occur is that the refractive index of the plasma medium, denoted as $n_r$, must satisfy the condition $n_r^2 > 1$. We conduct comprehensive calculations to determine various characteristics of ChR, including its characteristic frequency, emission power, required parallel electric field, and coherence factor. Notably, our proposed bunched coherent ChR mechanism has the remarkable advantage of generating a narrower-band spectrum. Furthermore, a frequency downward drifting pattern, and $\sim100\%$ linearly polarized emission can be predicted within the framework of this emission mechanism. △ Less

Submitted 12 September, 2023; originally announced September 2023.

Comments: 15 pages, 5 figures, 1 table. Accepted for publication in ApJ

Showing 1–50 of 241 results for author: Geng, J