subscribe to arXiv mailings

Large Language Models for Autonomous Driving (LLM4AD): Concept, Benchmark, Simulation, and Real-Vehicle Experiment

Authors: Can Cui, Yunsheng Ma, Zichong Yang, Yupeng Zhou, Peiran Liu, Juanwu Lu, Lingxi Li, Yaobin Chen, Jitesh H. Panchal, Amr Abdelraouf, Rohit Gupta, Kyungtae Han, Ziran Wang

Abstract: With the broader usage and highly successful development of Large Language Models (LLMs), there has been a growth of interest and demand for applying LLMs to autonomous driving technology. Driven by their natural language understanding and reasoning ability, LLMs have the potential to enhance various aspects of autonomous driving systems, from perception and scene understanding to language interac… ▽ More With the broader usage and highly successful development of Large Language Models (LLMs), there has been a growth of interest and demand for applying LLMs to autonomous driving technology. Driven by their natural language understanding and reasoning ability, LLMs have the potential to enhance various aspects of autonomous driving systems, from perception and scene understanding to language interaction and decision-making. In this paper, we first introduce novel concepts and approaches to designing LLMs for autonomous driving (LLM4AD). Then, we propose a comprehensive benchmark for evaluating the instruction-following abilities of LLMs within the autonomous driving domain. Furthermore, we conduct a series of experiments on both simulation and real-world vehicle platforms, thoroughly evaluating the performance and potential of our LLM4AD systems. Our research highlights the significant potential of LLMs to enhance various aspects of autonomous vehicle technology, from perception and scene understanding to language interaction and decision-making. △ Less

Submitted 20 October, 2024; originally announced October 2024.

arXiv:2409.11182 [pdf, other]

Video Token Sparsification for Efficient Multimodal LLMs in Autonomous Driving

Authors: Yunsheng Ma, Amr Abdelraouf, Rohit Gupta, Ziran Wang, Kyungtae Han

Abstract: Multimodal large language models (MLLMs) have demonstrated remarkable potential for enhancing scene understanding in autonomous driving systems through powerful logical reasoning capabilities. However, the deployment of these models faces significant challenges due to their substantial parameter sizes and computational demands, which often exceed the constraints of onboard computation. One major l… ▽ More Multimodal large language models (MLLMs) have demonstrated remarkable potential for enhancing scene understanding in autonomous driving systems through powerful logical reasoning capabilities. However, the deployment of these models faces significant challenges due to their substantial parameter sizes and computational demands, which often exceed the constraints of onboard computation. One major limitation arises from the large number of visual tokens required to capture fine-grained and long-context visual information, leading to increased latency and memory consumption. To address this issue, we propose Video Token Sparsification (VTS), a novel approach that leverages the inherent redundancy in consecutive video frames to significantly reduce the total number of visual tokens while preserving the most salient information. VTS employs a lightweight CNN-based proposal model to adaptively identify key frames and prune less informative tokens, effectively mitigating hallucinations and increasing inference throughput without compromising performance. We conduct comprehensive experiments on the DRAMA and LingoQA benchmarks, demonstrating the effectiveness of VTS in achieving up to a 33\% improvement in inference throughput and a 28\% reduction in memory usage compared to the baseline without compromising performance. △ Less

Submitted 16 September, 2024; originally announced September 2024.

Comments: 10 pages, 3 figures, 4 tables

arXiv:2409.07121 [pdf]

Next-Generation Multi-layer Metasurface Design: Hybrid Deep Learning Models for Beyond-RGB Reconfigurable Structural Colors

Authors: Omar A. M. Abdelraouf, Ahmed Mousa, Mohamed Ragab

Abstract: Metasurfaces are key to the development of flat optics and nanophotonic devices, offering significant advantages in creating structural colors and high-quality factor cavities. Multi-layer metasurfaces (MLMs) further amplify these benefits by enhancing light-matter interactions within individual nanopillars. However, the numerous design parameters involved make traditional simulation tools impract… ▽ More Metasurfaces are key to the development of flat optics and nanophotonic devices, offering significant advantages in creating structural colors and high-quality factor cavities. Multi-layer metasurfaces (MLMs) further amplify these benefits by enhancing light-matter interactions within individual nanopillars. However, the numerous design parameters involved make traditional simulation tools impractical and time-consuming for optimizing MLMs. This highlights the need for more efficient approaches to accelerate their design. In this work, we introduce NanoPhotoNet, an AI-driven design tool based on a hybrid deep neural network (DNN) model that combines convolutional neural networks (CNN) and Long Short-Term Memory (LSTM) networks. NanoPhotoNet enhances the design and optimization of MLMs, achieving a prediction accuracy of over 98.3% and a speed improvement of 50,000x compared to conventional methods. The tool enables MLMs to produce structural colors beyond the standard RGB region, expanding the RGB gamut area by 163%. Furthermore, we demonstrate the generation of tunable structural colors, extending the metasurface functionality to tunable color filters. These findings present a powerful method for applying NanoPhotoNet to MLMs, enabling strong light-matter interactions in applications such as tunable nanolasers and reconfigurable beam steering. △ Less

Submitted 11 September, 2024; originally announced September 2024.

arXiv:2405.03873 [pdf, other]

Investigating Personalized Driving Behaviors in Dilemma Zones: Analysis and Prediction of Stop-or-Go Decisions

Authors: Ziye Qin, Siyan Li, Guoyuan Wu, Matthew J. Barth, Amr Abdelraouf, Rohit Gupta, Kyungtae Han

Abstract: Dilemma zones at signalized intersections present a commonly occurring but unsolved challenge for both drivers and traffic operators. Onsets of the yellow lights prompt varied responses from different drivers: some may brake abruptly, compromising the ride comfort, while others may accelerate, increasing the risk of red-light violations and potential safety hazards. Such diversity in drivers' stop… ▽ More Dilemma zones at signalized intersections present a commonly occurring but unsolved challenge for both drivers and traffic operators. Onsets of the yellow lights prompt varied responses from different drivers: some may brake abruptly, compromising the ride comfort, while others may accelerate, increasing the risk of red-light violations and potential safety hazards. Such diversity in drivers' stop-or-go decisions may result from not only surrounding traffic conditions, but also personalized driving behaviors. To this end, identifying personalized driving behaviors and integrating them into advanced driver assistance systems (ADAS) to mitigate the dilemma zone problem presents an intriguing scientific question. In this study, we employ a game engine-based (i.e., CARLA-enabled) driving simulator to collect high-resolution vehicle trajectories, incoming traffic signal phase and timing information, and stop-or-go decisions from four subject drivers in various scenarios. This approach allows us to analyze personalized driving behaviors in dilemma zones and develop a Personalized Transformer Encoder to predict individual drivers' stop-or-go decisions. The results show that the Personalized Transformer Encoder improves the accuracy of predicting driver decision-making in the dilemma zone by 3.7% to 12.6% compared to the Generic Transformer Encoder, and by 16.8% to 21.6% over the binary logistic regression model. △ Less

Submitted 6 May, 2024; originally announced May 2024.

arXiv:2404.11181 [pdf, other]

KI-GAN: Knowledge-Informed Generative Adversarial Networks for Enhanced Multi-Vehicle Trajectory Forecasting at Signalized Intersections

Authors: Chuheng Wei, Guoyuan Wu, Matthew J. Barth, Amr Abdelraouf, Rohit Gupta, Kyungtae Han

Abstract: Reliable prediction of vehicle trajectories at signalized intersections is crucial to urban traffic management and autonomous driving systems. However, it presents unique challenges, due to the complex roadway layout at intersections, involvement of traffic signal controls, and interactions among different types of road users. To address these issues, we present in this paper a novel model called… ▽ More Reliable prediction of vehicle trajectories at signalized intersections is crucial to urban traffic management and autonomous driving systems. However, it presents unique challenges, due to the complex roadway layout at intersections, involvement of traffic signal controls, and interactions among different types of road users. To address these issues, we present in this paper a novel model called Knowledge-Informed Generative Adversarial Network (KI-GAN), which integrates both traffic signal information and multi-vehicle interactions to predict vehicle trajectories accurately. Additionally, we propose a specialized attention pooling method that accounts for vehicle orientation and proximity at intersections. Based on the SinD dataset, our KI-GAN model is able to achieve an Average Displacement Error (ADE) of 0.05 and a Final Displacement Error (FDE) of 0.12 for a 6-second observation and 6-second prediction cycle. When the prediction window is extended to 9 seconds, the ADE and FDE values are further reduced to 0.11 and 0.26, respectively. These results demonstrate the effectiveness of the proposed KI-GAN model in vehicle trajectory prediction under complex scenarios at signalized intersections, which represents a significant advancement in the target field. △ Less

Submitted 19 April, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

Comments: 2024 CVPR AICity Workshop

arXiv:2312.04372 [pdf, other]

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

Authors: Yunsheng Ma, Can Cui, Xu Cao, Wenqian Ye, Peiran Liu, Juanwu Lu, Amr Abdelraouf, Rohit Gupta, Kyungtae Han, Aniket Bera, James M. Rehg, Ziran Wang

Abstract: Autonomous driving (AD) has made significant strides in recent years. However, existing frameworks struggle to interpret and execute spontaneous user instructions, such as "overtake the car ahead." Large Language Models (LLMs) have demonstrated impressive reasoning capabilities showing potential to bridge this gap. In this paper, we present LaMPilot, a novel framework that integrates LLMs into AD… ▽ More Autonomous driving (AD) has made significant strides in recent years. However, existing frameworks struggle to interpret and execute spontaneous user instructions, such as "overtake the car ahead." Large Language Models (LLMs) have demonstrated impressive reasoning capabilities showing potential to bridge this gap. In this paper, we present LaMPilot, a novel framework that integrates LLMs into AD systems, enabling them to follow user instructions by generating code that leverages established functional primitives. We also introduce LaMPilot-Bench, the first benchmark dataset specifically designed to quantitatively evaluate the efficacy of language model programs in AD. Adopting the LaMPilot framework, we conduct extensive experiments to assess the performance of off-the-shelf LLMs on LaMPilot-Bench. Our results demonstrate the potential of LLMs in handling diverse driving scenarios and following user instructions in driving. To facilitate further research in this area, we release our code and data at https://github.com/PurdueDigitalTwin/LaMPilot. △ Less

Submitted 4 April, 2024; v1 submitted 7 December, 2023; originally announced December 2023.

Comments: CVPR 2024

arXiv:2310.16639 [pdf, other]

Driving through the Concept Gridlock: Unraveling Explainability Bottlenecks in Automated Driving

Authors: Jessica Echterhoff, An Yan, Kyungtae Han, Amr Abdelraouf, Rohit Gupta, Julian McAuley

Abstract: Concept bottleneck models have been successfully used for explainable machine learning by encoding information within the model with a set of human-defined concepts. In the context of human-assisted or autonomous driving, explainability models can help user acceptance and understanding of decisions made by the autonomous vehicle, which can be used to rationalize and explain driver or vehicle behav… ▽ More Concept bottleneck models have been successfully used for explainable machine learning by encoding information within the model with a set of human-defined concepts. In the context of human-assisted or autonomous driving, explainability models can help user acceptance and understanding of decisions made by the autonomous vehicle, which can be used to rationalize and explain driver or vehicle behavior. We propose a new approach using concept bottlenecks as visual features for control command predictions and explanations of user and vehicle behavior. We learn a human-understandable concept layer that we use to explain sequential driving scenes while learning vehicle control commands. This approach can then be used to determine whether a change in a preferred gap or steering commands from a human (or autonomous vehicle) is led by an external stimulus or change in preferences. We achieve competitive performance to latent visual features while gaining interpretability within our model setup. △ Less

Submitted 26 October, 2023; v1 submitted 25 October, 2023; originally announced October 2023.

arXiv:2309.05115 [pdf, other]

Real-time Learning of Driving Gap Preference for Personalized Adaptive Cruise Control

Authors: Zhouqiao Zhao, Xishun Liao, Amr Abdelraouf, Kyungtae Han, Rohit Gupta, Matthew J. Barth, Guoyuan Wu

Abstract: Advanced Driver Assistance Systems (ADAS) are increasingly important in improving driving safety and comfort, with Adaptive Cruise Control (ACC) being one of the most widely used. However, pre-defined ACC settings may not always align with driver's preferences and habits, leading to discomfort and potential safety issues. Personalized ACC (P-ACC) has been proposed to address this problem, but most… ▽ More Advanced Driver Assistance Systems (ADAS) are increasingly important in improving driving safety and comfort, with Adaptive Cruise Control (ACC) being one of the most widely used. However, pre-defined ACC settings may not always align with driver's preferences and habits, leading to discomfort and potential safety issues. Personalized ACC (P-ACC) has been proposed to address this problem, but most existing research uses historical driving data to imitate behaviors that conform to driver preferences, neglecting real-time driver feedback. To bridge this gap, we propose a cloud-vehicle collaborative P-ACC framework that incorporates driver feedback adaptation in real time. The framework is divided into offline and online parts. The offline component records the driver's naturalistic car-following trajectory and uses inverse reinforcement learning (IRL) to train the model on the cloud. In the online component, driver feedback is used to update the driving gap preference in real time. The model is then retrained on the cloud with driver's takeover trajectories, achieving incremental learning to better match driver's preference. Human-in-the-loop (HuiL) simulation experiments demonstrate that our proposed method significantly reduces driver intervention in automatic control systems by up to 62.8%. By incorporating real-time driver feedback, our approach enhances the comfort and safety of P-ACC, providing a personalized and adaptable driving experience. △ Less

Submitted 10 September, 2023; originally announced September 2023.

arXiv:2308.07439 [pdf, other]

Interaction-Aware Personalized Vehicle Trajectory Prediction Using Temporal Graph Neural Networks

Authors: Amr Abdelraouf, Rohit Gupta, Kyungtae Han

Abstract: Accurate prediction of vehicle trajectories is vital for advanced driver assistance systems and autonomous vehicles. Existing methods mainly rely on generic trajectory predictions derived from large datasets, overlooking the personalized driving patterns of individual drivers. To address this gap, we propose an approach for interaction-aware personalized vehicle trajectory prediction that incorpor… ▽ More Accurate prediction of vehicle trajectories is vital for advanced driver assistance systems and autonomous vehicles. Existing methods mainly rely on generic trajectory predictions derived from large datasets, overlooking the personalized driving patterns of individual drivers. To address this gap, we propose an approach for interaction-aware personalized vehicle trajectory prediction that incorporates temporal graph neural networks. Our method utilizes Graph Convolution Networks (GCN) and Long Short-Term Memory (LSTM) to model the spatio-temporal interactions between target vehicles and their surrounding traffic. To personalize the predictions, we establish a pipeline that leverages transfer learning: the model is initially pre-trained on a large-scale trajectory dataset and then fine-tuned for each driver using their specific driving data. We employ human-in-the-loop simulation to collect personalized naturalistic driving trajectories and corresponding surrounding vehicle trajectories. Experimental results demonstrate the superior performance of our personalized GCN-LSTM model, particularly for longer prediction horizons, compared to its generic counterpart. Moreover, the personalized model outperforms individual models created without pre-training, emphasizing the significance of pre-training on a large dataset to avoid overfitting. By incorporating personalization, our approach enhances trajectory prediction accuracy. △ Less

Submitted 15 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

arXiv:2305.08877 [pdf, other]

M$^2$DAR: Multi-View Multi-Scale Driver Action Recognition with Vision Transformer

Authors: Yunsheng Ma, Liangqi Yuan, Amr Abdelraouf, Kyungtae Han, Rohit Gupta, Zihao Li, Ziran Wang

Abstract: Ensuring traffic safety and preventing accidents is a critical goal in daily driving, where the advancement of computer vision technologies can be leveraged to achieve this goal. In this paper, we present a multi-view, multi-scale framework for naturalistic driving action recognition and localization in untrimmed videos, namely M$^2$DAR, with a particular focus on detecting distracted driving beha… ▽ More Ensuring traffic safety and preventing accidents is a critical goal in daily driving, where the advancement of computer vision technologies can be leveraged to achieve this goal. In this paper, we present a multi-view, multi-scale framework for naturalistic driving action recognition and localization in untrimmed videos, namely M$^2$DAR, with a particular focus on detecting distracted driving behaviors. Our system features a weight-sharing, multi-scale Transformer-based action recognition network that learns robust hierarchical representations. Furthermore, we propose a new election algorithm consisting of aggregation, filtering, merging, and selection processes to refine the preliminary results from the action recognition module across multiple views. Extensive experiments conducted on the 7th AI City Challenge Track 3 dataset demonstrate the effectiveness of our approach, where we achieved an overlap score of 0.5921 on the A2 test set. Our source code is available at \url{https://github.com/PurdueDigitalTwin/M2DAR}. △ Less

Submitted 12 May, 2023; originally announced May 2023.

Comments: Accepted in the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)

arXiv:2305.07840 [pdf, other]

CEMFormer: Learning to Predict Driver Intentions from In-Cabin and External Cameras via Spatial-Temporal Transformers

Authors: Yunsheng Ma, Wenqian Ye, Xu Cao, Amr Abdelraouf, Kyungtae Han, Rohit Gupta, Ziran Wang

Abstract: Driver intention prediction seeks to anticipate drivers' actions by analyzing their behaviors with respect to surrounding traffic environments. Existing approaches primarily focus on late-fusion techniques, and neglect the importance of maintaining consistency between predictions and prevailing driving contexts. In this paper, we introduce a new framework called Cross-View Episodic Memory Transfor… ▽ More Driver intention prediction seeks to anticipate drivers' actions by analyzing their behaviors with respect to surrounding traffic environments. Existing approaches primarily focus on late-fusion techniques, and neglect the importance of maintaining consistency between predictions and prevailing driving contexts. In this paper, we introduce a new framework called Cross-View Episodic Memory Transformer (CEMFormer), which employs spatio-temporal transformers to learn unified memory representations for an improved driver intention prediction. Specifically, we develop a spatial-temporal encoder to integrate information from both in-cabin and external camera views, along with episodic memory representations to continuously fuse historical data. Furthermore, we propose a novel context-consistency loss that incorporates driving context as an auxiliary supervision signal to improve prediction performance. Comprehensive experiments on the Brain4Cars dataset demonstrate that CEMFormer consistently outperforms existing state-of-the-art methods in driver intention prediction. △ Less

Submitted 13 May, 2023; originally announced May 2023.

arXiv:2303.15231 [pdf]

doi 10.1016/j.aap.2023.107191

Advances and Applications of Computer Vision Techniques in Vehicle Trajectory Generation and Surrogate Traffic Safety Indicators

Authors: Mohamed Abdel-Aty, Zijin Wang, Ou Zheng, Amr Abdelraouf

Abstract: The application of Computer Vision (CV) techniques massively stimulates microscopic traffic safety analysis from the perspective of traffic conflicts and near misses, which is usually measured using Surrogate Safety Measures (SSM). However, as video processing and traffic safety modeling are two separate research domains and few research have focused on systematically bridging the gap between them… ▽ More The application of Computer Vision (CV) techniques massively stimulates microscopic traffic safety analysis from the perspective of traffic conflicts and near misses, which is usually measured using Surrogate Safety Measures (SSM). However, as video processing and traffic safety modeling are two separate research domains and few research have focused on systematically bridging the gap between them, it is necessary to provide transportation researchers and practitioners with corresponding guidance. With this aim in mind, this paper focuses on reviewing the applications of CV techniques in traffic safety modeling using SSM and suggesting the best way forward. The CV algorithm that are used for vehicle detection and tracking from early approaches to the state-of-the-art models are summarized at a high level. Then, the video pre-processing and post-processing techniques for vehicle trajectory extraction are introduced. A detailed review of SSMs for vehicle trajectory data along with their application on traffic safety analysis is presented. Finally, practical issues in traffic video processing and SSM-based safety analysis are discussed, and the available or potential solutions are provided. This review is expected to assist transportation researchers and engineers with the selection of suitable CV techniques for video processing, and the usage of SSMs for various traffic safety research objectives. △ Less

Submitted 29 June, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

arXiv:2210.08009 [pdf]

Trajectory Prediction for Vehicle Conflict Identification at Intersections Using Sequence-to-Sequence Recurrent Neural Networks

Authors: Amr Abdelraouf, Mohamed Abdel-Aty, Zijin Wang, Ou Zheng

Abstract: Surrogate safety measures in the form of conflict indicators are indispensable components of the proactive traffic safety toolbox. Conflict indicators can be classified into past-trajectory-based conflicts and predicted-trajectory-based conflicts. While the calculation of the former class of conflicts is deterministic and unambiguous, the latter category is computed using predicted vehicle traject… ▽ More Surrogate safety measures in the form of conflict indicators are indispensable components of the proactive traffic safety toolbox. Conflict indicators can be classified into past-trajectory-based conflicts and predicted-trajectory-based conflicts. While the calculation of the former class of conflicts is deterministic and unambiguous, the latter category is computed using predicted vehicle trajectories and is thus more stochastic. Consequently, the accuracy of prediction-based conflicts is contingent on the accuracy of the utilized trajectory prediction algorithm. Trajectory prediction can be a challenging task, particularly at intersections where vehicle maneuvers are diverse. Furthermore, due to limitations relating to the road user trajectory extraction pipelines, accurate geometric representation of vehicles during conflict analysis is a challenging task. Misrepresented geometries distort the real distances between vehicles under observation. In this research, a prediction-based conflict identification methodology was proposed. A sequence-to-sequence Recurrent Neural Network was developed to sequentially predict future vehicle trajectories for up to 3 seconds ahead. Furthermore, the proposed network was trained using the CitySim Dataset to forecast both future vehicle positions and headings to facilitate the prediction of future bounding boxes, thus maintaining accurate vehicle geometric representations. It was experimentally determined that the proposed method outperformed frequently used trajectory prediction models for conflict analysis at intersections. A comparison between Time-to-Collision (TTC) conflict identification using vehicle bounding boxes versus the commonly used vehicle center points for geometric representation was conducted. Compared to the bounding box method, the center point approach often failed to identify TTC conflicts or underestimated their severity. △ Less

Submitted 13 October, 2022; originally announced October 2022.

arXiv:2210.05044 [pdf]

Modelling the Relationship Between Post Encroachment Time and Signal Timings Using UAV Video data

Authors: Zubayer Islam, Mohamed Abdel-Aty, Amrita Goswamy, Amr Abdelraouf, Ou Zheng

Abstract: Intersection safety often relies on the correct modelling of signal phasing and timing parameters. A slight increase in yellow time or red time can have significant impact on the rear end crashes or conflicts. This paper aims to identify the relationship between surrogate safety measures and signal phasing. Unmanned Aerial Vehicle (UAV) video data has been used to study an intersection. Post Encro… ▽ More Intersection safety often relies on the correct modelling of signal phasing and timing parameters. A slight increase in yellow time or red time can have significant impact on the rear end crashes or conflicts. This paper aims to identify the relationship between surrogate safety measures and signal phasing. Unmanned Aerial Vehicle (UAV) video data has been used to study an intersection. Post Encroachment Time (PET) between vehicles was calculated from the video data as well as speed, heading and relevant signal timing parameters such as all red time, red clearance time, yellow time, etc. Random Parameter Ordered Logit Model was used to model the relationship between PET and these signal timing parameters. Overall, the results showed that yellow time and red clearance time is positively related to PETs. The model was also able to idendity certain signal phases that could be a potential safety hazard and would need to be retimed by considering the PETs. The odds ratios from the models also indicates that increasing the yellow and red clearance times by one second can improve the PET levels by 16% and 3% respectively. △ Less

Submitted 10 October, 2022; originally announced October 2022.

arXiv:2208.11036 [pdf]

doi 10.1177/03611981231185768

CitySim: A Drone-Based Vehicle Trajectory Dataset for Safety Oriented Research and Digital Twins

Authors: Ou Zheng, Mohamed Abdel-Aty, Lishengsa Yue, Amr Abdelraouf, Zijin Wang, Nada Mahmoud

Abstract: The development of safety-oriented research and applications requires fine-grain vehicle trajectories that not only have high accuracy, but also capture substantial safety-critical events. However, it would be challenging to satisfy both these requirements using the available vehicle trajectory datasets do not have the capacity to satisfy both.This paper introduces the CitySim dataset that has the… ▽ More The development of safety-oriented research and applications requires fine-grain vehicle trajectories that not only have high accuracy, but also capture substantial safety-critical events. However, it would be challenging to satisfy both these requirements using the available vehicle trajectory datasets do not have the capacity to satisfy both.This paper introduces the CitySim dataset that has the core objective of facilitating safety-oriented research and applications. CitySim has vehicle trajectories extracted from 1140 minutes of drone videos recorded at 12 locations. It covers a variety of road geometries including freeway basic segments, signalized intersections, stop-controlled intersections, and control-free intersections. CitySim was generated through a five-step procedure that ensured trajectory accuracy. The five-step procedure included video stabilization, object filtering, multi-video stitching, object detection and tracking, and enhanced error filtering. Furthermore, CitySim provides the rotated bounding box information of a vehicle, which was demonstrated to improve safety evaluations. Compared with other video-based critical events, including cut-in, merge, and diverge events, which were validated by distributions of both minimum time-to-collision and minimum post-encroachment time. In addition, CitySim had the capability to facilitate digital-twin-related research by providing relevant assets, such as the recording locations' three-dimensional base maps and signal timings. △ Less

Submitted 31 July, 2023; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: Transportation Research Record (2023)

arXiv:2201.08151 [pdf]

High-order Photonic Cavity Modes Enabled 3D Structural Color

Authors: Hailong Liu, Hongtao Wang, Hao Wang, Jie Deng, Qifeng Ruan, Wang Zhang, Omar A. M. Abdelraouf, Noman Soo Seng Ang, Zhaogang Dong, Joel K. W. Yang, Hong Liu

Abstract: It remains a challenge to directly print three-dimensional arbitrary shapes that exhibit structural colors at the micrometer scale. Woodpile photonic crystals (WPCs) fabricated via two-photon lithography (TPL) are promising as building blocks to produce 3D geometries that generate structural colors due to their ability to exhibit either omnidirectional or anisotropic photonic stopbands. However, e… ▽ More It remains a challenge to directly print three-dimensional arbitrary shapes that exhibit structural colors at the micrometer scale. Woodpile photonic crystals (WPCs) fabricated via two-photon lithography (TPL) are promising as building blocks to produce 3D geometries that generate structural colors due to their ability to exhibit either omnidirectional or anisotropic photonic stopbands. However, existing approaches have focused on achieving structural colors when illuminating WPCs from the top, which necessitates print resolutions beyond the limit of commercial TPL and/or post-processing techniques. Here, we devised a new strategy to support high-order photonic cavity modes upon side-illumination on WPCs that surprisingly generate large reflectance peaks in the visible spectrum. Based on that, we demonstrate one-step printing of 3D photonic structural colors without requiring post-processing or subwavelength features. Vivid colors with reflectance peaks exhibiting a full width at half maximum of ~25 nm, a maximum reflectance of 50%, gamut of ~85% of sRGB, and large viewing angles, were achieved. In addition, we also demonstrated voxel-level manipulation and control of colors in arbitrary-shaped 3D objects constituted with WPCs as unit cells, which has great potential for applications in dynamic color displays, colorimetric sensing, anti-counterfeiting, and light-matter interaction platforms. △ Less

Submitted 24 January, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

arXiv:2201.07559 [pdf]

Recent Advances in Tunable Metasurfaces: Materials, Design and Applications

Authors: Omar A. M. Abdelraouf, Ziyu Wang, Hailong Liu, Zhaogang Dong, Qian Wang, Ming Ye, Xiao Renshaw Wang, Qi Jie Wang, Hong Liu

Abstract: Metasurfaces, a two-dimensional (2D) form of metamaterials constituted by planar meta-atoms, exhibit exotic abilities to freely tailor electromagnetic (EM) waves. Over the past decade, tunable metasurfaces have come to the frontier in the field of nanophotonics, with tremendous effort focused on developing and integrating various active materials into metasurfaces. As a result, tunable/reconfigura… ▽ More Metasurfaces, a two-dimensional (2D) form of metamaterials constituted by planar meta-atoms, exhibit exotic abilities to freely tailor electromagnetic (EM) waves. Over the past decade, tunable metasurfaces have come to the frontier in the field of nanophotonics, with tremendous effort focused on developing and integrating various active materials into metasurfaces. As a result, tunable/reconfigurable metasurfaces with multi-functionalities triggered by various external stimuli have been successfully demonstrated, openings a new avenue to dynamically manipulate and control EM waves for photonic applications in demand. In this review, we first brief the progress of tunable metasurfaces development in the last decade and highlight representative works from the perspectives of active materials development, design methodologies and application-driven exploration. Then, we elaborate on the active tuning mechanisms and relevant active materials. Next, we discuss recent achievements in theory as well as machine learning (ML) assisted design methodologies to sustain the development of this field. After that, we summarize and describe typical application areas of the tunable metasurfaces. We conclude this review by analyzing existing challenges and presenting our perspectives on future directions and opportunities in this vibrant and fast-developing field. △ Less

Submitted 19 January, 2022; originally announced January 2022.

arXiv:2105.04233 [pdf]

Multistate Tuning of Third Harmonic Generation in Fano-Resonant Hybrid Dielectric Metasurfaces

Authors: Omar A. M. Abdelraouf, Aravind P. Anthur, Zhaogang Dong, Hailong Liu, Qian Wang, Leonid Krivitsky, Xiao Renshaw Wang, Qi Jie Wang, Hong Liu

Abstract: Hybrid dielectric metasurfaces have emerged as a promising approach to enhancing near field confinement and thus achieving high optical nonlinearity using low loss dielectrics. Additional flexibility in design and fabrication of hybrid metasurfaces allows dynamic control of light, which is value-added for a wider range of applications. Here, we demonstrate a tunable and efficient third harmonic ge… ▽ More Hybrid dielectric metasurfaces have emerged as a promising approach to enhancing near field confinement and thus achieving high optical nonlinearity using low loss dielectrics. Additional flexibility in design and fabrication of hybrid metasurfaces allows dynamic control of light, which is value-added for a wider range of applications. Here, we demonstrate a tunable and efficient third harmonic generation (THG) via hybrid metasurfaces with phase change material Ge2Sb2Te5 (GST) deposited on top of amorphous silicon nanostructutes. Fano resonance is excited to confine the incident light inside the hybrid metasurfaces, and an experimental quality factor ($Q$-factor) of 125 is achieved at the fundamental pump wavelength around 1210 nm. We demonstrate the switching between a turn-on state of Fano resonance in the amorphous state of GST and a turn-off state in its crystalline state and also gradual multistate tuning of THG emission at its intermediate state. We achieve a high THG conversion efficiency of $η = 2.9*10^{-6}$ %, which is more than ~32 times of that of a GST-based Fabry-Pèrot cavity under a similar pump laser power, thanks to the enhanced field confinement due to the Fano resonance. Our results show the strong potential of GST-based hybrid dielectric metasurfaces for efficient and tunable nonlinear optical devices. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Showing 1–18 of 18 results for author: Abdelraouf, A