subscribe to arXiv mailings

High fidelity TiN processing modes for multi-gate Ge-based quantum devices

Authors: Sinan Bugu, Sheshank Biradar, Alan Blake, CheWee Liu, Maksym Myronovd, Ray Duffy, Giorgos Fagas, Nikolay Petkov

Abstract: Charge or spin-qubits can be realized by using gate-defined quantum dots (QDs) in semiconductors in a similar fashion to the processes used in CMOS for conventional field-effect transistors or more recent fin FET technology. However, to realize larger number of gate-defined qubits, multiples of gates with ultimately high resolution and fidelity is required. Electron beam lithography (EBL) offers f… ▽ More Charge or spin-qubits can be realized by using gate-defined quantum dots (QDs) in semiconductors in a similar fashion to the processes used in CMOS for conventional field-effect transistors or more recent fin FET technology. However, to realize larger number of gate-defined qubits, multiples of gates with ultimately high resolution and fidelity is required. Electron beam lithography (EBL) offers flexible and tunable patterning of gate-defined spin-qubit devices for studying important quantum phenomena. While such devices are commonly realized by a positive resist process using metal lift-off, there are several clear limitations related to the resolution and the fidelity of patterning. Herein, we report a systematic study of an alternative TiN multi-gates definition approach based on the highest resolution hydrogen silsesquioxane (HSQ) EBL resist and all associated processing modes. The TiN gate arrays formed show excellent fidelity, dimensions down to 15 nm, various densities, and complexities. The processing modes developed were used to demonstrate applicability of this approach to forming multi-gate architectures for two types of spin-qubit devices prototypic to i) NW/fin-type FETs and ii) planar quantum well-type devices, both utilizing epi-grown Ge device layers on Si, where GeSn or Ge are the host materials for the QDs. △ Less

Submitted 27 August, 2024; originally announced August 2024.

arXiv:2401.16304 [pdf, other]

Regressing Transformers for Data-efficient Visual Place Recognition

Authors: María Leyva-Vallina, Nicola Strisciuglio, Nicolai Petkov

Abstract: Visual place recognition is a critical task in computer vision, especially for localization and navigation systems. Existing methods often rely on contrastive learning: image descriptors are trained to have small distance for similar images and larger distance for dissimilar ones in a latent space. However, this approach struggles to ensure accurate distance-based image similarity representation,… ▽ More Visual place recognition is a critical task in computer vision, especially for localization and navigation systems. Existing methods often rely on contrastive learning: image descriptors are trained to have small distance for similar images and larger distance for dissimilar ones in a latent space. However, this approach struggles to ensure accurate distance-based image similarity representation, particularly when training with binary pairwise labels, and complex re-ranking strategies are required. This work introduces a fresh perspective by framing place recognition as a regression problem, using camera field-of-view overlap as similarity ground truth for learning. By optimizing image descriptors to align directly with graded similarity labels, this approach enhances ranking capabilities without expensive re-ranking, offering data-efficient training and strong generalization across several benchmark datasets. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: Accepted for publication in ICRA 2024

arXiv:2303.11739 [pdf, other]

Data-efficient Large Scale Place Recognition with Graded Similarity Supervision

Authors: Maria Leyva-Vallina, Nicola Strisciuglio, Nicolai Petkov

Abstract: Visual place recognition (VPR) is a fundamental task of computer vision for visual localization. Existing methods are trained using image pairs that either depict the same place or not. Such a binary indication does not consider continuous relations of similarity between images of the same place taken from different positions, determined by the continuous nature of camera pose. The binary similari… ▽ More Visual place recognition (VPR) is a fundamental task of computer vision for visual localization. Existing methods are trained using image pairs that either depict the same place or not. Such a binary indication does not consider continuous relations of similarity between images of the same place taken from different positions, determined by the continuous nature of camera pose. The binary similarity induces a noisy supervision signal into the training of VPR methods, which stall in local minima and require expensive hard mining algorithms to guarantee convergence. Motivated by the fact that two images of the same place only partially share visual cues due to camera pose differences, we deploy an automatic re-annotation strategy to re-label VPR datasets. We compute graded similarity labels for image pairs based on available localization metadata. Furthermore, we propose a new Generalized Contrastive Loss (GCL) that uses graded similarity labels for training contrastive networks. We demonstrate that the use of the new labels and GCL allow to dispense from hard-pair mining, and to train image descriptors that perform better in VPR by nearest neighbor search, obtaining superior or comparable results than methods that require expensive hard-pair mining and re-ranking techniques. Code and models available at: https://github.com/marialeyvallina/generalized_contrastive_loss △ Less

Submitted 25 March, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

Comments: Accepted at CVPR 2023

arXiv:2112.05591 [pdf]

doi 10.1039/D1CP02339J

Transient absorption, femtosecond dynamics, vibrational coherence and molecular modelling of the photoisomerization of N-salicylidene-o-aminophenol in solution

Authors: Nikolai Petkov, Anela Ivanova, Anton Trifonov, Ivan Buchvarov, Stanislav Stanimirov

Abstract: This article presents a study of the excited state relaxation dynamics of N-salycylidene-o-aminophenol (SOAP) in ethanol solution. Femtosecond transient absorption (TA) spectroscopy and theoretical calculations are used in combination to establish the mechanism of the excited state relaxation and type of molecular species involved in the accompanying phototransformations. TA spectra show that upon… ▽ More This article presents a study of the excited state relaxation dynamics of N-salycylidene-o-aminophenol (SOAP) in ethanol solution. Femtosecond transient absorption (TA) spectroscopy and theoretical calculations are used in combination to establish the mechanism of the excited state relaxation and type of molecular species involved in the accompanying phototransformations. TA spectra show that upon photoexcitation two SOAP tautomers (E-enol and Z-keto) interconvert by ESIPT. The molecule can subsequently isomerize to the E-keto form of SOAP. An intriguing observation is that the TA spectra of this compound in ethanol show modulations of the signal at the stimulated emission spectral range. It is found that these modulations are due to the coherence of the excited ensemble of molecules whose evolution over time represents a moving wave packet. After Fourier transform of the modulations, two characteristic frequencies are identified. These frequencies are referred to the corresponding vibrational modes of the excited state and their nature is elucidated by DFT quantum chemical calculations. The obtained experimental and theoretical data reveal the nature of vibronic coupling between the ground and excited state and the type of molecular vibrations involved in the molecular dynamics along the potential surface of the first excited state at the initial moment right after excitation. These vibrations characterize the starting point in the excited state dynamics of the molecule toward Z-E isomerization of the keto form of SOAP. The study provides a comprehensive picture of the dynamic processes taking place upon photoexcitation of the compound, which might enable control over the various relaxation channels. △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: 14 pages, 10 figures

Journal ref: Phys. Chem. Chem. Phys., 2021,23, 20989-21000

arXiv:2103.06638 [pdf, other]

Generalized Contrastive Optimization of Siamese Networks for Place Recognition

Authors: María Leyva-Vallina, Nicola Strisciuglio, Nicolai Petkov

Abstract: Visual place recognition is a challenging task in computer vision and a key component of camera-based localization and navigation systems. Recently, Convolutional Neural Networks (CNNs) achieved high results and good generalization capabilities. They are usually trained using pairs or triplets of images labeled as either similar or dissimilar, in a binary fashion. In practice, the similarity betwe… ▽ More Visual place recognition is a challenging task in computer vision and a key component of camera-based localization and navigation systems. Recently, Convolutional Neural Networks (CNNs) achieved high results and good generalization capabilities. They are usually trained using pairs or triplets of images labeled as either similar or dissimilar, in a binary fashion. In practice, the similarity between two images is not binary, but continuous. Furthermore, training these CNNs is computationally complex and involves costly pair and triplet mining strategies. We propose a Generalized Contrastive loss (GCL) function that relies on image similarity as a continuous measure, and use it to train a siamese CNN. Furthermore, we present three techniques for automatic annotation of image pairs with labels indicating their degree of similarity, and deploy them to re-annotate the MSLS, TB-Places, and 7Scenes datasets. We demonstrate that siamese CNNs trained using the GCL function and the improved annotations consistently outperform their binary counterparts. Our models trained on MSLS outperform the state-of-the-art methods, including NetVLAD, NetVLAD-SARE, AP-GeM and Patch-NetVLAD, and generalize well on the Pittsburgh30k, Tokyo 24/7, RobotCar Seasons v2 and Extended CMU Seasons datasets. Furthermore, training a siamese network using the GCL function does not require complex pair mining. We release the source code at https://github.com/marialeyvallina/generalized_contrastive_loss. △ Less

Submitted 20 April, 2023; v1 submitted 11 March, 2021; originally announced March 2021.

Comments: Published at CVPR2023 as arXiv:2303.11739

arXiv:2006.15373 [pdf, other]

MTStereo 2.0: improved accuracy of stereo depth estimation withMax-trees

Authors: Rafael Brandt, Nicola Strisciuglio, Nicolai Petkov

Abstract: Efficient yet accurate extraction of depth from stereo image pairs is required by systems with low power resources, such as robotics and embedded systems. State-of-the-art stereo matching methods based on convolutional neural networks require intensive computations on GPUs and are difficult to deploy on embedded systems. In this paper, we propose a stereo matching method, called MTStereo 2.0, for… ▽ More Efficient yet accurate extraction of depth from stereo image pairs is required by systems with low power resources, such as robotics and embedded systems. State-of-the-art stereo matching methods based on convolutional neural networks require intensive computations on GPUs and are difficult to deploy on embedded systems. In this paper, we propose a stereo matching method, called MTStereo 2.0, for limited-resource systems that require efficient and accurate depth estimation. It is based on a Max-tree hierarchical representation of image pairs, which we use to identify matching regions along image scan-lines. The method includes a cost function that considers similarity of region contextual information based on the Max-trees and a disparity border preserving cost aggregation approach. MTStereo 2.0 improves on its predecessor MTStereo 1.0 as it a) deploys a more robust cost function, b) performs more thorough detection of incorrect matches, c) computes disparity maps with pixel-level rather than node-level precision. MTStereo provides accurate sparse and semi-dense depth estimation and does not require intensive GPU computations like methods based on CNNs. Thus it can run on embedded and robotics devices with low-power requirements. We tested the proposed approach on several benchmark data sets, namely KITTI 2015, Driving, FlyingThings3D, Middlebury 2014, Monkaa and the TrimBot2020 garden data sets, and achieved competitive accuracy and efficiency. The code is available at https://github.com/rbrandt1/MaxTreeS. △ Less

Submitted 27 June, 2020; originally announced June 2020.

arXiv:1906.12151 [pdf, other]

Place recognition in gardens by learning visual representations: data set and benchmark analysis

Authors: Maria Leyva-Vallina, Nicola Strisciuglio, Nicolai Petkov

Abstract: Visual place recognition is an important component of systems for camera localization and loop closure detection. It concerns the recognition of a previously visited place based on visual cues only. Although it is a widely studied problem for indoor and urban environments, the recent use of robots for automation of agricultural and gardening tasks has created new problems, due to the challenging a… ▽ More Visual place recognition is an important component of systems for camera localization and loop closure detection. It concerns the recognition of a previously visited place based on visual cues only. Although it is a widely studied problem for indoor and urban environments, the recent use of robots for automation of agricultural and gardening tasks has created new problems, due to the challenging appearance of garden-like environments. Garden scenes predominantly contain green colors, as well as repetitive patterns and textures. The lack of available data recorded in gardens and natural environments makes the improvement of visual localization algorithms difficult. In this paper we propose an extended version of the TB-Places data set, which is designed for testing algorithms for visual place recognition. It contains images with ground truth camera pose recorded in real gardens in different seasons, with varying light conditions. We constructed and released a ground truth for all possible pairs of images, indicating whether they depict the same place or not. We present the results of a benchmark analysis of methods based on convolutional neural networks for holistic image description and place recognition. We train existing networks (i.e. ResNet, DenseNet and VGG NetVLAD) as backbone of a two-way architecture with a contrastive loss function. The results that we obtained demonstrate that learning garden-tailored representations contribute to an improvement of performance, although the generalization capabilities are limited. △ Less

Submitted 28 June, 2019; originally announced June 2019.

Comments: Accepted for the 18th International Conference on Computer Analysis of Images and Patterns

arXiv:1905.04107 [pdf, other]

Towards Emotion Retrieval in Egocentric PhotoStream

Authors: Estefania Talavera, Petia Radeva, Nicolai Petkov

Abstract: The availability and use of egocentric data are rapidly increasing due to the growing use of wearable cameras. Our aim is to study the effect (positive, neutral or negative) of egocentric images or events on an observer. Given egocentric photostreams capturing the wearer's days, we propose a method that aims to assign sentiment to events extracted from egocentric photostreams. Such moments can be… ▽ More The availability and use of egocentric data are rapidly increasing due to the growing use of wearable cameras. Our aim is to study the effect (positive, neutral or negative) of egocentric images or events on an observer. Given egocentric photostreams capturing the wearer's days, we propose a method that aims to assign sentiment to events extracted from egocentric photostreams. Such moments can be candidates to retrieve according to their possibility of representing a positive experience for the camera's wearer. The proposed approach obtained a classification accuracy of 75% on the test set, with a deviation of 8%. Our model makes a step forward opening the door to sentiment recognition in egocentric photostreams. △ Less