-
Comparison of different Artificial Neural Networks for Bitcoin price forecasting
Authors:
Silas Baumann,
Karl A. Busch,
Hamza A. A. Gardi
Abstract:
This study investigates the impact of varying sequence lengths on the accuracy of predicting cryptocurrency returns using Artificial Neural Networks (ANNs). Utilizing the Mean Absolute Error (MAE) as a threshold criterion, we aim to enhance prediction accuracy by excluding returns that are smaller than this threshold, thus mitigating errors associated with minor returns. The subsequent evaluation…
▽ More
This study investigates the impact of varying sequence lengths on the accuracy of predicting cryptocurrency returns using Artificial Neural Networks (ANNs). Utilizing the Mean Absolute Error (MAE) as a threshold criterion, we aim to enhance prediction accuracy by excluding returns that are smaller than this threshold, thus mitigating errors associated with minor returns. The subsequent evaluation focuses on the accuracy of predicted returns that exceed this threshold. We compare four sequence lengths 168 hours (7 days), 72 hours (3 days), 24 hours, and 12 hours each with a return prediction interval of 2 hours. Our findings reveal the influence of sequence length on prediction accuracy and underscore the potential for optimized sequence configurations in financial forecasting models.
△ Less
Submitted 25 July, 2024;
originally announced July 2024.
-
CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models
Authors:
Nick Stracke,
Stefan Andreas Baumann,
Joshua M. Susskind,
Miguel Angel Bautista,
Björn Ommer
Abstract:
Text-to-image generative models have become a prominent and powerful tool that excels at generating high-resolution realistic images. However, guiding the generative process of these models to consider detailed forms of conditioning reflecting style and/or structure information remains an open problem. In this paper, we present LoRAdapter, an approach that unifies both style and structure conditio…
▽ More
Text-to-image generative models have become a prominent and powerful tool that excels at generating high-resolution realistic images. However, guiding the generative process of these models to consider detailed forms of conditioning reflecting style and/or structure information remains an open problem. In this paper, we present LoRAdapter, an approach that unifies both style and structure conditioning under the same formulation using a novel conditional LoRA block that enables zero-shot control. LoRAdapter is an efficient, powerful, and architecture-agnostic approach to condition text-to-image diffusion models, which enables fine-grained control conditioning during generation and outperforms recent state-of-the-art approaches.
△ Less
Submitted 8 October, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions
Authors:
Stefan Andreas Baumann,
Felix Krause,
Michael Neumayr,
Nick Stracke,
Vincent Tao Hu,
Björn Ommer
Abstract:
In recent years, advances in text-to-image (T2I) diffusion models have substantially elevated the quality of their generated images. However, achieving fine-grained control over attributes remains a challenge due to the limitations of natural language prompts (such as no continuous set of intermediate descriptions existing between ``person'' and ``old person''). Even though many methods were intro…
▽ More
In recent years, advances in text-to-image (T2I) diffusion models have substantially elevated the quality of their generated images. However, achieving fine-grained control over attributes remains a challenge due to the limitations of natural language prompts (such as no continuous set of intermediate descriptions existing between ``person'' and ``old person''). Even though many methods were introduced that augment the model or generation process to enable such control, methods that do not require a fixed reference image are limited to either enabling global fine-grained attribute expression control or coarse attribute expression control localized to specific subjects, not both simultaneously. We show that there exist directions in the commonly used token-level CLIP text embeddings that enable fine-grained subject-specific control of high-level attributes in text-to-image models. Based on this observation, we introduce one efficient optimization-free and one robust optimization-based method to identify these directions for specific attributes from contrastive text prompts. We demonstrate that these directions can be used to augment the prompt text input with fine-grained control over attributes of specific subjects in a compositional manner (control over multiple attributes of a single subject) without having to adapt the diffusion model. Project page: https://compvis.github.io/attribute-control. Code is available at https://github.com/CompVis/attribute-control.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
ZigMa: A DiT-style Zigzag Mamba Diffusion Model
Authors:
Vincent Tao Hu,
Stefan Andreas Baumann,
Ming Gui,
Olga Grebenkova,
Pingchuan Ma,
Johannes Fischer,
Björn Ommer
Abstract:
The diffusion model has long been plagued by scalability and quadratic complexity issues, especially within transformer-based structures. In this study, we aim to leverage the long sequence modeling capability of a State-Space Model called Mamba to extend its applicability to visual data generation. Firstly, we identify a critical oversight in most current Mamba-based vision methods, namely the la…
▽ More
The diffusion model has long been plagued by scalability and quadratic complexity issues, especially within transformer-based structures. In this study, we aim to leverage the long sequence modeling capability of a State-Space Model called Mamba to extend its applicability to visual data generation. Firstly, we identify a critical oversight in most current Mamba-based vision methods, namely the lack of consideration for spatial continuity in the scan scheme of Mamba. Secondly, building upon this insight, we introduce a simple, plug-and-play, zero-parameter method named Zigzag Mamba, which outperforms Mamba-based baselines and demonstrates improved speed and memory utilization compared to transformer-based baselines. Lastly, we integrate Zigzag Mamba with the Stochastic Interpolant framework to investigate the scalability of the model on large-resolution visual datasets, such as FacesHQ $1024\times 1024$ and UCF101, MultiModal-CelebA-HQ, and MS COCO $256\times 256$ . Code will be released at https://taohu.me/zigma/
△ Less
Submitted 1 April, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
DepthFM: Fast Monocular Depth Estimation with Flow Matching
Authors:
Ming Gui,
Johannes S. Fischer,
Ulrich Prestel,
Pingchuan Ma,
Dmytro Kotovenko,
Olga Grebenkova,
Stefan Andreas Baumann,
Vincent Tao Hu,
Björn Ommer
Abstract:
Monocular depth estimation is crucial for numerous downstream vision tasks and applications. Current discriminative approaches to this problem are limited due to blurry artifacts, while state-of-the-art generative methods suffer from slow sampling due to their SDE nature. Rather than starting from noise, we seek a direct mapping from input image to depth map. We observe that this can be effectivel…
▽ More
Monocular depth estimation is crucial for numerous downstream vision tasks and applications. Current discriminative approaches to this problem are limited due to blurry artifacts, while state-of-the-art generative methods suffer from slow sampling due to their SDE nature. Rather than starting from noise, we seek a direct mapping from input image to depth map. We observe that this can be effectively framed using flow matching, since its straight trajectories through solution space offer efficiency and high quality. Our study demonstrates that a pre-trained image diffusion model can serve as an adequate prior for a flow matching depth model, allowing efficient training on only synthetic data to generalize to real images. We find that an auxiliary surface normals loss further improves the depth estimates. Due to the generative nature of our approach, our model reliably predicts the confidence of its depth estimates. On standard benchmarks of complex natural scenes, our lightweight approach exhibits state-of-the-art performance at favorable low computational cost despite only being trained on little synthetic data.
△ Less
Submitted 20 March, 2024;
originally announced March 2024.
-
X-maps: Direct Depth Lookup for Event-based Structured Light Systems
Authors:
Wieland Morgenstern,
Niklas Gard,
Simon Baumann,
Anna Hilsmann,
Peter Eisert
Abstract:
We present a new approach to direct depth estimation for Spatial Augmented Reality (SAR) applications using event cameras. These dynamic vision sensors are a great fit to be paired with laser projectors for depth estimation in a structured light approach. Our key contributions involve a conversion of the projector time map into a rectified X-map, capturing x-axis correspondences for incoming event…
▽ More
We present a new approach to direct depth estimation for Spatial Augmented Reality (SAR) applications using event cameras. These dynamic vision sensors are a great fit to be paired with laser projectors for depth estimation in a structured light approach. Our key contributions involve a conversion of the projector time map into a rectified X-map, capturing x-axis correspondences for incoming events and enabling direct disparity lookup without any additional search. Compared to previous implementations, this significantly simplifies depth estimation, making it more efficient, while the accuracy is similar to the time map-based process. Moreover, we compensate non-linear temporal behavior of cheap laser projectors by a simple time map calibration, resulting in improved performance and increased depth estimation accuracy. Since depth estimation is executed by two lookups only, it can be executed almost instantly (less than 3 ms per frame with a Python implementation) for incoming events. This allows for real-time interactivity and responsiveness, which makes our approach especially suitable for SAR experiences where low latency, high frame rates and direct feedback are crucial. We present valuable insights gained into data transformed into X-maps and evaluate our depth from disparity estimation against the state of the art time map-based results. Additional results and code are available on our project page: https://fraunhoferhhi.github.io/X-maps/
△ Less
Submitted 15 February, 2024;
originally announced February 2024.
-
They were robbed! Scoring by the middlemost to attenuate biased judging in boxing
Authors:
Stuart Baumann,
Carl Singleton
Abstract:
Boxing has a long-standing problem with biased judging, impacting both professional and Olympic bouts. "Robberies", where boxers are widely seen as being denied rightful victories, threaten to drive fans and athletes away from the sport. To tackle this problem, we propose a minimalist adjustment in how boxing is scored: the winner would be decided by the majority of round-by-round victories accord…
▽ More
Boxing has a long-standing problem with biased judging, impacting both professional and Olympic bouts. "Robberies", where boxers are widely seen as being denied rightful victories, threaten to drive fans and athletes away from the sport. To tackle this problem, we propose a minimalist adjustment in how boxing is scored: the winner would be decided by the majority of round-by-round victories according to the judges, rather than relying on the judges' overall bout scores. This approach, rooted in social choice theory and utilising majority rule and middlemost aggregation functions, creates a coordination problem for partisan judges and attenuates their influence. Our model analysis and simulations demonstrate the potential to significantly decrease the likelihood of a partisan judge swaying the result of a bout.
△ Less
Submitted 27 June, 2024; v1 submitted 9 February, 2024;
originally announced February 2024.
-
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Authors:
Katherine Crowson,
Stefan Andreas Baumann,
Alex Birch,
Tanishq Mathew Abraham,
Daniel Z. Kaplan,
Enrico Shippole
Abstract:
We present the Hourglass Diffusion Transformer (HDiT), an image generative model that exhibits linear scaling with pixel count, supporting training at high-resolution (e.g. $1024 \times 1024$) directly in pixel-space. Building on the Transformer architecture, which is known to scale to billions of parameters, it bridges the gap between the efficiency of convolutional U-Nets and the scalability of…
▽ More
We present the Hourglass Diffusion Transformer (HDiT), an image generative model that exhibits linear scaling with pixel count, supporting training at high-resolution (e.g. $1024 \times 1024$) directly in pixel-space. Building on the Transformer architecture, which is known to scale to billions of parameters, it bridges the gap between the efficiency of convolutional U-Nets and the scalability of Transformers. HDiT trains successfully without typical high-resolution training techniques such as multiscale architectures, latent autoencoders or self-conditioning. We demonstrate that HDiT performs competitively with existing models on ImageNet $256^2$, and sets a new state-of-the-art for diffusion models on FFHQ-$1024^2$.
△ Less
Submitted 21 January, 2024;
originally announced January 2024.
-
arXiv:2401.04793
[pdf]
cond-mat.mtrl-sci
cond-mat.mes-hall
cond-mat.str-el
cond-mat.supr-con
quant-ph
2024 Roadmap on Magnetic Microscopy Techniques and Their Applications in Materials Science
Authors:
D. V. Christensen,
U. Staub,
T. R. Devidas,
B. Kalisky,
K. C. Nowack,
J. L. Webb,
U. L. Andersen,
A. Huck,
D. A. Broadway,
K. Wagner,
P. Maletinsky,
T. van der Sar,
C. R. Du,
A. Yacoby,
D. Collomb,
S. Bending,
A. Oral,
H. J. Hug,
A. -O. Mandru,
V. Neu,
H. W. Schumacher,
S. Sievers,
H. Saito,
A. A. Khajetoorians,
N. Hauptmann
, et al. (28 additional authors not shown)
Abstract:
Considering the growing interest in magnetic materials for unconventional computing, data storage, and sensor applications, there is active research not only on material synthesis but also characterisation of their properties. In addition to structural and integral magnetic characterisations, imaging of magnetization patterns, current distributions and magnetic fields at nano- and microscale is of…
▽ More
Considering the growing interest in magnetic materials for unconventional computing, data storage, and sensor applications, there is active research not only on material synthesis but also characterisation of their properties. In addition to structural and integral magnetic characterisations, imaging of magnetization patterns, current distributions and magnetic fields at nano- and microscale is of major importance to understand the material responses and qualify them for specific applications. In this roadmap, we aim to cover a broad portfolio of techniques to perform nano- and microscale magnetic imaging using SQUIDs, spin center and Hall effect magnetometries, scanning probe microscopies, x-ray- and electron-based methods as well as magnetooptics and nanoMRI. The roadmap is aimed as a single access point of information for experts in the field as well as the young generation of students outlining prospects of the development of magnetic imaging technologies for the upcoming decade with a focus on physics, materials science, and chemistry of planar, 3D and geometrically curved objects of different material classes including 2D materials, complex oxides, semi-metals, multiferroics, skyrmions, antiferromagnets, frustrated magnets, magnetic molecules/nanoparticles, ionic conductors, superconductors, spintronic and spinorbitronic materials.
△ Less
Submitted 9 January, 2024;
originally announced January 2024.
-
Boosting Latent Diffusion with Flow Matching
Authors:
Johannes S. Fischer,
Ming Gui,
Pingchuan Ma,
Nick Stracke,
Stefan A. Baumann,
Björn Ommer
Abstract:
Recently, there has been tremendous progress in visual synthesis and the underlying generative models. Here, diffusion models (DMs) stand out particularly, but lately, flow matching (FM) has also garnered considerable interest. While DMs excel in providing diverse images, they suffer from long training and slow generation. With latent diffusion, these issues are only partially alleviated. Converse…
▽ More
Recently, there has been tremendous progress in visual synthesis and the underlying generative models. Here, diffusion models (DMs) stand out particularly, but lately, flow matching (FM) has also garnered considerable interest. While DMs excel in providing diverse images, they suffer from long training and slow generation. With latent diffusion, these issues are only partially alleviated. Conversely, FM offers faster training and inference but exhibits less diversity in synthesis. We demonstrate that introducing FM between the Diffusion model and the convolutional decoder offers high-resolution image synthesis with reduced computational cost and model size. Diffusion can then efficiently provide the necessary generation diversity. FM compensates for the lower resolution, mapping the small latent space to a high-dimensional one. Subsequently, the convolutional decoder of the LDM maps these latents to high-resolution images. By combining the diversity of DMs, the efficiency of FMs, and the effectiveness of convolutional decoders, we achieve state-of-the-art high-resolution image synthesis at $1024^2$ with minimal computational cost. Importantly, our approach is orthogonal to recent approximation and speed-up strategies for the underlying DMs, making it easily integrable into various DM frameworks.
△ Less
Submitted 28 March, 2024; v1 submitted 12 December, 2023;
originally announced December 2023.
-
A Proof of the $(n,k,t)$ Conjectures
Authors:
Stacie Baumann,
Joseph Briggs
Abstract:
An $(n,k,t)$-graph is a graph on $n$ vertices in which every set of $k$ vertices contain a clique on $t$ vertices. Turán's Theorem (complemented) states that the unique minimum $(n,k,2)$-graph is a disjoint union of cliques. We prove that minimum $(n,k,t)$-graphs are always disjoint unions of cliques for any $t$ (despite nonuniqueness of extremal examples), thereby generalizing Turán's Theorem and…
▽ More
An $(n,k,t)$-graph is a graph on $n$ vertices in which every set of $k$ vertices contain a clique on $t$ vertices. Turán's Theorem (complemented) states that the unique minimum $(n,k,2)$-graph is a disjoint union of cliques. We prove that minimum $(n,k,t)$-graphs are always disjoint unions of cliques for any $t$ (despite nonuniqueness of extremal examples), thereby generalizing Turán's Theorem and confirming two conjectures of Hoffman et al.
△ Less
Submitted 13 June, 2023; v1 submitted 15 October, 2022;
originally announced October 2022.
-
Updatable Materialization of Approximate Constraints
Authors:
Steffen Kläbe,
Kai-Uwe Sattler,
Stephan Baumann
Abstract:
Modern big data applications integrate data from various sources. As a result, these datasets may not satisfy perfect constraints, leading to sparse schema information and non-optimal query performance. The existing approach of PatchIndexes enable the definition of approximate constraints and improve query performance by exploiting the materialized constraint information. As real world data wareho…
▽ More
Modern big data applications integrate data from various sources. As a result, these datasets may not satisfy perfect constraints, leading to sparse schema information and non-optimal query performance. The existing approach of PatchIndexes enable the definition of approximate constraints and improve query performance by exploiting the materialized constraint information. As real world data warehouse workloads are often not limited to read-only queries, we enhance the PatchIndex structure towards an update-conscious design in this paper. Therefore, we present a sharded bitmap as the underlying data structure which offers efficient update operations, and describe approaches to maintain approximate constraints under updates, avoiding index recomputations and full table scans. In our evaluation, we prove that PatchIndexes significantly impact query performance while achieving lightweight update support.
△ Less
Submitted 12 February, 2021;
originally announced February 2021.
-
Imaging the breakdown of ohmic transport in graphene
Authors:
A. Jenkins,
S. Baumann,
H. Zhou,
S. A. Meynell,
D. Yang,
K. Watanabe,
T. Taniguchi,
A. Lucas,
A. F. Young,
A. C. Bleszynski Jayich
Abstract:
Ohm's law describes the proportionality of current density and electric field. In solid-state conductors, Ohm's law emerges due to electron scattering processes that relax the electrical current. Here, we use nitrogen-vacancy center magnetometry to directly image the local breakdown of Ohm's law in a narrow constriction fabricated in a high mobility graphene monolayer. Ohmic flow is visible at roo…
▽ More
Ohm's law describes the proportionality of current density and electric field. In solid-state conductors, Ohm's law emerges due to electron scattering processes that relax the electrical current. Here, we use nitrogen-vacancy center magnetometry to directly image the local breakdown of Ohm's law in a narrow constriction fabricated in a high mobility graphene monolayer. Ohmic flow is visible at room temperature as current concentration on the constriction edges, with flow profiles entirely determined by sample geometry. However, as the temperature is lowered below 200 K, the current concentrates near the constriction center. The change in the flow pattern is consistent with a crossover from diffusive to viscous electron transport dominated by electron-electron scattering processes that do not relax current.
△ Less
Submitted 12 March, 2020; v1 submitted 12 February, 2020;
originally announced February 2020.
-
High frequency impact ionization and nonlinearity of photocurrent induced by intense terahertz radiation in HgTe-based quantum well structures
Authors:
S. Hubmann,
G. V. Budkin,
A. P. Dmitriev,
S. Gebert,
V. V. Belkov,
E. L. Ivchenko,
S. Baumann,
M. Otteneder,
D. A. Kozlov,
N. N. Mikhailov,
S. A. Dvoretsky,
Z. D. Kvon,
S. D. Ganichev
Abstract:
We report on a strong nonlinear behavior of the photogalvanics and photoconductivity under excitation of HgTe quantum wells (QWs) by intense terahertz (THz) radiation. The increasing radiation intensity causes an inversion of the sign of the photocurrent and transition to its superlinear dependence on the intensity. The photoconductivity also shows a superlinear raise with the intensity. We show t…
▽ More
We report on a strong nonlinear behavior of the photogalvanics and photoconductivity under excitation of HgTe quantum wells (QWs) by intense terahertz (THz) radiation. The increasing radiation intensity causes an inversion of the sign of the photocurrent and transition to its superlinear dependence on the intensity. The photoconductivity also shows a superlinear raise with the intensity. We show that the observed photoresponse nonlinearities are caused by the band-to-band \emph{light} impact ionization under conditions of a photon energy less than the forbidden gap. The signature of this kind of impact ionization is that the angular radiation frequency $ω=2πf$ is much higher than the reciprocal momentum relaxation time. Thus, the impact ionization takes place solely because of collisions in the presence of a high-frequency electric field. The effect has been measured on narrow HgTe/CdTe QWs of 5.7\,nm width; the nonlinearity is detected for linearly and circularly polarized THz radiation with different frequencies ranging from $f=0.6$ to 1.07\,THz and intensities up to hundreds of kW/cm$^2$. We demonstrate that the probability of the impact ionization is proportional to the exponential function, $\exp(-E_0^2/E^2)$, of the radiation electric field amplitude $E$ and the characteristic field parameter $E_0$. The effect is observable in a wide temperature range from 4.2 to 90\,K, with the characteristic field increasing with rising temperature.
△ Less
Submitted 4 December, 2018;
originally announced December 2018.
-
Interplay between Orbital Magnetic Moment and Crystal Field Symmetry: Fe atoms on MgO
Authors:
S. Baumann,
F. Donati,
S. Stepanow,
S. Rusponi,
W. Paul,
S. Gangopadhyay,
I. G. Rau,
G. E. Pacchioni,
L. Gragnaniello,
M. Pivetta,
J. Dreiser,
C. Piamonteze,
C. P. Lutz,
R. M. Macfarlane,
B. A. Jones,
P. Gambardella,
A. J. Heinrich,
H. Brune
Abstract:
We combine density functional theory, x-ray magnetic circular dichroism, multiplet calculations, and scanning tunneling spectroscopy to assess the magnetic properties of Fe atoms adsorbed on a thin layer of MgO(100) on Ag(100). Despite the strong axial field due to the O ligand, the weak cubic field induced by the four-fold coordination to Mg atoms entirely quenches the first order orbital moment.…
▽ More
We combine density functional theory, x-ray magnetic circular dichroism, multiplet calculations, and scanning tunneling spectroscopy to assess the magnetic properties of Fe atoms adsorbed on a thin layer of MgO(100) on Ag(100). Despite the strong axial field due to the O ligand, the weak cubic field induced by the four-fold coordination to Mg atoms entirely quenches the first order orbital moment. This is in marked contrast to Co, which has an out-of-plane orbital moment of $L_z = \pm 3$ that is protected from mixing in a cubic ligand field. The spin-orbit interaction restores a large fraction of the Fe orbital moment leading a zero-field splitting of $14.0 \pm 0.3$~meV, the largest value reported for surface adsorbed Fe atoms.
△ Less
Submitted 25 June, 2015;
originally announced June 2015.
-
RRL: A Rich Representation Language for the Description of Agent Behaviour in NECA
Authors:
P. Piwek,
B. Krenn,
M. Schroeder,
M. Grice,
S. Baumann,
H. Pirker
Abstract:
In this paper, we describe the Rich Representation Language (RRL) which is used in the NECA system. The NECA system generates interactions between two or more animated characters. The RRL is an XML compliant framework for representing the information that is exchanged at the interfaces between the various NECA system modules. The full XML Schemas for the RRL are available at http://www.ai.univie…
▽ More
In this paper, we describe the Rich Representation Language (RRL) which is used in the NECA system. The NECA system generates interactions between two or more animated characters. The RRL is an XML compliant framework for representing the information that is exchanged at the interfaces between the various NECA system modules. The full XML Schemas for the RRL are available at http://www.ai.univie.ac.at/NECA/RRL
△ Less
Submitted 11 October, 2004;
originally announced October 2004.