Emerging Technologies
See recent articles
Showing new listings for Tuesday, 22 October 2024
- [1] arXiv:2410.15296 [pdf, html, other]
-
Title: A Remedy to Compute-in-Memory with Dynamic Random Access Memory: 1FeFET-1C Technology for Neuro-Symbolic AIXunzhao Yin, Hamza Errahmouni Barkam, Franz Müller, Yuxiao Jiang, Mohsen Imani, Sukhrob Abdulazhanov, Alptekin Vardar, Nellie Laleni, Zijian Zhao, Jiahui Duan, Zhiguo Shi, Siddharth Joshi, Michael Niemier, Xiaobo Sharon Hu, Cheng Zhuo, Thomas Kämpfe, Kai NiSubjects: Emerging Technologies (cs.ET); Neural and Evolutionary Computing (cs.NE); Symbolic Computation (cs.SC)
Neuro-symbolic artificial intelligence (AI) excels at learning from noisy and generalized patterns, conducting logical inferences, and providing interpretable reasoning. Comprising a 'neuro' component for feature extraction and a 'symbolic' component for decision-making, neuro-symbolic AI has yet to fully benefit from efficient hardware accelerators. Additionally, current hardware struggles to accommodate applications requiring dynamic resource allocation between these two components. To address these challenges-and mitigate the typical data-transfer bottleneck of classical Von Neumann architectures-we propose a ferroelectric charge-domain compute-in-memory (CiM) array as the foundational processing element for neuro-symbolic AI. This array seamlessly handles both the critical multiply-accumulate (MAC) operations of the 'neuro' workload and the parallel associative search operations of the 'symbolic' workload. To enable this approach, we introduce an innovative 1FeFET-1C cell, combining a ferroelectric field-effect transistor (FeFET) with a capacitor. This design, overcomes the destructive sensing limitations of DRAM in CiM applications, while capable of capitalizing decades of DRAM expertise with a similar cell structure as DRAM, achieves high immunity against FeFET variation-crucial for neuro-symbolic AI-and demonstrates superior energy efficiency. The functionalities of our design have been successfully validated through SPICE simulations and prototype fabrication and testing. Our hardware platform has been benchmarked in executing typical neuro-symbolic AI reasoning tasks, showing over 2x improvement in latency and 1000x improvement in energy efficiency compared to GPU-based implementations.
- [2] arXiv:2410.15893 [pdf, html, other]
-
Title: ATOMIC: Automatic Tool for Memristive IMPLY-based Circuit-level Simulation and ValidationComments: 4 pages, 5 figures, Submitted and Presented at the Embedded Systems Software Competition 2024 at ESWEEKSubjects: Emerging Technologies (cs.ET)
Since performance improvements of computers are stagnating, new technologies and computer paradigms are hot research topics. Memristor-based In-Memory Computing is one of the promising candidates for the post-CMOS era, which comes in many flavors. Processing In memory Array (PIA) or using memory, is on of them which is a relatively new approach, and substantially different than traditional CMOS-based logic design. Consequently, there is a lack of publicly available CAD tools for memristive PIA design and evaluation. Here, we present ATOMIC: an Automatic Tool for Memristive IMPLY-based Circuit-level Simulation and Validation. Using our tool, a large portion of the simulation, evaluation, and validation process can be performed automatically, drastically reducing the development time for memristive PIA systems, in particular those using IMPLY logic. The code is available at this https URL.
- [3] arXiv:2410.15943 [pdf, html, other]
-
Title: Molecular Signal Reception in Complex Vessel Networks: The Role of the Network TopologyComments: 6 pages, 4 figuresSubjects: Emerging Technologies (cs.ET); Signal Processing (eess.SP); Quantitative Methods (q-bio.QM)
The notion of synthetic molecular communication (MC) refers to the transmission of information via molecules and is largely foreseen for use within the human body, where traditional electromagnetic wave (EM)-based communication is impractical. MC is anticipated to enable innovative medical applications, such as early-stage tumor detection, targeted drug delivery, and holistic approaches like the Internet of Bio-Nano Things (IoBNT). Many of these applications involve parts of the human cardiovascular system (CVS), here referred to as networks, posing challenges for MC due to their complex, highly branched vessel structures. To gain a better understanding of how the topology of such branched vessel networks affects the reception of a molecular signal at a target location, e.g., the network outlet, we present a generic analytical end-to-end model that characterizes molecule propagation and reception in linear branched vessel networks (LBVNs). We specialize this generic model to any MC system employing superparamagnetic iron-oxide nanoparticles (SPIONs) as signaling molecules and a planar coil as receiver (RX). By considering components that have been previously established in testbeds, we effectively isolate the impact of the network topology and validate our theoretical model with testbed data. Additionally, we propose two metrics, namely the molecule delay and the multi-path spread, that relate the LBVN topology to the molecule dispersion induced by the network, thereby linking the network structure to the signal-to-noise ratio (SNR) at the target location. This allows the characterization of the SNR at any point in the network solely based on the network topology. Consequently, our framework can, e.g., be exploited for optimal sensor placement in the CVS or identification of suitable testbed topologies for given SNR requirements.
New submissions (showing 3 of 3 entries)
- [4] arXiv:2410.14766 (cross-list from cs.SE) [pdf, html, other]
-
Title: Evaluating Quantized Large Language Models for Code Generation on Low-Resource Language BenchmarksSubjects: Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Emerging Technologies (cs.ET); Machine Learning (cs.LG); Programming Languages (cs.PL)
Democratization of AI is an important topic within the broader topic of the digital divide. This issue is relevant to LLMs, which are becoming popular as AI co-pilots but suffer from a lack of accessibility due to high computational demand. In this study, we evaluate whether quantization is a viable approach toward enabling LLMs on generic consumer devices. The study assesses the performance of five quantized code LLMs in Lua code generation tasks. To evaluate the impact of quantization, the models with 7B parameters were tested on a consumer laptop at 2-, 4-, and 8-bit integer precisions and compared to non-quantized code LLMs with 1.3, 2, and 3 billion parameters. Lua is chosen as a low-level resource language to avoid models' biases related to high-resource languages. The results suggest that the models quantized at the 4-bit integer precision offer the best trade-off between performance and model size. These models can be comfortably deployed on an average laptop without a dedicated GPU. The performance significantly drops at the 2-bit integer precision. The models at 8-bit integer precision require more inference time that does not effectively translate to better performance. The 4-bit models with 7 billion parameters also considerably outperform non-quantized models with lower parameter numbers despite having comparable model sizes with respect to storage and memory demand. While quantization indeed increases the accessibility of smaller LLMs with 7 billion parameters, these LLMs demonstrate overall low performance (less than 50\%) on high-precision and low-resource tasks such as Lua code generation. While accessibility is improved, usability is still not at the practical level comparable to foundational LLMs such as GPT-4o or Llama 3.1 405B.
- [5] arXiv:2410.14945 (cross-list from cs.SD) [pdf, html, other]
-
Title: ImmerseDiffusion: A Generative Spatial Audio Latent Diffusion ModelComments: This work pioneers a Latent Diffusion Model for generating text-prompted ambisonic spatial audioSubjects: Sound (cs.SD); Emerging Technologies (cs.ET); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
We introduce ImmerseDiffusion, an end-to-end generative audio model that produces 3D immersive soundscapes conditioned on the spatial, temporal, and environmental conditions of sound objects. ImmerseDiffusion is trained to generate first-order ambisonics (FOA) audio, which is a conventional spatial audio format comprising four channels that can be rendered to multichannel spatial output. The proposed generative system is composed of a spatial audio codec that maps FOA audio to latent components, a latent diffusion model trained based on various user input types, namely, text prompts, spatial, temporal and environmental acoustic parameters, and optionally a spatial audio and text encoder trained in a Contrastive Language and Audio Pretraining (CLAP) style. We propose metrics to evaluate the quality and spatial adherence of the generated spatial audio. Finally, we assess the model performance in terms of generation quality and spatial conformance, comparing the two proposed modes: ``descriptive", which uses spatial text prompts) and ``parametric", which uses non-spatial text prompts and spatial parameters. Our evaluations demonstrate promising results that are consistent with the user conditions and reflect reliable spatial fidelity.
- [6] arXiv:2410.15087 (cross-list from cs.DC) [pdf, html, other]
-
Title: The Sunk Carbon Fallacy: Rethinking Carbon Footprint Metrics for Effective Carbon-Aware SchedulingNoman Bashir, Varun Gohil, Anagha Belavadi, Mohammad Shahrad, David Irwin, Elsa Olivetti, Christina DelimitrouSubjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computers and Society (cs.CY); Emerging Technologies (cs.ET); Performance (cs.PF)
The rapid increase in computing demand and its corresponding energy consumption have focused attention on computing's impact on the climate and sustainability. Prior work proposes metrics that quantify computing's carbon footprint across several lifecycle phases, including its supply chain, operation, and end-of-life. Industry uses these metrics to optimize the carbon footprint of manufacturing hardware and running computing applications. Unfortunately, prior work on optimizing datacenters' carbon footprint often succumbs to the \emph{sunk cost fallacy} by considering embodied carbon emissions (a sunk cost) when making operational decisions (i.e., job scheduling and placement), which leads to operational decisions that do not always reduce the total carbon footprint.
In this paper, we evaluate carbon-aware job scheduling and placement on a given set of servers for a number of carbon accounting metrics. Our analysis reveals state-of-the-art carbon accounting metrics that include embodied carbon emissions when making operational decisions can actually increase the total carbon footprint of executing a set of jobs. We study the factors that affect the added carbon cost of such suboptimal decision-making. We then use a real-world case study from a datacenter to demonstrate how the sunk carbon fallacy manifests itself in practice. Finally, we discuss the implications of our findings in better guiding effective carbon-aware scheduling in on-premise and cloud datacenters. - [7] arXiv:2410.15626 (cross-list from quant-ph) [pdf, html, other]
-
Title: Hybrid Quantum-HPC Solutions for Max-Cut: Bridging Classical and Quantum AlgorithmsComments: Submitted to IEEE PuneConSubjects: Quantum Physics (quant-ph); Distributed, Parallel, and Cluster Computing (cs.DC); Emerging Technologies (cs.ET)
This research explores the integration of the Quantum Approximate Optimization Algorithm (QAOA) into Hybrid Quantum-HPC systems for solving the Max-Cut problem, comparing its performance with classical algorithms like brute-force search and greedy heuristics. We develop a theoretical model to analyze the time complexity, scalability, and communication overhead in hybrid systems. Using simulations, we evaluate QAOA's performance on small-scale Max-Cut instances, benchmarking its runtime, solution accuracy, and resource utilization. The study also investigates the scalability of QAOA with increasing problem size, offering insights into its potential advantages over classical methods for large-scale combinatorial optimization problems, with implications for future Quantum computing applications in HPC environments.
- [8] arXiv:2410.15724 (cross-list from cs.CR) [pdf, html, other]
-
Title: Efficient and Universally Accessible Cross-Chain Options without Upfront Holder CollateralComments: 19 pages, 4 figures, 2 tablesSubjects: Cryptography and Security (cs.CR); Emerging Technologies (cs.ET)
Options are fundamental to blockchain-based financial markets, offering essential tools for risk management and price speculation, which enhance liquidity, flexibility, and market efficiency in decentralized finance (DeFi). Despite the growing interest in options for blockchain-resident assets, such as cryptocurrencies, current option mechanisms face significant challenges, including limited asset support, high trading delays, and the requirement for option holders to provide upfront collateral.
In this paper, we present a protocol that addresses the aforementioned issues by facilitating efficient and universally accessible option trading without requiring holders to post collateral when establishing options. Our protocol's universality allows for cross-chain options involving nearly $\textit{any}$ assets on $\textit{any}$ two different blockchains, provided the chains' programming languages can enforce and execute the necessary contract logic. A key innovation in our approach is the use of Double-Authentication-Preventing Signatures (DAPS), which significantly reduces trading latency. Additionally, by introducing a guarantee from the option writer, our protocol removes the need of upfront collateral from holders. Our evaluation demonstrates that the proposed scheme reduces option transfer latency to less than half of that in existing methods. Rigorous security analysis proves that our protocol achieves secure option trading, even when facing adversarial behaviors. - [9] arXiv:2410.15736 (cross-list from cs.AR) [pdf, other]
-
Title: Design of a 64-bit SQRT-CSLA with Reduced Area and High-Speed Applications in Low Power VLSI CircuitsSubjects: Hardware Architecture (cs.AR); Emerging Technologies (cs.ET)
The main areas of research in VLSI system design include area, high speed, and power-efficient data route logic systems. The amount of time needed to send a carry through the adder limits the pace at which addition can occur in digital adders. One of the quickest adders, the Carry Select Adder (CSLA), is utilized by various data processing processors to carry out quick arithmetic operations. It is evident from the CSLA's structure that there is room to cut back on both the area and the delay. This work employs a straightforward and effective gate-level adjustment (in a regular structure) that significantly lowers the CSLA's area and delay. In light of this adjustment Square-Root Carry Select Adder (SQRT CSLA) designs with bit lengths of 8, 16, 32, and 64. When compared to the standard SQRT CSLA, the suggested design significantly reduces both area and latency. Xilinx ISE tool is used for Simulation and synthesis. The performance of the recommended designs in terms of delay is estimated in this study using the standard designs. The study of the findings indicates that the suggested CSLA structure outperforms the standard SQRT CSLA.
- [10] arXiv:2410.15854 (cross-list from cs.NE) [pdf, html, other]
-
Title: TEXEL: A neuromorphic processor with on-chip learning for beyond-CMOS device integrationHugh Greatorex, Ole Richter, Michele Mastella, Madison Cotteret, Philipp Klein, Maxime Fabre, Arianna Rubino, Willian Soares Girão, Junren Chen, Martin Ziegler, Laura Bégon-Lours, Giacomo Indiveri, Elisabetta ChiccaComments: 17 pages, 7 figures. Supplementary material: 8 pages, 4 figuresSubjects: Neural and Evolutionary Computing (cs.NE); Hardware Architecture (cs.AR); Emerging Technologies (cs.ET); Machine Learning (cs.LG)
Recent advances in memory technologies, devices and materials have shown great potential for integration into neuromorphic electronic systems. However, a significant gap remains between the development of these materials and the realization of large-scale, fully functional systems. One key challenge is determining which devices and materials are best suited for specific functions and how they can be paired with CMOS circuitry. To address this, we introduce TEXEL, a mixed-signal neuromorphic architecture designed to explore the integration of on-chip learning circuits and novel two- and three-terminal devices. TEXEL serves as an accessible platform to bridge the gap between CMOS-based neuromorphic computation and the latest advancements in emerging devices. In this paper, we demonstrate the readiness of TEXEL for device integration through comprehensive chip measurements and simulations. TEXEL provides a practical system for testing bio-inspired learning algorithms alongside emerging devices, establishing a tangible link between brain-inspired computation and cutting-edge device research.
- [11] arXiv:2410.16079 (cross-list from cs.AR) [pdf, html, other]
-
Title: SAIM: Scalable Analog Ising Machine for Solving Quadratic Binary Optimization ProblemsComments: 5 pages, 8 figures, prepared in IEEE formatSubjects: Hardware Architecture (cs.AR); Emerging Technologies (cs.ET)
This paper presents a CMOS-compatible Lechner-Hauke-Zoller (LHZ)--based analog tile structure as a fundamental unit for developing scalable analog Ising machines (IMs). In the designed LHZ tile, the voltage-controlled oscillators are employed as the physical Ising spins, while for the ancillary spins, we introduce an oscillator-based circuit to emulate the constraint needed to ensure the correct functionality of the tile. We implement the proposed LHZ tile in 12nm FinFET technology using the Cadence Virtuoso. Simulation results show the proposed tile could converge to the results in about 31~ns. Also, the designed spins could operate at approximately 13~GHz.
Cross submissions (showing 8 of 8 entries)
- [12] arXiv:2410.11295 (replaced) [pdf, html, other]
-
Title: BRC20 Pinning AttackSubjects: Cryptography and Security (cs.CR); Computational Engineering, Finance, and Science (cs.CE); Emerging Technologies (cs.ET)
BRC20 tokens are a type of non-fungible asset on the Bitcoin network. They allow users to embed customized content within Bitcoin satoshis. The related token frenzy has reached a market size of US$2,650b over the past year (2023Q3-2024Q3). However, this intuitive design has not undergone serious security scrutiny.
We present the first in-depth analysis of the BRC20 transfer mechanism and identify a critical attack vector. A typical BRC20 transfer involves two bundled on-chain transactions with different fee levels: the first (i.e., Tx1) with a lower fee inscribes the transfer request, while the second (i.e., Tx2) with a higher fee finalizes the actual transfer. We find that an adversary can exploit this by sending a manipulated fee transaction (falling between the two fee levels), which allows Tx1 to be processed while Tx2 remains pinned in the mempool. This locks the BRC20 liquidity and disrupts normal transfers for users. We term this BRC20 pinning attack.
Our attack exposes an inherent design flaw that can be applied to 90+% inscription-based tokens within the Bitcoin ecosystem.
We also conducted the attack on Binance's ORDI hot wallet (the most prevalent BRC20 token and the most active wallet), resulting in a temporary suspension of ORDI withdrawals on Binance for 3.5 hours, which were shortly resumed after our communication.