subscribe to arXiv mailings

Exposing Shadow Branches

Authors: Chrysanthos Pepi, Bhargav Reddy Godala, Krishnam Tibrewala, Gino Chacon, Paul V. Gratz, Daniel A. Jiménez, Gilles A. Pokam, David I. August

Abstract: Modern processors implement a decoupled front-end in the form of Fetch Directed Instruction Prefetching (FDIP) to avoid front-end stalls. FDIP is driven by the Branch Prediction Unit (BPU), relying on the BPU's accuracy and branch target tracking structures to speculatively fetch instructions into the Instruction Cache (L1I). As data center applications become more complex, their code footprints a… ▽ More Modern processors implement a decoupled front-end in the form of Fetch Directed Instruction Prefetching (FDIP) to avoid front-end stalls. FDIP is driven by the Branch Prediction Unit (BPU), relying on the BPU's accuracy and branch target tracking structures to speculatively fetch instructions into the Instruction Cache (L1I). As data center applications become more complex, their code footprints also grow, resulting in an increase in Branch Target Buffer (BTB) misses. FDIP can alleviate L1I cache misses, but when it encounters a BTB miss, the BPU may not identify the current instruction as a branch to FDIP. This can prevent FDIP from prefetching or cause it to speculate down the wrong path, further polluting the L1I cache. We observe that the vast majority, 75%, of BTB-missing, unidentified branches are actually present in instruction cache lines that FDIP has previously fetched but, these missing branches have not yet been decoded and inserted into the BTB. This is because the instruction line is decoded from an entry point (which is the target of the previous taken branch) till an exit point (the taken branch). Branch instructions present in the ignored portion of the cache line we call them "Shadow Branches". Here we present Skeia, a novel shadow branch decoding technique that identifies and decodes unused bytes in cache lines fetched by FDIP, inserting them into a Shadow Branch Buffer (SBB). The SBB is accessed in parallel with the BTB, allowing FDIP to speculate despite a BTB miss. With a minimal storage state of 12.25KB, Skeia delivers a geomean speedup of ~5.7% over an 8K-entry BTB (78KB) and ~2% versus adding an equal amount of state to the BTB across 16 front-end bound applications. Since many branches stored in the SBB are unique compared to those in a similarly sized BTB, we consistently observe greater performance gains with Skeia across all examined sizes until saturation. △ Less

Submitted 22 August, 2024; originally announced August 2024.

Comments: 13 pages, 16 figures, Submitted to ASPLOS 2025

arXiv:2408.05912 [pdf, other]

Correct Wrong Path

Authors: Bhargav Reddy Godala, Sankara Prasad Ramesh, Krishnam Tibrewala, Chrysanthos Pepi, Gino Chacon, Svilen Kanev, Gilles A. Pokam, Daniel A. Jiménez, Paul V. Gratz, David I. August

Abstract: Modern OOO CPUs have very deep pipelines with large branch misprediction recovery penalties. Speculatively executed instructions on the wrong path can significantly change cache state, depending on speculation levels. Architects often employ trace-driven simulation models in the design exploration stage, which sacrifice precision for speed. Trace-driven simulators are orders of magnitude faster th… ▽ More Modern OOO CPUs have very deep pipelines with large branch misprediction recovery penalties. Speculatively executed instructions on the wrong path can significantly change cache state, depending on speculation levels. Architects often employ trace-driven simulation models in the design exploration stage, which sacrifice precision for speed. Trace-driven simulators are orders of magnitude faster than execution-driven models, reducing the often hundreds of thousands of simulation hours needed to explore new micro-architectural ideas. Despite this strong benefit of trace-driven simulation, these often fail to adequately model the consequences of wrong path because obtaining them is nontrivial. Prior works consider either a positive or negative impact of wrong path but not both. Here, we examine wrong path execution in simulation results and design a set of infrastructure for enabling wrong-path execution in a trace driven simulator. Our analysis shows the wrong path affects structures on both the instruction and data sides extensively, resulting in performance variations ranging from $-3.05$\% to $20.9$\% when ignoring wrong path. To benefit the research community and enhance the accuracy of simulators, we opened our traces and tracing utility in the hopes that industry can provide wrong-path traces generated by their internal simulators, enabling academic simulation without exposing industry IP. △ Less

Submitted 11 August, 2024; originally announced August 2024.

Comments: 5 pages, 7 Figures, Submited to Computer Architecture Letters

arXiv:2210.14324 [pdf, other]

The Championship Simulator: Architectural Simulation for Education and Competition

Authors: Nathan Gober, Gino Chacon, Lei Wang, Paul V. Gratz, Daniel A. Jimenez, Elvira Teran, Seth Pugsley, Jinchun Kim

Abstract: Recent years have seen a dramatic increase in the microarchitectural complexity of processors. This increase in complexity presents a twofold challenge for the field of computer architecture. First, no individual architect can fully comprehend the complexity of the entire microarchitecture of the core. This leads to increasingly specialized architects, who treat parts of the core outside their par… ▽ More Recent years have seen a dramatic increase in the microarchitectural complexity of processors. This increase in complexity presents a twofold challenge for the field of computer architecture. First, no individual architect can fully comprehend the complexity of the entire microarchitecture of the core. This leads to increasingly specialized architects, who treat parts of the core outside their particular expertise as black boxes. Second, with increasing complexity, the field becomes decreasingly accessible to new students of the field. When learning core microarchitecture, new students must first learn the big picture of how the system works in order to understand how the pieces all fit together. The tools used to study microarchitecture experience a similar struggle. As with the microarchitectures they simulate, an increase in complexity reduces accessibility to new users. In this work, we present ChampSim. ChampSim uses a modular design and configurable structure to achieve a low barrier to entry into the field of microarchitecural simulation. ChampSim has shown itself to be useful in multiple areas of research, competition, and education. In this way, we seek to promote access and inclusion despite the increasing complexity of the field of computer architecture. △ Less

Submitted 25 October, 2022; originally announced October 2022.

arXiv:2210.00058 [pdf, other]

Hardware Trojan Threats to Cache Coherence in Modern 2.5D Chiplet Systems

Authors: Gino A. Chacon, Charles Williams, Johann Knechtel, Ozgur Sinanoglu, Paul V. Gratz

Abstract: As industry moves toward chiplet-based designs, the insertion of hardware Trojans poses a significant threat to the security of these systems. These systems rely heavily on cache coherence for coherent data communication, making coherence an attractive target. Critically, unlike prior work, which focuses only on malicious packet modifications, a Trojan attack that exploits coherence can modify dat… ▽ More As industry moves toward chiplet-based designs, the insertion of hardware Trojans poses a significant threat to the security of these systems. These systems rely heavily on cache coherence for coherent data communication, making coherence an attractive target. Critically, unlike prior work, which focuses only on malicious packet modifications, a Trojan attack that exploits coherence can modify data in memory that was never touched and is not owned by the chiplet which contains the Trojan. Further, the Trojan need not even be physically between the victim and the memory controller to attack the victim's memory transactions. Here, we explore the fundamental attack vectors possible in chiplet-based systems and provide an example Trojan implementation capable of directly modifying victim data in memory. This work aims to highlight the need for developing mechanisms that can protect and secure the coherence scheme from these forms of attacks. △ Less

Submitted 30 September, 2022; originally announced October 2022.

arXiv:2105.02917 [pdf, other]

Coherence Attacks and Countermeasures in Interposer-Based Systems

Authors: Gino Chacon, Tapojyoti Mandal, Johann Knechtel, Ozgur Sinanoglu, Paul Gratz, Vassos Soteriou

Abstract: Industry is moving towards large-scale systems where processor cores, memories, accelerators, etc.\ are bundled via 2.5D integration. These various components are fabricated separately as chiplets and then integrated using an interconnect carrier, a so-called interposer. This new design style provides benefits in terms of yield as well as economies of scale, as chiplets may come from various third… ▽ More Industry is moving towards large-scale systems where processor cores, memories, accelerators, etc.\ are bundled via 2.5D integration. These various components are fabricated separately as chiplets and then integrated using an interconnect carrier, a so-called interposer. This new design style provides benefits in terms of yield as well as economies of scale, as chiplets may come from various third-party vendors, and be integrated into one sophisticated system. The benefits of this approach, however, come at the cost of new challenges for the system's security and integrity when many third-party component chiplets, some from not fully trusted vendors, are integrated. Here, we explore these challenges, but also promises, for modern interposer-based systems of cache-coherent, multi-core chiplets. First, we introduce a new, coherence-based attack, GETXspy, wherein a single compromised chiplet can expose a high-bandwidth side/covert-channel in an ostensibly secure system. We further show that prior art is insufficient to stop this new attack. Second, we propose using an active interposer as generic, secure-by-construction platform that forms a physical root of trust for modern 2.5D systems. Our scheme has limited overhead, restricted to the active interposer, allowing the chiplets and the coherence system to remain untouched. We show that our scheme prevents a wide range of attacks, including but not limited to our GETXspy attack, with little overhead on system performance, $\sim$4\%. This overhead reduces as workloads increase, ensuring scalability of the scheme. △ Less

Submitted 7 January, 2022; v1 submitted 6 May, 2021; originally announced May 2021.

arXiv:1810.13045 [pdf, ps, other]

Analytic Variable Exponent Hardy Spaces

Authors: Gerardo A. Chacón, Gerardo R. Chacón

Abstract: We introduce a variable exponent version of the Hardy space of analytic functions on the unit disk, we show some properties of the space, and give an example of a variable exponent $p(\cdot)$ that satisfies the $\log$-Hölder condition such that $H^{p(\cdot)}\neq H^q$ for any constant exponent $1<q<\infty$. We also consider the variable exponent version of the Hardy space on the upper-half plane. We introduce a variable exponent version of the Hardy space of analytic functions on the unit disk, we show some properties of the space, and give an example of a variable exponent $p(\cdot)$ that satisfies the $\log$-Hölder condition such that $H^{p(\cdot)}\neq H^q$ for any constant exponent $1<q<\infty$. We also consider the variable exponent version of the Hardy space on the upper-half plane. △ Less

Submitted 30 October, 2018; originally announced October 2018.

MSC Class: 30H10; 42B30 30H10; 42B30

arXiv:1805.06864 [pdf, ps, other]

Resource allocation under uncertainty: an algebraic and qualitative treatment

Authors: Franklin Camacho, Gerardo Chacón, Ramón Pino Peréz

Abstract: We use an algebraic viewpoint, namely a matrix framework to deal with the problem of resource allocation under uncertainty in the context of a qualitative approach. Our basic qualitative data are a plausibility relation over the resources, a hierarchical relation over the agents and of course the preference that the agents have over the resources. With this data we propose a qualitative binary rel… ▽ More We use an algebraic viewpoint, namely a matrix framework to deal with the problem of resource allocation under uncertainty in the context of a qualitative approach. Our basic qualitative data are a plausibility relation over the resources, a hierarchical relation over the agents and of course the preference that the agents have over the resources. With this data we propose a qualitative binary relation $\unrhd$ between allocations such that $\mathcal{F}\unrhd \mathcal{G}$ has the following intended meaning: the allocation $\mathcal{F}$ produces more or equal social welfare than the allocation $\mathcal{G}$. We prove that there is a family of allocations which are maximal with respect to $\unrhd$. We prove also that there is a notion of simple deal such that optimal allocations can be reached by sequences of simple deals. Finally, we introduce some mechanism for discriminating {optimal} allocations. △ Less

Submitted 17 May, 2018; originally announced May 2018.

MSC Class: 90A80; 68T; 68T37; 68E; 90A06

arXiv:1709.00724 [pdf, ps, other]

Variable Exponent Fock Spaces

Authors: Gerardo A. Chacon, Gerardo R. Chacon

Abstract: In this article we introduce Variable exponent Fock spaces and study some of their basic properties such as the boundedness of evaluation functionals, density of polynomials, boundedness of a Bergman-type projection and duality. In this article we introduce Variable exponent Fock spaces and study some of their basic properties such as the boundedness of evaluation functionals, density of polynomials, boundedness of a Bergman-type projection and duality. △ Less

Submitted 3 September, 2017; originally announced September 2017.

MSC Class: Primary 30H20; Secondary 46E30

arXiv:1304.5958 [pdf, ps, other]

Characterizations of Dirichlet-type Spaces

Authors: Xiaosong Liu, Gerardo R. Chacón, Zengjian Lou

Abstract: We give three characterizations of the Dirichlet-type spaces $D(μ)$. First we characterize $D(μ)$ in terms of a double integral and in terms of the mean oscillation in the Bergman metric, none of them involve the use of derivatives. Next, we obtain another characterization for $D(μ)$ in terms of higher order derivatives. Also, a decomposition theorem for $D(μ)$ is established. We give three characterizations of the Dirichlet-type spaces $D(μ)$. First we characterize $D(μ)$ in terms of a double integral and in terms of the mean oscillation in the Bergman metric, none of them involve the use of derivatives. Next, we obtain another characterization for $D(μ)$ in terms of higher order derivatives. Also, a decomposition theorem for $D(μ)$ is established. △ Less

Submitted 22 April, 2013; originally announced April 2013.

MSC Class: Primary 30D45; Secondary 30D50

arXiv:1208.2917 [pdf, ps, other]

Toeplitz Operators on Weighted Bergman Spaces

Authors: Gerardo R. Chacón

Abstract: In this article we characterize the boundedness and compactness of a Toeplitz-type operator on weighted Bergman spaces satisfying the so-called Bekolle-Bonami condition in terms of the Berezin transform. In this article we characterize the boundedness and compactness of a Toeplitz-type operator on weighted Bergman spaces satisfying the so-called Bekolle-Bonami condition in terms of the Berezin transform. △ Less

Submitted 14 August, 2012; originally announced August 2012.

MSC Class: 47B35 (Primary) 32A36 (Secondary)

arXiv:1009.1801 [pdf, ps, other]

Carleson Measures and Reproducing Kernel Thesis in Dirichlet-type spaces

Authors: Gerardo Chacòn, Emmanuel Fricain, Mahmood Shabankhah

Abstract: In this paper, using a generalization of a Richter and Sundberg representation theorem, we give a new characterization of Carleson measures for the Dirichlet-type space $\mathcal D(μ)$ when $μ$ is a finite sum of point masses. A reproducing kernel thesis result is also established in this case. In this paper, using a generalization of a Richter and Sundberg representation theorem, we give a new characterization of Carleson measures for the Dirichlet-type space $\mathcal D(μ)$ when $μ$ is a finite sum of point masses. A reproducing kernel thesis result is also established in this case. △ Less

Submitted 17 February, 2012; v1 submitted 9 September, 2010; originally announced September 2010.

Journal ref: Acta Universitatis Szegediensis. Acta Scientiarum Mathematicarum 78, 1-2 (2012) 315-329

arXiv:math/0504179 [pdf, ps, other]

Composition Operators on the Dirichlet Space and Related Problems

Authors: Gerardo A. Chacon, Gerardo R. Chacon, Jose Gimenez

Abstract: In this paper we investigate the following problem: when a bounded analytic function $φ$ on the unit disk $\mathbb{D}$, fixing 0, is such that $\{φ^n : n = 0, 1, 2, . . . \}$ is orthogonal in $\mathbb{D}$?, and consider the problem of characterizing the univalent, full self-maps of $\mathbb{D}$ in terms of the norm of the composition operator induced. The first problem is analogous to a celebrat… ▽ More In this paper we investigate the following problem: when a bounded analytic function $φ$ on the unit disk $\mathbb{D}$, fixing 0, is such that $\{φ^n : n = 0, 1, 2, . . . \}$ is orthogonal in $\mathbb{D}$?, and consider the problem of characterizing the univalent, full self-maps of $\mathbb{D}$ in terms of the norm of the composition operator induced. The first problem is analogous to a celebrated question asked by W. Rudin on the Hardy space setting that was answered recently ([3] and [15]). The second problem is analogous to a problem investigated by J. Shapiro in [14] about characterization of inner functions in the setting of $H^2$. △ Less

Submitted 8 April, 2005; originally announced April 2005.

Comments: 8 pages, 1 figure. See also http://webdelprofesor.ula.ve/nucleotachira/gchacon or http://webdelprofesor.ula.ve/humanidades/grchacon

MSC Class: 47B33 (primary); 47B38; 47A16 (secondary)

Showing 1–12 of 12 results for author: Chacon, G