-
Exposing Shadow Branches
Authors:
Chrysanthos Pepi,
Bhargav Reddy Godala,
Krishnam Tibrewala,
Gino Chacon,
Paul V. Gratz,
Daniel A. Jiménez,
Gilles A. Pokam,
David I. August
Abstract:
Modern processors implement a decoupled front-end in the form of Fetch Directed Instruction Prefetching (FDIP) to avoid front-end stalls. FDIP is driven by the Branch Prediction Unit (BPU), relying on the BPU's accuracy and branch target tracking structures to speculatively fetch instructions into the Instruction Cache (L1I). As data center applications become more complex, their code footprints a…
▽ More
Modern processors implement a decoupled front-end in the form of Fetch Directed Instruction Prefetching (FDIP) to avoid front-end stalls. FDIP is driven by the Branch Prediction Unit (BPU), relying on the BPU's accuracy and branch target tracking structures to speculatively fetch instructions into the Instruction Cache (L1I). As data center applications become more complex, their code footprints also grow, resulting in an increase in Branch Target Buffer (BTB) misses. FDIP can alleviate L1I cache misses, but when it encounters a BTB miss, the BPU may not identify the current instruction as a branch to FDIP. This can prevent FDIP from prefetching or cause it to speculate down the wrong path, further polluting the L1I cache. We observe that the vast majority, 75%, of BTB-missing, unidentified branches are actually present in instruction cache lines that FDIP has previously fetched but, these missing branches have not yet been decoded and inserted into the BTB. This is because the instruction line is decoded from an entry point (which is the target of the previous taken branch) till an exit point (the taken branch). Branch instructions present in the ignored portion of the cache line we call them "Shadow Branches". Here we present Skeia, a novel shadow branch decoding technique that identifies and decodes unused bytes in cache lines fetched by FDIP, inserting them into a Shadow Branch Buffer (SBB). The SBB is accessed in parallel with the BTB, allowing FDIP to speculate despite a BTB miss. With a minimal storage state of 12.25KB, Skeia delivers a geomean speedup of ~5.7% over an 8K-entry BTB (78KB) and ~2% versus adding an equal amount of state to the BTB across 16 front-end bound applications. Since many branches stored in the SBB are unique compared to those in a similarly sized BTB, we consistently observe greater performance gains with Skeia across all examined sizes until saturation.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
Correct Wrong Path
Authors:
Bhargav Reddy Godala,
Sankara Prasad Ramesh,
Krishnam Tibrewala,
Chrysanthos Pepi,
Gino Chacon,
Svilen Kanev,
Gilles A. Pokam,
Daniel A. Jiménez,
Paul V. Gratz,
David I. August
Abstract:
Modern OOO CPUs have very deep pipelines with large branch misprediction recovery penalties. Speculatively executed instructions on the wrong path can significantly change cache state, depending on speculation levels. Architects often employ trace-driven simulation models in the design exploration stage, which sacrifice precision for speed. Trace-driven simulators are orders of magnitude faster th…
▽ More
Modern OOO CPUs have very deep pipelines with large branch misprediction recovery penalties. Speculatively executed instructions on the wrong path can significantly change cache state, depending on speculation levels. Architects often employ trace-driven simulation models in the design exploration stage, which sacrifice precision for speed. Trace-driven simulators are orders of magnitude faster than execution-driven models, reducing the often hundreds of thousands of simulation hours needed to explore new micro-architectural ideas. Despite this strong benefit of trace-driven simulation, these often fail to adequately model the consequences of wrong path because obtaining them is nontrivial. Prior works consider either a positive or negative impact of wrong path but not both. Here, we examine wrong path execution in simulation results and design a set of infrastructure for enabling wrong-path execution in a trace driven simulator. Our analysis shows the wrong path affects structures on both the instruction and data sides extensively, resulting in performance variations ranging from $-3.05$\% to $20.9$\% when ignoring wrong path. To benefit the research community and enhance the accuracy of simulators, we opened our traces and tracing utility in the hopes that industry can provide wrong-path traces generated by their internal simulators, enabling academic simulation without exposing industry IP.
△ Less
Submitted 11 August, 2024;
originally announced August 2024.
-
The Championship Simulator: Architectural Simulation for Education and Competition
Authors:
Nathan Gober,
Gino Chacon,
Lei Wang,
Paul V. Gratz,
Daniel A. Jimenez,
Elvira Teran,
Seth Pugsley,
Jinchun Kim
Abstract:
Recent years have seen a dramatic increase in the microarchitectural complexity of processors. This increase in complexity presents a twofold challenge for the field of computer architecture. First, no individual architect can fully comprehend the complexity of the entire microarchitecture of the core. This leads to increasingly specialized architects, who treat parts of the core outside their par…
▽ More
Recent years have seen a dramatic increase in the microarchitectural complexity of processors. This increase in complexity presents a twofold challenge for the field of computer architecture. First, no individual architect can fully comprehend the complexity of the entire microarchitecture of the core. This leads to increasingly specialized architects, who treat parts of the core outside their particular expertise as black boxes. Second, with increasing complexity, the field becomes decreasingly accessible to new students of the field. When learning core microarchitecture, new students must first learn the big picture of how the system works in order to understand how the pieces all fit together. The tools used to study microarchitecture experience a similar struggle. As with the microarchitectures they simulate, an increase in complexity reduces accessibility to new users.
In this work, we present ChampSim. ChampSim uses a modular design and configurable structure to achieve a low barrier to entry into the field of microarchitecural simulation. ChampSim has shown itself to be useful in multiple areas of research, competition, and education. In this way, we seek to promote access and inclusion despite the increasing complexity of the field of computer architecture.
△ Less
Submitted 25 October, 2022;
originally announced October 2022.
-
Hardware Trojan Threats to Cache Coherence in Modern 2.5D Chiplet Systems
Authors:
Gino A. Chacon,
Charles Williams,
Johann Knechtel,
Ozgur Sinanoglu,
Paul V. Gratz
Abstract:
As industry moves toward chiplet-based designs, the insertion of hardware Trojans poses a significant threat to the security of these systems. These systems rely heavily on cache coherence for coherent data communication, making coherence an attractive target. Critically, unlike prior work, which focuses only on malicious packet modifications, a Trojan attack that exploits coherence can modify dat…
▽ More
As industry moves toward chiplet-based designs, the insertion of hardware Trojans poses a significant threat to the security of these systems. These systems rely heavily on cache coherence for coherent data communication, making coherence an attractive target. Critically, unlike prior work, which focuses only on malicious packet modifications, a Trojan attack that exploits coherence can modify data in memory that was never touched and is not owned by the chiplet which contains the Trojan. Further, the Trojan need not even be physically between the victim and the memory controller to attack the victim's memory transactions. Here, we explore the fundamental attack vectors possible in chiplet-based systems and provide an example Trojan implementation capable of directly modifying victim data in memory. This work aims to highlight the need for developing mechanisms that can protect and secure the coherence scheme from these forms of attacks.
△ Less
Submitted 30 September, 2022;
originally announced October 2022.
-
Coherence Attacks and Countermeasures in Interposer-Based Systems
Authors:
Gino Chacon,
Tapojyoti Mandal,
Johann Knechtel,
Ozgur Sinanoglu,
Paul Gratz,
Vassos Soteriou
Abstract:
Industry is moving towards large-scale systems where processor cores, memories, accelerators, etc.\ are bundled via 2.5D integration. These various components are fabricated separately as chiplets and then integrated using an interconnect carrier, a so-called interposer. This new design style provides benefits in terms of yield as well as economies of scale, as chiplets may come from various third…
▽ More
Industry is moving towards large-scale systems where processor cores, memories, accelerators, etc.\ are bundled via 2.5D integration. These various components are fabricated separately as chiplets and then integrated using an interconnect carrier, a so-called interposer. This new design style provides benefits in terms of yield as well as economies of scale, as chiplets may come from various third-party vendors, and be integrated into one sophisticated system. The benefits of this approach, however, come at the cost of new challenges for the system's security and integrity when many third-party component chiplets, some from not fully trusted vendors, are integrated.
Here, we explore these challenges, but also promises, for modern interposer-based systems of cache-coherent, multi-core chiplets. First, we introduce a new, coherence-based attack, GETXspy, wherein a single compromised chiplet can expose a high-bandwidth side/covert-channel in an ostensibly secure system. We further show that prior art is insufficient to stop this new attack. Second, we propose using an active interposer as generic, secure-by-construction platform that forms a physical root of trust for modern 2.5D systems. Our scheme has limited overhead, restricted to the active interposer, allowing the chiplets and the coherence system to remain untouched. We show that our scheme prevents a wide range of attacks, including but not limited to our GETXspy attack, with little overhead on system performance, $\sim$4\%. This overhead reduces as workloads increase, ensuring scalability of the scheme.
△ Less
Submitted 7 January, 2022; v1 submitted 6 May, 2021;
originally announced May 2021.
-
Analytic Variable Exponent Hardy Spaces
Authors:
Gerardo A. Chacón,
Gerardo R. Chacón
Abstract:
We introduce a variable exponent version of the Hardy space of analytic functions on the unit disk, we show some properties of the space, and give an example of a variable exponent $p(\cdot)$ that satisfies the $\log$-Hölder condition such that $H^{p(\cdot)}\neq H^q$ for any constant exponent $1<q<\infty$. We also consider the variable exponent version of the Hardy space on the upper-half plane.
We introduce a variable exponent version of the Hardy space of analytic functions on the unit disk, we show some properties of the space, and give an example of a variable exponent $p(\cdot)$ that satisfies the $\log$-Hölder condition such that $H^{p(\cdot)}\neq H^q$ for any constant exponent $1<q<\infty$. We also consider the variable exponent version of the Hardy space on the upper-half plane.
△ Less
Submitted 30 October, 2018;
originally announced October 2018.
-
Resource allocation under uncertainty: an algebraic and qualitative treatment
Authors:
Franklin Camacho,
Gerardo Chacón,
Ramón Pino Peréz
Abstract:
We use an algebraic viewpoint, namely a matrix framework to deal with the problem of resource allocation under uncertainty in the context of a qualitative approach. Our basic qualitative data are a plausibility relation over the resources, a hierarchical relation over the agents and of course the preference that the agents have over the resources. With this data we propose a qualitative binary rel…
▽ More
We use an algebraic viewpoint, namely a matrix framework to deal with the problem of resource allocation under uncertainty in the context of a qualitative approach. Our basic qualitative data are a plausibility relation over the resources, a hierarchical relation over the agents and of course the preference that the agents have over the resources. With this data we propose a qualitative binary relation $\unrhd$ between allocations such that $\mathcal{F}\unrhd \mathcal{G}$ has the following intended meaning: the allocation $\mathcal{F}$ produces more or equal social welfare than the allocation $\mathcal{G}$. We prove that there is a family of allocations which are maximal with respect to $\unrhd$. We prove also that there is a notion of simple deal such that optimal allocations can be reached by sequences of simple deals. Finally, we introduce some mechanism for discriminating {optimal} allocations.
△ Less
Submitted 17 May, 2018;
originally announced May 2018.
-
Variable Exponent Fock Spaces
Authors:
Gerardo A. Chacon,
Gerardo R. Chacon
Abstract:
In this article we introduce Variable exponent Fock spaces and study some of their basic properties such as the boundedness of evaluation functionals, density of polynomials, boundedness of a Bergman-type projection and duality.
In this article we introduce Variable exponent Fock spaces and study some of their basic properties such as the boundedness of evaluation functionals, density of polynomials, boundedness of a Bergman-type projection and duality.
△ Less
Submitted 3 September, 2017;
originally announced September 2017.
-
Characterizations of Dirichlet-type Spaces
Authors:
Xiaosong Liu,
Gerardo R. Chacón,
Zengjian Lou
Abstract:
We give three characterizations of the Dirichlet-type spaces $D(μ)$. First we characterize $D(μ)$ in terms of a double integral and in terms of the mean oscillation in the Bergman metric, none of them involve the use of derivatives. Next, we obtain another characterization for $D(μ)$ in terms of higher order derivatives. Also, a decomposition theorem for $D(μ)$ is established.
We give three characterizations of the Dirichlet-type spaces $D(μ)$. First we characterize $D(μ)$ in terms of a double integral and in terms of the mean oscillation in the Bergman metric, none of them involve the use of derivatives. Next, we obtain another characterization for $D(μ)$ in terms of higher order derivatives. Also, a decomposition theorem for $D(μ)$ is established.
△ Less
Submitted 22 April, 2013;
originally announced April 2013.
-
Toeplitz Operators on Weighted Bergman Spaces
Authors:
Gerardo R. Chacón
Abstract:
In this article we characterize the boundedness and compactness of a Toeplitz-type operator on weighted Bergman spaces satisfying the so-called Bekolle-Bonami condition in terms of the Berezin transform.
In this article we characterize the boundedness and compactness of a Toeplitz-type operator on weighted Bergman spaces satisfying the so-called Bekolle-Bonami condition in terms of the Berezin transform.
△ Less
Submitted 14 August, 2012;
originally announced August 2012.
-
Carleson Measures and Reproducing Kernel Thesis in Dirichlet-type spaces
Authors:
Gerardo Chacòn,
Emmanuel Fricain,
Mahmood Shabankhah
Abstract:
In this paper, using a generalization of a Richter and Sundberg representation theorem, we give a new characterization of Carleson measures for the Dirichlet-type space $\mathcal D(μ)$ when $μ$ is a finite sum of point masses. A reproducing kernel thesis result is also established in this case.
In this paper, using a generalization of a Richter and Sundberg representation theorem, we give a new characterization of Carleson measures for the Dirichlet-type space $\mathcal D(μ)$ when $μ$ is a finite sum of point masses. A reproducing kernel thesis result is also established in this case.
△ Less
Submitted 17 February, 2012; v1 submitted 9 September, 2010;
originally announced September 2010.
-
Composition Operators on the Dirichlet Space and Related Problems
Authors:
Gerardo A. Chacon,
Gerardo R. Chacon,
Jose Gimenez
Abstract:
In this paper we investigate the following problem: when a bounded analytic function $φ$ on the unit disk $\mathbb{D}$, fixing 0, is such that $\{φ^n : n = 0, 1, 2, . . . \}$ is orthogonal in $\mathbb{D}$?, and consider the problem of characterizing the univalent, full self-maps of $\mathbb{D}$ in terms of the norm of the composition operator induced. The first problem is analogous to a celebrat…
▽ More
In this paper we investigate the following problem: when a bounded analytic function $φ$ on the unit disk $\mathbb{D}$, fixing 0, is such that $\{φ^n : n = 0, 1, 2, . . . \}$ is orthogonal in $\mathbb{D}$?, and consider the problem of characterizing the univalent, full self-maps of $\mathbb{D}$ in terms of the norm of the composition operator induced. The first problem is analogous to a celebrated question asked by W. Rudin on the Hardy space setting that was answered recently ([3] and [15]). The second problem is analogous to a problem investigated by J. Shapiro in [14] about characterization of inner functions in the setting of $H^2$.
△ Less
Submitted 8 April, 2005;
originally announced April 2005.