-
Object Graph Programming
Authors:
Aditya Thimmaiah,
Leonidas Lampropoulos,
Christopher J. Rossbach,
Milos Gligoric
Abstract:
We introduce Object Graph Programming (OGO), which enables reading and modifying an object graph (i.e., the entire state of the object heap) via declarative queries. OGO models the objects and their relations in the heap as an object graph thereby treating the heap as a graph database: each node in the graph is an object (e.g., an instance of a class or an instance of a metadata class) and each ed…
▽ More
We introduce Object Graph Programming (OGO), which enables reading and modifying an object graph (i.e., the entire state of the object heap) via declarative queries. OGO models the objects and their relations in the heap as an object graph thereby treating the heap as a graph database: each node in the graph is an object (e.g., an instance of a class or an instance of a metadata class) and each edge is a relation between objects (e.g., a field of one object references another object). We leverage Cypher, the most popular query language for graph databases, as OGO's query language. Unlike LINQ, which uses collections (e.g., List) as a source of data, OGO views the entire object graph as a single "collection". OGO is ideal for querying collections (just like LINQ), introspecting the runtime system state (e.g., finding all instances of a given class or accessing fields via reflection), and writing assertions that have access to the entire program state. We prototyped OGO for Java in two ways: (a) by translating an object graph into a Neo4j database on which we run Cypher queries, and (b) by implementing our own in-memory graph query engine that directly queries the object heap. We used OGO to rewrite hundreds of statements in large open-source projects into OGO queries. We report our experience and performance of our prototypes.
△ Less
Submitted 4 February, 2024;
originally announced February 2024.
-
On a Foundation Model for Operating Systems
Authors:
Divyanshu Saxena,
Nihal Sharma,
Donghyun Kim,
Rohit Dwivedula,
Jiayi Chen,
Chenxi Yang,
Sriram Ravula,
Zichao Hu,
Aditya Akella,
Sebastian Angel,
Joydeep Biswas,
Swarat Chaudhuri,
Isil Dillig,
Alex Dimakis,
P. Brighten Godfrey,
Daehyeok Kim,
Chris Rossbach,
Gang Wang
Abstract:
This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in…
▽ More
This paper lays down the research agenda for a domain-specific foundation model for operating systems (OSes). Our case for a foundation model revolves around the observations that several OS components such as CPU, memory, and network subsystems are interrelated and that OS traces offer the ideal dataset for a foundation model to grasp the intricacies of diverse OS components and their behavior in varying environments and workloads. We discuss a wide range of possibilities that then arise, from employing foundation models as policy agents to utilizing them as generators and predictors to assist traditional OS control algorithms. Our hope is that this paper spurs further research into OS foundation models and creating the next generation of operating systems for the evolving computing landscape.
△ Less
Submitted 12 December, 2023;
originally announced December 2023.
-
Compiler-Driven FPGA Virtualization with SYNERGY
Authors:
Joshua Landgraf,
Tiffany Yang,
Will Lin,
Christopher J. Rossbach,
Eric Schkufza
Abstract:
FPGAs are increasingly common in modern applications, and cloud providers now support on-demand FPGA acceleration in data centers. Applications in data centers run on virtual infrastructure, where consolidation, multi-tenancy, and workload migration enable economies of scale that are fundamental to the provider's business. However, a general strategy for virtualizing FPGAs has yet to emerge. While…
▽ More
FPGAs are increasingly common in modern applications, and cloud providers now support on-demand FPGA acceleration in data centers. Applications in data centers run on virtual infrastructure, where consolidation, multi-tenancy, and workload migration enable economies of scale that are fundamental to the provider's business. However, a general strategy for virtualizing FPGAs has yet to emerge. While manufacturers struggle with hardware-based approaches, we propose a compiler/runtime-based solution called Synergy. We show a compiler transformation for Verilog programs that produces code able to yield control to software at sub-clock-tick granularity according to the semantics of the original program. Synergy uses this property to efficiently support core virtualization primitives: suspend and resume, program migration, and spatial/temporal multiplexing, on hardware which is available today. We use Synergy to virtualize FPGA workloads across a cluster of Altera SoCs and Xilinx FPGAs on Amazon F1. The workloads require no modification, run within 3-4x of unvirtualized performance, and incur a modest increase in FPGA fabric utilization.
△ Less
Submitted 27 August, 2021;
originally announced September 2021.
-
ALTIS: Modernizing GPGPU Benchmarking
Authors:
Bodun Hu,
Christopher J. Rossbach
Abstract:
This paper presents Altis, a benchmark suite for modern GPGPU computing. Previous benchmark suites such as Rodinia and SHOC have served the research community well, but were developed years ago when hardware was more limited, software supported fewer features, and production hardware-accelerated workloads were scarce. Since that time, GPU compute density and memory capacity has grown exponentially…
▽ More
This paper presents Altis, a benchmark suite for modern GPGPU computing. Previous benchmark suites such as Rodinia and SHOC have served the research community well, but were developed years ago when hardware was more limited, software supported fewer features, and production hardware-accelerated workloads were scarce. Since that time, GPU compute density and memory capacity has grown exponentially, programmability features such as unified memory, demand paging, and HyperQ have matured, and new workloads such as deep neural networks (DNNs), graph analytics, and crypto-currencies have emerged in production environments, stressing the hardware and software in ways that previous benchmarks did not anticipate. Drawing inspiration from Rodinia and SHOC, Altis is a benchmark suite designed for modern GPU architectures and modern GPU runtimes, representing a diverse set of application domains. By adopting and extending applications from Rodinia and SHOC, adding new applications, and focusing on CUDA platforms, Altis better represents modern GPGPU workloads to enable support GPGPU research in both architecture and system software.
△ Less
Submitted 27 August, 2020; v1 submitted 25 June, 2019;
originally announced June 2019.
-
Mosaic: An Application-Transparent Hardware-Software Cooperative Memory Manager for GPUs
Authors:
Rachata Ausavarungnirun,
Joshua Landgraf,
Vance Miller,
Saugata Ghose,
Jayneel Gandhi,
Christopher J. Rossbach,
Onur Mutlu
Abstract:
Modern GPUs face a trade-off on how the page size used for memory management affects address translation and demand paging. Support for multiple page sizes can help relax the page size trade-off so that address translation and demand paging optimizations work together synergistically. However, existing page coalescing and splintering policies require costly base page migrations that undermine the…
▽ More
Modern GPUs face a trade-off on how the page size used for memory management affects address translation and demand paging. Support for multiple page sizes can help relax the page size trade-off so that address translation and demand paging optimizations work together synergistically. However, existing page coalescing and splintering policies require costly base page migrations that undermine the benefits multiple page sizes provide. In this paper, we observe that GPGPU applications present an opportunity to support multiple page sizes without costly data migration, as the applications perform most of their memory allocation en masse (i.e., they allocate a large number of base pages at once). We show that this en masse allocation allows us to create intelligent memory allocation policies which ensure that base pages that are contiguous in virtual memory are allocated to contiguous physical memory pages. As a result, coalescing and splintering operations no longer need to migrate base pages.
We introduce Mosaic, a GPU memory manager that provides application-transparent support for multiple page sizes. Mosaic uses base pages to transfer data over the system I/O bus, and allocates physical memory in a way that (1) preserves base page contiguity and (2) ensures that a large page frame contains pages from only a single memory protection domain. This mechanism allows the TLB to use large pages, reducing address translation overhead. During data transfer, this mechanism enables the GPU to transfer only the base pages that are needed by the application over the system I/O bus, keeping demand paging overhead low.
△ Less
Submitted 30 April, 2018;
originally announced April 2018.
-
Improving Multi-Application Concurrency Support Within the GPU Memory System
Authors:
Rachata Ausavarungnirun,
Christopher J. Rossbach,
Vance Miller,
Joshua Landgraf,
Saugata Ghose,
Jayneel Gnadhi,
Adwait Jog,
Onur Mutlu
Abstract:
GPUs exploit a high degree of thread-level parallelism to hide long-latency stalls. Due to the heterogeneous compute requirements of different applications, there is a growing need to share the GPU across multiple applications in large-scale computing environments. However, while CPUs offer relatively seamless multi-application concurrency, and are an excellent fit for multitasking and for virtual…
▽ More
GPUs exploit a high degree of thread-level parallelism to hide long-latency stalls. Due to the heterogeneous compute requirements of different applications, there is a growing need to share the GPU across multiple applications in large-scale computing environments. However, while CPUs offer relatively seamless multi-application concurrency, and are an excellent fit for multitasking and for virtualized environments, GPUs currently offer only primitive support for multi-application concurrency. Much of the problem in a contemporary GPU lies within the memory system, where multi-application execution requires virtual memory support to manage the address spaces of each application and to provide memory protection. In this work, we perform a detailed analysis of the major problems in state-of-the-art GPU virtual memory management that hinders multi-application execution. Existing GPUs are designed to share memory between the CPU and GPU, but do not handle multi-application support within the GPU well. We find that when multiple applications spatially share the GPU, there is a significant amount of inter-core thrashing on the shared TLB within the GPU. The TLB contention is high enough to prevent the GPU from successfully hiding stall latencies, thus becoming a first-order performance concern. We introduce MASK, a memory hierarchy design that provides low-overhead virtual memory support for the concurrent execution of multiple applications. MASK extends the GPU memory hierarchy to efficiently support address translation through the use of multi-level TLBs, and uses translation-aware memory and cache management to maximize throughput in the presence of inter-application contention.
△ Less
Submitted 16 August, 2017;
originally announced August 2017.
-
The Future of Computing Research: Industry-Academic Collaborations
Authors:
Nady Boules,
Khari Douglas,
Stuart Feldman,
Limor Fix,
Gregory Hager,
Brent Hailpern,
Martial Hebert,
Dan Lopresti,
Beth Mynatt,
Chris Rossbach,
Helen Wright
Abstract:
IT-driven innovation is an enormous factor in the worldwide economic leadership of the United States. It is larger than finance, construction, or transportation, and it employs nearly 6% of the US workforce. The top three companies, as measured by market capitalization, are IT companies - Apple, Google (now Alphabet), and Microsoft. Facebook, a relatively recent entry in the top 10 list by market…
▽ More
IT-driven innovation is an enormous factor in the worldwide economic leadership of the United States. It is larger than finance, construction, or transportation, and it employs nearly 6% of the US workforce. The top three companies, as measured by market capitalization, are IT companies - Apple, Google (now Alphabet), and Microsoft. Facebook, a relatively recent entry in the top 10 list by market capitalization has surpassed Walmart, the nation's largest retailer, and the largest employer in the world. The net income of just the top three exceeds $80 billion - roughly 100 times the total budget of the NSF CISE directorate which funds 87% of computing research. In short, the direct return on federal research investments in IT research has been enormously profitable to the nation.
The IT industry ecosystem is also evolving. The time from conception to market of successful products has been cut from years to months. Product life cycles are increasingly a year or less. This change has pressured companies to focus industrial R&D on a pipeline or portfolio of technologies that bring immediate, or almost immediate, value to the companies. To defeat the competition and stay ahead of the pack, a company must devote resources to realizing gains that are shorter term, and must remain agile to respond quickly to market changes driven by new technologies, new startups, evolving user experience expectations, and the continuous consumer demand for new and exciting products.
Amidst this landscape, the Computing Community Consortium convened a round-table of industry and academic participants to better understand the landscape of industry-academic interaction, and to discuss possible actions that might be taken to enhance those interactions. We close with some recommendations for actions that could expand the lively conversation we experienced at the round-table to a national scale.
△ Less
Submitted 29 June, 2016;
originally announced June 2016.