Skip to main content

Showing 1–8 of 8 results for author: Novo, D

  1. arXiv:2310.10168  [pdf, other

    cs.AR

    DaPPA: A Data-Parallel Framework for Processing-in-Memory Architectures

    Authors: Geraldo F. Oliveira, Alain Kohli, David Novo, Juan Gómez-Luna, Onur Mutlu

    Abstract: To ease the programmability of PIM architectures, we propose DaPPA(data-parallel processing-in-memory architecture), a framework that can, for a given application, automatically distribute input and gather output data, handle memory management, and parallelize work across the DPUs. The key idea behind DaPPA is to remove the responsibility of managing hardware resources from the programmer by provi… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

  2. Approximations in Deep Learning

    Authors: Etienne Dupuis, Silviu-Ioan Filip, Olivier Sentieys, David Novo, Ian O'Connor, Alberto Bosio

    Abstract: The design and implementation of Deep Learning (DL) models is currently receiving a lot of attention from both industrials and academics. However, the computational workload associated with DL is often out of reach for low-power embedded devices and is still costly when run on datacenters. By relaxing the need for fully precise operations, Approximate Computing (AxC) substantially improves perform… ▽ More

    Submitted 8 December, 2022; originally announced December 2022.

    Comments: Approximate Computing Techniques - From Component- to Application-Level, pp.467-512, 2022, 978-3-030-94704-0

  3. arXiv:2209.05566  [pdf, other

    cs.AR cs.DC

    Flash-Cosmos: In-Flash Bulk Bitwise Operations Using Inherent Computation Capability of NAND Flash Memory

    Authors: Jisung Park, Roknoddin Azizi, Geraldo F. Oliveira, Mohammad Sadrosadati, Rakesh Nadig, David Novo, Juan Gómez-Luna, Myungsuk Kim, Onur Mutlu

    Abstract: Bulk bitwise operations, i.e., bitwise operations on large bit vectors, are prevalent in a wide range of important application domains, including databases, graph processing, genome analysis, cryptography, and hyper-dimensional computing. In conventional systems, the performance and energy efficiency of bulk bitwise operations are bottlenecked by data movement between the compute units and the mem… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    Comments: To appear in 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), 2022

  4. arXiv:2209.00188  [pdf, other

    cs.AR cs.LG

    Hermes: Accelerating Long-Latency Load Requests via Perceptron-Based Off-Chip Load Prediction

    Authors: Rahul Bera, Konstantinos Kanellopoulos, Shankar Balachandran, David Novo, Ataberk Olgun, Mohammad Sadrosadati, Onur Mutlu

    Abstract: Long-latency load requests continue to limit the performance of high-performance processors. To increase the latency tolerance of a processor, architects have primarily relied on two key techniques: sophisticated data prefetchers and large on-chip caches. In this work, we show that: 1) even a sophisticated state-of-the-art prefetcher can only predict half of the off-chip load requests on average a… ▽ More

    Submitted 30 September, 2022; v1 submitted 31 August, 2022; originally announced September 2022.

    Comments: To appear in 55th IEEE/ACM International Symposium on Microarchitecture (MICRO), 2022

    ACM Class: B.3.2; C.0

  5. arXiv:2205.07394  [pdf, other

    cs.AR cs.AI cs.DC cs.LG

    Sibyl: Adaptive and Extensible Data Placement in Hybrid Storage Systems Using Online Reinforcement Learning

    Authors: Gagandeep Singh, Rakesh Nadig, Jisung Park, Rahul Bera, Nastaran Hajinazar, David Novo, Juan Gómez-Luna, Sander Stuijk, Henk Corporaal, Onur Mutlu

    Abstract: Hybrid storage systems (HSS) use multiple different storage devices to provide high and scalable storage capacity at high performance. Recent research proposes various techniques that aim to accurately identify performance-critical data to place it in a "best-fit" storage device. Unfortunately, most of these techniques are rigid, which (1) limits their adaptivity to perform well for a wide range o… ▽ More

    Submitted 16 November, 2023; v1 submitted 15 May, 2022; originally announced May 2022.

  6. arXiv:2102.01345  [pdf

    cs.LG cs.AI cs.CV cs.NE

    Fast Exploration of Weight Sharing Opportunities for CNN Compression

    Authors: Etienne Dupuis, David Novo, Ian O'Connor, Alberto Bosio

    Abstract: The computational workload involved in Convolutional Neural Networks (CNNs) is typically out of reach for low-power embedded devices. There are a large number of approximation techniques to address this problem. These methods have hyper-parameters that need to be optimized for each CNNs using design space exploration (DSE). The goal of this work is to demonstrate that the DSE phase time can easily… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: Presented at DATE Friday Workshop on System-level Design Methods for Deep Learning on Heterogeneous Architectures (SLOHA 2021) (arXiv:2102.00818)

    Report number: SLOHA/2021/05

  7. arXiv:1902.02343  [pdf, other

    cs.DC

    Exploration of Performance and Energy Trade-offs for Heterogeneous Multicore Architectures

    Authors: Anastasiia Butko, Florent Bruguier, David Novo, Abdoulaye Gamatié, Gilles Sassatelli

    Abstract: Energy-efficiency has become a major challenge in modern computer systems. To address this challenge, candidate systems increasingly integrate heterogeneous cores in order to satisfy diverse computation requirements by selecting cores with suitable features. In particular, single-ISA heterogeneous multicore processors such as ARM big.LITTLE have become very attractive since they offer good opportu… ▽ More

    Submitted 6 February, 2019; originally announced February 2019.

    Comments: 11 pages, 6 figure, 2 tables

  8. arXiv:1601.07420  [pdf, other

    cs.DC

    A Workflow for Fast Evaluation of Mapping Heuristics Targeting Cloud Infrastructures

    Authors: Roman Ursu, Khalid Latif, David Novo, Manuel Selva, Abdoulaye Gamatie, Gilles Sassatelli, Dmitry Khabi, Alexey Cheptsov

    Abstract: Resource allocation is today an integral part of cloud infrastructures management to efficiently exploit resources. Cloud infrastructures centers generally use custom built heuristics to define the resource allocations. It is an immediate requirement for the management tools of these centers to have a fast yet reasonably accurate simulation and evaluation platform to define the resource allocation… ▽ More

    Submitted 27 January, 2016; originally announced January 2016.

    Comments: 2nd International Workshop on Dynamic Resource Allocation and Management in Embedded, High Performance and Cloud Computing DREAMCloud 2016 (arXiv:cs/1601.04675)

    Report number: DREAMCloud/2016/03