Skip to main content

Showing 1–6 of 6 results for author: Wasti, B

  1. arXiv:2407.21783  [pdf, other

    cs.AI cs.CL cs.CV

    The Llama 3 Herd of Models

    Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (510 additional authors not shown)

    Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More

    Submitted 15 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

  2. LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

    Authors: Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Anas Mahmoud, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed A Aly, Beidi Chen, Carole-Jean Wu

    Abstract: We present LayerSkip, an end-to-end solution to speed-up inference of large language models (LLMs). First, during training we apply layer dropout, with low dropout rates for earlier layers and higher dropout rates for later layers, and an early exit loss where all transformer layers share the same exit. Second, during inference, we show that this training recipe increases the accuracy of early exi… ▽ More

    Submitted 18 October, 2024; v1 submitted 25 April, 2024; originally announced April 2024.

    Comments: ACL 2024

  3. arXiv:2309.01825  [pdf, other

    cs.LG cs.PL

    LoopTune: Optimizing Tensor Computations with Reinforcement Learning

    Authors: Dejan Grubisic, Bram Wasti, Chris Cummins, John Mellor-Crummey, Aleksandar Zlateski

    Abstract: Advanced compiler technology is crucial for enabling machine learning applications to run on novel hardware, but traditional compilers fail to deliver performance, popular auto-tuners have long search times and expert-optimized libraries introduce unsustainable costs. To address this, we developed LoopTune, a deep reinforcement learning compiler that optimizes tensor computations in deep learning… ▽ More

    Submitted 8 November, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

  4. arXiv:2205.00618  [pdf, other

    cs.LG cs.PF cs.SC

    LoopStack: a Lightweight Tensor Algebra Compiler Stack

    Authors: Bram Wasti, José Pablo Cambronero, Benoit Steiner, Hugh Leather, Aleksandar Zlateski

    Abstract: We present LoopStack, a domain specific compiler stack for tensor operations, composed of a frontend, LoopTool, and an efficient optimizing code generator, LoopNest. This stack enables us to compile entire neural networks and generate code targeting the AVX2, AVX512, NEON, and NEONfp16 instruction sets while incorporating optimizations often missing from other machine learning compiler backends. W… ▽ More

    Submitted 1 May, 2022; originally announced May 2022.

  5. arXiv:2109.08267  [pdf, other

    cs.PL cs.AI cs.LG cs.PF

    CompilerGym: Robust, Performant Compiler Optimization Environments for AI Research

    Authors: Chris Cummins, Bram Wasti, Jiadong Guo, Brandon Cui, Jason Ansel, Sahir Gomez, Somya Jain, Jia Liu, Olivier Teytaud, Benoit Steiner, Yuandong Tian, Hugh Leather

    Abstract: Interest in applying Artificial Intelligence (AI) techniques to compiler optimizations is increasing rapidly, but compiler research has a high entry barrier. Unlike in other domains, compiler and AI researchers do not have access to the datasets and frameworks that enable fast iteration and development of ideas, and getting started requires a significant engineering investment. What is needed is a… ▽ More

    Submitted 22 December, 2021; v1 submitted 16 September, 2021; originally announced September 2021.

    Comments: 12 pages. Source code available at https://github.com/facebookresearch/CompilerGym

  6. arXiv:1805.07479  [pdf, other

    cs.SI cs.IR cs.LG cs.MM

    Semisupervised Learning on Heterogeneous Graphs and its Applications to Facebook News Feed

    Authors: Cheng Ju, James Li, Bram Wasti, Shengbo Guo

    Abstract: Graph-based semi-supervised learning is a fundamental machine learning problem, and has been well studied. Most studies focus on homogeneous networks (e.g. citation network, friend network). In the present paper, we propose the Heterogeneous Embedding Label Propagation (HELP) algorithm, a graph-based semi-supervised deep learning algorithm, for graphs that are characterized by heterogeneous node t… ▽ More

    Submitted 4 July, 2018; v1 submitted 18 May, 2018; originally announced May 2018.