Skip to main content

Showing 1–27 of 27 results for author: John, L

  1. arXiv:2410.11112  [pdf, other

    cs.LG cs.AI

    Differentiable Weightless Neural Networks

    Authors: Alan T. L. Bacellar, Zachary Susskind, Mauricio Breternitz Jr., Eugene John, Lizy K. John, Priscila M. V. Lima, Felipe M. G. França

    Abstract: We introduce the Differentiable Weightless Neural Network (DWN), a model based on interconnected lookup tables. Training of DWNs is enabled by a novel Extended Finite Difference technique for approximate differentiation of binary values. We propose Learnable Mapping, Learnable Reduction, and Spectral Regularization to further improve the accuracy and efficiency of these models. We evaluate DWNs in… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

    Journal ref: International Conference on Machine Learning (ICML) 2024

  2. arXiv:2410.10505  [pdf

    cs.LG

    Comparison of deep learning and conventional methods for disease onset prediction

    Authors: Luis H. John, Chungsoo Kim, Jan A. Kors, Junhyuk Chang, Hannah Morgan-Cooper, Priya Desai, Chao Pang, Peter R. Rijnbeek, Jenna M. Reps, Egill A. Fridgeirsson

    Abstract: Background: Conventional prediction methods such as logistic regression and gradient boosting have been widely utilized for disease onset prediction for their reliability and interpretability. Deep learning methods promise enhanced prediction performance by extracting complex patterns from clinical data, but face challenges like data sparsity and high dimensionality. Methods: This study compares… ▽ More

    Submitted 14 October, 2024; originally announced October 2024.

  3. arXiv:2405.00820  [pdf, other

    cs.AR cs.LG

    HLSFactory: A Framework Empowering High-Level Synthesis Datasets for Machine Learning and Beyond

    Authors: Stefan Abi-Karam, Rishov Sarkar, Allison Seigler, Sean Lowe, Zhigang Wei, Hanqiu Chen, Nanditha Rao, Lizy John, Aman Arora, Cong Hao

    Abstract: Machine learning (ML) techniques have been applied to high-level synthesis (HLS) flows for quality-of-result (QoR) prediction and design space exploration (DSE). Nevertheless, the scarcity of accessible high-quality HLS datasets and the complexity of building such datasets present challenges. Existing datasets have limitations in terms of benchmark coverage, design space enumeration, vendor extens… ▽ More

    Submitted 17 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

    Comments: Edit to "Section V.E" for proper attribution of open-source HLSyn, AutoDSE, and the Merlin compiler

  4. arXiv:2311.11384  [pdf, other

    cs.AR

    PIMSAB: A Processing-In-Memory System with Spatially-Aware Communication and Bit-Serial-Aware Computation

    Authors: Aman Arora, Jian Weng, Siyuan Ma, Tony Nowatzki, Lizy K. John

    Abstract: Bit-serial Processing-In-Memory (PIM) is an attractive paradigm for accelerator architectures, for parallel workloads such as Deep Learning (DL), because of its capability to achieve massive data parallelism at a low area overhead and provide orders-of-magnitude data movement savings by moving computational resources closer to the data. While many PIM architectures have been proposed, improvements… ▽ More

    Submitted 19 November, 2023; originally announced November 2023.

    Comments: Aman Arora and Jian Weng are co-first authors with equal contribution

  5. arXiv:2304.10618  [pdf, other

    cs.AR eess.SP

    ULEEN: A Novel Architecture for Ultra Low-Energy Edge Neural Networks

    Authors: Zachary Susskind, Aman Arora, Igor D. S. Miranda, Alan T. L. Bacellar, Luis A. Q. Villon, Rafael F. Katopodis, Leandro S. de Araujo, Diego L. C. Dutra, Priscila M. V. Lima, Felipe M. G. Franca, Mauricio Breternitz Jr., Lizy K. John

    Abstract: The deployment of AI models on low-power, real-time edge devices requires accelerators for which energy, latency, and area are all first-order concerns. There are many approaches to enabling deep neural networks (DNNs) in this domain, including pruning, quantization, compression, and binary neural networks (BNNs), but with the emergence of the "extreme edge", there is now a demand for even more ef… ▽ More

    Submitted 20 April, 2023; originally announced April 2023.

    Comments: 14 pages, 14 figures Portions of this article draw heavily from arXiv:2203.01479, most notably sections 5E and 5F.2

  6. arXiv:2302.10977  [pdf, other

    cs.AR cs.LG

    HLSDataset: Open-Source Dataset for ML-Assisted FPGA Design using High Level Synthesis

    Authors: Zhigang Wei, Aman Arora, Ruihao Li, Lizy K. John

    Abstract: Machine Learning (ML) has been widely adopted in design exploration using high level synthesis (HLS) to give a better and faster performance, and resource and power estimation at very early stages for FPGA-based design. To perform prediction accurately, high-quality and large-volume datasets are required for training ML models.This paper presents a dataset for ML-assisted FPGA design using HLS, ca… ▽ More

    Submitted 21 August, 2023; v1 submitted 17 February, 2023; originally announced February 2023.

    Comments: 8 pages, 5 figures

  7. arXiv:2203.12521  [pdf, other

    cs.AR

    CoMeFa: Compute-in-Memory Blocks for FPGAs

    Authors: Aman Arora, Tanmay Anand, Aatman Borda, Rishabh Sehgal, Bagus Hanindhito, Jaydeep Kulkarni, Lizy K. John

    Abstract: Block RAMs (BRAMs) are the storage houses of FPGAs, providing extensive on-chip memory bandwidth to the compute units implemented using Logic Blocks (LBs) and Digital Signal Processing (DSP) slices. We propose modifying BRAMs to convert them to CoMeFa (Compute-In-Memory Blocks for FPGAs) RAMs. These RAMs provide highly-parallel compute-in-memory by combining computation and storage capabilities in… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: 10 pages, 12 figures, 4 tables, FCCM conference

  8. arXiv:2203.01479  [pdf, other

    cs.AR cs.LG

    Weightless Neural Networks for Efficient Edge Inference

    Authors: Zachary Susskind, Aman Arora, Igor Dantas Dos Santos Miranda, Luis Armando Quintanilla Villon, Rafael Fontella Katopodis, Leandro Santiago de Araujo, Diego Leonel Cadette Dutra, Priscila Machado Vieira Lima, Felipe Maia Galvao Franca, Mauricio Breternitz Jr., Lizy K. John

    Abstract: Weightless Neural Networks (WNNs) are a class of machine learning model which use table lookups to perform inference. This is in contrast with Deep Neural Networks (DNNs), which use multiply-accumulate operations. State-of-the-art WNN architectures have a fraction of the implementation cost of DNNs, but still lag behind them on accuracy for common image recognition tasks. Additionally, many existi… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

  9. arXiv:2202.00492  [pdf

    physics.bio-ph

    Specific interactions of peripheral membrane proteins with lipids: what can molecular simulations show us?

    Authors: Andreas Haahr Larsen, Laura H. John, Mark S. P. Sansom, Robin A. Corey

    Abstract: Peripheral membrane proteins can reversibly and specifically bind to biological membranes to carry out functions such as cell signalling, enzymatic activity, or membrane remodelling. Structures of these proteins and of their lipid-binding domains are typically solved in a soluble form, sometimes with a lipid or lipid headgroup at the binding site. To provide a detailed molecular view of peripheral… ▽ More

    Submitted 1 February, 2022; originally announced February 2022.

    Comments: Review

    Journal ref: Biosci Rep (2022) 42 (4): BSR20211406

  10. arXiv:2109.06133  [pdf, other

    cs.AI cs.LG cs.NE cs.PF

    Neuro-Symbolic AI: An Emerging Class of AI Workloads and their Characterization

    Authors: Zachary Susskind, Bryce Arden, Lizy K. John, Patrick Stockton, Eugene B. John

    Abstract: Neuro-symbolic artificial intelligence is a novel area of AI research which seeks to combine traditional rules-based AI approaches with modern deep learning techniques. Neuro-symbolic models have already demonstrated the capability to outperform state-of-the-art deep learning models in domains such as image and video reasoning. They have also been shown to obtain high accuracy with significantly l… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: 11 pages, 7 figures

    ACM Class: C.4; I.2.m

  11. arXiv:2107.09178  [pdf, other

    cs.AR eess.SP

    Compute RAMs: Adaptable Compute and Storage Blocks for DL-Optimized FPGAs

    Authors: Aman Arora, Bagus Hanindhito, Lizy K. John

    Abstract: The configurable building blocks of current FPGAs -- Logic blocks (LBs), Digital Signal Processing (DSP) slices, and Block RAMs (BRAMs) -- make them efficient hardware accelerators for the rapid-changing world of Deep Learning (DL). Communication between these blocks happens through an interconnect fabric consisting of switching elements spread throughout the FPGA. In this paper, a new block, Comp… ▽ More

    Submitted 30 September, 2021; v1 submitted 19 July, 2021; originally announced July 2021.

    Comments: 8 pages, IEEE Signal Processing Society's ASILOMAR Conference on Signals, Systems and Computers

  12. arXiv:2106.07087  [pdf, other

    cs.AR

    Koios: A Deep Learning Benchmark Suite for FPGA Architecture and CAD Research

    Authors: Aman Arora, Andrew Boutros, Daniel Rauch, Aishwarya Rajen, Aatman Borda, Seyed Alireza Damghani, Samidh Mehta, Sangram Kate, Pragnesh Patel, Kenneth B. Kent, Vaughn Betz, Lizy K. John

    Abstract: With the prevalence of deep learning (DL) in many applications, researchers are investigating different ways of optimizing FPGA architecture and CAD to achieve better quality-of-results (QoR) on DL-based workloads. In this optimization process, benchmark circuits are an essential component; the QoR achieved on a set of benchmarks is the main driver for architecture and CAD design choices. However,… ▽ More

    Submitted 13 June, 2021; originally announced June 2021.

  13. arXiv:2012.05181  [pdf, ps, other

    cs.AR

    Virtual-Link: A Scalable Multi-Producer, Multi-Consumer Message Queue Architecture for Cross-Core Communication

    Authors: Qinzhe Wu, Jonathan Beard, Ashen Ekanayake, Andreas Gerstlauer, Lizy K. John

    Abstract: Cross-core communication is increasingly a bottleneck as the number of processing elements increase per system-on-chip. Typical hardware solutions to cross-core communication are often inflexible; while software solutions are flexible, they have performance scaling limitations. A key problem, as we will show, is that of shared state in software-based message queue mechanisms. This paper proposes V… ▽ More

    Submitted 19 January, 2021; v1 submitted 9 December, 2020; originally announced December 2020.

  14. arXiv:2008.07361  [pdf

    stat.AP cs.LG stat.ME stat.ML

    Logistic regression models for patient-level prediction based on massive observational data: Do we need all data?

    Authors: Luis H. John, Jan A. Kors, Jenna M. Reps, Patrick B. Ryan, Peter R. Rijnbeek

    Abstract: Objective: Provide guidance on sample size considerations for developing predictive models by empirically establishing the adequate sample size, which balances the competing objectives of improving model performance and reducing model complexity as well as computational requirements. Materials and Methods: We empirically assess the effect of sample size on prediction performance and model comple… ▽ More

    Submitted 24 July, 2024; v1 submitted 14 August, 2020; originally announced August 2020.

    Journal ref: International Journal of Medical Informatics, Volume 163, July 2022, Article number 104762

  15. arXiv:1908.09207  [pdf, ps, other

    cs.LG stat.ML

    Demystifying the MLPerf Benchmark Suite

    Authors: Snehil Verma, Qinzhe Wu, Bagus Hanindhito, Gunjan Jha, Eugene B. John, Ramesh Radhakrishnan, Lizy K. John

    Abstract: MLPerf, an emerging machine learning benchmark suite strives to cover a broad range of applications of machine learning. We present a study on its characteristics and how the MLPerf benchmarks differ from some of the previous deep learning benchmarks like DAWNBench and DeepBench. We find that application benchmarks such as MLPerf (although rich in kernels) exhibit different features compared to ke… ▽ More

    Submitted 24 August, 2019; originally announced August 2019.

  16. arXiv:1805.12305  [pdf, other

    cs.DC

    Start Late or Finish Early: A Distributed Graph Processing System with Redundancy Reduction

    Authors: Shuang Song, Xu Liu, Qinzhe Wu, Andreas Gerstlauer, Tao Li, Lizy K. John

    Abstract: Graph processing systems are important in the big data domain. However, processing graphs in parallel often introduces redundant computations in existing algorithms and models. Prior work has proposed techniques to optimize redundancies for the out-of-core graph systems, rather than the distributed graph systems. In this paper, we study various state-of-the-art distributed graph systems and observ… ▽ More

    Submitted 30 May, 2018; originally announced May 2018.

    Comments: 11 pages, 10 figures

  17. arXiv:1612.01333  [pdf, ps, other

    math.NA

    On the analysis of block smoothers for saddle point problems

    Authors: Lorenz John, Ulrich Rüde, Barbara Wohlmuth, Walter Zulehner

    Abstract: In this article, we discuss several classes of Uzawa smoothers for the application in multigrid methods in the context of saddle point problems. Beside commonly used variants, such as the inexact and block factorization version, we also introduce a new symmetric method, belonging to the class of Uzawa smoothers. For these variants we unify the analysis of the smoothing properties, which is an impo… ▽ More

    Submitted 5 December, 2016; originally announced December 2016.

  18. Stable discontinuous Galerkin FEM without penalty parameters

    Authors: Lorenz John, Michael Neilan, Iain Smears

    Abstract: We propose a modified local discontinuous Galerkin (LDG) method for second--order elliptic problems that does not require extrinsic penalization to ensure stability. Stability is instead achieved by showing a discrete Poincaré--Friedrichs inequality for the discrete gradient that employs a lifting of the jumps with one polynomial degree higher than the scalar approximation space. Our analysis cove… ▽ More

    Submitted 9 March, 2016; v1 submitted 15 February, 2016; originally announced February 2016.

    Comments: Accepted for publication in the conference proceedings of Numerical Mathematics and Advanced Applications ENUMATH 2015. Typo corrected

  19. arXiv:1511.05759  [pdf, other

    math.NA

    Solution Techniques for the Stokes System: A priori and a posteriori modifications, resilient algorithms

    Authors: Markus Huber, Lorenz John, Petra Pustejovska, Ulrich Rüde, Christian Waluga, Barbara Wohlmuth

    Abstract: This article proposes modifications to standard low order finite element approximations of the Stokes system with the goal of improving both the approximation quality and the parallel algebraic solution process. Different from standard finite element techniques, we do not modify or enrich the approximation spaces but modify the operator itself to ensure fundamental physical properties such as mass… ▽ More

    Submitted 18 November, 2015; originally announced November 2015.

    Comments: in Proceedings of the ICIAM, Beijing, China, 2015

    MSC Class: 65N30; 65N12; 65N55; 65Y05; 76D07

  20. arXiv:1511.02134  [pdf, other

    cs.CE cs.MS math.NA

    A quantitative performance analysis for Stokes solvers at the extreme scale

    Authors: Björn Gmeiner, Markus Huber, Lorenz John, Ulrich Rüde, Barbara Wohlmuth

    Abstract: This article presents a systematic quantitative performance analysis for large finite element computations on extreme scale computing systems. Three parallel iterative solvers for the Stokes system, discretized by low order tetrahedral elements, are compared with respect to their numerical efficiency and their scalability running on up to $786\,432$ parallel threads. A genuine multigrid method for… ▽ More

    Submitted 6 November, 2015; originally announced November 2015.

    MSC Class: 65N55; 65Y05; 68Q25

  21. arXiv:1504.02205  [pdf, other

    cs.DC

    BigDataBench-MT: A Benchmark Tool for Generating Realistic Mixed Data Center Workloads

    Authors: Rui Han, Shulin Zhan, Chenrong Shao, Junwei Wang, Lizy K. John, Jiangtao Xu, Gang Lu, Lei Wang

    Abstract: Long-running service workloads (e.g. web search engine) and short-term data analysis workloads (e.g. Hadoop MapReduce jobs) co-locate in today's data centers. Developing realistic benchmarks to reflect such practical scenario of mixed workload is a key problem to produce trustworthy results when evaluating and comparing data center systems. This requires using actual workloads as well as guarantee… ▽ More

    Submitted 4 December, 2015; v1 submitted 9 April, 2015; originally announced April 2015.

    Comments: 12 pages, 5 figures

  22. arXiv:1308.3281  [pdf, other

    math.CO

    Hyperbanana Graphs

    Authors: Christopher Clement, Audrey Lee-St. John, Jessica Sidman

    Abstract: A bar-and-joint framework is a finite set of points together with specified distances between selected pairs. In rigidity theory we seek to understand when the remaining pairwise distances are also fixed. If there exists a pair of points which move relative to one another while maintaining the given distance constraints, the framework is flexible; otherwise, it is rigid. Counting conditions due… ▽ More

    Submitted 14 August, 2013; originally announced August 2013.

    MSC Class: 05c50

    Journal ref: Proceedings of 25th Canadian Conference on Computational Geometry, pages 199-204, 2013

  23. arXiv:1306.1572  [pdf, other

    cs.CG math.CO

    Algorithms for detecting dependencies and rigid subsystems for CAD

    Authors: James Farre, Helena Kleinschmidt, Jessica Sidman, Audrey Lee-St. John, Stephanie Stark, Louis Theran, Xilin Yu

    Abstract: Geometric constraint systems underly popular Computer Aided Design soft- ware. Automated approaches for detecting dependencies in a design are critical for developing robust solvers and providing informative user feedback, and we provide algorithms for two types of dependencies. First, we give a pebble game algorithm for detecting generic dependencies. Then, we focus on identifying the "special po… ▽ More

    Submitted 1 October, 2015; v1 submitted 6 June, 2013; originally announced June 2013.

    Comments: 37 pages, 14 figures (v2 is an expanded version of an AGD'14 abstract based on v1)

  24. arXiv:1210.0451  [pdf, other

    cs.DM math.CO

    Combinatorics and the Rigidity of CAD Systems

    Authors: Audrey Lee-St. John, Jessica Sidman

    Abstract: We study the rigidity of body-and-cad frameworks which capture the majority of the geometric constraints used in 3D mechanical engineering CAD software. We present a combinatorial characterization of the generic minimal rigidity of a subset of body-and-cad frameworks in which we treat 20 of the 21 body-and-cad constraints, omitting only point-point coincidences. While the handful of classical comb… ▽ More

    Submitted 17 October, 2012; v1 submitted 1 October, 2012; originally announced October 2012.

    Comments: 17 pages, 7 figures, version to appear in Symposium on Solid and Physical Modeling '12 and associated special issue of Computer Aided Design

    MSC Class: 68R10; 05C50 ACM Class: G.2.1; J.6; I.3.5

  25. Single-trial EEG Discrimination between Wrist and Finger Movement Imagery and Execution in a Sensorimotor BCI

    Authors: A. K. Mohamed, T. Marwala, L. R. John

    Abstract: A brain-computer interface (BCI) may be used to control a prosthetic or orthotic hand using neural activity from the brain. The core of this sensorimotor BCI lies in the interpretation of the neural information extracted from electroencephalogram (EEG). It is desired to improve on the interpretation of EEG to allow people with neuromuscular disorders to perform daily activities. This paper investi… ▽ More

    Submitted 26 August, 2011; originally announced August 2011.

    Comments: 33rd Annual International IEEE EMBS Conference 2011

  26. arXiv:1006.1126  [pdf, other

    cs.CG

    Body-and-cad Geometric Constraint Systems

    Authors: Kirk Haller, Audrey Lee-St. John, Meera Sitharam, Ileana Streinu, Neil White

    Abstract: Motivated by constraint-based CAD software, we develop the foundation for the rigidity theory of a very general model: the body-and-cad structure, composed of rigid bodies in 3D constrained by pairwise coincidence, angular and distance constraints. We identify 21 relevant geometric constraints and develop the corresponding infinitesimal rigidity theory for these structures. The classical body-and-… ▽ More

    Submitted 6 June, 2010; originally announced June 2010.

    Comments: 33 pages, to appear in Computational Geometry: Theory and Applications (an abbreviated version appeared in: 24th Annual ACM Symposium on Applied Computing, Technical Track on Geometric Constraints and Reasoning GCR'09, Honolulu, HI, 2009)

    MSC Class: 68R10; 05C85; 05C50 ACM Class: I.3.5; J.6; G.2.1

  27. Archer: A Community Distributed Computing Infrastructure for Computer Architecture Research and Education

    Authors: Renato Figueiredo, P. Oscar Boykin, Jose A. B. Fortes, Tao Li, Jie-Kwon Peir, David Wolinsky, Lizy John, David Kaeli, David Lilja, Sally McKee, Gokhan Memik, Alain Roy, Gary Tyson

    Abstract: This paper introduces Archer, a community-based computing resource for computer architecture research and education. The Archer infrastructure integrates virtualization and batch scheduling middleware to deliver high-throughput computing resources aggregated from resources distributed across wide-area networks and owned by different participating entities in a seamless manner. The paper discusse… ▽ More

    Submitted 10 July, 2008; originally announced July 2008.

    Comments: 11 pages, 2 figures. Describes the Archer project, http://archer-project.org

    ACM Class: C.0; I.6.3; C.2.4