Skip to main content

Showing 1–29 of 29 results for author: Ogras, U Y

  1. arXiv:2410.09188  [pdf, other

    cs.AR

    MFIT: Multi-Fidelity Thermal Modeling for 2.5D and 3D Multi-Chiplet Architectures

    Authors: Lukas Pfromm, Alish Kanani, Harsh Sharma, Parth Solanki, Eric Tervo, Jaehyun Park, Janardhan Rao Doppa, Partha Pratim Pande, Umit Y. Ogras

    Abstract: Rapidly evolving artificial intelligence and machine learning applications require ever-increasing computational capabilities, while monolithic 2D design technologies approach their limits. Heterogeneous integration of smaller chiplets using a 2.5D silicon interposer and 3D packaging has emerged as a promising paradigm to address this limit and meet performance demands. These approaches offer a si… ▽ More

    Submitted 11 October, 2024; originally announced October 2024.

    Comments: Preprint for MFIT: Multi-Fidelity Thermal Modeling for 2.5D and 3D Multi-Chiplet Architectures

  2. arXiv:2302.12779  [pdf, other

    cs.AR

    Machine Learning-based Low Overhead Congestion Control Algorithm for Industrial NoCs

    Authors: Shruti Yadav Narayana, Sumit K. Mandal, Raid Ayoub, Michael Kishinevsky, Umit Y. Ogras

    Abstract: Network-on-Chip (NoC) congestion builds up during heavy traffic load and cripples the system performance by stalling the cores. Moreover, congestion leads to wasted link bandwidth due to blocked buffers and bouncing packets. Existing approaches throttle the cores after congestion is detected, reducing efficiency and wasting line bandwidth unnecessarily. In contrast, we propose a lightweight machin… ▽ More

    Submitted 24 February, 2023; originally announced February 2023.

    Comments: The short version of the paper has been accepted in DATE'23

  3. arXiv:2208.07914  [pdf, other

    cs.LG cs.AI

    PD-MORL: Preference-Driven Multi-Objective Reinforcement Learning Algorithm

    Authors: Toygun Basaklar, Suat Gumussoy, Umit Y. Ogras

    Abstract: Multi-objective reinforcement learning (MORL) approaches have emerged to tackle many real-world problems with multiple conflicting objectives by maximizing a joint objective function weighted by a preference vector. These approaches find fixed customized policies corresponding to preference vectors specified during training. However, the design constraints and objectives typically change dynamical… ▽ More

    Submitted 29 May, 2023; v1 submitted 16 August, 2022; originally announced August 2022.

    Comments: 24 pages, 8 Figures, 9 Tables, Published as a conference paper at ICLR 2023, https://openreview.net/forum?id=zS9sRyaPFlJ

  4. COIN: Communication-Aware In-Memory Acceleration for Graph Convolutional Networks

    Authors: Sumit K. Mandal, Gokul Krishnan, A. Alper Goksoy, Gopikrishnan Ravindran Nair, Yu Cao, Umit Y. Ogras

    Abstract: Graph convolutional networks (GCNs) have shown remarkable learning capabilities when processing graph-structured data found inherently in many application areas. GCNs distribute the outputs of neural networks embedded in each vertex over multiple iterations to take advantage of the relations captured by the underlying graphs. Consequently, they incur a significant amount of computation and irregul… ▽ More

    Submitted 15 May, 2022; originally announced May 2022.

    Comments: IEEE Journal on Emerging and Selected Topics in Circuits and Systems (2022)

  5. Fast and Scalable Human Pose Estimation using mmWave Point Cloud

    Authors: Sizhe An, Umit Y. Ogras

    Abstract: Millimeter-Wave (mmWave) radar can enable high-resolution human pose estimation with low cost and computational requirements. However, mmWave data point cloud, the primary input to processing algorithms, is highly sparse and carries significantly less information than other alternatives such as video frames. Furthermore, the scarce labeled mmWave data impedes the development of machine learning (M… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

    Comments: Accepted for Design Automation Conference (DAC) 2022

  6. arXiv:2202.09297  [pdf, other

    eess.SP cs.LG

    tinyMAN: Lightweight Energy Manager using Reinforcement Learning for Energy Harvesting Wearable IoT Devices

    Authors: Toygun Basaklar, Yigit Tuncel, Umit Y. Ogras

    Abstract: Advances in low-power electronics and machine learning techniques lead to many novel wearable IoT devices. These devices have limited battery capacity and computational power. Thus, energy harvesting from ambient sources is a promising solution to power these low-energy wearable devices. They need to manage the harvested energy optimally to achieve energy-neutral operation, which eliminates rechar… ▽ More

    Submitted 18 March, 2022; v1 submitted 18 February, 2022; originally announced February 2022.

    Comments: 7 pages, 4 figures, accepted as "Full Paper" for the 2022 tinyML Research Symposium

  7. arXiv:2112.08980  [pdf, other

    cs.DC

    Performant, Multi-objective Scheduling of Highly Interleaved Task Graphs on Heterogeneous System on Chip Devices

    Authors: Joshua Mack, Samet E. Arda, Umit Y. Ogras, Ali Akoglu

    Abstract: Performance-, power-, and energy-aware scheduling techniques play an essential role in optimally utilizing processing elements (PEs) of heterogeneous systems. List schedulers, a class of low-complexity static schedulers, have commonly been used in static execution scenarios. However, list schedulers are not suitable for runtime decision making, particularly when multiple concurrent applications ar… ▽ More

    Submitted 16 December, 2021; originally announced December 2021.

    Comments: 15 pages, 2 pages of appendix, 14 figures including appendix. Accepted for publication in IEEE Transactions on Parallel and Distributed Systems

  8. DAS: Dynamic Adaptive Scheduling for Energy-Efficient Heterogeneous SoCs

    Authors: A. Alper Goksoy, Anish Krishnakumar, Md Sahil Hassan, Allen J. Farcas, Ali Akoglu, Radu Marculescu, Umit Y. Ogras

    Abstract: Domain-specific systems-on-chip (DSSoCs) aim at bridging the gap between application-specific integrated circuits (ASICs) and general-purpose processors. Traditional operating system (OS) schedulers can undermine the potential of DSSoCs since their execution times can be orders of magnitude larger than the execution time of the task itself. To address this problem, we propose a dynamic adaptive sc… ▽ More

    Submitted 22 September, 2021; originally announced September 2021.

    Comments: 4 pages, 2 tables, 3 figures, 1 algorithm, Accepted for publication in IEEE Embedded Systems Letters

  9. arXiv:2108.09534  [pdf, other

    cs.PF math.PR

    Theoretical Analysis and Evaluation of NoCs with Weighted Round-Robin Arbitration

    Authors: Sumit K. Mandal, Jie Tong, Raid Ayoub, Michael Kishinevsky, Ahmed Abousamra, Umit Y. Ogras

    Abstract: Fast and accurate performance analysis techniques are essential in early design space exploration and pre-silicon evaluations, including software eco-system development. In particular, on-chip communication continues to play an increasingly important role as the many-core processors scale up. This paper presents the first performance analysis technique that targets networks-on-chip (NoCs) that emp… ▽ More

    Submitted 11 August, 2023; v1 submitted 21 August, 2021; originally announced August 2021.

    Comments: This paper is accepted in International Conference on Computer Aided Design (ICCAD), 2021

  10. arXiv:2108.08903  [pdf, other

    cs.LG cs.AR

    SIAM: Chiplet-based Scalable In-Memory Acceleration with Mesh for Deep Neural Networks

    Authors: Gokul Krishnan, Sumit K. Mandal, Manvitha Pannala, Chaitali Chakrabarti, Jae-sun Seo, Umit Y. Ogras, Yu Cao

    Abstract: In-memory computing (IMC) on a monolithic chip for deep learning faces dramatic challenges on area, yield, and on-chip interconnection cost due to the ever-increasing model sizes. 2.5D integration or chiplet-based architectures interconnect multiple small chips (i.e., chiplets) to form a large computing system, presenting a feasible solution beyond a monolithic IMC architecture to accelerate large… ▽ More

    Submitted 14 August, 2021; originally announced August 2021.

  11. arXiv:2108.00568  [pdf, other

    cs.CV cs.LG

    FLASH: Fast Neural Architecture Search with Hardware Optimization

    Authors: Guihong Li, Sumit K. Mandal, Umit Y. Ogras, Radu Marculescu

    Abstract: Neural architecture search (NAS) is a promising technique to design efficient and high-performance deep neural networks (DNNs). As the performance requirements of ML applications grow continuously, the hardware accelerators start playing a central role in DNN design. This trend makes NAS even more complicated and time-consuming for most real applications. This paper proposes FLASH, a very fast NAS… ▽ More

    Submitted 1 August, 2021; originally announced August 2021.

    Comments: Published at ACM CODES+ISSS 2021

  12. arXiv:2107.02358  [pdf, other

    cs.AR cs.AI cs.LG

    Impact of On-Chip Interconnect on In-Memory Acceleration of Deep Neural Networks

    Authors: Gokul Krishnan, Sumit K. Mandal, Chaitali Chakrabarti, Jae-sun Seo, Umit Y. Ogras, Yu Cao

    Abstract: With the widespread use of Deep Neural Networks (DNNs), machine learning algorithms have evolved in two diverse directions -- one with ever-increasing connection density for better accuracy and the other with more compact sizing for energy efficiency. The increase in connection density increases on-chip data movement, which makes efficient on-chip communication a critical function of the DNN accel… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

  13. arXiv:2103.06709  [pdf, other

    cs.LG cs.AR

    Hypervector Design for Efficient Hyperdimensional Computing on Edge Devices

    Authors: Toygun Basaklar, Yigit Tuncel, Shruti Yadav Narayana, Suat Gumussoy, Umit Y. Ogras

    Abstract: Hyperdimensional computing (HDC) has emerged as a new light-weight learning algorithm with smaller computation and energy requirements compared to conventional techniques. In HDC, data points are represented by high-dimensional vectors (hypervectors), which are mapped to high-dimensional space (hyperspace). Typically, a large hypervector dimension ($\geq1000$) is required to achieve accuracies com… ▽ More

    Submitted 8 March, 2021; originally announced March 2021.

    Comments: 9 pages, 6 figures, accepted to tinyML 2021 Research Symposium

  14. arXiv:2008.09728  [pdf, other

    cs.DC cs.AI cs.LG eess.SY

    Online Adaptive Learning for Runtime Resource Management of Heterogeneous SoCs

    Authors: Sumit K. Mandal, Umit Y. Ogras, Janardhan Rao Doppa, Raid Z. Ayoub, Michael Kishinevsky, Partha P. Pande

    Abstract: Dynamic resource management has become one of the major areas of research in modern computer and communication system design due to lower power consumption and higher performance demands. The number of integrated cores, level of heterogeneity and amount of control knobs increase steadily. As a result, the system complexity is increasing faster than our ability to optimize and dynamically manage th… ▽ More

    Submitted 21 August, 2020; originally announced August 2020.

    Comments: This paper appeared in the Proceedings of Design Automation Conference 2020

  15. arXiv:2008.03904  [pdf, other

    cs.PF

    Performance Analysis of Priority-Aware NoCs with Deflection Routing under Traffic Congestion

    Authors: Sumit K. Mandal, Anish Krishnakumar, Raid Ayoub, Michael Kishinevsky, Umit Y. Ogras

    Abstract: Priority-aware networks-on-chip (NoCs) are used in industry to achieve predictable latency under different workload conditions. These NoCs incorporate deflection routing to minimize queuing resources within routers and achieve low latency during low traffic load. However, deflected packets can exacerbate congestion during high traffic load since they consume the NoC bandwidth. State-of-the-art ana… ▽ More

    Submitted 8 November, 2020; v1 submitted 10 August, 2020; originally announced August 2020.

    Comments: This article is in the Proceedings of ICCAD 2020

  16. arXiv:2007.13951  [pdf, other

    cs.PF

    Analytical Performance Modeling of NoCs under Priority Arbitration and Bursty Traffic

    Authors: Sumit K. Mandal, Raid Ayoub, Michael Kishinevsky, Mohammad M. Islam, Umit Y. Ogras

    Abstract: Networks-on-Chip (NoCs) used in commercial many-core processors typically incorporate priority arbitration. Moreover, they experience bursty traffic due to application workloads. However, most state-of-the-art NoC analytical performance analysis techniques assume fair arbitration and simple traffic models. To address these limitations, we propose an analytical modeling technique for priority-aware… ▽ More

    Submitted 27 July, 2020; originally announced July 2020.

    Comments: This paper will appear in a future issue of IEEE Embedded Systems Letters

  17. Runtime Task Scheduling using Imitation Learning for Heterogeneous Many-Core Systems

    Authors: Anish Krishnakumar, Samet E. Arda, A. Alper Goksoy, Sumit K. Mandal, Umit Y. Ogras, Anderson L. Sartor, Radu Marculescu

    Abstract: Domain-specific systems-on-chip, a class of heterogeneous many-core systems, are recognized as a key approach to narrow down the performance and energy-efficiency gap between custom hardware accelerators and programmable processors. Reaching the full potential of these architectures depends critically on optimally scheduling the applications to available resources at runtime. Existing optimization… ▽ More

    Submitted 6 August, 2020; v1 submitted 18 July, 2020; originally announced July 2020.

    Comments: 14 pages, 12 figures, 8 tables. Accepted for publication in Embedded Systems Week CODES+ISSS 2020 (Special Issue in IEEE TCAD)

  18. arXiv:2004.01636  [pdf, other

    cs.DC

    User-Space Emulation Framework for Domain-Specific SoC Design

    Authors: Joshua Mack, Nirmal Kumbhare, Anish NK, Umit Y. Ogras, Ali Akoglu

    Abstract: In this work, we propose a portable, Linux-based emulation framework to provide an ecosystem for hardware-software co-design of Domain-specific SoCs (DSSoCs) and enable their rapid evaluation during the pre-silicon design phase. This framework holistically targets three key challenges of DSSoC design: accelerator integration, resource management, and application development. We address these chall… ▽ More

    Submitted 11 April, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

    Comments: 11 pages, 11 figures. To be published in proceedings of 2020 Heterogeneity in Computing Workshop http://hcw.oucreate.com/ held in conjunction with IPDPS 2020 http://www.ipdps.org/

  19. Analysis and Control of Power-Temperature Dynamics in Heterogeneous Multiprocessors

    Authors: Ganapati Bhat, Suat Gumussoy, Umit Y. Ogras

    Abstract: Virtually all electronic systems try to optimize a fundamental trade-off between higher performance and lower power consumption. The latter becomes critical in mobile computing systems, such as smartphones, which rely on passive cooling. Otherwise, the heat concentrated in a small area drives both the junction and skin temperatures up. High junction temperatures degrade the reliability, while skin… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.

    Comments: Will appear in a future issue of IEEE Transactions on Control Systems Technology

  20. arXiv:2003.09526  [pdf, other

    cs.DC cs.LG eess.SY

    An Energy-Aware Online Learning Framework for Resource Management in Heterogeneous Platforms

    Authors: Sumit K. Mandal, Ganapati Bhat, Janardhan Rao Doppa, Partha Pratim Pande, Umit Y. Ogras

    Abstract: Mobile platforms must satisfy the contradictory requirements of fast response time and minimum energy consumption as a function of dynamically changing applications. To address this need, system-on-chips (SoC) that are at the heart of these devices provide a variety of control knobs, such as the number of active cores and their voltage/frequency levels. Controlling these knobs optimally at runtime… ▽ More

    Submitted 20 March, 2020; originally announced March 2020.

    Comments: This paper has been accepted to be published in a future issue of ACM TODAES

  21. arXiv:2003.09016  [pdf, other

    cs.AR

    DS3: A System-Level Domain-Specific System-on-Chip Simulation Framework

    Authors: Samet E. Arda, Anish NK, A. Alper Goksoy, Nirmal Kumbhare, Joshua Mack, Anderson L. Sartor, Ali Akoglu, Radu Marculescu, Umit Y. Ogras

    Abstract: Heterogeneous systems-on-chip (SoCs) are highly favorable computing platforms due to their superior performance and energy efficiency potential compared to homogeneous architectures. They can be further tailored to a specific domain of applications by incorporating processing elements (PEs) that accelerate frequently used kernels in these applications. However, this potential is contingent upon op… ▽ More

    Submitted 19 March, 2020; originally announced March 2020.

    Comments: 14 pages, 20 figures

  22. arXiv:1908.03664  [pdf, other

    cs.AR

    Work-in-Progress: A Simulation Framework for Domain-Specific System-on-Chips

    Authors: Samet E. Arda, Anish NK, A. Alper Goksoy, Joshua Mack, Nirmal Kumbhare, Anderson L. Sartor, Ali Akoglu, Radu Marculescu, Umit Y. Ogras

    Abstract: Heterogeneous system-on-chips (SoCs) have become the standard embedded computing platforms due to their potential to deliver superior performance and energy efficiency compared to homogeneous architectures. They can be particularly suited to target a specific domain of applications. However, this potential is contingent upon optimizing the SoC for the target domain and utilizing its resources effe… ▽ More

    Submitted 9 August, 2019; originally announced August 2019.

  23. arXiv:1908.02408  [pdf, other

    cs.PF eess.SY

    Analytical Performance Models for NoCs with Multiple Priority Traffic Classes

    Authors: Sumit K. Mandal, Raid Ayoub, Michael Kishinevsky, Umit Y. Ogras

    Abstract: Networks-on-chip (NoCs) have become the standard for interconnect solutions in industrial designs ranging from client CPUs to many-core chip-multiprocessors. Since NoCs play a vital role in system performance and power consumption, pre-silicon evaluation environments include cycle-accurate NoC simulators. Long simulations increase the execution time of evaluation frameworks, which are already noto… ▽ More

    Submitted 3 January, 2020; v1 submitted 6 August, 2019; originally announced August 2019.

    Comments: This article will appear as part of the ESWEEK-TECS special issue and will be presented in the International Conference on Compilers, Architecture, and Synthesis for Embedded Systems (CASES) 2019

  24. arXiv:1904.09814  [pdf, other

    cs.OH cs.DC

    Power and Thermal Analysis of Commercial Mobile Platforms: Experiments and Case Studies

    Authors: Ganapati Bhat, Suat Gumussoy, Umit Y. Ogras

    Abstract: State-of-the-art mobile processors can deliver fast response time and high throughput to maximize the user experience. However, high performance comes at the expense of larger power density, which leads to higher skin temperatures. Since this can degrade the user experience, there is a strong need for power consumption and thermal analysis in mobile processors. In this paper, we first perform expe… ▽ More

    Submitted 19 March, 2019; originally announced April 2019.

    Comments: To appear in proceedings of IEEE DATE 2019

  25. arXiv:1903.03168  [pdf, other

    cs.HC

    OpenHealth: Open Source Platform for Wearable Health Monitoring

    Authors: Ganapati Bhat, Ranadeep Deb, Umit Y. Ogras

    Abstract: Movement disorders are becoming one of the leading causes of functional disability due to aging populations and extended life expectancy. Wearable health monitoring is emerging as an effective way to augment clinical care for movement disorders. However, wearable devices face a number of adaptation and technical challenges that hinder their widespread adoption. To address these challenges, we intr… ▽ More

    Submitted 16 March, 2019; v1 submitted 18 February, 2019; originally announced March 2019.

    Comments: To appear in a future issue of IEEE Design & Test

  26. arXiv:1902.02639  [pdf, other

    eess.SP eess.SY

    REAP: Runtime Energy-Accuracy Optimization for Energy Harvesting IoT Devices

    Authors: Ganapati Bhat, Kunal Bagewadi, Hyung Gyu Lee, Umit Y. Ogras

    Abstract: The use of wearable and mobile devices for health monitoring and activity recognition applications is increasing rapidly. These devices need to maximize their accuracy and active time under a tight energy budget imposed by battery and small form-factor constraints. This paper considers energy harvesting devices that run on a limited energy budget to recognize user activities over a given period. W… ▽ More

    Submitted 22 February, 2019; v1 submitted 2 February, 2019; originally announced February 2019.

    Comments: To appear in Proceedings of DAC 2019. Datasets are available at https://github.com/gmbhat/human-activity-recognition

  27. Online Human Activity Recognition using Low-Power Wearable Devices

    Authors: Ganapati Bhat, Ranadeep Deb, Vatika Vardhan Chaurasia, Holly Shill, Umit Y. Ogras

    Abstract: Human activity recognition~(HAR) has attracted significant research interest due to its applications in health monitoring and patient rehabilitation. Recent research on HAR focuses on using smartphones due to their widespread use. However, this leads to inconvenient use, limited choice of sensors and inefficient use of resources, since smartphones are not designed for HAR. This paper presents the… ▽ More

    Submitted 4 February, 2019; v1 submitted 26 August, 2018; originally announced August 2018.

    Comments: This is in proceedings of ICCAD 2018. The datasets are available at https://github.com/gmbhat/human-activity-recognition

  28. Power-Temperature Stability and Safety Analysis for Multiprocessor Systems

    Authors: Ganapati Bhat, Suat Gumussoy, Umit Y. Ogras

    Abstract: Modern multiprocessor system-on-chips (SoCs) integrate multiple heterogeneous cores to achieve high energy efficiency. The power consumption of each core contributes to an increase in the temperature across the chip floorplan. In turn, higher temperature increases the leakage power exponentially, and leads to a positive feedback with nonlinear dynamics. This paper presents a power-temperature stab… ▽ More

    Submitted 16 June, 2018; originally announced June 2018.

    Comments: Published in ACM TECS

    Journal ref: ACM Trans. Embed. Comput. Syst. 16, 5s, Article 145 (September 2017), 19 pages

  29. arXiv:0710.4707  [pdf

    cs.AR

    Energy- and Performance-Driven NoC Communication Architecture Synthesis Using a Decomposition Approach

    Authors: Umit Y. Ogras, Radu Marculescu

    Abstract: In this paper, we present a methodology for customized communication architecture synthesis that matches the communication requirements of the target application. This is an important problem, particularly for network-based implementations of complex applications. Our approach is based on using frequently encountered generic communication primitives as an alphabet capable of characterizing any g… ▽ More

    Submitted 25 October, 2007; originally announced October 2007.

    Comments: Submitted on behalf of EDAA (http://www.edaa.com/)

    Journal ref: Dans Design, Automation and Test in Europe - DATE'05, Munich : Allemagne (2005)