Quantizing deep convolutional networks for efficient inference: A whitepaper

R Krishnamoorthi�- arXiv preprint arXiv:1806.08342, 2018 - arxiv.org
We present an overview of techniques for quantizing convolutional neural networks for
inference with integer weights and activations. Per-channel quantization of weights and per-layer …

Deep learning recommendation model for personalization and recommendation systems

…, I Cherniavskii, Y Lu, R Krishnamoorthi…�- arXiv preprint arXiv�…, 2019 - arxiv.org
With the advent of deep learning, neural network-based recommendation models have
emerged as an important tool for tackling personalization and recommendation tasks. These …

Minigpt-v2: large language model as a unified interface for vision-language multi-task learning

…, X Shen, X Li, Z Liu, P Zhang, R Krishnamoorthi…�- arXiv preprint arXiv�…, 2023 - arxiv.org
Large language models have shown their remarkable capabilities as a general interface for
various language-related applications. Motivated by this, we target to build a unified …

Llm-qat: Data-free quantization aware training for large language models

…, P Stock, Y Mehdad, Y Shi, R Krishnamoorthi…�- arXiv preprint arXiv�…, 2023 - arxiv.org
Several post-training quantization methods have been applied to large language models (LLMs),
and have been shown to perform well down to 8-bits. We find that these methods break …

Efficientsam: Leveraged masked image pretraining for efficient segment anything

…, F Sun, F Iandola, R Krishnamoorthi…�- Proceedings of the�…, 2024 - openaccess.thecvf.com
Segment Anything Model (SAM) has emerged as a powerful tool for numerous vision
applications. A key component that drives the impressive performance for zero-shot transfer and …

[PDF][PDF] Identifying user behavior by analyzing web server access log file

KR Suneetha, R Krishnamoorthi�- IJCSNS International Journal of�…, 2009 - researchgate.net
Web usage mining is application of data mining techniques to discover usage patterns from
web data, in order to better serve the needs of web based applications. The user access log …

Bit: Robustly binarized multi-distilled transformer

…, L Xiao, S Yih, M Li, R Krishnamoorthi…�- Advances in neural�…, 2022 - proceedings.neurips.cc
Modern pre-trained transformers have rapidly advanced the state-of-the-art in machine
learning, but have also grown in parameters and computational complexity, making them …

Fast point cloud generation with straight flows

…, Y Xiong, R Ranjan, R Krishnamoorthi…�- Proceedings of the�…, 2023 - openaccess.thecvf.com
Diffusion models have emerged as a powerful tool for point cloud generation. A key component
that drives the impressive performance for generating high-quality samples from noise is …

FLO physical layer: An overview

…, F Ling, A Mantravadi, R Krishnamoorthi…�- IEEE transactions on�…, 2007 - ieeexplore.ieee.org
This paper provides an overview of the physical layer of the Forward Link Only (FLO) Air
Interface. The FLO Air Interface is a key component of the MediaFLO system developed by …

Mobilellm: Optimizing sub-billion parameter language models for on-device use cases

…, Y Xiong, E Chang, Y Shi, R Krishnamoorthi…�- arXiv preprint arXiv�…, 2024 - arxiv.org
This paper addresses the growing need for efficient large language models (LLMs) on mobile
devices, driven by increasing cloud costs and latency concerns. We focus on designing top…