Skip to main content

Showing 1–50 of 80 results for author: Hou, R

  1. arXiv:2410.13405  [pdf, other

    cs.AR cs.CR

    Trinity: A General Purpose FHE Accelerator

    Authors: Xianglong Deng, Shengyu Fan, Zhicheng Hu, Zhuoyu Tian, Zihao Yang, Jiangrui Yu, Dingyuan Cao, Dan Meng, Rui Hou, Meng Li, Qian Lou, Mingzhe Zhang

    Abstract: In this paper, we present the first multi-modal FHE accelerator based on a unified architecture, which efficiently supports CKKS, TFHE, and their conversion scheme within a single accelerator. To achieve this goal, we first analyze the theoretical foundations of the aforementioned schemes and highlight their composition from a finite number of arithmetic kernels. Then, we investigate the challenge… ▽ More

    Submitted 17 October, 2024; originally announced October 2024.

    Comments: To be appeared in MICRO 2024. The first ASIC-based FHE accelerator which supports both CKKS, TFHE and their conversions. Provide new SOTA performance record for CKKS, TFHE and conversion

  2. arXiv:2410.06777  [pdf, other

    cs.CV

    HERM: Benchmarking and Enhancing Multimodal LLMs for Human-Centric Understanding

    Authors: Keliang Li, Zaifei Yang, Jiahe Zhao, Hongze Shen, Ruibing Hou, Hong Chang, Shiguang Shan, Xilin Chen

    Abstract: The significant advancements in visual understanding and instruction following from Multimodal Large Language Models (MLLMs) have opened up more possibilities for broader applications in diverse and universal human-centric scenarios. However, existing image-text data may not support the precise modality alignment and integration of multi-grained information, which is crucial for human-centric visu… ▽ More

    Submitted 9 October, 2024; originally announced October 2024.

  3. arXiv:2410.05934  [pdf, other

    cs.CR

    Chameleon: An Efficient FHE Scheme Switching Acceleration on GPUs

    Authors: Zhiwei Wang, Haoqi He, Lutan Zhao, Peinan Li, Zhihao Li, Dan Meng, Rui Hou

    Abstract: Fully homomorphic encryption (FHE) enables direct computation on encrypted data, making it a crucial technology for privacy protection. However, FHE suffers from significant performance bottlenecks. In this context, GPU acceleration offers a promising solution to bridge the performance gap. Existing efforts primarily focus on single-class FHE schemes, which fail to meet the diverse requirements of… ▽ More

    Submitted 8 October, 2024; originally announced October 2024.

    Comments: 15 pages, 14 figures

  4. arXiv:2410.01335  [pdf, other

    cs.CL cs.AI cs.LG

    Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models

    Authors: Lucas Bandarkar, Benjamin Muller, Pritish Yuvraj, Rui Hou, Nayan Singhal, Hongjiang Lv, Bing Liu

    Abstract: Model merging, such as model souping, is the practice of combining different models with the same architecture together without further training. In this work, we present a model merging methodology that addresses the difficulty of fine-tuning Large Language Models (LLMs) for target tasks in non-English languages, where task-specific data is often unavailable. We focus on mathematical reasoning an… ▽ More

    Submitted 2 October, 2024; originally announced October 2024.

    Comments: 11 main pages, 23 pages total, 9 figures, 5 tables

  5. arXiv:2409.20002  [pdf, other

    cs.CR

    The Early Bird Catches the Leak: Unveiling Timing Side Channels in LLM Serving Systems

    Authors: Linke Song, Zixuan Pang, Wenhao Wang, Zihao Wang, XiaoFeng Wang, Hongbo Chen, Wei Song, Yier Jin, Dan Meng, Rui Hou

    Abstract: The wide deployment of Large Language Models (LLMs) has given rise to strong demands for optimizing their inference performance. Today's techniques serving this purpose primarily focus on reducing latency and improving throughput through algorithmic and hardware enhancements, while largely overlooking their privacy side effects, particularly in a multi-user environment. In our research, for the fi… ▽ More

    Submitted 30 September, 2024; originally announced September 2024.

  6. arXiv:2409.19951  [pdf, other

    cs.AI cs.CL cs.CV

    Law of the Weakest Link: Cross Capabilities of Large Language Models

    Authors: Ming Zhong, Aston Zhang, Xuewei Wang, Rui Hou, Wenhan Xiong, Chenguang Zhu, Zhengxing Chen, Liang Tan, Chloe Bi, Mike Lewis, Sravya Popuri, Sharan Narang, Melanie Kambadur, Dhruv Mahajan, Sergey Edunov, Jiawei Han, Laurens van der Maaten

    Abstract: The development and evaluation of Large Language Models (LLMs) have largely focused on individual capabilities. However, this overlooks the intersection of multiple abilities across different types of expertise that are often required for real-world tasks, which we term cross capabilities. To systematically explore this concept, we first define seven core individual capabilities and then pair them… ▽ More

    Submitted 2 October, 2024; v1 submitted 30 September, 2024; originally announced September 2024.

    Comments: Data, Code, & Benchmark: www.llm-cross-capabilities.org

  7. arXiv:2408.12079  [pdf, other

    cs.CL cs.AI

    High-Quality Data Augmentation for Low-Resource NMT: Combining a Translation Memory, a GAN Generator, and Filtering

    Authors: Hengjie Liu, Ruibo Hou, Yves Lepage

    Abstract: Back translation, as a technique for extending a dataset, is widely used by researchers in low-resource language translation tasks. It typically translates from the target to the source language to ensure high-quality translation results. This paper proposes a novel way of utilizing a monolingual corpus on the source side to assist Neural Machine Translation (NMT) in low-resource settings. We real… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  8. arXiv:2408.11261  [pdf, other

    cs.AI cs.CL

    Towards Analyzing and Mitigating Sycophancy in Large Vision-Language Models

    Authors: Yunpu Zhao, Rui Zhang, Junbin Xiao, Changxin Ke, Ruibo Hou, Yifan Hao, Qi Guo, Yunji Chen

    Abstract: Large Vision-Language Models (LVLMs) have shown significant capability in vision-language understanding. However, one critical issue that persists in these models is sycophancy, which means models are unduly influenced by leading or deceptive prompts, resulting in biased outputs and hallucinations. Despite the progress in LVLMs, evaluating and mitigating sycophancy is yet much under-explored. In t… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  9. arXiv:2408.10039  [pdf, other

    cs.AI

    MSDiagnosis: An EMR-based Dataset for Clinical Multi-Step Diagnosis

    Authors: Ruihui Hou, Shencheng Chen, Yongqi Fan, Lifeng Zhu, Jing Sun, Jingping Liu, Tong Ruan

    Abstract: Clinical diagnosis is critical in medical practice, typically requiring a continuous and evolving process that includes primary diagnosis, differential diagnosis, and final diagnosis. However, most existing clinical diagnostic tasks are single-step processes, which does not align with the complex multi-step diagnostic procedures found in real-world clinical settings. In this paper, we propose a mu… ▽ More

    Submitted 29 August, 2024; v1 submitted 19 August, 2024; originally announced August 2024.

  10. arXiv:2408.00767  [pdf

    cs.IT cs.CL

    Quantification and Validation for Degree of Understanding in M2M Semantic Communications

    Authors: Linhan Xia, Jiaxin Cai, Ricky Yuen-Tan Hou, Seon-Phil Jeong

    Abstract: With the development of Artificial Intelligence (AI) and Internet of Things (IoT) technologies, network communications based on the Shannon-Nyquist theorem gradually reveal their limitations due to the neglect of semantic information in the transmitted content. Semantic communication (SemCom) provides a solution for extracting information meanings from the transmitted content. The semantic informa… ▽ More

    Submitted 14 July, 2024; originally announced August 2024.

    Comments: ICCT 2024

  11. arXiv:2407.21783  [pdf, other

    cs.AI cs.CL cs.CV

    The Llama 3 Herd of Models

    Authors: Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, Anirudh Goyal, Anthony Hartshorn, Aobo Yang, Archi Mitra, Archie Sravankumar, Artem Korenev, Arthur Hinsvark, Arun Rao, Aston Zhang, Aurelien Rodriguez, Austen Gregerson, Ava Spataru, Baptiste Roziere, Bethany Biron, Binh Tang , et al. (510 additional authors not shown)

    Abstract: Modern artificial intelligence (AI) systems are powered by foundation models. This paper presents a new set of foundation models, called Llama 3. It is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. Our largest model is a dense Transformer with 405B parameters and a context window of up to 128K tokens. This paper presents an extensive empirical… ▽ More

    Submitted 15 August, 2024; v1 submitted 31 July, 2024; originally announced July 2024.

  12. arXiv:2406.06007  [pdf, other

    cs.LG cs.CL cs.CV cs.CY

    CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models

    Authors: Peng Xia, Ze Chen, Juanxi Tian, Yangrui Gong, Ruibo Hou, Yue Xu, Zhenbang Wu, Zhiyuan Fan, Yiyang Zhou, Kangyu Zhu, Wenhao Zheng, Zhaoyang Wang, Xiao Wang, Xuchao Zhang, Chetan Bansal, Marc Niethammer, Junzhou Huang, Hongtu Zhu, Yun Li, Jimeng Sun, Zongyuan Ge, Gang Li, James Zou, Huaxiu Yao

    Abstract: Artificial intelligence has significantly impacted medical applications, particularly with the advent of Medical Large Vision Language Models (Med-LVLMs), sparking optimism for the future of automated and personalized healthcare. However, the trustworthiness of Med-LVLMs remains unverified, posing significant risks for future model deployment. In this paper, we introduce CARES and aim to comprehen… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  13. arXiv:2405.20596  [pdf, other

    cs.CV cs.LG

    Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation

    Authors: Jiachen Liang, Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen

    Abstract: Traditional semi-supervised learning (SSL) assumes that the feature distributions of labeled and unlabeled data are consistent which rarely holds in realistic scenarios. In this paper, we propose a novel SSL setting, where unlabeled samples are drawn from a mixed distribution that deviates from the feature distribution of labeled samples. Under this setting, previous SSL methods tend to predict wr… ▽ More

    Submitted 30 May, 2024; originally announced May 2024.

    Comments: 10 pages; Accepted by NeurIPS 2023

  14. arXiv:2405.16273  [pdf, other

    cs.CV

    M$^3$GPT: An Advanced Multimodal, Multitask Framework for Motion Comprehension and Generation

    Authors: Mingshuang Luo, Ruibing Hou, Zhuo Li, Hong Chang, Zimo Liu, Yaowei Wang, Shiguang Shan

    Abstract: This paper presents M$^3$GPT, an advanced $\textbf{M}$ultimodal, $\textbf{M}$ultitask framework for $\textbf{M}$otion comprehension and generation. M$^3$GPT operates on three fundamental principles. The first focuses on creating a unified representation space for various motion-relevant modalities. We employ discrete vector quantization for multimodal control and generation signals, such as text,… ▽ More

    Submitted 25 September, 2024; v1 submitted 25 May, 2024; originally announced May 2024.

    Comments: 18 pages, 6 figures

  15. arXiv:2404.15310  [pdf, other

    cs.HC cs.AI cs.CY cs.LG

    Automated Assessment of Encouragement and Warmth in Classrooms Leveraging Multimodal Emotional Features and ChatGPT

    Authors: Ruikun Hou, Tim Fütterer, Babette Bühler, Efe Bozkir, Peter Gerjets, Ulrich Trautwein, Enkelejda Kasneci

    Abstract: Classroom observation protocols standardize the assessment of teaching effectiveness and facilitate comprehension of classroom interactions. Whereas these protocols offer teachers specific feedback on their teaching practices, the manual coding by human raters is resource-intensive and often unreliable. This has sparked interest in developing AI-driven, cost-effective methods for automating such h… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: Accepted as a full paper by the 25th International Conference on Artificial Intelligence in Education (AIED 2024)

    Journal ref: Proceedings of the 25th International Conference on Artificial Intelligence in Education (AIED 2024)

  16. arXiv:2404.09507  [pdf, other

    cs.CV

    Clothes-Changing Person Re-Identification with Feasibility-Aware Intermediary Matching

    Authors: Jiahe Zhao, Ruibing Hou, Hong Chang, Xinqian Gu, Bingpeng Ma, Shiguang Shan, Xilin Chen

    Abstract: Current clothes-changing person re-identification (re-id) approaches usually perform retrieval based on clothes-irrelevant features, while neglecting the potential of clothes-relevant features. However, we observe that relying solely on clothes-irrelevant features for clothes-changing re-id is limited, since they often lack adequate identity information and suffer from large intra-class variations… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

  17. arXiv:2403.10188  [pdf, other

    cs.CR cs.AR

    Taiyi: A high-performance CKKS accelerator for Practical Fully Homomorphic Encryption

    Authors: Shengyu Fan, Xianglong Deng, Zhuoyu Tian, Zhicheng Hu, Liang Chang, Rui Hou, Dan Meng, Mingzhe Zhang

    Abstract: Fully Homomorphic Encryption (FHE), a novel cryptographic theory enabling computation directly on ciphertext data, offers significant security benefits but is hampered by substantial performance overhead. In recent years, a series of accelerator designs have significantly enhanced the performance of FHE applications, bringing them closer to real-world applicability. However, these accelerators fac… ▽ More

    Submitted 15 March, 2024; originally announced March 2024.

    Comments: 14 pages, 15 figures

  18. arXiv:2402.11438  [pdf, other

    cs.CR cs.AR

    The Road to Trust: Building Enclaves within Confidential VMs

    Authors: Wenhao Wang, Linke Song, Benshan Mei, Shuang Liu, Shijun Zhao, Shoumeng Yan, XiaoFeng Wang, Dan Meng, Rui Hou

    Abstract: Integrity is critical for maintaining system security, as it ensures that only genuine software is loaded onto a machine. Although confidential virtual machines (CVMs) function within isolated environments separate from the host, it is important to recognize that users still encounter challenges in maintaining control over the integrity of the code running within the trusted execution environments… ▽ More

    Submitted 31 May, 2024; v1 submitted 17 February, 2024; originally announced February 2024.

  19. arXiv:2312.17273  [pdf, other

    cs.CV

    X Modality Assisting RGBT Object Tracking

    Authors: Zhaisheng Ding, Haiyan Li, Ruichao Hou, Yanyu Liu, Shidong Xie, Dongming Zhou, Jinde Cao

    Abstract: Learning robust multi-modal feature representations is critical for boosting tracking performance. To this end, we propose a novel X Modality Assisting Network (X-Net) to shed light on the impact of the fusion paradigm by decoupling the visual object tracking into three distinct levels, facilitating subsequent processing. Firstly, to tackle the feature learning hurdles stemming from significant di… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

  20. arXiv:2312.12423  [pdf, other

    cs.CV cs.AI

    Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model

    Authors: Shraman Pramanick, Guangxing Han, Rui Hou, Sayan Nag, Ser-Nam Lim, Nicolas Ballas, Qifan Wang, Rama Chellappa, Amjad Almahairi

    Abstract: The ability of large language models (LLMs) to process visual inputs has given rise to general-purpose vision systems, unifying various vision-language (VL) tasks by instruction tuning. However, due to the enormous diversity in input-output formats in the vision domain, existing general-purpose models fail to successfully integrate segmentation and multi-image inputs with coarse-level tasks into a… ▽ More

    Submitted 19 June, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 Highlight

  21. arXiv:2312.12021  [pdf, other

    cs.CL cs.AI

    Synergistic Anchored Contrastive Pre-training for Few-Shot Relation Extraction

    Authors: Da Luo, Yanglei Gan, Rui Hou, Run Lin, Qiao Liu, Yuxiang Cai, Wannian Gao

    Abstract: Few-shot Relation Extraction (FSRE) aims to extract relational facts from a sparse set of labeled corpora. Recent studies have shown promising results in FSRE by employing Pre-trained Language Models (PLMs) within the framework of supervised contrastive learning, which considers both instances and label facts. However, how to effectively harness massive instance-label pairs to encompass the learne… ▽ More

    Submitted 11 March, 2024; v1 submitted 19 December, 2023; originally announced December 2023.

  22. arXiv:2312.06424  [pdf, other

    cs.IR

    Cross Domain LifeLong Sequential Modeling for Online Click-Through Rate Prediction

    Authors: Ruijie Hou, Zhaoyang Yang, Yu Ming, Hongyu Lu, Zhuobin Zheng, Yu Chen, Qinsong Zeng, Ming Chen

    Abstract: Deep neural networks (DNNs) that incorporated lifelong sequential modeling (LSM) have brought great success to recommendation systems in various social media platforms. While continuous improvements have been made in domain-specific LSM, limited work has been done in cross-domain LSM, which considers modeling of lifelong sequences of both target domain and source domain. In this paper, we propose… ▽ More

    Submitted 17 May, 2024; v1 submitted 11 December, 2023; originally announced December 2023.

    Comments: Accepted by KDD 2024

  23. arXiv:2312.04032  [pdf, other

    cs.CL cs.LG

    RoAST: Robustifying Language Models via Adversarial Perturbation with Selective Training

    Authors: Jaehyung Kim, Yuning Mao, Rui Hou, Hanchao Yu, Davis Liang, Pascale Fung, Qifan Wang, Fuli Feng, Lifu Huang, Madian Khabsa

    Abstract: Fine-tuning pre-trained language models (LMs) has become the de facto standard in many NLP tasks. Nevertheless, fine-tuned LMs are still prone to robustness issues, such as adversarial robustness and model calibration. Several perspectives of robustness for LMs have been studied independently, but lacking a unified consideration in multiple perspectives. In this paper, we propose Robustifying LMs… ▽ More

    Submitted 6 December, 2023; originally announced December 2023.

    Comments: 33 pages, accepted at EMNLP 2023 Findings

  24. arXiv:2311.08024  [pdf, other

    eess.IV cs.CV cs.LG

    MD-IQA: Learning Multi-scale Distributed Image Quality Assessment with Semi Supervised Learning for Low Dose CT

    Authors: Tao Song, Ruizhi Hou, Lisong Dai, Lei Xiang

    Abstract: Image quality assessment (IQA) plays a critical role in optimizing radiation dose and developing novel medical imaging techniques in computed tomography (CT). Traditional IQA methods relying on hand-crafted features have limitations in summarizing the subjective perceptual experience of image quality. Recent deep learning-based approaches have demonstrated strong modeling capabilities and potentia… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

  25. arXiv:2311.07689  [pdf, other

    cs.CL

    MART: Improving LLM Safety with Multi-round Automatic Red-Teaming

    Authors: Suyu Ge, Chunting Zhou, Rui Hou, Madian Khabsa, Yi-Chia Wang, Qifan Wang, Jiawei Han, Yuning Mao

    Abstract: Red-teaming is a common practice for mitigating unsafe behaviors in Large Language Models (LLMs), which involves thoroughly assessing LLMs to identify potential flaws and addressing them with responsible and accurate responses. While effective, manual red-teaming is costly, and existing automatic red-teaming typically discovers safety risks without addressing them. In this paper, we propose a Mult… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  26. arXiv:2311.06742  [pdf, ps, other

    cs.IT

    Meta-Reinforcement Learning for Timely and Energy-efficient Data Collection in Solar-powered UAV-assisted IoT Networks

    Authors: Mengjie Yi, Xijun Wang, Juan Liu, Yan Zhang, Ronghui Hou

    Abstract: Unmanned aerial vehicles (UAVs) have the potential to greatly aid Internet of Things (IoT) networks in mission-critical data collection, thanks to their flexibility and cost-effectiveness. However, challenges arise due to the UAV's limited onboard energy and the unpredictable status updates from sensor nodes (SNs), which impact the freshness of collected data. In this paper, we investigate the ene… ▽ More

    Submitted 12 November, 2023; originally announced November 2023.

  27. arXiv:2311.02849  [pdf, other

    cs.CL cs.AI

    Co-training and Co-distillation for Quality Improvement and Compression of Language Models

    Authors: Hayeon Lee, Rui Hou, Jongpil Kim, Davis Liang, Hongbo Zhang, Sung Ju Hwang, Alexander Min

    Abstract: Knowledge Distillation (KD) compresses computationally expensive pre-trained language models (PLMs) by transferring their knowledge to smaller models, allowing their use in resource-constrained or real-time settings. However, most smaller models fail to surpass the performance of the original larger model, resulting in sacrificing performance to improve inference speed. To address this issue, we p… ▽ More

    Submitted 7 November, 2023; v1 submitted 5 November, 2023; originally announced November 2023.

    Comments: Findings of EMNLP 2023

  28. arXiv:2309.16039  [pdf, other

    cs.CL

    Effective Long-Context Scaling of Foundation Models

    Authors: Wenhan Xiong, Jingyu Liu, Igor Molybog, Hejia Zhang, Prajjwal Bhargava, Rui Hou, Louis Martin, Rashi Rungta, Karthik Abinav Sankararaman, Barlas Oguz, Madian Khabsa, Han Fang, Yashar Mehdad, Sharan Narang, Kshitiz Malik, Angela Fan, Shruti Bhosale, Sergey Edunov, Mike Lewis, Sinong Wang, Hao Ma

    Abstract: We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens. Our model series are built through continual pretraining from Llama 2 with longer training sequences and on a dataset where long texts are upsampled. We perform extensive evaluation on language modeling, synthetic context probing tasks, and a wide range of research benchmarks. On research benchm… ▽ More

    Submitted 13 November, 2023; v1 submitted 27 September, 2023; originally announced September 2023.

  29. arXiv:2308.13165  [pdf, other

    cs.CV

    Dual Compensation Residual Networks for Class Imbalanced Learning

    Authors: Ruibing Hou, Hong Chang, Bingpeng Ma, Shiguang Shan, Xilin Chen

    Abstract: Learning generalizable representation and classifier for class-imbalanced data is challenging for data-driven deep models. Most studies attempt to re-balance the data distribution, which is prone to overfitting on tail classes and underfitting on head classes. In this work, we propose Dual Compensation Residual Networks to better fit both tail and head classes. Firstly, we propose dual Feature Com… ▽ More

    Submitted 25 August, 2023; originally announced August 2023.

    Comments: 20 pages

    Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI,2023)

  30. arXiv:2308.11447  [pdf, other

    cs.CL cs.AI

    Aspect-oriented Opinion Alignment Network for Aspect-Based Sentiment Classification

    Authors: Xueyi Liu, Rui Hou, Yanglei Gan, Da Luo, Changlin Li, Xiaojun Shi, Qiao Liu

    Abstract: Aspect-based sentiment classification is a crucial problem in fine-grained sentiment analysis, which aims to predict the sentiment polarity of the given aspect according to its context. Previous works have made remarkable progress in leveraging attention mechanism to extract opinion words for different aspects. However, a persistent challenge is the effective management of semantic mismatches, whi… ▽ More

    Submitted 22 August, 2023; originally announced August 2023.

    Comments: 8 pages, 5 figure, ECAI 2023

    ACM Class: I.2.7

  31. arXiv:2307.09288  [pdf, other

    cs.CL cs.AI

    Llama 2: Open Foundation and Fine-Tuned Chat Models

    Authors: Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, Dan Bikel, Lukas Blecher, Cristian Canton Ferrer, Moya Chen, Guillem Cucurull, David Esiobu, Jude Fernandes, Jeremy Fu, Wenyin Fu, Brian Fuller, Cynthia Gao, Vedanuj Goswami, Naman Goyal, Anthony Hartshorn, Saghar Hosseini , et al. (43 additional authors not shown)

    Abstract: In this work, we develop and release Llama 2, a collection of pretrained and fine-tuned large language models (LLMs) ranging in scale from 7 billion to 70 billion parameters. Our fine-tuned LLMs, called Llama 2-Chat, are optimized for dialogue use cases. Our models outperform open-source chat models on most benchmarks we tested, and based on our human evaluations for helpfulness and safety, may be… ▽ More

    Submitted 19 July, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

  32. arXiv:2306.11011  [pdf, other

    cs.CR

    virtCCA: Virtualized Arm Confidential Compute Architecture with TrustZone

    Authors: Xiangyi Xu, Wenhao Wang, Yongzheng Wu, Chenyu Wang, Huifeng Zhu, Haocheng Ma, Zhennan Min, Zixuan Pang, Rui Hou, Yier Jin

    Abstract: ARM recently introduced the Confidential Compute Architecture (CCA) as part of the upcoming ARMv9-A architecture. CCA enables the support of confidential virtual machines (cVMs) within a separate world called the Realm world, providing protection from the untrusted normal world. While CCA offers a promising future for confidential computing, the widespread availability of CCA hardware is not expec… ▽ More

    Submitted 17 February, 2024; v1 submitted 19 June, 2023; originally announced June 2023.

  33. arXiv:2305.18239  [pdf, other

    cs.CL cs.AI

    A Study on Knowledge Distillation from Weak Teacher for Scaling Up Pre-trained Language Models

    Authors: Hayeon Lee, Rui Hou, Jongpil Kim, Davis Liang, Sung Ju Hwang, Alexander Min

    Abstract: Distillation from Weak Teacher (DWT) is a method of transferring knowledge from a smaller, weaker teacher model to a larger student model to improve its performance. Previous studies have shown that DWT can be effective in the vision domain and natural language processing (NLP) pre-training stage. Specifically, DWT shows promise in practical scenarios, such as enhancing new generation or larger mo… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

    Comments: Findings of ACL 2023

  34. arXiv:2305.16049  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    CN-Celeb-AV: A Multi-Genre Audio-Visual Dataset for Person Recognition

    Authors: Lantian Li, Xiaolou Li, Haoyu Jiang, Chen Chen, Ruihai Hou, Dong Wang

    Abstract: Audio-visual person recognition (AVPR) has received extensive attention. However, most datasets used for AVPR research so far are collected in constrained environments, and thus cannot reflect the true performance of AVPR systems in real-world scenarios. To meet the request for research on AVPR in unconstrained conditions, this paper presents a multi-genre AVPR dataset collected `in the wild', nam… ▽ More

    Submitted 28 July, 2023; v1 submitted 25 May, 2023; originally announced May 2023.

    Comments: INTERSPEECH 2023

  35. arXiv:2305.13499  [pdf, other

    cs.CL

    Learning Easily Updated General Purpose Text Representations with Adaptable Task-Specific Prefixes

    Authors: Kuan-Hao Huang, Liang Tan, Rui Hou, Sinong Wang, Amjad Almahairi, Ruty Rinott

    Abstract: Many real-world applications require making multiple predictions from the same text. Fine-tuning a large pre-trained language model for each downstream task causes computational burdens in the inference time due to several times of forward passes. To amortize the computational cost, freezing the language model and building lightweight models for downstream tasks based on fixed text representations… ▽ More

    Submitted 14 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: Paper accepted by EMNLP 2023 Findings

  36. arXiv:2305.03937  [pdf, other

    cs.CL cs.AI

    Residual Prompt Tuning: Improving Prompt Tuning with Residual Reparameterization

    Authors: Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Jimmy Ba, Amjad Almahairi

    Abstract: Prompt tuning is one of the successful approaches for parameter-efficient tuning of pre-trained language models. Despite being arguably the most parameter-efficient (tuned soft prompts constitute <0.1% of total parameters), it typically performs worse than other efficient tuning methods and is quite sensitive to hyper-parameters. In this work, we introduce Residual Prompt Tuning - a simple and eff… ▽ More

    Submitted 6 May, 2023; originally announced May 2023.

    Comments: ACL Findings 2023

  37. arXiv:2305.00104  [pdf, other

    cs.CV eess.AS eess.IV

    MMViT: Multiscale Multiview Vision Transformers

    Authors: Yuchen Liu, Natasha Ong, Kaiyan Peng, Bo Xiong, Qifan Wang, Rui Hou, Madian Khabsa, Kaiyue Yang, David Liu, Donald S. Williamson, Hanchao Yu

    Abstract: We present Multiscale Multiview Vision Transformers (MMViT), which introduces multiscale feature maps and multiview encodings to transformer models. Our model encodes different views of the input signal and builds several channel-resolution feature stages to process the multiple views of the input at different resolutions in parallel. At each scale stage, we use a cross-attention block to fuse inf… ▽ More

    Submitted 28 April, 2023; originally announced May 2023.

  38. arXiv:2304.00325  [pdf, other

    cs.CV

    SVT: Supertoken Video Transformer for Efficient Video Understanding

    Authors: Chenbin Pan, Rui Hou, Hanchao Yu, Qifan Wang, Senem Velipasalar, Madian Khabsa

    Abstract: Whether by processing videos with fixed resolution from start to end or incorporating pooling and down-scaling strategies, existing video transformers process the whole video content throughout the network without specially handling the large portions of redundant information. In this paper, we present a Supertoken Video Transformer (SVT) that incorporates a Semantic Pooling Module (SPM) to aggreg… ▽ More

    Submitted 23 April, 2023; v1 submitted 1 April, 2023; originally announced April 2023.

  39. arXiv:2301.12314  [pdf, other

    cs.CL cs.AI cs.LG

    Progressive Prompts: Continual Learning for Language Models

    Authors: Anastasia Razdaibiedina, Yuning Mao, Rui Hou, Madian Khabsa, Mike Lewis, Amjad Almahairi

    Abstract: We introduce Progressive Prompts - a simple and efficient approach for continual learning in language models. Our method allows forward transfer and resists catastrophic forgetting, without relying on data replay or a large number of task-specific parameters. Progressive Prompts learns a new soft prompt for each task and sequentially concatenates it with the previously learned prompts, while keepi… ▽ More

    Submitted 28 January, 2023; originally announced January 2023.

  40. arXiv:2301.10472  [pdf, other

    cs.CL cs.LG

    XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models

    Authors: Davis Liang, Hila Gonen, Yuning Mao, Rui Hou, Naman Goyal, Marjan Ghazvininejad, Luke Zettlemoyer, Madian Khabsa

    Abstract: Large multilingual language models typically rely on a single vocabulary shared across 100+ languages. As these models have increased in parameter count and depth, vocabulary size has remained largely unchanged. This \textit{vocabulary bottleneck} limits the representational capabilities of multilingual models like XLM-R. In this paper, we introduce a new approach for scaling to very large multili… ▽ More

    Submitted 13 October, 2023; v1 submitted 25 January, 2023; originally announced January 2023.

    Comments: EMNLP 2023

  41. arXiv:2212.14191  [pdf, other

    cs.AR cs.CR

    TensorFHE: Achieving Practical Computation on Encrypted Data Using GPGPU

    Authors: Shengyu Fan, Zhiwei Wang, Weizhi Xu, Rui Hou, Dan Meng, Mingzhe Zhang

    Abstract: In this paper, we propose TensorFHE, an FHE acceleration solution based on GPGPU for real applications on encrypted data. TensorFHE utilizes Tensor Core Units (TCUs) to boost the computation of Number Theoretic Transform (NTT), which is the part of FHE with highest time-cost. Moreover, TensorFHE focuses on performing as many FHE operations as possible in a certain time period rather than reducing… ▽ More

    Submitted 29 December, 2022; originally announced December 2022.

    Comments: To be appeared in the 29th IEEE International Symposium on High-Performance Computer Architecture (HPCA-29), 2023

  42. arXiv:2212.05195  [pdf, other

    cs.LG

    Uniform Masking Prevails in Vision-Language Pretraining

    Authors: Siddharth Verma, Yuchen Lu, Rui Hou, Hanchao Yu, Nicolas Ballas, Madian Khabsa, Amjad Almahairi

    Abstract: Masked Language Modeling (MLM) has proven to be an essential component of Vision-Language (VL) pretraining. To implement MLM, the researcher must make two design choices: the masking strategy, which determines which tokens to mask, and the masking rate, which determines how many tokens to mask. Previous work has focused primarily on the masking strategy while setting the masking rate at a default… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

  43. arXiv:2211.06056  [pdf, other

    cs.CR

    Remapped Cache Layout: Thwarting Cache-Based Side-Channel Attacks with a Hardware Defense

    Authors: Wei Song, Rui Hou, Peng Liu, Xiaoxin Li, Peinan Li, Lutan Zhao, Xiaofei Fu, Yifei Sun, Dan Meng

    Abstract: As cache-based side-channel attacks become serious security problems, various defenses have been proposed and deployed in both software and hardware. Consequently, cache-based side-channel attacks on processes co-residing on the same core are becoming extremely difficult. Most of recent attacks then shift their focus to the last-level cache (LLC). Although cache partitioning is currently the most… ▽ More

    Submitted 11 November, 2022; originally announced November 2022.

  44. arXiv:2210.07467  [pdf, other

    cs.CL

    Query Rewriting for Effective Misinformation Discovery

    Authors: Ashkan Kazemi, Artem Abzaliev, Naihao Deng, Rui Hou, Scott A. Hale, Verónica Pérez-Rosas, Rada Mihalcea

    Abstract: We propose a novel system to help fact-checkers formulate search queries for known misinformation claims and effectively search across multiple social media platforms. We introduce an adaptable rewriting strategy, where editing actions for queries containing claims (e.g., swap a word with its synonym; change verb tense into present simple) are automatically learned through offline reinforcement le… ▽ More

    Submitted 2 October, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: AACL 2023 (long paper)

  45. arXiv:2209.02986  [pdf, other

    cs.CV cs.LG eess.IV

    Shifting Perspective to See Difference: A Novel Multi-View Method for Skeleton based Action Recognition

    Authors: Ruijie Hou, Yanran Li, Ningyu Zhang, Yulin Zhou, Xiaosong Yang, Zhao Wang

    Abstract: Skeleton-based human action recognition is a longstanding challenge due to its complex dynamics. Some fine-grain details of the dynamics play a vital role in classification. The existing work largely focuses on designing incremental neural networks with more complicated adjacent matrices to capture the details of joints relationships. However, they still have difficulties distinguishing actions th… ▽ More

    Submitted 7 September, 2022; originally announced September 2022.

  46. Using Chatbots to Teach Languages

    Authors: Yu Li, Chun-Yen Chen, Dian Yu, Sam Davidson, Ryan Hou, Xun Yuan, Yinghua Tan, Derek Pham, Zhou Yu

    Abstract: This paper reports on progress towards building an online language learning tool to provide learners with conversational experience by using dialog systems as conversation practice partners. Our system can adapt to users' language proficiency on the fly. We also provide automatic grammar error feedback to help users learn from their mistakes. According to our first adopters, our system is entertai… ▽ More

    Submitted 31 July, 2022; originally announced August 2022.

    Comments: Accepted to Learning @ Scale 2022

  47. Immunofluorescence Capillary Imaging Segmentation: Cases Study

    Authors: Runpeng Hou, Ziyuan Ye, Chengyu Yang, Linhao Fu, Chao Liu, Quanying Liu

    Abstract: Nonunion is one of the challenges faced by orthopedics clinics for the technical difficulties and high costs in photographing interosseous capillaries. Segmenting vessels and filling capillaries are critical in understanding the obstacles encountered in capillary growth. However, existing datasets for blood vessel segmentation mainly focus on the large blood vessels of the body, and the lack of la… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  48. arXiv:2206.00279  [pdf, other

    cs.CR cs.FL

    Defensive Design of Saturating Counters Based on Differential Privacy

    Authors: Depeng Liu, Lutan Zhao, Pengfei Yang, Bow-Yaw Wang, Rui Hou, Lijun Zhang, Naijun Zhan

    Abstract: The saturating counter is the basic module of the dynamic branch predictor, which involves the core technique to improve instruction level parallelism performance in modern processors. However, most studies focus on the performance improvement and hardware consumption of saturating counters, while ignoring the security problems they may cause. In this paper, we creatively propose to study and desi… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  49. arXiv:2205.13685  [pdf, other

    cs.CR cs.SD eess.AS

    Adversarial attacks and defenses in Speaker Recognition Systems: A survey

    Authors: Jiahe Lan, Rui Zhang, Zheng Yan, Jie Wang, Yu Chen, Ronghui Hou

    Abstract: Speaker recognition has become very popular in many application scenarios, such as smart homes and smart assistants, due to ease of use for remote control and economic-friendly features. The rapid development of SRSs is inseparable from the advancement of machine learning, especially neural networks. However, previous work has shown that machine learning models are vulnerable to adversarial attack… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

    Comments: 38pages, 2 figures, 2 tables. Journal of Systems Architecture,2022

    ACM Class: A.1

  50. arXiv:2205.07295  [pdf, other

    cs.LG cs.IR

    Posterior Probability Matters: Doubly-Adaptive Calibration for Neural Predictions in Online Advertising

    Authors: Penghui Wei, Weimin Zhang, Ruijie Hou, Jinquan Liu, Shaoguo Liu, Liang Wang, Bo Zheng

    Abstract: Predicting user response probabilities is vital for ad ranking and bidding. We hope that predictive models can produce accurate probabilistic predictions that reflect true likelihoods. Calibration techniques aim to post-process model predictions to posterior probabilities. Field-level calibration -- which performs calibration w.r.t. to a specific field value -- is fine-grained and more practical.… ▽ More

    Submitted 25 May, 2024; v1 submitted 15 May, 2022; originally announced May 2022.

    Comments: SIGIR 2022 (short)