Skip to main content

Showing 1–50 of 86 results for author: Awadallah, A

  1. arXiv:2408.00203  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    OmniParser for Pure Vision Based GUI Agent

    Authors: Yadong Lu, Jianwei Yang, Yelong Shen, Ahmed Awadallah

    Abstract: The recent success of large vision language models shows great potential in driving the agent system operating on user interfaces. However, we argue that the power multimodal models like GPT-4V as a general agent on multiple operating systems across different applications is largely underestimated due to the lack of a robust screen parsing technique capable of: 1) reliably identifying interactable… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

  2. arXiv:2407.09879  [pdf, other

    cs.CL

    sPhinX: Sample Efficient Multilingual Instruction Fine-Tuning Through N-shot Guided Prompting

    Authors: Sanchit Ahuja, Kumar Tanmay, Hardik Hansrajbhai Chauhan, Barun Patra, Kriti Aggarwal, Luciano Del Corro, Arindam Mitra, Tejas Indulal Dhamecha, Ahmed Awadallah, Monojit Choudhary, Vishrav Chaudhary, Sunayana Sitaram

    Abstract: Despite the remarkable success of LLMs in English, there is a significant gap in performance in non-English languages. In order to address this, we introduce a novel recipe for creating a multilingual synthetic instruction tuning dataset, sPhinX, which is created by selectively translating instruction response pairs from English into 50 languages. We test the effectiveness of sPhinx by using it to… ▽ More

    Submitted 16 October, 2024; v1 submitted 13 July, 2024; originally announced July 2024.

    Comments: 20 pages, 12 tables, 5 figures

  3. arXiv:2407.03502  [pdf, other

    cs.AI cs.CL cs.LG

    AgentInstruct: Toward Generative Teaching with Agentic Flows

    Authors: Arindam Mitra, Luciano Del Corro, Guoqing Zheng, Shweti Mahajan, Dany Rouhana, Andres Codas, Yadong Lu, Wei-ge Chen, Olga Vrousgos, Corby Rosset, Fillipe Silva, Hamed Khanpour, Yash Lara, Ahmed Awadallah

    Abstract: Synthetic data is becoming increasingly important for accelerating the development of language models, both large and small. Despite several successful use cases, researchers also raised concerns around model collapse and drawbacks of imitating other models. This discrepancy can be attributed to the fact that synthetic data varies in quality and diversity. Effective use of synthetic data usually r… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  4. arXiv:2405.21046  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Exploratory Preference Optimization: Harnessing Implicit Q*-Approximation for Sample-Efficient RLHF

    Authors: Tengyang Xie, Dylan J. Foster, Akshay Krishnamurthy, Corby Rosset, Ahmed Awadallah, Alexander Rakhlin

    Abstract: Reinforcement learning from human feedback (RLHF) has emerged as a central tool for language model alignment. We consider online exploration in RLHF, which exploits interactive access to human or AI feedback by deliberately encouraging the model to produce diverse, maximally informative responses. By allowing RLHF to confidently stray from the pre-trained model, online exploration offers the possi… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

  5. arXiv:2405.02178  [pdf, other

    cs.CL cs.AI

    Assessing and Verifying Task Utility in LLM-Powered Applications

    Authors: Negar Arabzadeh, Siqing Huo, Nikhil Mehta, Qinqyun Wu, Chi Wang, Ahmed Awadallah, Charles L. A. Clarke, Julia Kiseleva

    Abstract: The rapid development of Large Language Models (LLMs) has led to a surge in applications that facilitate collaboration among multiple agents, assisting humans in their daily tasks. However, a significant gap remains in assessing to what extent LLM-powered applications genuinely enhance user experience and task execution efficiency. This highlights the need to verify utility of LLM-powered applicat… ▽ More

    Submitted 12 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2402.09015

  6. arXiv:2404.14618  [pdf, other

    cs.LG cs.AI cs.CL

    Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing

    Authors: Dujian Ding, Ankur Mallick, Chi Wang, Robert Sim, Subhabrata Mukherjee, Victor Ruhle, Laks V. S. Lakshmanan, Ahmed Hassan Awadallah

    Abstract: Large language models (LLMs) excel in most NLP tasks but also require expensive cloud servers for deployment due to their size, while smaller models that can be deployed on lower cost (e.g., edge) devices, tend to lag behind in terms of response quality. Therefore in this work we propose a hybrid inference approach which combines their respective strengths to save cost and maintain quality. Our ap… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

    Comments: Accepted to ICLR 2024 (main conference)

  7. arXiv:2404.14219  [pdf, other

    cs.CL cs.AI

    Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

    Authors: Marah Abdin, Jyoti Aneja, Hany Awadalla, Ahmed Awadallah, Ammar Ahmad Awan, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Martin Cai, Qin Cai, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Weizhu Chen, Yen-Chun Chen, Yi-Ling Chen, Hao Cheng, Parul Chopra, Xiyang Dai , et al. (104 additional authors not shown)

    Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. Our training dataset is a scaled-up version… ▽ More

    Submitted 30 August, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 24 pages

  8. arXiv:2404.03715  [pdf, other

    cs.LG cs.AI cs.CL

    Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences

    Authors: Corby Rosset, Ching-An Cheng, Arindam Mitra, Michael Santacroce, Ahmed Awadallah, Tengyang Xie

    Abstract: This paper studies post-training large language models (LLMs) using preference feedback from a powerful oracle to help a model iteratively improve over itself. The typical approach for post-training LLMs involves Reinforcement Learning from Human Feedback (RLHF), which traditionally separates reward learning and subsequent policy optimization. However, such a reward maximization approach is limite… ▽ More

    Submitted 4 April, 2024; originally announced April 2024.

  9. arXiv:2402.17896  [pdf, other

    cs.CL cs.AI

    Researchy Questions: A Dataset of Multi-Perspective, Decompositional Questions for LLM Web Agents

    Authors: Corby Rosset, Ho-Lam Chung, Guanghui Qin, Ethan C. Chau, Zhuo Feng, Ahmed Awadallah, Jennifer Neville, Nikhil Rao

    Abstract: Existing question answering (QA) datasets are no longer challenging to most powerful Large Language Models (LLMs). Traditional QA benchmarks like TriviaQA, NaturalQuestions, ELI5 and HotpotQA mainly study ``known unknowns'' with clear indications of both what information is missing, and how to find it to answer the question. Hence, good performance on these benchmarks provides a false sense of sec… ▽ More

    Submitted 27 February, 2024; originally announced February 2024.

  10. arXiv:2402.14830  [pdf, other

    cs.CL cs.AI

    Orca-Math: Unlocking the potential of SLMs in Grade School Math

    Authors: Arindam Mitra, Hamed Khanpour, Corby Rosset, Ahmed Awadallah

    Abstract: Mathematical word problem-solving has long been recognized as a complex task for small language models (SLMs). A recent study hypothesized that the smallest model size, needed to achieve over 80% accuracy on the GSM8K benchmark, is 34 billion parameters. To reach this level of performance with smaller models, researcher often train SLMs to generate Python code or use tools to help avoid calculatio… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  11. arXiv:2402.09015  [pdf, other

    cs.CL cs.AI

    Towards better Human-Agent Alignment: Assessing Task Utility in LLM-Powered Applications

    Authors: Negar Arabzadeh, Julia Kiseleva, Qingyun Wu, Chi Wang, Ahmed Awadallah, Victor Dibia, Adam Fourney, Charles Clarke

    Abstract: The rapid development in the field of Large Language Models (LLMs) has led to a surge in applications that facilitate collaboration among multiple agents to assist humans in their daily tasks. However, a significant gap remains in assessing whether LLM-powered applications genuinely enhance user experience and task execution efficiency. This highlights the pressing need for methods to verify utili… ▽ More

    Submitted 22 February, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

  12. arXiv:2312.02206  [pdf, other

    cs.AI cs.CL

    Axiomatic Preference Modeling for Longform Question Answering

    Authors: Corby Rosset, Guoqing Zheng, Victor Dibia, Ahmed Awadallah, Paul Bennett

    Abstract: The remarkable abilities of large language models (LLMs) like GPT-4 partially stem from post-training processes like Reinforcement Learning from Human Feedback (RLHF) involving human preferences encoded in a reward model. However, these reward models (RMs) often lack direct knowledge of why, or under what principles, the preferences annotations were made. In this study, we identify principles that… ▽ More

    Submitted 2 December, 2023; originally announced December 2023.

    Comments: Accepted to EMNLP 2023

  13. arXiv:2311.11045  [pdf, other

    cs.AI

    Orca 2: Teaching Small Language Models How to Reason

    Authors: Arindam Mitra, Luciano Del Corro, Shweti Mahajan, Andres Codas, Clarisse Simoes, Sahaj Agarwal, Xuxi Chen, Anastasia Razdaibiedina, Erik Jones, Kriti Aggarwal, Hamid Palangi, Guoqing Zheng, Corby Rosset, Hamed Khanpour, Ahmed Awadallah

    Abstract: Orca 1 learns from rich signals, such as explanation traces, allowing it to outperform conventional instruction-tuned models on benchmarks like BigBench Hard and AGIEval. In Orca 2, we continue exploring how improved training signals can enhance smaller LMs' reasoning abilities. Research on training small LMs has often relied on imitation learning to replicate the output of more capable models. We… ▽ More

    Submitted 21 November, 2023; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: Added url to model weights fixed typo in Author name

  14. arXiv:2310.06827  [pdf, other

    cs.CL cs.LG

    Teaching Language Models to Hallucinate Less with Synthetic Tasks

    Authors: Erik Jones, Hamid Palangi, Clarisse Simões, Varun Chandrasekaran, Subhabrata Mukherjee, Arindam Mitra, Ahmed Awadallah, Ece Kamar

    Abstract: Large language models (LLMs) frequently hallucinate on abstractive summarization tasks such as document-based question-answering, meeting summarization, and clinical report generation, even though all necessary information is included in context. However, optimizing LLMs to hallucinate less on these tasks is challenging, as hallucination is hard to efficiently evaluate at each optimization step. I… ▽ More

    Submitted 7 November, 2023; v1 submitted 10 October, 2023; originally announced October 2023.

  15. arXiv:2310.03046  [pdf, other

    cs.SE cs.AI

    EcoAssistant: Using LLM Assistant More Affordably and Accurately

    Authors: Jieyu Zhang, Ranjay Krishna, Ahmed H. Awadallah, Chi Wang

    Abstract: Today, users ask Large language models (LLMs) as assistants to answer queries that require external knowledge; they ask about the weather in a specific city, about stock prices, and even about where specific locations are within their neighborhood. These queries require the LLM to produce code that invokes external APIs to answer the user's question, yet LLMs rarely produce correct code on the fir… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

  16. arXiv:2310.02842  [pdf, other

    cs.CL cs.AI

    Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation

    Authors: Chen Dun, Mirian Hipolito Garcia, Guoqing Zheng, Ahmed Hassan Awadallah, Anastasios Kyrillidis, Robert Sim

    Abstract: Large Language Models (LLMs) have the ability to solve a variety of tasks, such as text summarization and mathematical questions, just out of the box, but they are often trained with a single task in mind. Due to high computational costs, the current trend is to use prompt instruction tuning to better adjust monolithic, pretrained LLMs for new -- but often individual -- downstream tasks. Thus, how… ▽ More

    Submitted 5 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

  17. arXiv:2310.02263  [pdf, other

    cs.CL cs.AI cs.LG

    Automatic Pair Construction for Contrastive Post-training

    Authors: Canwen Xu, Corby Rosset, Ethan C. Chau, Luciano Del Corro, Shweti Mahajan, Julian McAuley, Jennifer Neville, Ahmed Hassan Awadallah, Nikhil Rao

    Abstract: Alignment serves as an important step to steer large language models (LLMs) towards human preferences. In this paper, we propose an automatic way to construct contrastive data for LLM, using preference pairs from multiple models of varying strengths (e.g., InstructGPT, ChatGPT and GPT-4). We compare the contrastive techniques of SLiC and DPO to SFT baselines and find that DPO provides a step-funct… ▽ More

    Submitted 2 April, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: NAACL 2024 (Findings)

  18. arXiv:2308.08155  [pdf, other

    cs.AI cs.CL

    AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

    Authors: Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W White, Doug Burger, Chi Wang

    Abstract: AutoGen is an open-source framework that allows developers to build LLM applications via multiple agents that can converse with each other to accomplish tasks. AutoGen agents are customizable, conversable, and can operate in various modes that employ combinations of LLMs, human inputs, and tools. Using AutoGen, developers can also flexibly define agent interaction behaviors. Both natural language… ▽ More

    Submitted 3 October, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: 43 pages (10 pages for the main text, 3 pages for references, and 30 pages for appendices)

  19. arXiv:2307.02628  [pdf, other

    cs.CL

    SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference

    Authors: Luciano Del Corro, Allie Del Giorno, Sahaj Agarwal, Bin Yu, Ahmed Awadallah, Subhabrata Mukherjee

    Abstract: Autoregressive large language models (LLMs) have made remarkable progress in various natural language generation tasks. However, they incur high computation cost and latency resulting from the autoregressive token-by-token generation. To address this issue, several approaches have been proposed to reduce computational cost using early-exit strategies. These strategies enable faster text generation… ▽ More

    Submitted 5 July, 2023; originally announced July 2023.

  20. arXiv:2306.08586  [pdf, other

    cs.LG cs.AI math.OC

    FedJETs: Efficient Just-In-Time Personalization with Federated Mixture of Experts

    Authors: Chen Dun, Mirian Hipolito Garcia, Guoqing Zheng, Ahmed Hassan Awadallah, Robert Sim, Anastasios Kyrillidis, Dimitrios Dimitriadis

    Abstract: One of the goals in Federated Learning (FL) is to create personalized models that can adapt to the context of each participating client, while utilizing knowledge from a shared global model. Yet, often, personalization requires a fine-tuning step using clients' labeled data in order to achieve good performance. This may not be feasible in scenarios where incoming clients are fresh and/or have priv… ▽ More

    Submitted 4 October, 2023; v1 submitted 14 June, 2023; originally announced June 2023.

    Comments: 19 Pages

  21. arXiv:2306.02707  [pdf, other

    cs.CL cs.LG

    Orca: Progressive Learning from Complex Explanation Traces of GPT-4

    Authors: Subhabrata Mukherjee, Arindam Mitra, Ganesh Jawahar, Sahaj Agarwal, Hamid Palangi, Ahmed Awadallah

    Abstract: Recent research has focused on enhancing the capability of smaller models through imitation learning, drawing on the outputs generated by large foundation models (LFMs). A number of issues impact the quality of these models, ranging from limited imitation signals from shallow LFM outputs; small scale homogeneous training data; and most notably a lack of rigorous evaluation resulting in overestimat… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

  22. arXiv:2305.14676  [pdf, other

    cs.CL

    GRILL: Grounded Vision-language Pre-training via Aligning Text and Image Regions

    Authors: Woojeong Jin, Subhabrata Mukherjee, Yu Cheng, Yelong Shen, Weizhu Chen, Ahmed Hassan Awadallah, Damien Jose, Xiang Ren

    Abstract: Generalization to unseen tasks is an important ability for few-shot learners to achieve better zero-/few-shot performance on diverse tasks. However, such generalization to vision-language tasks including grounding and generation tasks has been under-explored; existing few-shot VL models struggle to handle tasks that involve object grounding and multiple images such as visual commonsense reasoning… ▽ More

    Submitted 23 May, 2023; originally announced May 2023.

    Comments: Preprint

  23. arXiv:2305.10783  [pdf, other

    cs.AI

    Transforming Human-Centered AI Collaboration: Redefining Embodied Agents Capabilities through Interactive Grounded Language Instructions

    Authors: Shrestha Mohanty, Negar Arabzadeh, Julia Kiseleva, Artem Zholus, Milagro Teruel, Ahmed Awadallah, Yuxuan Sun, Kavya Srinet, Arthur Szlam

    Abstract: Human intelligence's adaptability is remarkable, allowing us to adjust to new tasks and multi-modal environments swiftly. This skill is evident from a young age as we acquire new abilities and solve problems by imitating others or following natural language instructions. The research community is actively pursuing the development of interactive "embodied agents" that can engage in natural conversa… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

  24. arXiv:2304.10750  [pdf, other

    cs.CL cs.AI

    Improving Grounded Language Understanding in a Collaborative Environment by Interacting with Agents Through Help Feedback

    Authors: Nikhil Mehta, Milagro Teruel, Patricio Figueroa Sanz, Xin Deng, Ahmed Hassan Awadallah, Julia Kiseleva

    Abstract: Many approaches to Natural Language Processing (NLP) tasks often treat them as single-step problems, where an agent receives an instruction, executes it, and is evaluated based on the final outcome. However, human language is inherently interactive, as evidenced by the back-and-forth nature of human conversations. In light of this, we posit that human-AI collaboration should also be interactive, w… ▽ More

    Submitted 5 February, 2024; v1 submitted 21 April, 2023; originally announced April 2023.

    Comments: Findings of EACL 2024

  25. arXiv:2303.04673  [pdf, other

    cs.CL cs.AI cs.LG

    Cost-Effective Hyperparameter Optimization for Large Language Model Generation Inference

    Authors: Chi Wang, Susan Xueqing Liu, Ahmed H. Awadallah

    Abstract: Large Language Models (LLMs) have sparked significant interest in their generative capabilities, leading to the development of various commercial applications. The high cost of using the models drives application builders to maximize the value of generation under a limited inference budget. This paper presents a study of optimizing inference hyperparameters such as the number of responses, tempera… ▽ More

    Submitted 8 August, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

  26. arXiv:2302.07704  [pdf, other

    cond-mat.mtrl-sci cond-mat.mes-hall

    Spin-strain coupling in nanodiamonds

    Authors: Asad Awadallah, Inbar Zohar, Amit Finkler

    Abstract: Fluorescent nanodiamonds have been used to a large extent in various biological systems due to their robust nature, inert properties and the relative ease of modifying their surface for attachment to different functional groups. Within a given batch, however, each nanodiamond is indistinguishable from its neighbors and so far one could only rely on fluorescence statistics for some global informati… ▽ More

    Submitted 15 February, 2023; originally announced February 2023.

    Comments: SI with interesting qrcode is available at https://www.dropbox.com/s/5gjwfegiydxr5ig/SI.pdf

    Journal ref: J. Appl. Phys. 133, 145103 (2023)

  27. arXiv:2301.09211  [pdf, other

    cs.CL cs.AI

    An Empirical Study of Metrics to Measure Representational Harms in Pre-Trained Language Models

    Authors: Saghar Hosseini, Hamid Palangi, Ahmed Hassan Awadallah

    Abstract: Large-scale Pre-Trained Language Models (PTLMs) capture knowledge from massive human-written data which contains latent societal biases and toxic contents. In this paper, we leverage the primary task of PTLMs, i.e., language modeling, and propose a new metric to quantify manifested implicit representational harms in PTLMs towards 13 marginalized demographics. Using this metric, we conducted an emp… ▽ More

    Submitted 22 January, 2023; originally announced January 2023.

    Comments: 17 pages,

    ACM Class: I.2.7

  28. arXiv:2212.09968  [pdf, other

    cs.CL

    On Improving Summarization Factual Consistency from Natural Language Feedback

    Authors: Yixin Liu, Budhaditya Deb, Milagro Teruel, Aaron Halfaker, Dragomir Radev, Ahmed H. Awadallah

    Abstract: Despite the recent progress in language generation models, their outputs may not always meet user expectations. In this work, we study whether informational feedback in natural language can be leveraged to improve generation quality and user preference alignment. To this end, we consider factual consistency in summarization, the quality that the summary should only contain information supported by… ▽ More

    Submitted 16 October, 2023; v1 submitted 19 December, 2022; originally announced December 2022.

    Comments: ACL 2023 Camera Ready, GitHub Repo: https://github.com/microsoft/DeFacto

  29. arXiv:2211.00688  [pdf, other

    cs.AI cs.CL

    Learning to Solve Voxel Building Embodied Tasks from Pixels and Natural Language Instructions

    Authors: Alexey Skrynnik, Zoya Volovikova, Marc-Alexandre Côté, Anton Voronov, Artem Zholus, Negar Arabzadeh, Shrestha Mohanty, Milagro Teruel, Ahmed Awadallah, Aleksandr Panov, Mikhail Burtsev, Julia Kiseleva

    Abstract: The adoption of pre-trained language models to generate action plans for embodied agents is a promising research strategy. However, execution of instructions in real or simulated environments requires verification of the feasibility of actions as well as their relevance to the completion of a goal. We propose a new method that combines a language model and reinforcement learning for the task of bu… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

    Comments: 6 pages, 3 figures

  30. arXiv:2210.17451   

    cs.CL cs.AI cs.LG

    AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

    Authors: Yaqing Wang, Sahaj Agarwal, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao

    Abstract: Standard fine-tuning of large pre-trained language models (PLMs) for downstream tasks requires updating hundreds of millions to billions of parameters, and storing a large copy of the PLM weights for every task resulting in increased cost for storing, sharing and serving the models. To address this, parameter-efficient fine-tuning (PEFT) techniques were introduced where small trainable components… ▽ More

    Submitted 1 November, 2022; v1 submitted 31 October, 2022; originally announced October 2022.

    Comments: The paper is withdraw to avoid duplicate version of arXiv article 2205.12410. We will include new content as a updated version

  31. arXiv:2210.11617  [pdf, other

    cs.CL cs.LG

    Boosting Natural Language Generation from Instructions with Meta-Learning

    Authors: Budhaditya Deb, Guoqing Zheng, Ahmed Hassan Awadallah

    Abstract: Recent work has shown that language models (LMs) trained with multi-task \textit{instructional learning} (MTIL) can solve diverse NLP tasks in zero- and few-shot settings with improved performance compared to prompt tuning. MTIL illustrates that LMs can extract and use information about the task from instructions beyond the surface patterns of the inputs and outputs. This suggests that meta-learni… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

  32. arXiv:2210.07535  [pdf, other

    cs.CL cs.LG

    AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation

    Authors: Ganesh Jawahar, Subhabrata Mukherjee, Xiaodong Liu, Young Jin Kim, Muhammad Abdul-Mageed, Laks V. S. Lakshmanan, Ahmed Hassan Awadallah, Sebastien Bubeck, Jianfeng Gao

    Abstract: Mixture-of-Expert (MoE) models have obtained state-of-the-art performance in Neural Machine Translation (NMT) tasks. Existing works in MoE mostly consider a homogeneous design where the same number of experts of the same size are placed uniformly throughout the network. Furthermore, existing MoE works do not consider computational constraints (e.g., FLOPs, latency) to guide their design. To this e… ▽ More

    Submitted 7 June, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: ACL 2023 Findings

  33. arXiv:2208.11290  [pdf, other

    cs.LG cs.CR

    ADMoE: Anomaly Detection with Mixture-of-Experts from Noisy Labels

    Authors: Yue Zhao, Guoqing Zheng, Subhabrata Mukherjee, Robert McCann, Ahmed Awadallah

    Abstract: Existing works on anomaly detection (AD) rely on clean labels from human annotators that are expensive to acquire in practice. In this work, we propose a method to leverage weak/noisy labels (e.g., risk scores generated by machine rules for detecting malware) that are cheaper to obtain for anomaly detection. Specifically, we propose ADMoE, the first framework for anomaly detection algorithms to le… ▽ More

    Submitted 22 November, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

    Comments: AAAI 2023

  34. arXiv:2205.13771  [pdf, other

    cs.CL

    IGLU 2022: Interactive Grounded Language Understanding in a Collaborative Environment at NeurIPS 2022

    Authors: Julia Kiseleva, Alexey Skrynnik, Artem Zholus, Shrestha Mohanty, Negar Arabzadeh, Marc-Alexandre Côté, Mohammad Aliannejadi, Milagro Teruel, Ziming Li, Mikhail Burtsev, Maartje ter Hoeve, Zoya Volovikova, Aleksandr Panov, Yuxuan Sun, Kavya Srinet, Arthur Szlam, Ahmed Awadallah

    Abstract: Human intelligence has the remarkable ability to adapt to new tasks and environments quickly. Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions. To facilitate research in this direction, we propose IGLU: Interactive Grounded Language Understanding in a Collabor… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

    Comments: arXiv admin note: text overlap with arXiv:2110.06536

  35. arXiv:2205.12476  [pdf, other

    cs.CL

    Leveraging Locality in Abstractive Text Summarization

    Authors: Yixin Liu, Ansong Ni, Linyong Nan, Budhaditya Deb, Chenguang Zhu, Ahmed H. Awadallah, Dragomir Radev

    Abstract: Neural attention models have achieved significant improvements on many natural language processing tasks. However, the quadratic memory complexity of the self-attention module with respect to the input length hinders their applications in long text summarization. Instead of designing more efficient attention modules, we approach this problem by investigating if models with a restricted context can… ▽ More

    Submitted 30 October, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: Accepted to EMNLP 2022

  36. arXiv:2205.12410  [pdf, other

    cs.CL cs.AI cs.LG

    AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning

    Authors: Yaqing Wang, Sahaj Agarwal, Subhabrata Mukherjee, Xiaodong Liu, Jing Gao, Ahmed Hassan Awadallah, Jianfeng Gao

    Abstract: Standard fine-tuning of large pre-trained language models (PLMs) for downstream tasks requires updating hundreds of millions to billions of parameters, and storing a large copy of the PLM weights for every task resulting in increased cost for storing, sharing and serving the models. To address this, parameter-efficient fine-tuning (PEFT) techniques were introduced where small trainable components… ▽ More

    Submitted 1 November, 2022; v1 submitted 24 May, 2022; originally announced May 2022.

    Comments: Accepted by EMNLP 2022

  37. arXiv:2205.02388  [pdf, other

    cs.CL cs.AI

    Interactive Grounded Language Understanding in a Collaborative Environment: IGLU 2021

    Authors: Julia Kiseleva, Ziming Li, Mohammad Aliannejadi, Shrestha Mohanty, Maartje ter Hoeve, Mikhail Burtsev, Alexey Skrynnik, Artem Zholus, Aleksandr Panov, Kavya Srinet, Arthur Szlam, Yuxuan Sun, Marc-Alexandre Côté, Katja Hofmann, Ahmed Awadallah, Linar Abdrazakov, Igor Churin, Putra Manggala, Kata Naszadi, Michiel van der Meer, Taewoon Kim

    Abstract: Human intelligence has the remarkable ability to quickly adapt to new tasks and environments. Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions. To facilitate research in this direction, we propose \emph{IGLU: Interactive Grounded Language Understanding in a Co… ▽ More

    Submitted 27 May, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2110.06536

    Journal ref: Proceedings of Machine Learning Research NeurIPS 2021 Competition and Demonstration Track

  38. arXiv:2205.02370  [pdf, other

    cs.CL cs.AI

    PREME: Preference-based Meeting Exploration through an Interactive Questionnaire

    Authors: Negar Arabzadeh, Ali Ahmadvand, Julia Kiseleva, Yang Liu, Ahmed Hassan Awadallah, Ming Zhong, Milad Shokouhi

    Abstract: The recent increase in the volume of online meetings necessitates automated tools for managing and organizing the material, especially when an attendee has missed the discussion and needs assistance in quickly exploring it. In this work, we propose a novel end-to-end framework for generating interactive questionnaires for preference-based meeting exploration. As a result, users are supplied with a… ▽ More

    Submitted 26 April, 2023; v1 submitted 4 May, 2022; originally announced May 2022.

    Journal ref: EACL 2023

  39. arXiv:2204.08039  [pdf, other

    cs.CL

    Pathologies of Pre-trained Language Models in Few-shot Fine-tuning

    Authors: Hanjie Chen, Guoqing Zheng, Ahmed Hassan Awadallah, Yangfeng Ji

    Abstract: Although adapting pre-trained language models with few examples has shown promising performance on text classification, there is a lack of understanding of where the performance gain comes from. In this work, we propose to answer this question by interpreting the adaptation behavior using post-hoc explanations from model predictions. By modeling feature statistics of explanations, we discover that… ▽ More

    Submitted 17 April, 2022; originally announced April 2022.

    Comments: ACL 2022 Workshop on Insights from Negative Results in NLP

  40. arXiv:2204.07689  [pdf, other

    cs.LG cs.CL

    Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners

    Authors: Shashank Gupta, Subhabrata Mukherjee, Krishan Subudhi, Eduardo Gonzalez, Damien Jose, Ahmed H. Awadallah, Jianfeng Gao

    Abstract: Traditional multi-task learning (MTL) methods use dense networks that use the same set of shared weights across several different tasks. This often creates interference where two or more tasks compete to pull model parameters in different directions. In this work, we study whether sparsely activated Mixture-of-Experts (MoE) improve multi-task learning by specializing some weights for learning shar… ▽ More

    Submitted 15 April, 2022; originally announced April 2022.

  41. arXiv:2204.03084  [pdf, other

    cs.CL cs.AI cs.LG

    Knowledge Infused Decoding

    Authors: Ruibo Liu, Guoqing Zheng, Shashank Gupta, Radhika Gaonkar, Chongyang Gao, Soroush Vosoughi, Milad Shokouhi, Ahmed Hassan Awadallah

    Abstract: Pre-trained language models (LMs) have been shown to memorize a substantial amount of knowledge from the pre-training corpora; however, they are still limited in recalling factually correct knowledge given a certain context. Hence, they tend to suffer from counterfactual or hallucinatory generation when used in knowledge-intensive natural language generation (NLG) tasks. Recent remedies to this pr… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: In ICLR 2022

  42. arXiv:2203.06345  [pdf, other

    cs.LG cs.CV

    The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy

    Authors: Tianlong Chen, Zhenyu Zhang, Yu Cheng, Ahmed Awadallah, Zhangyang Wang

    Abstract: Vision transformers (ViTs) have gained increasing popularity as they are commonly believed to own higher modeling capacity and representation flexibility, than traditional convolutional networks. However, it is questionable whether such potential has been fully unleashed in practice, as the learned ViTs often suffer from over-smoothening, yielding likely redundant models. Recent works made prelimi… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

  43. arXiv:2201.12507  [pdf, other

    cs.CL

    AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models

    Authors: Dongkuan Xu, Subhabrata Mukherjee, Xiaodong Liu, Debadeepta Dey, Wenhui Wang, Xiang Zhang, Ahmed Hassan Awadallah, Jianfeng Gao

    Abstract: Knowledge distillation (KD) methods compress large models into smaller students with manually-designed student architectures given pre-specified computational cost. This requires several trials to find a viable student, and further repeating the process for each student or computational budget change. We use Neural Architecture Search (NAS) to automatically distill several compressed students with… ▽ More

    Submitted 19 February, 2022; v1 submitted 29 January, 2022; originally announced January 2022.

    Comments: 15 pages, 4 figures, 10 tables

  44. arXiv:2112.05209  [pdf, other

    cs.CL

    Compositional Generalization for Natural Language Interfaces to Web APIs

    Authors: Saghar Hosseini, Ahmed Hassan Awadallah, Yu Su

    Abstract: This paper presents Okapi, a new dataset for Natural Language to executable web Application Programming Interfaces (NL2API). This dataset is in English and contains 22,508 questions and 9,019 unique API calls, covering three domains. We define new compositional generalization tasks for NL2API which explore the models' ability to extrapolate from simple API calls in the training set to new and more… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

  45. arXiv:2111.02840  [pdf, other

    cs.CL cs.CR cs.LG

    Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models

    Authors: Boxin Wang, Chejian Xu, Shuohang Wang, Zhe Gan, Yu Cheng, Jianfeng Gao, Ahmed Hassan Awadallah, Bo Li

    Abstract: Large-scale pre-trained language models have achieved tremendous success across a wide range of natural language understanding (NLU) tasks, even surpassing human performance. However, recent studies reveal that the robustness of these models can be challenged by carefully crafted textual adversarial examples. While several individual datasets have been proposed to evaluate model robustness, a prin… ▽ More

    Submitted 10 January, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

    Comments: Oral Presentation in NeurIPS 2021 (Datasets and Benchmarks Track). 24 pages, 4 figures, 12 tables

  46. arXiv:2111.02570  [pdf, other

    cs.CL cs.LG

    CLUES: Few-Shot Learning Evaluation in Natural Language Understanding

    Authors: Subhabrata Mukherjee, Xiaodong Liu, Guoqing Zheng, Saghar Hosseini, Hao Cheng, Greg Yang, Christopher Meek, Ahmed Hassan Awadallah, Jianfeng Gao

    Abstract: Most recent progress in natural language understanding (NLU) has been driven, in part, by benchmarks such as GLUE, SuperGLUE, SQuAD, etc. In fact, many NLU models have now matched or exceeded "human-level" performance on many tasks in these benchmarks. Most of these benchmarks, however, give models access to relatively large amounts of labeled data for training. As such, the models are provided fa… ▽ More

    Submitted 3 November, 2021; originally announced November 2021.

    Comments: NeurIPS 2021 Datasets and Benchmarks Track

  47. arXiv:2111.00160  [pdf, other

    cs.LG cs.CL

    DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language Models

    Authors: Xuxi Chen, Tianlong Chen, Weizhu Chen, Ahmed Hassan Awadallah, Zhangyang Wang, Yu Cheng

    Abstract: Gigantic pre-trained models have become central to natural language processing (NLP), serving as the starting point for fine-tuning towards a range of downstream tasks. However, two pain points persist for this paradigm: (a) as the pre-trained models grow bigger (e.g., 175B parameters for GPT-3), even the fine-tuning process can be time-consuming and computationally expensive; (b) the fine-tuned m… ▽ More

    Submitted 23 May, 2023; v1 submitted 29 October, 2021; originally announced November 2021.

    Comments: Accepted by ACL 2023

  48. arXiv:2110.10150  [pdf, other

    cs.CL

    Summ^N: A Multi-Stage Summarization Framework for Long Input Dialogues and Documents

    Authors: Yusen Zhang, Ansong Ni, Ziming Mao, Chen Henry Wu, Chenguang Zhu, Budhaditya Deb, Ahmed H. Awadallah, Dragomir Radev, Rui Zhang

    Abstract: Text summarization helps readers capture salient information from documents, news, interviews, and meetings. However, most state-of-the-art pretrained language models (LM) are unable to efficiently process long text for many summarization tasks. In this paper, we propose Summ$^N$, a simple, flexible, and effective multi-stage framework for input texts that are longer than the maximum context lengt… ▽ More

    Submitted 13 April, 2022; v1 submitted 16 October, 2021; originally announced October 2021.

    Comments: ACL 2022

  49. arXiv:2110.08419  [pdf, other

    cs.CL cs.LG

    Robustness Challenges in Model Distillation and Pruning for Natural Language Understanding

    Authors: Mengnan Du, Subhabrata Mukherjee, Yu Cheng, Milad Shokouhi, Xia Hu, Ahmed Hassan Awadallah

    Abstract: Recent work has focused on compressing pre-trained language models (PLMs) like BERT where the major focus has been to improve the in-distribution performance for downstream tasks. However, very few of these studies have analyzed the impact of compression on the generalizability and robustness of compressed models for out-of-distribution (OOD) data. Towards this end, we study two popular model comp… ▽ More

    Submitted 26 February, 2023; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: Accepted by EACL 2023

  50. arXiv:2110.08168  [pdf, other

    cs.CL

    DYLE: Dynamic Latent Extraction for Abstractive Long-Input Summarization

    Authors: Ziming Mao, Chen Henry Wu, Ansong Ni, Yusen Zhang, Rui Zhang, Tao Yu, Budhaditya Deb, Chenguang Zhu, Ahmed H. Awadallah, Dragomir Radev

    Abstract: Transformer-based models have achieved state-of-the-art performance on short-input summarization. However, they still struggle with summarizing longer text. In this paper, we present DYLE, a novel dynamic latent extraction approach for abstractive long-input summarization. DYLE jointly trains an extractor and a generator and treats the extracted text snippets as the latent variable, allowing dynam… ▽ More

    Submitted 24 April, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: ACL 2022