Skip to main content

Showing 1–15 of 15 results for author: Chowdhury, J R

  1. arXiv:2409.01531  [pdf, ps, other

    cs.LG cs.AI

    On the Design Space Between Transformers and Recursive Neural Nets

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: In this paper, we study two classes of models, Recursive Neural Networks (RvNNs) and Transformers, and show that a tight connection between them emerges from the recent development of two recent models - Continuous Recursive Neural Networks (CRvNN) and Neural Data Routers (NDR). On one hand, CRvNN pushes the boundaries of traditional RvNN, relaxing its discrete structure-wise composition and ends… ▽ More

    Submitted 2 September, 2024; originally announced September 2024.

  2. arXiv:2402.00976  [pdf, ps, other

    cs.LG cs.AI cs.NE

    Investigating Recurrent Transformers with Dynamic Halt

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: In this paper, we comprehensively study the inductive biases of two major approaches to augmenting Transformers with a recurrent mechanism: (1) the approach of incorporating a depth-wise recurrence similar to Universal Transformers; and (2) the approach of incorporating a chunk-wise temporal recurrence like Temporal Latent Bottleneck. Furthermore, we propose and investigate novel ways to extend an… ▽ More

    Submitted 2 September, 2024; v1 submitted 1 February, 2024; originally announced February 2024.

  3. arXiv:2311.04449  [pdf, other

    cs.LG cs.CL

    Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: Binary Balanced Tree RvNNs (BBT-RvNNs) enforce sequence composition according to a preset balanced binary tree structure. Thus, their non-linear recursion depth is just $\log_2 n$ ($n$ being the sequence length). Such logarithmic scaling makes BBT-RvNNs efficient and scalable on long sequence tasks such as Long Range Arena (LRA). However, such computational efficiency comes at a cost because BBT-R… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: Accepted at NeurIPS 2023

  4. arXiv:2307.10779  [pdf, other

    cs.LG

    Efficient Beam Tree Recursion

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: Beam Tree Recursive Neural Network (BT-RvNN) was recently proposed as a simple extension of Gumbel Tree RvNN and it was shown to achieve state-of-the-art length generalization performance in ListOps while maintaining comparable performance on other tasks. However, although not the worst in its kind, BT-RvNN can be still exorbitantly expensive in memory usage. In this paper, we identify the main bo… ▽ More

    Submitted 7 November, 2023; v1 submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted in NeurIPS 2023

  5. arXiv:2305.20019  [pdf, ps, other

    cs.LG cs.AI cs.CL

    Monotonic Location Attention for Length Generalization

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: We explore different ways to utilize position-based cross-attention in seq2seq networks to enable length generalization in algorithmic tasks. We show that a simple approach of interpolating the original and reversed encoded representations combined with relative attention allows near-perfect length generalization for both forward and reverse lookup tasks or copy tasks that had been generally hard… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted in ICML 2023

  6. arXiv:2305.19999  [pdf, other

    cs.LG cs.AI cs.CL

    Beam Tree Recursive Cells

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: We propose Beam Tree Recursive Cell (BT-Cell) - a backpropagation-friendly framework to extend Recursive Neural Networks (RvNNs) with beam search for latent structure induction. We further extend this framework by proposing a relaxation of the hard top-k operators in beam search for better propagation of gradient signals. We evaluate our proposed models in different out-of-distribution splits in b… ▽ More

    Submitted 20 June, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted in ICML 2023

  7. arXiv:2305.17968  [pdf, other

    cs.CL

    Data Augmentation for Low-Resource Keyphrase Generation

    Authors: Krishna Garg, Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: Keyphrase generation is the task of summarizing the contents of any given article into a few salient phrases (or keyphrases). Existing works for the task mostly rely on large-scale annotated datasets, which are not easy to acquire. Very few works address the problem of keyphrase generation in low-resource settings, but they still rely on a lot of additional unlabeled data for pretraining and on au… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Comments: 9 pages, 8 tables, To appear at the Findings of the Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, Toronto, Canada

  8. arXiv:2304.13883  [pdf, other

    cs.CL cs.IR

    Neural Keyphrase Generation: Analysis and Evaluation

    Authors: Tuhin Kundu, Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: Keyphrase generation aims at generating topical phrases from a given text either by copying from the original text (present keyphrases) or by producing new keyphrases (absent keyphrases) that capture the semantic meaning of the text. Encoder-decoder models are most widely used for this task because of their capabilities for absent keyphrase generation. However, there has been little to no analysis… ▽ More

    Submitted 26 April, 2023; originally announced April 2023.

  9. arXiv:2203.04464  [pdf, other

    cs.CL

    On the Evaluation of Answer-Agnostic Paragraph-level Multi-Question Generation

    Authors: Jishnu Ray Chowdhury, Debanjan Mahata, Cornelia Caragea

    Abstract: We study the task of predicting a set of salient questions from a given paragraph without any prior knowledge of the precise answer. We make two main contributions. First, we propose a new method to evaluate a set of predicted questions against the set of references by using the Hungarian algorithm to assign predicted questions to references before scoring the assigned pairs. We show that our prop… ▽ More

    Submitted 11 March, 2022; v1 submitted 8 March, 2022; originally announced March 2022.

  10. arXiv:2202.00535  [pdf, ps, other

    cs.CL cs.AI

    Novelty Controlled Paraphrase Generation with Retrieval Augmented Conditional Prompt Tuning

    Authors: Jishnu Ray Chowdhury, Yong Zhuang, Shuyi Wang

    Abstract: Paraphrase generation is a fundamental and long-standing task in natural language processing. In this paper, we concentrate on two contributions to the task: (1) we propose Retrieval Augmented Prompt Tuning (RAPT) as a parameter-efficient method to adapt large pre-trained language models for paraphrase generation; (2) we propose Novelty Conditioned RAPT (NC-RAPT) as a simple model-agnostic method… ▽ More

    Submitted 12 March, 2022; v1 submitted 1 February, 2022; originally announced February 2022.

    Comments: Accepted by AAAI 2022 (Oral)

  11. arXiv:2112.06776  [pdf, other

    cs.CL

    Keyphrase Generation Beyond the Boundaries of Title and Abstract

    Authors: Krishna Garg, Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: Keyphrase generation aims at generating important phrases (keyphrases) that best describe a given document. In scholarly domains, current approaches have largely used only the title and abstract of the articles to generate keyphrases. In this paper, we comprehensively explore whether the integration of additional information from the full text of a given article or from semantically similar articl… ▽ More

    Submitted 20 October, 2022; v1 submitted 13 December, 2021; originally announced December 2021.

    Comments: 9 pages, 1 figure, 7 tables

  12. arXiv:2112.01476  [pdf, other

    cs.CL

    KPDrop: Improving Absent Keyphrase Generation

    Authors: Jishnu Ray Chowdhury, Seoyeon Park, Tuhin Kundu, Cornelia Caragea

    Abstract: Keyphrase generation is the task of generating phrases (keyphrases) that summarize the main topics of a given document. Keyphrases can be either present or absent from the given document. While the extraction of present keyphrases has received much attention in the past, only recently a stronger focus has been placed on the generation of absent keyphrases. However, generating absent keyphrases is… ▽ More

    Submitted 24 October, 2022; v1 submitted 2 December, 2021; originally announced December 2021.

    Comments: Accepted in EMNLP Findings 2022

  13. arXiv:2106.06038  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Modeling Hierarchical Structures with Continuous Recursive Neural Networks

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea

    Abstract: Recursive Neural Networks (RvNNs), which compose sequences according to their underlying hierarchical syntactic structure, have performed well in several natural language processing tasks compared to similar models without structural biases. However, traditional RvNNs are incapable of inducing the latent structure in a plain text sequence on their own. Several extensions have been proposed to over… ▽ More

    Submitted 10 June, 2021; originally announced June 2021.

    Comments: Accepted in ICML 2021 (long talk)

  14. arXiv:2001.01323  [pdf, other

    cs.IR cs.CL cs.LG

    On Identifying Hashtags in Disaster Twitter Data

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea

    Abstract: Tweet hashtags have the potential to improve the search for information during disaster events. However, there is a large number of disaster-related tweets that do not have any user-provided hashtags. Moreover, only a small number of tweets that contain actionable hashtags are useful for disaster response. To facilitate progress on automatic identification (or extraction) of disaster hashtags for… ▽ More

    Submitted 5 January, 2020; originally announced January 2020.

  15. arXiv:1910.07897  [pdf, other

    cs.IR cs.CL cs.LG

    Keyphrase Extraction from Disaster-related Tweets

    Authors: Jishnu Ray Chowdhury, Cornelia Caragea, Doina Caragea

    Abstract: While keyphrase extraction has received considerable attention in recent years, relatively few studies exist on extracting keyphrases from social media platforms such as Twitter, and even fewer for extracting disaster-related keyphrases from such sources. During a disaster, keyphrases can be extremely useful for filtering relevant tweets that can enhance situational awareness. Previously, joint tr… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

    Comments: 12 pages, 7 figures

    Journal ref: In The World Wide Web Conference (WWW '19), Ling Liu and Ryen White (Eds.). ACM, New York, NY, USA, 1555-1566 (2019)