Skip to main content

Showing 1–16 of 16 results for author: Ilhan, F

  1. arXiv:2410.03953  [pdf, other

    cs.CL cs.LG

    LLM-TOPLA: Efficient LLM Ensemble by Maximising Diversity

    Authors: Selim Furkan Tekin, Fatih Ilhan, Tiansheng Huang, Sihao Hu, Ling Liu

    Abstract: Combining large language models during training or at inference time has shown substantial performance gain over component LLMs. This paper presents LLM-TOPLA, a diversity-optimized LLM ensemble method with three unique properties: (i) We introduce the focal diversity metric to capture the diversity-performance correlation among component LLMs of an ensemble. (ii) We develop a diversity-optimized… ▽ More

    Submitted 4 October, 2024; originally announced October 2024.

  2. arXiv:2409.18169  [pdf, other

    cs.CR cs.AI cs.LG

    Harmful Fine-tuning Attacks and Defenses for Large Language Models: A Survey

    Authors: Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu

    Abstract: Recent research demonstrates that the nascent fine-tuning-as-a-service business model exposes serious safety concerns -- fine-tuning over a few harmful data uploaded by the users can compromise the safety alignment of the model. The attack, known as harmful fine-tuning, has raised a broad research interest among the community. However, as the attack is still new, \textbf{we observe from our misera… ▽ More

    Submitted 21 October, 2024; v1 submitted 26 September, 2024; originally announced September 2024.

  3. arXiv:2409.01586  [pdf, other

    cs.CL cs.AI

    Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation

    Authors: Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu

    Abstract: Harmful fine-tuning issue \citep{qi2023fine} poses serious safety concerns for Large language models' fine-tuning-as-a-service. While existing defenses \citep{huang2024vaccine,rosati2024representation} have been proposed to mitigate the issue, their performances are still far away from satisfactory, and the root cause of the problem has not been fully recovered. For the first time in the literatur… ▽ More

    Submitted 18 September, 2024; v1 submitted 2 September, 2024; originally announced September 2024.

  4. arXiv:2405.18641  [pdf, other

    cs.LG

    Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning

    Authors: Tiansheng Huang, Sihao Hu, Fatih Ilhan, Selim Furkan Tekin, Ling Liu

    Abstract: Recent studies show that Large Language Models (LLMs) with safety alignment can be jail-broken by fine-tuning on a dataset mixed with harmful data. First time in the literature, we show that the jail-broken effect can be mitigated by separating states in the finetuning stage to optimize the alignment and user datasets. Unfortunately, our subsequent study shows that this simple Bi-State Optimizatio… ▽ More

    Submitted 26 June, 2024; v1 submitted 28 May, 2024; originally announced May 2024.

  5. arXiv:2404.04434  [pdf, other

    cs.CV cs.LG

    Robust Few-Shot Ensemble Learning with Focal Diversity-Based Pruning

    Authors: Selim Furkan Tekin, Fatih Ilhan, Tiansheng Huang, Sihao Hu, Ka-Ho Chow, Margaret L. Loper, Ling Liu

    Abstract: This paper presents FusionShot, a focal diversity optimized few-shot ensemble learning approach for boosting the robustness and generalization performance of pre-trained few-shot models. The paper makes three original contributions. First, we explore the unique characteristics of few-shot learning to ensemble multiple few-shot (FS) models by creating three alternative fusion channels. Second, we i… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  6. arXiv:2404.02039  [pdf, other

    cs.AI

    A Survey on Large Language Model-Based Game Agents

    Authors: Sihao Hu, Tiansheng Huang, Fatih Ilhan, Selim Tekin, Gaowen Liu, Ramana Kompella, Ling Liu

    Abstract: The development of game agents holds a critical role in advancing towards Artificial General Intelligence (AGI). The progress of LLMs and their multimodal counterparts (MLLMs) offers an unprecedented opportunity to evolve and empower game agents with human-like decision-making capabilities in complex computer game environments. This paper provides a comprehensive overview of LLM-based game agents… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

  7. arXiv:2310.01152  [pdf, other

    cs.CR cs.AI

    Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives

    Authors: Sihao Hu, Tiansheng Huang, Fatih İlhan, Selim Furkan Tekin, Ling Liu

    Abstract: This paper provides a systematic analysis of the opportunities, challenges, and potential solutions of harnessing Large Language Models (LLMs) such as GPT-4 to dig out vulnerabilities within smart contracts based on our ongoing research. For the task of smart contract vulnerability detection, achieving practical usability hinges on identifying as many true vulnerabilities as possible while minimiz… ▽ More

    Submitted 16 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

    Comments: 10 pages

    Journal ref: IEEE International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications 2023

  8. arXiv:2303.11511  [pdf, other

    cs.CR cs.CV cs.LG

    STDLens: Model Hijacking-Resilient Federated Learning for Object Detection

    Authors: Ka-Ho Chow, Ling Liu, Wenqi Wei, Fatih Ilhan, Yanzhao Wu

    Abstract: Federated Learning (FL) has been gaining popularity as a collaborative learning framework to train deep learning-based object detection models over a distributed population of clients. Despite its advantages, FL is vulnerable to model hijacking. The attacker can control how the object detection system should misbehave by implanting Trojaned gradients using only a small number of compromised client… ▽ More

    Submitted 19 May, 2023; v1 submitted 20 March, 2023; originally announced March 2023.

    Comments: CVPR 2023. Source Code: https://github.com/git-disl/STDLens

  9. arXiv:2301.07099  [pdf, other

    cs.LG cs.AI

    Adaptive Deep Neural Network Inference Optimization with EENet

    Authors: Fatih Ilhan, Ka-Ho Chow, Sihao Hu, Tiansheng Huang, Selim Tekin, Wenqi Wei, Yanzhao Wu, Myungjin Lee, Ramana Kompella, Hugo Latapie, Gaowen Liu, Ling Liu

    Abstract: Well-trained deep neural networks (DNNs) treat all test samples equally during prediction. Adaptive DNN inference with early exiting leverages the observation that some test examples can be easier to predict than others. This paper presents EENet, a novel early-exiting scheduling framework for multi-exit DNN models. Instead of having every sample go through all DNN layers during prediction, EENet… ▽ More

    Submitted 1 December, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

  10. arXiv:2006.10119  [pdf, other

    cs.LG stat.ML

    Markovian RNN: An Adaptive Time Series Prediction Network with HMM-based Switching for Nonstationary Environments

    Authors: Fatih Ilhan, Oguzhan Karaahmetoglu, Ismail Balaban, Suleyman Serdar Kozat

    Abstract: We investigate nonlinear regression for nonstationary sequential data. In most real-life applications such as business domains including finance, retail, energy and economy, timeseries data exhibits nonstationarity due to the temporally varying dynamics of the underlying system. We introduce a novel recurrent neural network (RNN) architecture, which adaptively switches between internal regimes in… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

    Comments: 11 pages

  11. arXiv:2005.12005  [pdf, other

    stat.ML cs.LG

    Unsupervised Online Anomaly Detection On Irregularly Sampled Or Missing Valued Time-Series Data Using LSTM Networks

    Authors: Oguzhan Karaahmetoglu, Fatih Ilhan, Ismail Balaban, Suleyman Serdar Kozat

    Abstract: We study anomaly detection and introduce an algorithm that processes variable length, irregularly sampled sequences or sequences with missing values. Our algorithm is fully unsupervised, however, can be readily extended to supervised or semisupervised cases when the anomaly labels are present as remarked throughout the paper. Our approach uses the Long Short Term Memory (LSTM) networks in order to… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

    Comments: 11 pages

  12. arXiv:2005.08948  [pdf, other

    cs.LG stat.ML

    Achieving Online Regression Performance of LSTMs with Simple RNNs

    Authors: N. Mert Vural, Fatih Ilhan, Selim F. Yilmaz, Salih Ergüt, Suleyman S. Kozat

    Abstract: Recurrent Neural Networks (RNNs) are widely used for online regression due to their ability to generalize nonlinear temporal dependencies. As an RNN model, Long-Short-Term-Memory Networks (LSTMs) are commonly preferred in practice, as these networks are capable of learning long-term dependencies while avoiding the vanishing gradient problem. However, due to their large number of parameters, traini… ▽ More

    Submitted 31 May, 2021; v1 submitted 16 May, 2020; originally announced May 2020.

    Comments: arXiv admin note: substantial text overlap with arXiv:2003.03601

  13. Modeling of Spatio-Temporal Hawkes Processes with Randomized Kernels

    Authors: Fatih Ilhan, Suleyman Serdar Kozat

    Abstract: We investigate spatio-temporal event analysis using point processes. Inferring the dynamics of event sequences spatiotemporally has many practical applications including crime prediction, social media analysis, and traffic forecasting. In particular, we focus on spatio-temporal Hawkes processes that are commonly used due to their capability to capture excitations between event occurrences. We intr… ▽ More

    Submitted 15 February, 2021; v1 submitted 7 March, 2020; originally announced March 2020.

  14. arXiv:2003.03601   

    cs.LG stat.ML

    RNN-based Online Learning: An Efficient First-Order Optimization Algorithm with a Convergence Guarantee

    Authors: N. Mert Vural, Selim F. Yilmaz, Fatih Ilhan, Suleyman S. Kozat

    Abstract: We investigate online nonlinear regression with continually running recurrent neural network networks (RNNs), i.e., RNN-based online learning. For RNN-based online learning, we introduce an efficient first-order training algorithm that theoretically guarantees to converge to the optimum network parameters. Our algorithm is truly online such that it does not make any assumption on the learning envi… ▽ More

    Submitted 31 May, 2021; v1 submitted 7 March, 2020; originally announced March 2020.

    Comments: This paper was an early draft of the presented results. We have written and published another paper (arXiv:2005.08948) where we have improved the material in this paper. The published paper covers most of the material presented in this paper as well. Therefore, we remove this paper from Arxiv and kindly refer the interested readers to arXiv:2005.08948

  15. arXiv:1911.12258   

    cs.LG eess.SP stat.ML

    Stability of the Decoupled Extended Kalman Filter Learning Algorithm in LSTM-Based Online Learning

    Authors: Nuri Mert Vural, Fatih Ilhan, Suleyman S. Kozat

    Abstract: We investigate the convergence and stability properties of the decoupled extended Kalman filter learning algorithm (DEKF) within the long-short term memory network (LSTM) based online learning framework. For this purpose, we model DEKF as a perturbed extended Kalman filter and derive sufficient conditions for its stability during LSTM training. We show that if the perturbations -- introduced due t… ▽ More

    Submitted 31 May, 2021; v1 submitted 25 November, 2019; originally announced November 2019.

    Comments: This paper was an early draft of the presented results. We have written and published another paper (arXiv:1911.12258) where we have improved on the material in this paper. The published paper covers most of the material presented in this paper as well. Therefore, we remove this paper from Arxiv and refer the interested readers to arXiv:1911.12258

  16. arXiv:1412.6161  [pdf, ps, other

    eess.SY cs.PF nlin.AO

    Graph-Based Minimum Dwell Time and Average Dwell Time Computations for Discrete-Time Switched Linear Systems

    Authors: Ferruh İlhan, Özkan Karabacak

    Abstract: Discrete-time switched linear systems where switchings are governed by a digraph are considered. The minimum (or average) dwell time that guarantees the asymptotic stability can be computed by calculating the maximum cycle ratio (or maximum cycle mean) of a doubly weighted digraph where weights depend on the eigenvalues and eigenvectors of subsystem matrices. The graph-based method is applied to s… ▽ More

    Submitted 19 October, 2017; v1 submitted 14 November, 2014; originally announced December 2014.

    Journal ref: Asian Journal of Control, 18, 2018-2026, 2016