Skip to main content

Showing 1–50 of 56 results for author: Ozay, M

  1. arXiv:2410.08305  [pdf, other

    cs.LG math.OC

    Randomized Asymmetric Chain of LoRA: The First Meaningful Theoretical Framework for Low-Rank Adaptation

    Authors: Grigory Malinovsky, Umberto Michieli, Hasan Abed Al Kader Hammoud, Taha Ceritli, Hayder Elesedy, Mete Ozay, Peter Richtárik

    Abstract: Fine-tuning has become a popular approach to adapting large foundational models to specific tasks. As the size of models and datasets grows, parameter-efficient fine-tuning techniques are increasingly important. One of the most widely used methods is Low-Rank Adaptation (LoRA), with adaptation update expressed as the product of two low-rank matrices. While LoRA was shown to possess strong performa… ▽ More

    Submitted 10 October, 2024; originally announced October 2024.

    Comments: 36 pages, 4 figures, 2 algorithms

  2. arXiv:2407.07541  [pdf, other

    cs.CV cs.AI cs.RO

    Swiss DINO: Efficient and Versatile Vision Framework for On-device Personal Object Search

    Authors: Kirill Paramonov, Jia-Xing Zhong, Umberto Michieli, Jijoong Moon, Mete Ozay

    Abstract: In this paper, we address a recent trend in robotic home appliances to include vision systems on personal devices, capable of personalizing the appliances on the fly. In particular, we formulate and address an important technical task of personal object search, which involves localization and identification of personal items of interest on images captured by robotic appliances, with each item refe… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 8 pages, 2 figures, accepted to IROS2024

  3. arXiv:2407.06450  [pdf, other

    cs.CV

    Enhanced Model Robustness to Input Corruptions by Per-corruption Adaptation of Normalization Statistics

    Authors: Elena Camuffo, Umberto Michieli, Simone Milani, Jijoong Moon, Mete Ozay

    Abstract: Developing a reliable vision system is a fundamental challenge for robotic technologies (e.g., indoor service robots and outdoor autonomous robots) which can ensure reliable navigation even in challenging environments such as adverse weather conditions (e.g., fog, rain), poor lighting conditions (e.g., over/under exposure), or sensor degradation (e.g., blurring, noise), and can guarantee high perf… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Journal ref: International Conference on Intelligent Robots and Systems (IROS), 2024

  4. arXiv:2407.02987  [pdf, other

    cs.LG cs.AI cs.CL

    LoRA-Guard: Parameter-Efficient Guardrail Adaptation for Content Moderation of Large Language Models

    Authors: Hayder Elesedy, Pedro M. Esperança, Silviu Vlad Oprea, Mete Ozay

    Abstract: Guardrails have emerged as an alternative to safety alignment for content moderation of large language models (LLMs). Existing model-based guardrails have not been designed for resource-constrained computational portable devices, such as mobile phones, more and more of which are running LLM-based applications locally. We introduce LoRA-Guard, a parameter-efficient guardrail adaptation method that… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  5. arXiv:2407.01193  [pdf, other

    cs.CV

    Cross-Architecture Auxiliary Feature Space Translation for Efficient Few-Shot Personalized Object Detection

    Authors: Francesco Barbato, Umberto Michieli, Jijoong Moon, Pietro Zanuttigh, Mete Ozay

    Abstract: Recent years have seen object detection robotic systems deployed in several personal devices (e.g., home robots and appliances). This has highlighted a challenge in their design, i.e., they cannot efficiently update their knowledge to distinguish between general classes and user-specific instances (e.g., a dog vs. user's dog). We refer to this challenging task as Instance-level Personalized Object… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted at IROS 2024, 8 pages, 4 figures, 6 tables

  6. arXiv:2406.14563  [pdf, other

    cs.CL cs.AI cs.LG

    Model Merging and Safety Alignment: One Bad Model Spoils the Bunch

    Authors: Hasan Abed Al Kader Hammoud, Umberto Michieli, Fabio Pizzati, Philip Torr, Adel Bibi, Bernard Ghanem, Mete Ozay

    Abstract: Merging Large Language Models (LLMs) is a cost-effective technique for combining multiple expert LLMs into a single versatile model, retaining the expertise of the original ones. However, current approaches often overlook the importance of safety alignment during merging, leading to highly misaligned models. This work investigates the effects of model merging on alignment. We evaluate several popu… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: Under review

  7. arXiv:2405.06368  [pdf, other

    cs.LG cs.CR cs.DC

    DP-DyLoRA: Fine-Tuning Transformer-Based Models On-Device under Differentially Private Federated Learning using Dynamic Low-Rank Adaptation

    Authors: Jie Xu, Karthikeyan Saravanan, Rogier van Dalen, Haaris Mehmood, David Tuckey, Mete Ozay

    Abstract: Federated learning (FL) allows clients to collaboratively train a global model without sharing their local data with a server. However, clients' contributions to the server can still leak sensitive information. Differential privacy (DP) addresses such leakage by providing formal privacy guarantees, with mechanisms that add randomness to the clients' contributions. The randomness makes it infeasibl… ▽ More

    Submitted 22 July, 2024; v1 submitted 10 May, 2024; originally announced May 2024.

    Comments: 16 pages, 10 figures, 5 tables

  8. arXiv:2404.01397  [pdf, other

    cs.CV cs.AI cs.RO

    Object-conditioned Bag of Instances for Few-Shot Personalized Instance Recognition

    Authors: Umberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozay

    Abstract: Nowadays, users demand for increased personalization of vision systems to localize and identify personal instances of objects (e.g., my dog rather than dog) from a few-shot dataset only. Despite outstanding results of deep networks on classical label-abundant benchmarks (e.g., those of the latest YOLOv8 model for standard object detection), they struggle to maintain within-class variability to rep… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: ICASSP 2024. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other

  9. arXiv:2403.14335  [pdf, other

    cs.CV

    FFT-based Selection and Optimization of Statistics for Robust Recognition of Severely Corrupted Images

    Authors: Elena Camuffo, Umberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozay

    Abstract: Improving model robustness in case of corrupted images is among the key challenges to enable robust vision systems on smart devices, such as robotic agents. Particularly, robust test-time performance is imperative for most of the applications. This paper presents a novel approach to improve robustness of any classification model, especially on severely corrupted images. Our method (FROST) employs… ▽ More

    Submitted 21 March, 2024; originally announced March 2024.

    Comments: ICASSP 2024. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other

    Journal ref: International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2023

  10. arXiv:2402.18614  [pdf, other

    cs.LG cs.CV cs.NE

    Deep Neural Network Models Trained With A Fixed Random Classifier Transfer Better Across Domains

    Authors: Hafiz Tiomoko Ali, Umberto Michieli, Ji Joong Moon, Daehyun Kim, Mete Ozay

    Abstract: The recently discovered Neural collapse (NC) phenomenon states that the last-layer weights of Deep Neural Networks (DNN), converge to the so-called Equiangular Tight Frame (ETF) simplex, at the terminal phase of their training. This ETF geometry is equivalent to vanishing within-class variability of the last layer activations. Inspired by NC properties, we explore in this paper the transferability… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: ICASSP 2024. Copyright 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other

  11. arXiv:2402.18449  [pdf, other

    cs.CL cs.AI cs.LG

    HOP to the Next Tasks and Domains for Continual Learning in NLP

    Authors: Umberto Michieli, Mete Ozay

    Abstract: Continual Learning (CL) aims to learn a sequence of problems (i.e., tasks and domains) by transferring knowledge acquired on previous problems, whilst avoiding forgetting of past ones. Different from previous approaches which focused on CL for one NLP task or domain in a specific use-case, in this paper, we address a more general CL setting to learn from a sequence of problems in a unique framewor… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: AAAI 2024. Main + supplmentary

  12. A Modular System for Enhanced Robustness of Multimedia Understanding Networks via Deep Parametric Estimation

    Authors: Francesco Barbato, Umberto Michieli, Mehmet Kerim Yucel, Pietro Zanuttigh, Mete Ozay

    Abstract: In multimedia understanding tasks, corrupted samples pose a critical challenge, because when fed to machine learning models they lead to performance degradation. In the past, three groups of approaches have been proposed to handle noisy data: i) enhancer and denoiser modules to improve the quality of the noisy data, ii) data augmentation approaches, and iii) domain adaptation strategies. All the a… ▽ More

    Submitted 29 February, 2024; v1 submitted 28 February, 2024; originally announced February 2024.

    Comments: Accepted at ACM MMSys'24. 10 pages, 7 figures, 8 tables

  13. arXiv:2307.13343  [pdf, other

    eess.AS cs.CR cs.SD

    On-Device Speaker Anonymization of Acoustic Embeddings for ASR based onFlexible Location Gradient Reversal Layer

    Authors: Md Asif Jalal, Pablo Peso Parada, Jisi Zhang, Karthikeyan Saravanan, Mete Ozay, Myoungji Han, Jung In Lee, Seokyeong Jung

    Abstract: Smart devices serviced by large-scale AI models necessitates user data transfer to the cloud for inference. For speech applications, this means transferring private user information, e.g., speaker identity. Our paper proposes a privacy-enhancing framework that targets speaker identity anonymization while preserving speech recognition accuracy for our downstream task~-~Automatic Speech Recognition… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

    Comments: Proceedings of INTERSPEECH 2023

  14. arXiv:2307.12660  [pdf, other

    cs.SD cs.LG eess.AS

    Online Continual Learning in Keyword Spotting for Low-Resource Devices via Pooling High-Order Temporal Statistics

    Authors: Umberto Michieli, Pablo Peso Parada, Mete Ozay

    Abstract: Keyword Spotting (KWS) models on embedded devices should adapt fast to new user-defined words without forgetting previous ones. Embedded devices have limited storage and computational resources, thus, they cannot save samples or update large models. We consider the setup of embedded online continual learning (EOCL), where KWS models with frozen backbone are trained to incrementally recognize new w… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: INTERSPEECH 2023

  15. arXiv:2307.12659  [pdf, other

    cs.SD cs.CL eess.AS

    A Model for Every User and Budget: Label-Free and Personalized Mixed-Precision Quantization

    Authors: Edward Fish, Umberto Michieli, Mete Ozay

    Abstract: Recent advancement in Automatic Speech Recognition (ASR) has produced large AI models, which become impractical for deployment in mobile devices. Model quantization is effective to produce compressed general-purpose models, however such models may only be deployed to a restricted sub-domain of interest. We show that ASR models can be personalized during quantization while relying on just a small s… ▽ More

    Submitted 11 February, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: INTERSPEECH 2023. Code is available at https://github.com/SamsungLabs/myQASR

  16. arXiv:2307.09827  [pdf, other

    cs.RO cs.AI cs.CV

    Online Continual Learning for Robust Indoor Object Recognition

    Authors: Umberto Michieli, Mete Ozay

    Abstract: Vision systems mounted on home robots need to interact with unseen classes in changing environments. Robots have limited computational resources, labelled data and storage capability. These requirements pose some unique challenges: models should adapt without forgetting past knowledge in a data- and parameter-efficient way. We characterize the problem as few-shot (FS) online continual learning (OC… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

    Comments: IROS 2023

  17. arXiv:2208.13404  [pdf, other

    cs.CV

    Progressive Self-Distillation for Ground-to-Aerial Perception Knowledge Transfer

    Authors: Junjie Hu, Chenyou Fan, Mete Ozay, Hua Feng, Yuan Gao, Tin Lun Lam

    Abstract: We study a practical yet hasn't been explored problem: how a drone can perceive in an environment from different flight heights. Unlike autonomous driving, where the perception is always conducted from a ground viewpoint, a flying drone may flexibly change its flight height due to specific tasks, requiring the capability for viewpoint invariant perception. Tackling the such problem with supervised… ▽ More

    Submitted 16 April, 2023; v1 submitted 29 August, 2022; originally announced August 2022.

  18. arXiv:2208.12464  [pdf, other

    cs.CV

    Dense Depth Distillation with Out-of-Distribution Simulated Images

    Authors: Junjie Hu, Chenyou Fan, Mete Ozay, Hualie Jiang, Tin Lun Lam

    Abstract: We study data-free knowledge distillation (KD) for monocular depth estimation (MDE), which learns a lightweight model for real-world depth perception tasks by compressing it from a trained teacher model while lacking training data in the target domain. Owing to the essential difference between image classification and dense regression, previous methods of data-free KD are not applicable to MDE. To… ▽ More

    Submitted 7 December, 2023; v1 submitted 26 August, 2022; originally announced August 2022.

  19. arXiv:2207.04949  [pdf, ps, other

    eess.AS cs.SD

    pMCT: Patched Multi-Condition Training for Robust Speech Recognition

    Authors: Pablo Peso Parada, Agnieszka Dobrowolska, Karthikeyan Saravanan, Mete Ozay

    Abstract: We propose a novel Patched Multi-Condition Training (pMCT) method for robust Automatic Speech Recognition (ASR). pMCT employs Multi-condition Audio Modification and Patching (MAMP) via mixing {\it patches} of the same utterance extracted from clean and distorted speech. Training using patch-modified signals improves robustness of models in noisy reverberant scenarios. Our proposed pMCT is evaluate… ▽ More

    Submitted 11 July, 2022; originally announced July 2022.

    Comments: Accepted at Interspeech 2022

  20. arXiv:2206.02797  [pdf, ps, other

    eess.AS cs.AI cs.CL cs.CV cs.DC cs.LG

    FedNST: Federated Noisy Student Training for Automatic Speech Recognition

    Authors: Haaris Mehmood, Agnieszka Dobrowolska, Karthikeyan Saravanan, Mete Ozay

    Abstract: Federated Learning (FL) enables training state-of-the-art Automatic Speech Recognition (ASR) models on user devices (clients) in distributed systems, hence preventing transmission of raw user data to a central server. A key challenge facing practical adoption of FL for ASR is obtaining ground-truth labels on the clients. Existing approaches rely on clients to manually transcribe their speech, whic… ▽ More

    Submitted 12 July, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: Accepted at Interspeech 2022

    ACM Class: I.2.11

  21. arXiv:2205.05335  [pdf, other

    cs.CV

    Deep Depth Completion from Extremely Sparse Data: A Survey

    Authors: Junjie Hu, Chenyu Bao, Mete Ozay, Chenyou Fan, Qing Gao, Honghai Liu, Tin Lun Lam

    Abstract: Depth completion aims at predicting dense pixel-wise depth from an extremely sparse map captured from a depth sensor, e.g., LiDARs. It plays an essential role in various applications such as autonomous driving, 3D reconstruction, augmented reality, and robot navigation. Recent successes on the task have been demonstrated and dominated by deep learning based solutions. In this article, for the firs… ▽ More

    Submitted 29 August, 2022; v1 submitted 11 May, 2022; originally announced May 2022.

  22. arXiv:2203.06504  [pdf, other

    eess.IV cs.CV

    A Mixed Quantization Network for Computationally Efficient Mobile Inverse Tone Mapping

    Authors: Juan Borrego-Carazo, Mete Ozay, Frederik Laboyrie, Paul Wisbey

    Abstract: Recovering a high dynamic range (HDR) image from a single low dynamic range (LDR) image, namely inverse tone mapping (ITM), is challenging due to the lack of information in over- and under-exposed regions. Current methods focus exclusively on training high-performing but computationally inefficient ITM models, which in turn hinder deployment of the ITM models in resource-constrained environments w… ▽ More

    Submitted 12 March, 2022; originally announced March 2022.

    Comments: Presented at the British Machine Vision Conference (BMVC), 2021

  23. arXiv:2202.07421  [pdf, other

    cs.CR cs.AI cs.LG

    Adversarial Attacks and Defense Methods for Power Quality Recognition

    Authors: Jiwei Tian, Buhong Wang, Jing Li, Zhen Wang, Mete Ozay

    Abstract: Vulnerability of various machine learning methods to adversarial examples has been recently explored in the literature. Power systems which use these vulnerable methods face a huge threat against adversarial examples. To this end, we first propose a signal-specific method and a universal signal-agnostic method to attack power systems using generated adversarial examples. Black-box attacks based on… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

    Comments: Technical report

  24. arXiv:2109.05934  [pdf, other

    cs.CV

    Task Guided Compositional Representation Learning for ZDA

    Authors: Shuang Liu, Mete Ozay

    Abstract: Zero-shot domain adaptation (ZDA) methods aim to transfer knowledge about a task learned in a source domain to a target domain, while data from target domain are not available. In this work, we address learning feature representations which are invariant to and shared among different domains considering task characteristics for ZDA. To this end, we propose a method for task-guided ZDA (TG-ZDA) whi… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

  25. arXiv:2105.08982  [pdf, other

    cs.LG cs.CV

    Prototype Guided Federated Learning of Visual Feature Representations

    Authors: Umberto Michieli, Mete Ozay

    Abstract: Federated Learning (FL) is a framework which enables distributed model training using a large corpus of decentralized training data. Existing methods aggregate models disregarding their internal representations, which are crucial for training models in vision tasks. System and statistical heterogeneity (e.g., highly imbalanced and non-i.i.d. data) further harm model training. To this end, we intro… ▽ More

    Submitted 19 May, 2021; originally announced May 2021.

    Comments: 11 pages manuscript, 6 pages supplemental material

  26. arXiv:2012.06452  [pdf, other

    cs.LG cs.AI

    A New Neural Network Architecture Invariant to the Action of Symmetry Subgroups

    Authors: Piotr Kicki, Mete Ozay, Piotr Skrzypczyński

    Abstract: We propose a computationally efficient $G$-invariant neural network that approximates functions invariant to the action of a given permutation subgroup $G \leq S_n$ of the symmetric group on input data. The key element of the proposed network architecture is a new $G$-invariant transformation module, which produces a $G$-invariant latent representation of the input data. Theoretical considerations… ▽ More

    Submitted 11 December, 2020; originally announced December 2020.

    Comments: Presented as contributed talk at NeurIPS 2020 workshop on Differential Geometry meets Deep Learning

    ACM Class: I.2.6

  27. Learning from Experience for Rapid Generation of Local Car Maneuvers

    Authors: Piotr Kicki, Tomasz Gawron, Krzysztof Ćwian, Mete Ozay, Piotr Skrzypczyński

    Abstract: Being able to rapidly respond to the changing scenes and traffic situations by generating feasible local paths is of pivotal importance for car autonomy. We propose to train a deep neural network (DNN) to plan feasible and nearly-optimal paths for kinematically constrained vehicles in small constant time. Our DNN model is trained using a novel weakly supervised approach and a gradient-based policy… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

    ACM Class: I.2.9; I.2.6; J.2

  28. A Computationally Efficient Neural Network Invariant to the Action of Symmetry Subgroups

    Authors: Piotr Kicki, Mete Ozay, Piotr Skrzypczyński

    Abstract: We introduce a method to design a computationally efficient $G$-invariant neural network that approximates functions invariant to the action of a given permutation subgroup $G \leq S_n$ of the symmetric group on input data. The key element of the proposed network architecture is a new $G$-invariant transformation module, which produces a $G$-invariant latent representation of the input data. This… ▽ More

    Submitted 18 February, 2020; originally announced February 2020.

    ACM Class: I.2.6

  29. arXiv:1905.09054  [pdf, other

    cs.LG cs.CV math.OC stat.ML

    Fine-grained Optimization of Deep Neural Networks

    Authors: Mete Ozay

    Abstract: In recent studies, several asymptotic upper bounds on generalization errors on deep neural networks (DNNs) are theoretically derived. These bounds are functions of several norms of weights of the DNNs, such as the Frobenius and spectral norms, and they are computed for weights grouped according to either input and output channels of the DNNs. In this work, we conjecture that if we can impose multi… ▽ More

    Submitted 22 May, 2019; originally announced May 2019.

  30. arXiv:1905.08609  [pdf, other

    cs.CV

    Improving Head Pose Estimation with a Combined Loss and Bounding Box Margin Adjustment

    Authors: Mingzhen Shao, Zhun Sun, Mete Ozay, Takayuki Okatani

    Abstract: We address a problem of estimating pose of a person's head from its RGB image. The employment of CNNs for the problem has contributed to significant improvement in accuracy in recent works. However, we show that the following two methods, despite their simplicity, can attain further improvement: (i) proper adjustment of the margin of bounding box of a detected face, and (ii) choice of loss functio… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: IEEE International Conference on Automatic Face & Gesture Recognition (FG2019)

  31. arXiv:1803.08673  [pdf, other

    cs.CV

    Revisiting Single Image Depth Estimation: Toward Higher Resolution Maps with Accurate Object Boundaries

    Authors: Junjie Hu, Mete Ozay, Yan Zhang, Takayuki Okatani

    Abstract: This paper considers the problem of single image depth estimation. The employment of convolutional neural networks (CNNs) has recently brought about significant advancements in the research of this problem. However, most existing methods suffer from loss of spatial resolution in the estimated depth maps; a typical symptom is distorted and blurry reconstruction of object boundaries. In this paper,… ▽ More

    Submitted 22 September, 2018; v1 submitted 23 March, 2018; originally announced March 2018.

  32. arXiv:1803.00370  [pdf, other

    cs.NE

    Exploiting the Potential of Standard Convolutional Autoencoders for Image Restoration by Evolutionary Search

    Authors: Masanori Suganuma, Mete Ozay, Takayuki Okatani

    Abstract: Researchers have applied deep neural networks to image restoration tasks, in which they proposed various network architectures, loss functions, and training methods. In particular, adversarial training, which is employed in recent studies, seems to be a key ingredient to success. In this paper, we show that simple convolutional autoencoders (CAEs) built upon only standard network components, i.e.,… ▽ More

    Submitted 1 March, 2018; originally announced March 2018.

    Comments: Our code is available at https://github.com/sg-nm/Evolutionary-Autoencoders

  33. arXiv:1801.07939  [pdf, ps, other

    cs.CV

    Deep Structured Energy-Based Image Inpainting

    Authors: Fazil Altinel, Mete Ozay, Takayuki Okatani

    Abstract: In this paper, we propose a structured image inpainting method employing an energy based model. In order to learn structural relationship between patterns observed in images and missing regions of the images, we employ an energy-based structured prediction method. The structural relationship is learned by minimizing an energy function which is defined by a simple convolutional neural network. The… ▽ More

    Submitted 30 August, 2018; v1 submitted 24 January, 2018; originally announced January 2018.

    Comments: Accepted to 24th International Conference on Pattern Recognition (ICPR 2018). 6 pages, 7 figures

  34. arXiv:1712.04138  [pdf, other

    cs.CV

    A vision based system for underwater docking

    Authors: Shuang Liu, Mete Ozay, Takayuki Okatani, Hongli Xu, Kai Sun, Yang Lin

    Abstract: Autonomous underwater vehicles (AUVs) have been deployed for underwater exploration. However, its potential is confined by its limited on-board battery energy and data storage capacity. This problem has been addressed using docking systems by underwater recharging and data transfer for AUVs. In this work, we propose a vision based framework for underwater docking following these systems. The propo… ▽ More

    Submitted 12 December, 2017; originally announced December 2017.

  35. arXiv:1711.01791  [pdf, other

    cs.CV

    HyperNetworks with statistical filtering for defending adversarial examples

    Authors: Zhun Sun, Mete Ozay, Takayuki Okatani

    Abstract: Deep learning algorithms have been known to be vulnerable to adversarial perturbations in various tasks such as image classification. This problem was addressed by employing several defense methods for detection and rejection of particular types of attacks. However, training and manipulating networks according to particular defense schemes increases computational complexity of the learning algorit… ▽ More

    Submitted 6 November, 2017; originally announced November 2017.

  36. arXiv:1707.07831  [pdf, other

    stat.ML cs.LG

    Linear Discriminant Generative Adversarial Networks

    Authors: Zhun Sun, Mete Ozay, Takayuki Okatani

    Abstract: We develop a novel method for training of GANs for unsupervised and class conditional generation of images, called Linear Discriminant GAN (LD-GAN). The discriminator of an LD-GAN is trained to maximize the linear separability between distributions of hidden representations of generated and targeted samples, while the generator is updated based on the decision hyper-planes computed by performing L… ▽ More

    Submitted 25 July, 2017; originally announced July 2017.

  37. arXiv:1707.07830  [pdf, other

    cs.CV

    Improving Robustness of Feature Representations to Image Deformations using Powered Convolution in CNNs

    Authors: Zhun Sun, Mete Ozay, Takayuki Okatani

    Abstract: In this work, we address the problem of improvement of robustness of feature representations learned using convolutional neural networks (CNNs) to image deformation. We argue that higher moment statistics of feature distributions could be shifted due to image deformations, and the shift leads to degrade of performance and cannot be reduced by ordinary normalization methods as observed in experimen… ▽ More

    Submitted 25 July, 2017; originally announced July 2017.

  38. arXiv:1706.04635  [pdf, other

    cs.LG cs.IT stat.ML

    Information Potential Auto-Encoders

    Authors: Yan Zhang, Mete Ozay, Zhun Sun, Takayuki Okatani

    Abstract: In this paper, we suggest a framework to make use of mutual information as a regularization criterion to train Auto-Encoders (AEs). In the proposed framework, AEs are regularized by minimization of the mutual information between input and encoding variables of AEs during the training phase. In order to estimate the entropy of the encoding variables and the mutual information, we propose a non-para… ▽ More

    Submitted 6 August, 2017; v1 submitted 14 June, 2017; originally announced June 2017.

    Comments: Information Theory

  39. arXiv:1704.00509  [pdf, other

    cs.CV

    Truncating Wide Networks using Binary Tree Architectures

    Authors: Yan Zhang, Mete Ozay, Shuohao Li, Takayuki Okatani

    Abstract: Recent study shows that a wide deep network can obtain accuracy comparable to a deeper but narrower network. Compared to narrower and deeper networks, wide networks employ relatively less number of layers and have various important benefits, such that they have less running time on parallel computing devices, and they are less affected by gradient vanishing problems. However, the parameter size of… ▽ More

    Submitted 3 April, 2017; originally announced April 2017.

    Comments: 10 pages

  40. arXiv:1701.06123  [pdf, ps, other

    cs.CV cs.LG cs.NE

    Optimization on Product Submanifolds of Convolution Kernels

    Authors: Mete Ozay, Takayuki Okatani

    Abstract: Recent advances in optimization methods used for training convolutional neural networks (CNNs) with kernels, which are normalized according to particular constraints, have shown remarkable success. This work introduces an approach for training CNNs using ensembles of joint spaces of kernels constructed using different constraints. For this purpose, we address a problem of optimization on ensembles… ▽ More

    Submitted 27 November, 2017; v1 submitted 22 January, 2017; originally announced January 2017.

    Comments: 7 pages

  41. arXiv:1610.07008  [pdf, ps, other

    cs.CV

    Optimization on Submanifolds of Convolution Kernels in CNNs

    Authors: Mete Ozay, Takayuki Okatani

    Abstract: Kernel normalization methods have been employed to improve robustness of optimization methods to reparametrization of convolution kernels, covariate shift, and to accelerate training of Convolutional Neural Networks (CNNs). However, our understanding of theoretical properties of these methods has lagged behind their success in applications. We develop a geometric framework to elucidate underlying… ▽ More

    Submitted 22 October, 2016; originally announced October 2016.

    Comments: 9 pages, 3 figures

  42. arXiv:1610.05036  [pdf, other

    cs.CV cs.LG

    Encoding the Local Connectivity Patterns of fMRI for Cognitive State Classification

    Authors: Itir Onal Ertugrul, Mete Ozay, Fatos T. Yarman Vural

    Abstract: In this work, we propose a novel framework to encode the local connectivity patterns of brain, using Fisher Vectors (FV), Vector of Locally Aggregated Descriptors (VLAD) and Bag-of-Words (BoW) methods. We first obtain local descriptors, called Mesh Arc Descriptors (MADs) from fMRI data, by forming local meshes around anatomical regions, and estimating their relationship within a neighborhood. Then… ▽ More

    Submitted 17 October, 2016; originally announced October 2016.

    Comments: 8 pages, 5 figures

  43. arXiv:1607.07695  [pdf, other

    cs.NE cs.CV cs.LG

    Hierarchical Multi-resolution Mesh Networks for Brain Decoding

    Authors: Itir Onal Ertugrul, Mete Ozay, Fatos Tunay Yarman Vural

    Abstract: We propose a new framework, called Hierarchical Multi-resolution Mesh Networks (HMMNs), which establishes a set of brain networks at multiple time resolutions of fMRI signal to represent the underlying cognitive process. The suggested framework, first, decomposes the fMRI signal into various frequency subbands using wavelet transforms. Then, a brain network, called mesh network, is formed at each… ▽ More

    Submitted 11 January, 2017; v1 submitted 12 July, 2016; originally announced July 2016.

    Comments: 18 pages

  44. arXiv:1603.01067  [pdf, other

    cs.LG cs.AI cs.CV

    Modeling the Sequence of Brain Volumes by Local Mesh Models for Brain Decoding

    Authors: Itir Onal, Mete Ozay, Eda Mizrak, Ilke Oztekin, Fatos T. Yarman Vural

    Abstract: We represent the sequence of fMRI (Functional Magnetic Resonance Imaging) brain volumes recorded during a cognitive stimulus by a graph which consists of a set of local meshes. The corresponding cognitive process, encoded in the brain, is then represented by these meshes each of which is estimated assuming a linear relationship among the voxel time series in a predefined locality. First, we define… ▽ More

    Submitted 3 March, 2016; originally announced March 2016.

    Comments: 13 pages, 10 figures, submitted to JSTSP Special Issue on Advanced Signal Processing in Brain Networks

  45. arXiv:1511.09231  [pdf, other

    cs.CV

    Design of Kernels in Convolutional Neural Networks for Image Classification

    Authors: Zhun Sun, Mete Ozay, Takayuki Okatani

    Abstract: Despite the effectiveness of Convolutional Neural Networks (CNNs) for image classification, our understanding of the relationship between shape of convolution kernels and learned representations is limited. In this work, we explore and employ the relationship between shape of kernels which define Receptive Fields (RFs) in CNNs for learning of feature representations and image classification. For t… ▽ More

    Submitted 28 November, 2016; v1 submitted 30 November, 2015; originally announced November 2015.

  46. arXiv:1511.06522  [pdf, other

    cs.CV cs.LG

    Integrating Deep Features for Material Recognition

    Authors: Yan Zhang, Mete Ozay, Xing Liu, Takayuki Okatani

    Abstract: We propose a method for integration of features extracted using deep representations of Convolutional Neural Networks (CNNs) each of which is learned using a different image dataset of objects and materials for material recognition. Given a set of representations of multiple pre-trained CNNs, we first compute activations of features using the representations on the images to select a set of sample… ▽ More

    Submitted 21 April, 2016; v1 submitted 20 November, 2015; originally announced November 2015.

    Comments: 6 pages

  47. arXiv:1503.06468  [pdf, ps, other

    cs.LG cs.CR eess.SY

    Machine Learning Methods for Attack Detection in the Smart Grid

    Authors: Mete Ozay, Inaki Esnaola, Fatos T. Yarman Vural, Sanjeev R. Kulkarni, H. Vincent Poor

    Abstract: Attack detection problems in the smart grid are posed as statistical learning problems for different attack scenarios in which the measurements are observed in batch or online settings. In this approach, machine learning algorithms are used to classify measurements as being either secure or attacked. An attack detection framework is provided to exploit any available prior knowledge about the syste… ▽ More

    Submitted 22 March, 2015; originally announced March 2015.

    Comments: 14 pages, 11 Figures

    Journal ref: A version of the manuscript was published in IEEE Transactions on Neural Networks and Learning Systems, 19 March 2015

  48. A Hierarchical Approach for Joint Multi-view Object Pose Estimation and Categorization

    Authors: Mete Ozay, Krzysztof Walas, Ales Leonardis

    Abstract: We propose a joint object pose estimation and categorization approach which extracts information about object poses and categories from the object parts and compositions constructed at different layers of a hierarchical object representation algorithm, namely Learned Hierarchy of Parts (LHOP). In the proposed approach, we first employ the LHOP to learn hierarchical part libraries which represent e… ▽ More

    Submitted 4 March, 2015; originally announced March 2015.

    Comments: 7 Figures

    Journal ref: Proceedings of IEEE International Conference on Robotics and Automation (ICRA), pp. 5480 - 5487, Hong Kong, 2014

  49. Fusion of Image Segmentation Algorithms using Consensus Clustering

    Authors: Mete Ozay, Fatos T. Yarman Vural, Sanjeev R. Kulkarni, H. Vincent Poor

    Abstract: A new segmentation fusion method is proposed that ensembles the output of several segmentation algorithms applied on a remotely sensed image. The candidate segmentation sets are processed to achieve a consensus segmentation using a stochastic optimization algorithm based on the Filtered Stochastic BOEM (Best One Element Move) method. For this purpose, Filtered Stochastic BOEM is reformulated as a… ▽ More

    Submitted 18 February, 2015; originally announced February 2015.

    Comments: A version of the manuscript was published in ICIP 2013

    Journal ref: 20th IEEE International Conference on Image Processing (ICIP), pp. 4049-4053, Melbourne, VIC, 15-18 Sept. 2013

  50. Semi-supervised Segmentation Fusion of Multi-spectral and Aerial Images

    Authors: Mete Ozay

    Abstract: A Semi-supervised Segmentation Fusion algorithm is proposed using consensus and distributed learning. The aim of Unsupervised Segmentation Fusion (USF) is to achieve a consensus among different segmentation outputs obtained from different segmentation algorithms by computing an approximate solution to the NP problem with less computational complexity. Semi-supervision is incorporated in USF using… ▽ More

    Submitted 25 February, 2015; v1 submitted 17 February, 2015; originally announced February 2015.

    Comments: A version of the manuscript was published in ICPR 2014

    Journal ref: Proc. 22nd International Conference on Pattern Recognition, pp. 3839-3844, Stockholm, 2014