-
DRL-Assisted Dynamic QoT-Aware Service Provisioning in Multi-Band Elastic Optical Networks
Authors:
Yiran Teng,
Carlos Natalino,
Farhad Arpanaei,
Alfonso Sánchez-Macián,
Paolo Monti,
Shuangyi Yan,
Dimitra Simeonidou
Abstract:
We propose a DRL-assisted approach for service provisioning in multi-band elastic optical networks. Our simulation environment uses an accurate QoT estimator based on the GN/EGN model. Results show that the proposed approach reduces request blocking by 50% compared with heuristics from the literature.
We propose a DRL-assisted approach for service provisioning in multi-band elastic optical networks. Our simulation environment uses an accurate QoT estimator based on the GN/EGN model. Results show that the proposed approach reduces request blocking by 50% compared with heuristics from the literature.
△ Less
Submitted 6 August, 2024;
originally announced August 2024.
-
Cluster-based Method for Eavesdropping Identification and Localization in Optical Links
Authors:
Haokun Song,
Rui Lin,
Andrea Sgambelluri,
Filippo Cugini,
Yajie Li,
Jie Zhang,
Paolo Monti
Abstract:
We propose a cluster-based method to detect and locate eavesdropping events in optical line systems characterized by small power losses. Our findings indicate that detecting such subtle losses from eavesdropping can be accomplished solely through optical performance monitoring (OPM) data collected at the receiver. On the other hand, the localization of such events can be effectively achieved by le…
▽ More
We propose a cluster-based method to detect and locate eavesdropping events in optical line systems characterized by small power losses. Our findings indicate that detecting such subtle losses from eavesdropping can be accomplished solely through optical performance monitoring (OPM) data collected at the receiver. On the other hand, the localization of such events can be effectively achieved by leveraging in-line OPM data.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Evolving 5G: ANIARA, an Edge-Cloud perspective
Authors:
Ian Marsh,
Wolfgang John,
Ali Balador,
Federico Tonini,
Jalil Taghia,
Andreas Johnsson,
Paolo Monti,
Jonas Gustafsson,
Pontus Sköldström,
Johan Sjöberg,
Jim Dowling
Abstract:
Emerging use-cases like smart manufacturing and smart cities pose challenges in terms of latency, which cannot be satisfied by traditional centralized networks. Edge networks, which bring computational capacity closer to the users/clients, are a promising solution for supporting these critical low latency services. Different from traditional centralized networks, the edge is distributed by nature…
▽ More
Emerging use-cases like smart manufacturing and smart cities pose challenges in terms of latency, which cannot be satisfied by traditional centralized networks. Edge networks, which bring computational capacity closer to the users/clients, are a promising solution for supporting these critical low latency services. Different from traditional centralized networks, the edge is distributed by nature and is usually equipped with limited connectivity and compute capacity. This creates a complex network to handle, subject to failures of different natures, that requires novel solutions to work in practice. To reduce complexity, more lightweight solutions are needed for containerization as well as smart monitoring strategies with reduced overhead. Orchestration strategies should provide reliable resource slicing with limited resources, and intelligent scaling while preserving data privacy in a distributed fashion. Power management is also critical, as providing and managing a large amount of power at the edge is unprecedented.
△ Less
Submitted 6 May, 2022;
originally announced May 2022.
-
Causal Autoregressive Flows
Authors:
Ilyes Khemakhem,
Ricardo Pio Monti,
Robert Leech,
Aapo Hyvärinen
Abstract:
Two apparently unrelated fields -- normalizing flows and causality -- have recently received considerable attention in the machine learning community. In this work, we highlight an intrinsic correspondence between a simple family of autoregressive normalizing flows and identifiable causal models. We exploit the fact that autoregressive flow architectures define an ordering over variables, analogou…
▽ More
Two apparently unrelated fields -- normalizing flows and causality -- have recently received considerable attention in the machine learning community. In this work, we highlight an intrinsic correspondence between a simple family of autoregressive normalizing flows and identifiable causal models. We exploit the fact that autoregressive flow architectures define an ordering over variables, analogous to a causal ordering, to show that they are well-suited to performing a range of causal inference tasks, ranging from causal discovery to making interventional and counterfactual predictions. First, we show that causal models derived from both affine and additive autoregressive flows with fixed orderings over variables are identifiable, i.e. the true direction of causal influence can be recovered. This provides a generalization of the additive noise model well-known in causal discovery. Second, we derive a bivariate measure of causal direction based on likelihood ratios, leveraging the fact that flow models can estimate normalized log-densities of data. Third, we demonstrate that flows naturally allow for direct evaluation of both interventional and counterfactual queries, the latter case being possible due to the invertible nature of flows. Finally, throughout a series of experiments on synthetic and real data, the proposed method is shown to outperform current approaches for causal discovery as well as making accurate interventional and counterfactual predictions.
△ Less
Submitted 24 February, 2021; v1 submitted 4 November, 2020;
originally announced November 2020.
-
Bayesian optimization for automatic design of face stimuli
Authors:
Pedro F. da Costa,
Romy Lorenz,
Ricardo Pio Monti,
Emily Jones,
Robert Leech
Abstract:
Investigating the cognitive and neural mechanisms involved with face processing is a fundamental task in modern neuroscience and psychology. To date, the majority of such studies have focused on the use of pre-selected stimuli. The absence of personalized stimuli presents a serious limitation as it fails to account for how each individual face processing system is tuned to cultural embeddings or h…
▽ More
Investigating the cognitive and neural mechanisms involved with face processing is a fundamental task in modern neuroscience and psychology. To date, the majority of such studies have focused on the use of pre-selected stimuli. The absence of personalized stimuli presents a serious limitation as it fails to account for how each individual face processing system is tuned to cultural embeddings or how it is disrupted in disease. In this work, we propose a novel framework which combines generative adversarial networks (GANs) with Bayesian optimization to identify individual response patterns to many different faces. Formally, we employ Bayesian optimization to efficiently search the latent space of state-of-the-art GAN models, with the aim to automatically generate novel faces, to maximize an individual subject's response. We present results from a web-based proof-of-principle study, where participants rated images of themselves generated via performing Bayesian optimization over the latent space of a GAN. We show how the algorithm can efficiently locate an individual's optimal face while mapping out their response across different semantic transformations of a face; inter-individual analyses suggest how the approach can provide rich information about individual differences in face processing.
△ Less
Submitted 20 July, 2020;
originally announced July 2020.
-
Autoregressive flow-based causal discovery and inference
Authors:
Ricardo Pio Monti,
Ilyes Khemakhem,
Aapo Hyvarinen
Abstract:
We posit that autoregressive flow models are well-suited to performing a range of causal inference tasks - ranging from causal discovery to making interventional and counterfactual predictions. In particular, we exploit the fact that autoregressive architectures define an ordering over variables, analogous to a causal ordering, in order to propose a single flow architecture to perform all three af…
▽ More
We posit that autoregressive flow models are well-suited to performing a range of causal inference tasks - ranging from causal discovery to making interventional and counterfactual predictions. In particular, we exploit the fact that autoregressive architectures define an ordering over variables, analogous to a causal ordering, in order to propose a single flow architecture to perform all three aforementioned tasks. We first leverage the fact that flow models estimate normalized log-densities of data to derive a bivariate measure of causal direction based on likelihood ratios. Whilst traditional measures of causal direction often require restrictive assumptions on the nature of causal relationships (e.g., linearity),the flexibility of flow models allows for arbitrary causal dependencies. Our approach compares favourably against alternative methods on synthetic data as well as on the Cause-Effect Pairs bench-mark dataset. Subsequently, we demonstrate that the invertible nature of flows naturally allows for direct evaluation of both interventional and counterfactual predictions, which require marginalization and conditioning over latent variables respectively. We present examples over synthetic data where autoregressive flows, when trained under the correct causal ordering, are able to make accurate interventional and counterfactual predictions
△ Less
Submitted 26 July, 2020; v1 submitted 18 July, 2020;
originally announced July 2020.
-
ICE-BeeM: Identifiable Conditional Energy-Based Deep Models Based on Nonlinear ICA
Authors:
Ilyes Khemakhem,
Ricardo Pio Monti,
Diederik P. Kingma,
Aapo Hyvärinen
Abstract:
We consider the identifiability theory of probabilistic models and establish sufficient conditions under which the representations learned by a very broad family of conditional energy-based models are unique in function space, up to a simple transformation. In our model family, the energy function is the dot-product between two feature extractors, one for the dependent variable, and one for the co…
▽ More
We consider the identifiability theory of probabilistic models and establish sufficient conditions under which the representations learned by a very broad family of conditional energy-based models are unique in function space, up to a simple transformation. In our model family, the energy function is the dot-product between two feature extractors, one for the dependent variable, and one for the conditioning variable. We show that under mild conditions, the features are unique up to scaling and permutation. Our results extend recent developments in nonlinear ICA, and in fact, they lead to an important generalization of ICA models. In particular, we show that our model can be used for the estimation of the components in the framework of Independently Modulated Component Analysis (IMCA), a new generalization of nonlinear ICA that relaxes the independence assumption. A thorough empirical study shows that representations learned by our model from real-world image datasets are identifiable, and improve performance in transfer learning and semi-supervised learning tasks.
△ Less
Submitted 26 October, 2020; v1 submitted 26 February, 2020;
originally announced February 2020.
-
Variational Autoencoders and Nonlinear ICA: A Unifying Framework
Authors:
Ilyes Khemakhem,
Diederik P. Kingma,
Ricardo Pio Monti,
Aapo Hyvärinen
Abstract:
The framework of variational autoencoders allows us to efficiently learn deep latent-variable models, such that the model's marginal distribution over observed variables fits the data. Often, we're interested in going a step further, and want to approximate the true joint distribution over observed and latent variables, including the true prior and posterior distributions over latent variables. Th…
▽ More
The framework of variational autoencoders allows us to efficiently learn deep latent-variable models, such that the model's marginal distribution over observed variables fits the data. Often, we're interested in going a step further, and want to approximate the true joint distribution over observed and latent variables, including the true prior and posterior distributions over latent variables. This is known to be generally impossible due to unidentifiability of the model. We address this issue by showing that for a broad family of deep latent-variable models, identification of the true joint distribution over observed and latent variables is actually possible up to very simple transformations, thus achieving a principled and powerful form of disentanglement. Our result requires a factorized prior distribution over the latent variables that is conditioned on an additionally observed variable, such as a class label or almost any other observation. We build on recent developments in nonlinear ICA, which we extend to the case with noisy, undercomplete or discrete observations, integrated in a maximum likelihood framework. The result also trivially contains identifiable flow-based generative models as a special case.
△ Less
Submitted 21 December, 2020; v1 submitted 10 July, 2019;
originally announced July 2019.
-
Causal Discovery with General Non-Linear Relationships Using Non-Linear ICA
Authors:
Ricardo Pio Monti,
Kun Zhang,
Aapo Hyvarinen
Abstract:
We consider the problem of inferring causal relationships between two or more passively observed variables. While the problem of such causal discovery has been extensively studied especially in the bivariate setting, the majority of current methods assume a linear causal relationship, and the few methods which consider non-linear dependencies usually make the assumption of additive noise. Here, we…
▽ More
We consider the problem of inferring causal relationships between two or more passively observed variables. While the problem of such causal discovery has been extensively studied especially in the bivariate setting, the majority of current methods assume a linear causal relationship, and the few methods which consider non-linear dependencies usually make the assumption of additive noise. Here, we propose a framework through which we can perform causal discovery in the presence of general non-linear relationships. The proposed method is based on recent progress in non-linear independent component analysis and exploits the non-stationarity of observations in order to recover the underlying sources or latent disturbances. We show rigorously that in the case of bivariate causal discovery, such non-linear ICA can be used to infer the causal direction via a series of independence tests. We further propose an alternative measure of causal direction based on asymptotic approximations to the likelihood ratio, as well as an extension to multivariate causal discovery. We demonstrate the capabilities of the proposed method via a series of simulation studies and conclude with an application to neuroimaging data.
△ Less
Submitted 19 April, 2019;
originally announced April 2019.
-
A Unified Probabilistic Model for Learning Latent Factors and Their Connectivities from High-Dimensional Data
Authors:
Ricardo Pio Monti,
Aapo Hyvärinen
Abstract:
Connectivity estimation is challenging in the context of high-dimensional data. A useful preprocessing step is to group variables into clusters, however, it is not always clear how to do so from the perspective of connectivity estimation. Another practical challenge is that we may have data from multiple related classes (e.g., multiple subjects or conditions) and wish to incorporate constraints on…
▽ More
Connectivity estimation is challenging in the context of high-dimensional data. A useful preprocessing step is to group variables into clusters, however, it is not always clear how to do so from the perspective of connectivity estimation. Another practical challenge is that we may have data from multiple related classes (e.g., multiple subjects or conditions) and wish to incorporate constraints on the similarities across classes. We propose a probabilistic model which simultaneously performs both a grouping of variables (i.e., detecting community structure) and estimation of connectivities between the groups which correspond to latent variables. The model is essentially a factor analysis model where the factors are allowed to have arbitrary correlations, while the factor loading matrix is constrained to express a community structure. The model can be applied on multiple classes so that the connectivities can be different between the classes, while the community structure is the same for all classes. We propose an efficient estimation algorithm based on score matching, and prove the identifiability of the model. Finally, we present an extension to directed (causal) connectivities over latent variables. Simulations and experiments on fMRI data validate the practical utility of the method.
△ Less
Submitted 24 May, 2018;
originally announced May 2018.
-
Adaptive regularization for Lasso models in the context of non-stationary data streams
Authors:
Ricardo Pio Monti,
Christoforos Anagnostopoulos,
Giovanni Montana
Abstract:
Large scale, streaming datasets are ubiquitous in modern machine learning. Streaming algorithms must be scalable, amenable to incremental training and robust to the presence of non-stationarity. In this work consider the problem of learning $\ell_1$ regularized linear models in the context of streaming data. In particular, the focus of this work revolves around how to select the regularization par…
▽ More
Large scale, streaming datasets are ubiquitous in modern machine learning. Streaming algorithms must be scalable, amenable to incremental training and robust to the presence of non-stationarity. In this work consider the problem of learning $\ell_1$ regularized linear models in the context of streaming data. In particular, the focus of this work revolves around how to select the regularization parameter when data arrives sequentially and the underlying distribution is non-stationary (implying the choice of optimal regularization parameter is itself time-varying). We propose a framework through which to infer an adaptive regularization parameter. Our approach employs an $\ell_1$ penalty constraint where the corresponding sparsity parameter is iteratively updated via stochastic gradient descent. This serves to reformulate the choice of regularization parameter in a principled framework for online learning. The proposed method is derived for linear regression and subsequently extended to generalized linear models. We validate our approach using simulated and real datasets and present an application to a neuroimaging dataset.
△ Less
Submitted 14 December, 2017; v1 submitted 28 October, 2016;
originally announced October 2016.
-
Text-mining the NeuroSynth corpus using Deep Boltzmann Machines
Authors:
Ricardo Pio Monti,
Romy Lorenz,
Robert Leech,
Christoforos Anagnostopoulos,
Giovanni Montana
Abstract:
Large-scale automated meta-analysis of neuroimaging data has recently established itself as an important tool in advancing our understanding of human brain function. This research has been pioneered by NeuroSynth, a database collecting both brain activation coordinates and associated text across a large cohort of neuroimaging research papers. One of the fundamental aspects of such meta-analysis is…
▽ More
Large-scale automated meta-analysis of neuroimaging data has recently established itself as an important tool in advancing our understanding of human brain function. This research has been pioneered by NeuroSynth, a database collecting both brain activation coordinates and associated text across a large cohort of neuroimaging research papers. One of the fundamental aspects of such meta-analysis is text-mining. To date, word counts and more sophisticated methods such as Latent Dirichlet Allocation have been proposed. In this work we present an unsupervised study of the NeuroSynth text corpus using Deep Boltzmann Machines (DBMs). The use of DBMs yields several advantages over the aforementioned methods, principal among which is the fact that it yields both word and document embeddings in a high-dimensional vector space. Such embeddings serve to facilitate the use of traditional machine learning techniques on the text corpus. The proposed DBM model is shown to learn embeddings with a clear semantic structure.
△ Less
Submitted 1 May, 2016;
originally announced May 2016.