subscribe to arXiv mailings

arXiv:2408.08002 [pdf, other]

Practical Privacy-Preserving Identity Verification using Third-Party Cloud Services and FHE (Role of Data Encoding in Circuit Depth Management)

Authors: Deep Inder Mohan, Srinivas Vivek

Abstract: National digital identity verification systems have played a critical role in the effective distribution of goods and services, particularly, in developing countries. Due to the cost involved in deploying and maintaining such systems, combined with a lack of in-house technical expertise, governments seek to outsource this service to third-party cloud service providers to the extent possible. This… ▽ More National digital identity verification systems have played a critical role in the effective distribution of goods and services, particularly, in developing countries. Due to the cost involved in deploying and maintaining such systems, combined with a lack of in-house technical expertise, governments seek to outsource this service to third-party cloud service providers to the extent possible. This leads to increased concerns regarding the privacy of users' personal data. In this work, we propose a practical privacy-preserving digital identity (ID) verification protocol where the third-party cloud services process the identity data encrypted using a (single-key) Fully Homomorphic Encryption (FHE) scheme such as BFV. Though the role of a trusted entity such as government is not completely eliminated, our protocol does significantly reduces the computation load on such parties. A challenge in implementing a privacy-preserving ID verification protocol using FHE is to support various types of queries such as exact and/or fuzzy demographic and biometric matches including secure age comparisons. From a cryptographic engineering perspective, our main technical contribution is a user data encoding scheme that encodes demographic and biometric user data in only two BFV ciphertexts and yet facilitates us to outsource various types of ID verification queries to a third-party cloud. Our encoding scheme also ensures that the only computation done by the trusted entity is a query-agnostic "extended" decryption. This is in stark contrast with recent works that outsource all the non-arithmetic operations to a trusted server. We implement our protocol using the Microsoft SEAL FHE library and demonstrate its practicality. △ Less

Submitted 27 September, 2024; v1 submitted 15 August, 2024; originally announced August 2024.

Comments: A preliminary version of this work was presented (without proceedings) at the Turing Trustworthy Digital Identity International Conference 2022 at The Alan Turing Institute, London, UK, on Sep. 16, 2022. The recently updated version now contains a detailed security analysis

arXiv:2407.12699 [pdf, other]

Mechanism Design via the Interim Relaxation

Authors: Kshipra Bhawalkar, Marios Mertzanidis, Divyarthi Mohan, Alexandros Psomas

Abstract: We study revenue maximization for agents with additive preferences, subject to downward-closed constraints on the set of feasible allocations. In seminal work, Alaei~\cite{alaei2014bayesian} introduced a powerful multi-to-single agent reduction based on an ex-ante relaxation of the multi-agent problem. This reduction employs a rounding procedure which is an online contention resolution scheme (OCR… ▽ More We study revenue maximization for agents with additive preferences, subject to downward-closed constraints on the set of feasible allocations. In seminal work, Alaei~\cite{alaei2014bayesian} introduced a powerful multi-to-single agent reduction based on an ex-ante relaxation of the multi-agent problem. This reduction employs a rounding procedure which is an online contention resolution scheme (OCRS) in disguise, a now widely-used method for rounding fractional solutions in online Bayesian and stochastic optimization problems. In this paper, we leverage our vantage point, 10 years after the work of Alaei, with a rich OCRS toolkit and modern approaches to analyzing multi-agent mechanisms; we introduce a general framework for designing non-sequential and sequential multi-agent, revenue-maximizing mechanisms, capturing a wide variety of problems Alaei's framework could not address. Our framework uses an \emph{interim} relaxation, that is rounded to a feasible mechanism using what we call a two-level OCRS, which allows for some structured dependence between the activation of its input elements. For a wide family of constraints, we can construct such schemes using existing OCRSs as a black box; for other constraints, such as knapsack, we construct such schemes from scratch. We demonstrate numerous applications of our framework, including a sequential mechanism that guarantees a $\frac{2e}{e-1} \approx 3.16$ approximation to the optimal revenue for the case of additive agents subject to matroid feasibility constraints. We also show how our framework can be easily extended to multi-parameter procurement auctions, where we provide an OCRS for Stochastic Knapsack that might be of independent interest. △ Less

Submitted 17 July, 2024; originally announced July 2024.

arXiv:2405.18351 [pdf, other]

Evaluating Bayesian deep learning for radio galaxy classification

Authors: Devina Mohan, Anna M. M. Scaife

Abstract: The radio astronomy community is rapidly adopting deep learning techniques to deal with the huge data volumes expected from the next generation of radio observatories. Bayesian neural networks (BNNs) provide a principled way to model uncertainty in the predictions made by such deep learning models and will play an important role in extracting well-calibrated uncertainty estimates on their outputs.… ▽ More The radio astronomy community is rapidly adopting deep learning techniques to deal with the huge data volumes expected from the next generation of radio observatories. Bayesian neural networks (BNNs) provide a principled way to model uncertainty in the predictions made by such deep learning models and will play an important role in extracting well-calibrated uncertainty estimates on their outputs. In this work, we evaluate the performance of different BNNs against the following criteria: predictive performance, uncertainty calibration and distribution-shift detection for the radio galaxy classification problem. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: Accepted to the 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)

arXiv:2404.06293 [pdf, ps, other]

Optimal Stopping with Interdependent Values

Authors: Simon Mauras, Divyarthi Mohan, Rebecca Reiffenhäuser

Abstract: We study online selection problems in both the prophet and secretary settings, when arriving agents have interdependent values. In the interdependent values model, introduced in the seminal work of Milgrom and Weber [1982], each agent has a private signal and the value of an agent is a function of the signals held by all agents. Results in online selection crucially rely on some degree of independ… ▽ More We study online selection problems in both the prophet and secretary settings, when arriving agents have interdependent values. In the interdependent values model, introduced in the seminal work of Milgrom and Weber [1982], each agent has a private signal and the value of an agent is a function of the signals held by all agents. Results in online selection crucially rely on some degree of independence of values, which is conceptually at odds with the interdependent values model. For prophet and secretary models under the standard independent values assumption, prior works provide constant factor approximations to the welfare. On the other hand, when agents have interdependent values, prior works in Economics and Computer Science provide truthful mechanisms that obtain optimal and approximately optimal welfare under certain assumptions on the valuation functions. We bring together these two important lines of work and provide the first constant factor approximations for prophet and secretary problems with interdependent values. We consider both the algorithmic setting, where agents are non-strategic (but have interdependent values), and the mechanism design setting with strategic agents. All our results are constructive and use simple stopping rules. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2404.02973 [pdf, other]

Scaling Laws for Galaxy Images

Authors: Mike Walmsley, Micah Bowles, Anna M. M. Scaife, Jason Shingirai Makechemu, Alexander J. Gordon, Annette M. N. Ferguson, Robert G. Mann, James Pearson, Jürgen J. Popp, Jo Bovy, Josh Speagle, Hugh Dickinson, Lucy Fortson, Tobias Géron, Sandor Kruk, Chris J. Lintott, Kameswara Mantha, Devina Mohan, David O'Ryan, Inigo V. Slijepevic

Abstract: We present the first systematic investigation of supervised scaling laws outside of an ImageNet-like context - on images of galaxies. We use 840k galaxy images and over 100M annotations by Galaxy Zoo volunteers, comparable in scale to Imagenet-1K. We find that adding annotated galaxy images provides a power law improvement in performance across all architectures and all tasks, while adding trainab… ▽ More We present the first systematic investigation of supervised scaling laws outside of an ImageNet-like context - on images of galaxies. We use 840k galaxy images and over 100M annotations by Galaxy Zoo volunteers, comparable in scale to Imagenet-1K. We find that adding annotated galaxy images provides a power law improvement in performance across all architectures and all tasks, while adding trainable parameters is effective only for some (typically more subjectively challenging) tasks. We then compare the downstream performance of finetuned models pretrained on either ImageNet-12k alone vs. additionally pretrained on our galaxy images. We achieve an average relative error rate reduction of 31% across 5 downstream tasks of scientific interest. Our finetuned models are more label-efficient and, unlike their ImageNet-12k-pretrained equivalents, often achieve linear transfer performance equal to that of end-to-end finetuning. We find relatively modest additional downstream benefits from scaling model size, implying that scaling alone is not sufficient to address our domain gap, and suggest that practitioners with qualitatively different images might benefit more from in-domain adaption followed by targeted downstream labelling. △ Less

Submitted 3 April, 2024; originally announced April 2024.

Comments: 10+6 pages, 12 figures. Appendix C2 based on arxiv:2206.11927. Code, demos, documentation at https://github.com/mwalmsley/zoobot

arXiv:2404.01035 [pdf, other]

MICROSIM: A high performance phase-field solver based on CPU and GPU implementations

Authors: Tanmay Dutta, Dasari Mohan, Saurav Shenoy, Nasir Attar, Abhikshek Kalokhe, Ajay Sagar, Swapnil Bhure, Swaroop . S. Pradhan, Jitendriya Praharaj, Subham Mridha, Anshika Kushwaha, Vaishali Shah, M. P. Gururajan, V. Venkatesh Shenoi, Gandham Phanikumar, Saswata Bhattacharyya, Abhik Choudhury

Abstract: The phase-field method has become a useful tool for the simulation of classical metallurgical phase transformations as well as other phenomena related to materials science. The thermodynamic consistency that forms the basis of these formulations lends to its strong predictive capabilities and utility. However, a strong impediment to the usage of the method for typical applied problems of industria… ▽ More The phase-field method has become a useful tool for the simulation of classical metallurgical phase transformations as well as other phenomena related to materials science. The thermodynamic consistency that forms the basis of these formulations lends to its strong predictive capabilities and utility. However, a strong impediment to the usage of the method for typical applied problems of industrial and academic relevance is the significant overhead with regard to the code development and know-how required for quantitative model formulations. In this paper, we report the development of an open-source phase-field software stack that contains generic formulations for the simulation of multi-phase and multi-component phase transformations. The solvers incorporate thermodynamic coupling that allows the realization of simulations with real alloys in scenarios directly relevant to the materials industry. Further, the solvers utilize parallelization strategies using either multiple CPUs or GPUs to provide cross-platform portability and usability on available supercomputing machines. Finally, the solver stack also contains a graphical user interface to gradually introduce the usage of the software. The user interface also provides a collection of post-processing tools that allow the estimation of useful metrics related to microstructural evolution. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2402.12017 [pdf, ps, other]

Private Interdependent Valuations: New Bounds for Single-Item Auctions and Matroids

Authors: Alon Eden, Michal Feldman, Simon Mauras, Divyarthi Mohan

Abstract: We study auction design within the widely acclaimed model of interdependent values, introduced by Milgrom and Weber [1982]. In this model, every bidder $i$ has a private signal $s_i$ for the item for sale, and a public valuation function $v_i(s_1,\ldots,s_n)$ which maps every vector of private signals (of all bidders) into a real value. A recent line of work established the existence of approximat… ▽ More We study auction design within the widely acclaimed model of interdependent values, introduced by Milgrom and Weber [1982]. In this model, every bidder $i$ has a private signal $s_i$ for the item for sale, and a public valuation function $v_i(s_1,\ldots,s_n)$ which maps every vector of private signals (of all bidders) into a real value. A recent line of work established the existence of approximately-optimal mechanisms within this framework, even in the more challenging scenario where each bidder's valuation function $v_i$ is also private. This body of work has primarily focused on single-item auctions with two natural classes of valuations: those exhibiting submodularity over signals (SOS) and $d$-critical valuations. In this work we advance the state of the art on interdependent values with private valuation functions, with respect to both SOS and $d$-critical valuations. For SOS valuations, we devise a new mechanism that gives an improved approximation bound of $5$ for single-item auctions. This mechanism employs a novel variant of an "eating mechanism", leveraging LP-duality to achieve feasibility with reduced welfare loss. For $d$-critical valuations, we broaden the scope of existing results beyond single-item auctions, introducing a mechanism that gives a $(d+1)$-approximation for any environment with matroid feasibility constraints on the set of agents that can be simultaneously served. Notably, this approximation bound is tight, even with respect to single-item auctions. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2312.10046 [pdf, other]

Deep Metric Learning for Computer Vision: A Brief Overview

Authors: Deen Dayal Mohan, Bhavin Jawade, Srirangaraj Setlur, Venu Govindaraj

Abstract: Objective functions that optimize deep neural networks play a vital role in creating an enhanced feature representation of the input data. Although cross-entropy-based loss formulations have been extensively used in a variety of supervised deep-learning applications, these methods tend to be less adequate when there is large intra-class variance and low inter-class variance in input data distribut… ▽ More Objective functions that optimize deep neural networks play a vital role in creating an enhanced feature representation of the input data. Although cross-entropy-based loss formulations have been extensively used in a variety of supervised deep-learning applications, these methods tend to be less adequate when there is large intra-class variance and low inter-class variance in input data distribution. Deep Metric Learning seeks to develop methods that aim to measure the similarity between data samples by learning a representation function that maps these data samples into a representative embedding space. It leverages carefully designed sampling strategies and loss functions that aid in optimizing the generation of a discriminative embedding space even for distributions having low inter-class and high intra-class variances. In this chapter, we will provide an overview of recent progress in this area and discuss state-of-the-art Deep Metric Learning approaches. △ Less

Submitted 1 December, 2023; originally announced December 2023.

Comments: Book Chapter Published In Handbook of Statistics, Special Issue - Deep Learning 48, 59

arXiv:2311.08243 [pdf, other]

MCMC to address model misspecification in Deep Learning classification of Radio Galaxies

Authors: Devina Mohan, Anna Scaife

Abstract: The radio astronomy community is adopting deep learning techniques to deal with the huge data volumes expected from the next-generation of radio observatories. Bayesian neural networks (BNNs) provide a principled way to model uncertainty in the predictions made by deep learning models and will play an important role in extracting well-calibrated uncertainty estimates from the outputs of these mode… ▽ More The radio astronomy community is adopting deep learning techniques to deal with the huge data volumes expected from the next-generation of radio observatories. Bayesian neural networks (BNNs) provide a principled way to model uncertainty in the predictions made by deep learning models and will play an important role in extracting well-calibrated uncertainty estimates from the outputs of these models. However, most commonly used approximate Bayesian inference techniques such as variational inference and MCMC-based algorithms experience a "cold posterior effect (CPE)", according to which the posterior must be down-weighted in order to get good predictive performance. The CPE has been linked to several factors such as data augmentation or dataset curation leading to a misspecified likelihood and prior misspecification. In this work we use MCMC sampling to show that a Gaussian parametric family is a poor variational approximation to the true posterior and gives rise to the CPE previously observed in morphological classification of radio galaxies using variational inference based BNNs. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: Accepted in Machine Learning and the Physical Sciences Workshop at NeurIPS 2023; 6 pages, 1 figure, 1 table

arXiv:2310.00958 [pdf, other]

Constant Approximation for Private Interdependent Valuations

Authors: Alon Eden, Michal Feldman, Kira Goldner, Simon Mauras, Divyarthi Mohan

Abstract: The celebrated model of auctions with interdependent valuations, introduced by Milgrom and Weber in 1982, has been studied almost exclusively under private signals $s_1, \ldots, s_n$ of the $n$ bidders and public valuation functions $v_i(s_1, \ldots, s_n)$. Recent work in TCS has shown that this setting admits a constant approximation to the optimal social welfare if the valuations satisfy a natur… ▽ More The celebrated model of auctions with interdependent valuations, introduced by Milgrom and Weber in 1982, has been studied almost exclusively under private signals $s_1, \ldots, s_n$ of the $n$ bidders and public valuation functions $v_i(s_1, \ldots, s_n)$. Recent work in TCS has shown that this setting admits a constant approximation to the optimal social welfare if the valuations satisfy a natural property called submodularity over signals (SOS). More recently, Eden et al. (2022) have extended the analysis of interdependent valuations to include settings with private signals and private valuations, and established $O(\log^2 n)$-approximation for SOS valuations. In this paper we show that this setting admits a constant factor approximation, settling the open question raised by Eden et al. (2022). △ Less

Submitted 2 October, 2023; originally announced October 2023.

Comments: In 64th IEEE Symposium on Foundations of Computer Science (FOCS 2023)

arXiv:2309.04691 [pdf, other]

Asynchronous Majority Dynamics on Binomial Random Graphs

Authors: Divyarthi Mohan, Pawel Pralat

Abstract: We study information aggregation in networks when agents interact to learn a binary state of the world. Initially each agent privately observes an independent signal which is "correct" with probability $\frac{1}{2}+δ$ for some $δ> 0$. At each round, a node is selected uniformly at random to update their public opinion to match the majority of their neighbours (breaking ties in favour of their init… ▽ More We study information aggregation in networks when agents interact to learn a binary state of the world. Initially each agent privately observes an independent signal which is "correct" with probability $\frac{1}{2}+δ$ for some $δ> 0$. At each round, a node is selected uniformly at random to update their public opinion to match the majority of their neighbours (breaking ties in favour of their initial private signal). Our main result shows that for sparse and connected binomial random graphs $\mathcal G(n,p)$ the process stabilizes in a "correct" consensus in $\mathcal O(n\log^2 n/\log\log n)$ steps with high probability. In fact, when $\log n/n \ll p = o(1)$ the process terminates at time $\hat T = (1+o(1))n\log n$, where $\hat T$ is the first time when all nodes have been selected at least once. However, in dense binomial random graphs with $p=Ω(1)$, there is an information cascade where the process terminates in the "incorrect" consensus with probability bounded away from zero. △ Less

Submitted 9 September, 2023; originally announced September 2023.

arXiv:2307.10237 [pdf, other]

CoNAN: Conditional Neural Aggregation Network For Unconstrained Face Feature Fusion

Authors: Bhavin Jawade, Deen Dayal Mohan, Dennis Fedorishin, Srirangaraj Setlur, Venu Govindaraju

Abstract: Face recognition from image sets acquired under unregulated and uncontrolled settings, such as at large distances, low resolutions, varying viewpoints, illumination, pose, and atmospheric conditions, is challenging. Face feature aggregation, which involves aggregating a set of N feature representations present in a template into a single global representation, plays a pivotal role in such recognit… ▽ More Face recognition from image sets acquired under unregulated and uncontrolled settings, such as at large distances, low resolutions, varying viewpoints, illumination, pose, and atmospheric conditions, is challenging. Face feature aggregation, which involves aggregating a set of N feature representations present in a template into a single global representation, plays a pivotal role in such recognition systems. Existing works in traditional face feature aggregation either utilize metadata or high-dimensional intermediate feature representations to estimate feature quality for aggregation. However, generating high-quality metadata or style information is not feasible for extremely low-resolution faces captured in long-range and high altitude settings. To overcome these limitations, we propose a feature distribution conditioning approach called CoNAN for template aggregation. Specifically, our method aims to learn a context vector conditioned over the distribution information of the incoming feature set, which is utilized to weigh the features based on their estimated informativeness. The proposed method produces state-of-the-art results on long-range unconstrained face recognition datasets such as BTS, and DroneSURF, validating the advantages of such an aggregation strategy. △ Less

Submitted 16 July, 2023; originally announced July 2023.

Comments: Paper accepted at IJCB 2023

arXiv:2307.05563 [pdf, other]

doi 10.1109/IJCB54206.2022.10007936

RidgeBase: A Cross-Sensor Multi-Finger Contactless Fingerprint Dataset

Authors: Bhavin Jawade, Deen Dayal Mohan, Srirangaraj Setlur, Nalini Ratha, Venu Govindaraju

Abstract: Contactless fingerprint matching using smartphone cameras can alleviate major challenges of traditional fingerprint systems including hygienic acquisition, portability and presentation attacks. However, development of practical and robust contactless fingerprint matching techniques is constrained by the limited availability of large scale real-world datasets. To motivate further advances in contac… ▽ More Contactless fingerprint matching using smartphone cameras can alleviate major challenges of traditional fingerprint systems including hygienic acquisition, portability and presentation attacks. However, development of practical and robust contactless fingerprint matching techniques is constrained by the limited availability of large scale real-world datasets. To motivate further advances in contactless fingerprint matching across sensors, we introduce the RidgeBase benchmark dataset. RidgeBase consists of more than 15,000 contactless and contact-based fingerprint image pairs acquired from 88 individuals under different background and lighting conditions using two smartphone cameras and one flatbed contact sensor. Unlike existing datasets, RidgeBase is designed to promote research under different matching scenarios that include Single Finger Matching and Multi-Finger Matching for both contactless- to-contactless (CL2CL) and contact-to-contactless (C2CL) verification and identification. Furthermore, due to the high intra-sample variance in contactless fingerprints belonging to the same finger, we propose a set-based matching protocol inspired by the advances in facial recognition datasets. This protocol is specifically designed for pragmatic contactless fingerprint matching that can account for variances in focus, polarity and finger-angles. We report qualitative and quantitative baseline results for different protocols using a COTS fingerprint matcher (Verifinger) and a Deep CNN based approach on the RidgeBase dataset. The dataset can be downloaded here: https://www.buffalo.edu/cubs/research/datasets/ridgebase-benchmark-dataset.html △ Less

Submitted 9 July, 2023; originally announced July 2023.

Comments: Paper accepted at IJCB 2022

Journal ref: 2022 IEEE International Joint Conference on Biometrics (IJCB), Abu Dhabi, United Arab Emirates, 2022, pp. 1-9

arXiv:2304.00714 [pdf, other]

Ensemble prosody prediction for expressive speech synthesis

Authors: Tian Huey Teh, Vivian Hu, Devang S Ram Mohan, Zack Hodari, Christopher G. R. Wallis, Tomás Gomez Ibarrondo, Alexandra Torresquintero, James Leoni, Mark Gales, Simon King

Abstract: Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech. Most efforts have focused on sophisticated neural architectures intended to better model the data distribution. Yet, in evaluations it is generally found that no single model is preferred for all input texts. This suggests an approach that has rarely been used before for Text-to-Speech: an ens… ▽ More Generating expressive speech with rich and varied prosody continues to be a challenge for Text-to-Speech. Most efforts have focused on sophisticated neural architectures intended to better model the data distribution. Yet, in evaluations it is generally found that no single model is preferred for all input texts. This suggests an approach that has rarely been used before for Text-to-Speech: an ensemble of models. We apply ensemble learning to prosody prediction. We construct simple ensembles of prosody predictors by varying either model architecture or model parameter values. To automatically select amongst the models in the ensemble when performing Text-to-Speech, we propose a novel, and computationally trivial, variance-based criterion. We demonstrate that even a small ensemble of prosody predictors yields useful diversity, which, combined with the proposed selection criterion, outperforms any individual model from the ensemble. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Comments: ICASSP 2023

arXiv:2303.09446 [pdf, other]

Controllable Prosody Generation With Partial Inputs

Authors: Dan Andrei Iliescu, Devang Savita Ram Mohan, Tian Huey Teh, Zack Hodari

Abstract: We address the problem of human-in-the-loop control for generating prosody in the context of text-to-speech synthesis. Controlling prosody is challenging because existing generative models lack an efficient interface through which users can modify the output quickly and precisely. To solve this, we introduce a novel framework whereby the user provides partial inputs and the generative model genera… ▽ More We address the problem of human-in-the-loop control for generating prosody in the context of text-to-speech synthesis. Controlling prosody is challenging because existing generative models lack an efficient interface through which users can modify the output quickly and precisely. To solve this, we introduce a novel framework whereby the user provides partial inputs and the generative model generates the missing features. We propose a model that is specifically designed to encode partial prosodic features and output complete audio. We show empirically that our model displays two essential qualities of a human-in-the-loop control mechanism: efficiency and robustness. With even a very small number of input values (~4), our model enables users to improve the quality of the output significantly in terms of listener preference (4:1). △ Less

Submitted 15 April, 2024; v1 submitted 14 March, 2023; originally announced March 2023.

Comments: 5 pages

arXiv:2211.03019 [pdf, other]

Hear The Flow: Optical Flow-Based Self-Supervised Visual Sound Source Localization

Authors: Dennis Fedorishin, Deen Dayal Mohan, Bhavin Jawade, Srirangaraj Setlur, Venu Govindaraju

Abstract: Learning to localize the sound source in videos without explicit annotations is a novel area of audio-visual research. Existing work in this area focuses on creating attention maps to capture the correlation between the two modalities to localize the source of the sound. In a video, oftentimes, the objects exhibiting movement are the ones generating the sound. In this work, we capture this charact… ▽ More Learning to localize the sound source in videos without explicit annotations is a novel area of audio-visual research. Existing work in this area focuses on creating attention maps to capture the correlation between the two modalities to localize the source of the sound. In a video, oftentimes, the objects exhibiting movement are the ones generating the sound. In this work, we capture this characteristic by modeling the optical flow in a video as a prior to better aid in localizing the sound source. We further demonstrate that the addition of flow-based attention substantially improves visual sound source localization. Finally, we benchmark our method on standard sound source localization datasets and achieve state-of-the-art performance on the Soundnet Flickr and VGG Sound Source datasets. Code: https://github.com/denfed/heartheflow. △ Less

Submitted 5 November, 2022; originally announced November 2022.

Comments: Accepted to WACV 2023

arXiv:2206.02948 [pdf, other]

Simple Mechanisms for Welfare Maximization in Rich Advertising Auctions

Authors: Gagan Aggarwal, Kshipra Bhawalkar, Aranyak Mehta, Divyarthi Mohan, Alexandros Psomas

Abstract: Internet ad auctions have evolved from a few lines of text to richer informational layouts that include images, sitelinks, videos, etc. Ads in these new formats occupy varying amounts of space, and an advertiser can provide multiple formats, only one of which can be shown. The seller is now faced with a multi-parameter mechanism design problem. Computing an efficient allocation is computationally… ▽ More Internet ad auctions have evolved from a few lines of text to richer informational layouts that include images, sitelinks, videos, etc. Ads in these new formats occupy varying amounts of space, and an advertiser can provide multiple formats, only one of which can be shown. The seller is now faced with a multi-parameter mechanism design problem. Computing an efficient allocation is computationally intractable, and therefore the standard Vickrey-Clarke-Groves (VCG) auction, while truthful and welfare-optimal, is impractical. In this paper, we tackle a fundamental problem in the design of modern ad auctions. We adopt a ``Myersonian'' approach and study allocation rules that are monotone both in the bid and set of rich ads. We show that such rules can be paired with a payment function to give a truthful auction. Our main technical challenge is designing a monotone rule that yields a good approximation to the optimal welfare. Monotonicity doesn't hold for standard algorithms, e.g. the incremental bang-per-buck order, that give good approximations to ``knapsack-like'' problems such as ours. In fact, we show that no deterministic monotone rule can approximate the optimal welfare within a factor better than $2$ (while there is a non-monotone FPTAS). Our main result is a new, simple, greedy and monotone allocation rule that guarantees a $3$ approximation. In ad auctions in practice, monotone allocation rules are often paired with the so-called Generalized Second Price (GSP) payment rule, which charges the minimum threshold price below which the allocation changes. We prove that, even though our monotone allocation rule paired with GSP is not truthful, its Price of Anarchy (PoA) is bounded. Under standard no overbidding assumption, we prove a pure PoA bound of $6$ and a Bayes-Nash PoA bound of $\frac{6}{(1 - \frac{1}{e})}$. Finally, we experimentally test our algorithms on real-world data. △ Less

Submitted 6 June, 2022; originally announced June 2022.

arXiv:2205.13461 [pdf, ps, other]

Communicating with Anecdotes

Authors: Nika Haghtalab, Nicole Immorlica, Brendan Lucier, Markus Mobius, Divyarthi Mohan

Abstract: We study a communication game between a sender and a receiver. The sender chooses one of her signals about the state of the world (i.e., anecdotes) and communicates to the receiver who takes an action affecting both players. The sender and the receiver both care about the state of the world but are also influenced by personal preferences, so their ideal actions can differ. We characterize perfect… ▽ More We study a communication game between a sender and a receiver. The sender chooses one of her signals about the state of the world (i.e., anecdotes) and communicates to the receiver who takes an action affecting both players. The sender and the receiver both care about the state of the world but are also influenced by personal preferences, so their ideal actions can differ. We characterize perfect Bayesian equilibria. The sender faces a temptation to persuade: she wants to select a biased anecdote to influence the receiver's action. Anecdotes are still informative to the receiver (who will debias at equilibrium) but the attempt to persuade comes at a cost to precision. This gives rise to informational homophily where the receiver prefers to listen to like-minded senders because they provide higher-precision signals. Communication becomes polarized when the sender is an expert with access to many signals, with the sender choosing extreme outlier anecdotes at equilibrium (unless preferences are perfectly aligned). This polarization dissipates all gains from communication with an increasingly well-informed sender when the anecdote distribution is heavy-tailed. Experts can therefore face a curse of informedness: receivers will prefer to listen to less-informed senders who cannot pick biased signals as easily. △ Less

Submitted 17 July, 2024; v1 submitted 26 May, 2022; originally announced May 2022.

Comments: Extended Abstract appeared at ITCS 2024. A preliminary version of this paper appeared under the title "Persuading with Anecdotes" as an NBER working paper

arXiv:2204.08044 [pdf, other]

doi 10.1137/1.9781611977554.ch18

Interdependent Public Projects

Authors: Avi Cohen, Michal Feldman, Divyarthi Mohan, Inbal Talgam-Cohen

Abstract: In the interdependent values (IDV) model introduced by Milgrom and Weber [1982], agents have private signals that capture their information about different social alternatives, and the valuation of every agent is a function of all agent signals. While interdependence has been mainly studied for auctions, it is extremely relevant for a large variety of social choice settings, including the canonica… ▽ More In the interdependent values (IDV) model introduced by Milgrom and Weber [1982], agents have private signals that capture their information about different social alternatives, and the valuation of every agent is a function of all agent signals. While interdependence has been mainly studied for auctions, it is extremely relevant for a large variety of social choice settings, including the canonical setting of public projects. The IDV model is very challenging relative to standard independent private values, and welfare guarantees have been achieved through two alternative conditions known as {\em single-crossing} and {\em submodularity over signals (SOS)}. In either case, the existing theory falls short of solving the public projects setting. Our contribution is twofold: (i) We give a workable characterization of truthfulness for IDV public projects for the largest class of valuations for which such a characterization exists, and term this class \emph{decomposable valuations}; (ii) We provide possibility and impossibility results for welfare approximation in public projects with SOS valuations. Our main impossibility result is that, in contrast to auctions, no universally truthful mechanism performs better for public projects with SOS valuations than choosing a project at random. Our main positive result applies to {\em excludable} public projects with SOS, for which we establish a constant factor approximation similar to auctions. Our results suggest that exclusion may be a key tool for achieving welfare guarantees in the IDV model. △ Less

Submitted 5 July, 2022; v1 submitted 17 April, 2022; originally announced April 2022.

arXiv:2201.01203 [pdf, other]

doi 10.1093/mnras/stac223

Quantifying Uncertainty in Deep Learning Approaches to Radio Galaxy Classification

Authors: Devina Mohan, Anna M. M. Scaife, Fiona Porter, Mike Walmsley, Micah Bowles

Abstract: In this work we use variational inference to quantify the degree of uncertainty in deep learning model predictions of radio galaxy classification. We show that the level of model posterior variance for individual test samples is correlated with human uncertainty when labelling radio galaxies. We explore the model performance and uncertainty calibration for different weight priors and suggest that… ▽ More In this work we use variational inference to quantify the degree of uncertainty in deep learning model predictions of radio galaxy classification. We show that the level of model posterior variance for individual test samples is correlated with human uncertainty when labelling radio galaxies. We explore the model performance and uncertainty calibration for different weight priors and suggest that a sparse prior produces more well-calibrated uncertainty estimates. Using the posterior distributions for individual weights, we demonstrate that we can prune 30% of the fully-connected layer weights without significant loss of performance by removing the weights with the lowest signal-to-noise ratio. A larger degree of pruning can be achieved using a Fisher information based ranking, but both pruning methods affect the uncertainty calibration for Fanaroff-Riley type I and type II radio galaxies differently. Like other work in this field, we experience a cold posterior effect, whereby the posterior must be down-weighted to achieve good predictive performance. We examine whether adapting the cost function to accommodate model misspecification can compensate for this effect, but find that it does not make a significant difference. We also examine the effect of principled data augmentation and find that this improves upon the baseline but also does not compensate for the observed effect. We interpret this as the cold posterior effect being due to the overly effective curation of our training sample leading to likelihood misspecification, and raise this as a potential issue for Bayesian deep learning approaches to radio galaxy classification in future. △ Less

Submitted 24 January, 2022; v1 submitted 4 January, 2022; originally announced January 2022.

Comments: accepted by MNRAS

arXiv:2112.14801 [pdf, other]

doi 10.1080/17477778.2022.2039570

Modeling Prejudice and Its Effect on Societal Prosperity

Authors: Deep Inder Mohan, Arjun Verma, Shrisha Rao

Abstract: Existing studies on prejudice, which is important in multi-group dynamics in societies, focus on the social-psychological knowledge behind the processes involving prejudice and its propagation. We instead create a multi-agent framework that simulates the propagation of prejudice and measures its tangible impact on the prosperity of individuals as well as of larger social structures, including grou… ▽ More Existing studies on prejudice, which is important in multi-group dynamics in societies, focus on the social-psychological knowledge behind the processes involving prejudice and its propagation. We instead create a multi-agent framework that simulates the propagation of prejudice and measures its tangible impact on the prosperity of individuals as well as of larger social structures, including groups and factions within. Groups in society help us define prejudice, and factions represent smaller tight-knit circles of individuals with similar opinions. We model social interactions using the Continuous Prisoner's Dilemma (CPD) and a type of agent called a prejudiced agent, whose cooperation is affected by a prejudice attribute, updated over time based both on the agent's own experiences and those of others in its faction. Our simulations show that modeling prejudice as an exclusively out-group phenomenon generates implicit in-group promotion, which eventually leads to higher relative prosperity of the prejudiced population. This skew in prosperity is shown to be correlated to factors such as size difference between groups and the number of prejudiced agents in a group. Although prejudiced agents achieve higher prosperity within prejudiced societies, their presence degrades the overall prosperity levels of their societies. Our proposed system model can serve as a basis for promoting a deeper understanding of origins, propagation, and ramifications of prejudice through rigorous simulative studies grounded in apt theoretical backgrounds. This can help conduct impactful research on prominent social issues such as racism, religious discrimination, and unfair immigrant treatment. This model can also serve as a foundation to study other socio-psychological phenomena in tandem with prejudice such as the distribution of wealth, social status, and ethnocentrism in a society. △ Less

Submitted 29 December, 2021; originally announced December 2021.

Comments: 12 pages, 7 figures, 4 tables

Journal ref: Journal of Simulation, vol. 17 (6), 2023

arXiv:2111.11654 [pdf, other]

Weight Pruning and Uncertainty in Radio Galaxy Classification

Authors: Devina Mohan, Anna Scaife

Abstract: In this work we use variational inference to quantify the degree of epistemic uncertainty in model predictions of radio galaxy classification and show that the level of model posterior variance for individual test samples is correlated with human uncertainty when labelling radio galaxies. We explore the model performance and uncertainty calibration for a variety of different weight priors and sugg… ▽ More In this work we use variational inference to quantify the degree of epistemic uncertainty in model predictions of radio galaxy classification and show that the level of model posterior variance for individual test samples is correlated with human uncertainty when labelling radio galaxies. We explore the model performance and uncertainty calibration for a variety of different weight priors and suggest that a sparse prior produces more well-calibrated uncertainty estimates. Using the posterior distributions for individual weights, we show that signal-to-noise ratio (SNR) ranking allows pruning of the fully-connected layers to the level of 30% without significant loss of performance, and that this pruning increases the predictive uncertainty in the model. Finally we show that, like other work in this field, we experience a cold posterior effect. We examine whether adapting the cost function in our model to accommodate model misspecification can compensate for this effect, but find that it does not make a significant difference. We also examine the effect of principled data augmentation and find that it improves upon the baseline but does not compensate for the observed effect fully. We interpret this as the cold posterior effect being due to the overly effective curation of our training sample leading to likelihood misspecification, and raise this as a potential issue for Bayesian deep learning approaches to radio galaxy classification in future. △ Less

Submitted 29 November, 2021; v1 submitted 23 November, 2021; originally announced November 2021.

Comments: Accepted in: Fourth Workshop on Machine Learning and the Physical Sciences (35th Conference on Neural Information Processing Systems; NeurIPS2021); final version (corrected typo)

arXiv:2106.08352 [pdf, other]

Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis

Authors: Devang S Ram Mohan, Vivian Hu, Tian Huey Teh, Alexandra Torresquintero, Christopher G. R. Wallis, Marlene Staib, Lorenzo Foglianti, Jiameng Gao, Simon King

Abstract: Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text. One way to reduce the amount of unexplained variation in training data is to provide acoustic information as an additional learning signal. When generating speech, modifying this acoustic information enables multiple distinct rendit… ▽ More Text does not fully specify the spoken form, so text-to-speech models must be able to learn from speech data that vary in ways not explained by the corresponding text. One way to reduce the amount of unexplained variation in training data is to provide acoustic information as an additional learning signal. When generating speech, modifying this acoustic information enables multiple distinct renditions of a text to be produced. Since much of the unexplained variation is in the prosody, we propose a model that generates speech explicitly conditioned on the three primary acoustic correlates of prosody: $F_{0}$, energy and duration. The model is flexible about how the values of these features are specified: they can be externally provided, or predicted from text, or predicted then subsequently modified. Compared to a model that employs a variational auto-encoder to learn unsupervised latent features, our model provides more interpretable, temporally-precise, and disentangled control. When automatically predicting the acoustic features from text, it generates speech that is more natural than that from a Tacotron 2 model with reference encoder. Subsequent human-in-the-loop modification of the predicted acoustic features can significantly further increase naturalness. △ Less

Submitted 15 June, 2021; originally announced June 2021.

Comments: To be published in Interspeech 2021. 5 pages, 4 figures

arXiv:2106.08321 [pdf, other]

ADEPT: A Dataset for Evaluating Prosody Transfer

Authors: Alexandra Torresquintero, Tian Huey Teh, Christopher G. R. Wallis, Marlene Staib, Devang S Ram Mohan, Vivian Hu, Lorenzo Foglianti, Jiameng Gao, Simon King

Abstract: Text-to-speech is now able to achieve near-human naturalness and research focus has shifted to increasing expressivity. One popular method is to transfer the prosody from a reference speech sample. There have been considerable advances in using prosody transfer to generate more expressive speech, but the field lacks a clear definition of what successful prosody transfer means and a method for meas… ▽ More Text-to-speech is now able to achieve near-human naturalness and research focus has shifted to increasing expressivity. One popular method is to transfer the prosody from a reference speech sample. There have been considerable advances in using prosody transfer to generate more expressive speech, but the field lacks a clear definition of what successful prosody transfer means and a method for measuring it. We introduce a dataset of prosodically-varied reference natural speech samples for evaluating prosody transfer. The samples include global variations reflecting emotion and interpersonal attitude, and local variations reflecting topical emphasis, propositional attitude, syntactic phrasing and marked tonicity. The corpus only includes prosodic variations that listeners are able to distinguish with reasonable accuracy, and we report these figures as a benchmark against which text-to-speech prosody transfer can be compared. We conclude the paper with a demonstration of our proposed evaluation methodology, using the corpus to evaluate two text-to-speech models that perform prosody transfer. △ Less

Submitted 21 July, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

Comments: 5 pages, 1 figure, accepted to Interspeech 2021

arXiv:2104.09226 [pdf]

Machine learning approach to dynamic risk modeling of mortality in COVID-19: a UK Biobank study

Authors: Mohammad A. Dabbah, Angus B. Reed, Adam T. C. Booth, Arrash Yassaee, Alex Despotovic, Benjamin Klasmer, Emily Binning, Mert Aral, David Plans, Alain B. Labrique, Diwakar Mohan

Abstract: The COVID-19 pandemic has created an urgent need for robust, scalable monitoring tools supporting stratification of high-risk patients. This research aims to develop and validate prediction models, using the UK Biobank, to estimate COVID-19 mortality risk in confirmed cases. From the 11,245 participants testing positive for COVID-19, we develop a data-driven random forest classification model with… ▽ More The COVID-19 pandemic has created an urgent need for robust, scalable monitoring tools supporting stratification of high-risk patients. This research aims to develop and validate prediction models, using the UK Biobank, to estimate COVID-19 mortality risk in confirmed cases. From the 11,245 participants testing positive for COVID-19, we develop a data-driven random forest classification model with excellent performance (AUC: 0.91), using baseline characteristics, pre-existing conditions, symptoms, and vital signs, such that the score could dynamically assess mortality risk with disease deterioration. We also identify several significant novel predictors of COVID-19 mortality with equivalent or greater predictive value than established high-risk comorbidities, such as detailed anthropometrics and prior acute kidney failure, urinary tract infection, and pneumonias. The model design and feature selection enables utility in outpatient settings. Possible applications include supporting individual-level risk profiling and monitoring disease progression across patients with COVID-19 at-scale, especially in hospital-at-home settings. △ Less

Submitted 19 April, 2021; originally announced April 2021.

Comments: 20 pages, 3 figures

arXiv:2012.12123 [pdf]

Machine Learning Algorithm for NLOS Millimeter Wave in 5G V2X Communication

Authors: Deepika Mohan, G. G. Md. Nawaz Ali, Peter Han Joo Chong

Abstract: The 5G vehicle-to-everything (V2X) communication for autonomous and semi-autonomous driving utilizes the wireless technology for communication and the Millimeter Wave bands are widely implemented in this kind of vehicular network application. The main purpose of this paper is to broadcast the messages from the mmWave Base Station to vehicles at LOS (Line-of-sight) and NLOS (Non-LOS). Relay using M… ▽ More The 5G vehicle-to-everything (V2X) communication for autonomous and semi-autonomous driving utilizes the wireless technology for communication and the Millimeter Wave bands are widely implemented in this kind of vehicular network application. The main purpose of this paper is to broadcast the messages from the mmWave Base Station to vehicles at LOS (Line-of-sight) and NLOS (Non-LOS). Relay using Machine Learning (RML) algorithm is formulated to train the mmBS for identifying the blockages within its coverage area and broadcast the messages to the vehicles at NLOS using a LOS nodes as a relay. The transmission of information is faster with higher throughput and it covers a wider bandwidth which is reused, therefore when performing machine learning within the coverage area of mmBS most of the vehicles in NLOS can be benefited. A unique method of relay mechanism combined with machine learning is proposed to communicate with mobile nodes at NLOS. △ Less

Submitted 16 December, 2020; originally announced December 2020.

Comments: 14 pages, 9 figures, conference 7th International conference on Computer Networks and Communications (CCNET 2020), AIRCC Publishing Corporation

Report number: Volume 10, number 17 MSC Class: 68M11 ACM Class: I.6; I.2.2; J.7

arXiv:2008.04107 [pdf, other]

doi 10.21437/Interspeech.2020-1821

Phonological Features for 0-shot Multilingual Speech Synthesis

Authors: Marlene Staib, Tian Huey Teh, Alexandra Torresquintero, Devang S Ram Mohan, Lorenzo Foglianti, Raphael Lenain, Jiameng Gao

Abstract: Code-switching---the intra-utterance use of multiple languages---is prevalent across the world. Within text-to-speech (TTS), multilingual models have been found to enable code-switching. By modifying the linguistic input to sequence-to-sequence TTS, we show that code-switching is possible for languages unseen during training, even within monolingual models. We use a small set of phonological featu… ▽ More Code-switching---the intra-utterance use of multiple languages---is prevalent across the world. Within text-to-speech (TTS), multilingual models have been found to enable code-switching. By modifying the linguistic input to sequence-to-sequence TTS, we show that code-switching is possible for languages unseen during training, even within monolingual models. We use a small set of phonological features derived from the International Phonetic Alphabet (IPA), such as vowel height and frontness, consonant place and manner. This allows the model topology to stay unchanged for different languages, and enables new, previously unseen feature combinations to be interpreted by the model. We show that this allows us to generate intelligible, code-switched speech in a new language at test time, including the approximation of sounds never seen in training. △ Less

Submitted 6 August, 2020; originally announced August 2020.

Comments: 5 pages, to be presented at INTERSPEECH 2020

arXiv:2008.03096 [pdf, other]

doi 10.21437/Interspeech.2020-1822

Incremental Text to Speech for Neural Sequence-to-Sequence Models using Reinforcement Learning

Authors: Devang S Ram Mohan, Raphael Lenain, Lorenzo Foglianti, Tian Huey Teh, Marlene Staib, Alexandra Torresquintero, Jiameng Gao

Abstract: Modern approaches to text to speech require the entire input character sequence to be processed before any audio is synthesised. This latency limits the suitability of such models for time-sensitive tasks like simultaneous interpretation. Interleaving the action of reading a character with that of synthesising audio reduces this latency. However, the order of this sequence of interleaved actions v… ▽ More Modern approaches to text to speech require the entire input character sequence to be processed before any audio is synthesised. This latency limits the suitability of such models for time-sensitive tasks like simultaneous interpretation. Interleaving the action of reading a character with that of synthesising audio reduces this latency. However, the order of this sequence of interleaved actions varies across sentences, which raises the question of how the actions should be chosen. We propose a reinforcement learning based framework to train an agent to make this decision. We compare our performance against that of deterministic, rule-based systems. Our results demonstrate that our agent successfully balances the trade-off between the latency of audio generation and the quality of synthesised audio. More broadly, we show that neural sequence-to-sequence models can be adapted to run in an incremental manner. △ Less

Submitted 7 August, 2020; originally announced August 2020.

Comments: To be published in Interspeech 2020. 5 pages, 4 figures

arXiv:2006.12495 [pdf, other]

doi 10.1016/j.ecoser.2020.101176

Using graph theory and social media data to assess cultural ecosystem services in coastal areas: Method development and application

Authors: Ana Ruiz-Frau, Andres Ospina-Alvarez, Sebastián Villasante, Pablo Pita, Isidro Maya-Jariego, Silvia de Juan Mohan

Abstract: The use of social media (SM) data has emerged as a promising tool for the assessment of cultural ecosystem services (CES). Most studies have focused on the use of single SM platforms and on the analysis of photo content to assess the demand for CES. Here, we introduce a novel methodology for the assessment of CES using SM data through the application of graph theory network analyses (GTNA) on hash… ▽ More The use of social media (SM) data has emerged as a promising tool for the assessment of cultural ecosystem services (CES). Most studies have focused on the use of single SM platforms and on the analysis of photo content to assess the demand for CES. Here, we introduce a novel methodology for the assessment of CES using SM data through the application of graph theory network analyses (GTNA) on hashtags associated to SM posts and compare it to photo content analysis. We applied the proposed methodology on two SM platforms, Instagram and Twitter, on three worldwide known case study areas, namely Great Barrier Reef, Galapagos Islands and Easter Island. Our results indicate that the analysis of hashtags through graph theory offers similar capabilities to photo content analysis in the assessment of CES provision and the identification of CES providers. More importantly, GTNA provides greater capabilities at identifying relational values and eudaimonic aspects associated to nature, elusive aspects for photo content analysis. In addition, GTNA contributes to the reduction of the interpreter's bias associated to photo content analyses, since GTNA is based on the tags provided by the users themselves. The study also highlights the importance of considering data from different social media platforms, as the type of users and the information offered by these platforms can show different CES attributes. The ease of application and short computing processing times involved in the application of GTNA makes it a cost-effective method with the potential of being applied to large geographical scales. △ Less

Submitted 20 June, 2020; originally announced June 2020.

Comments: 23 pages, 5 figures, 2 appendices

MSC Class: 14J60 (Primary) 92F05; 91D30; 91B76 (Secondary) ACM Class: J.3

Journal ref: Ecosystem Services, Volume 45, October 2020, 101176

arXiv:1912.10798 [pdf, other]

Deep learning predictions of sand dune migration

Authors: Kelly Kochanski, Divya Mohan, Jenna Horrall, Barry Rountree, Ghaleb Abdulla

Abstract: A dry decade in the Navajo Nation has killed vegetation, dessicated soils, and released once-stable sand into the wind. This sand now covers one-third of the Nation's land, threatening roads, gardens and hundreds of homes. Many arid regions have similar problems: global warming has increased dune movement across farmland in Namibia and Angola, and the southwestern US. Current dune models, unfortun… ▽ More A dry decade in the Navajo Nation has killed vegetation, dessicated soils, and released once-stable sand into the wind. This sand now covers one-third of the Nation's land, threatening roads, gardens and hundreds of homes. Many arid regions have similar problems: global warming has increased dune movement across farmland in Namibia and Angola, and the southwestern US. Current dune models, unfortunately, do not scale well enough to provide useful forecasts for the $\sim$5\% of land surfaces covered by mobile sand. We test the ability of two deep learning algorithms, a GAN and a CNN, to model the motion of sand dunes. The models are trained on simulated data from community-standard cellular automaton model of sand dunes. Preliminary results show the GAN producing reasonable forward predictions of dune migration at ten million times the speed of the existing model. △ Less

Submitted 12 December, 2019; originally announced December 2019.

Comments: Workshop on Tackling climate change with machine learning at NeurIPS. Vancouver, Canada, December 2019

arXiv:1907.05823 [pdf, ps, other]

doi 10.4230/LIPIcs.ICALP.2020.8

Asynchronous Majority Dynamics in Preferential Attachment Trees

Authors: Maryam Bahrani, Nicole Immorlica, Divyarthi Mohan, S. Matthew Weinberg

Abstract: We study information aggregation in networks where agents make binary decisions (labeled incorrect or correct). Agents initially form independent private beliefs about the better decision, which is correct with probability $1/2+δ$. The dynamics we consider are asynchronous (each round, a single agent updates their announced decision) and non-Bayesian (agents simply copy the majority announcements… ▽ More We study information aggregation in networks where agents make binary decisions (labeled incorrect or correct). Agents initially form independent private beliefs about the better decision, which is correct with probability $1/2+δ$. The dynamics we consider are asynchronous (each round, a single agent updates their announced decision) and non-Bayesian (agents simply copy the majority announcements among their neighbors, tie-breaking in favor of their private signal). Our main result proves that when the network is a tree formed according to the preferential attachment model \cite{BarabasiA99}, with high probability, the process stabilizes in a correct majority within $O(n \log n/ \log\log n)$ rounds. We extend our results to other tree structures, including balanced $M$-ary trees for any $M$. △ Less

Submitted 7 July, 2020; v1 submitted 12 July, 2019; originally announced July 2019.

Comments: ICALP 2020

arXiv:1905.05231 [pdf, ps, other]

doi 10.1109/FOCS.2019.00023

Approximation Schemes for a Unit-Demand Buyer with Independent Items via Symmetries

Authors: Pravesh Kothari, Divyarthi Mohan, Ariel Schvartzman, Sahil Singla, S. Matthew Weinberg

Abstract: We consider a revenue-maximizing seller with $n$ items facing a single buyer. We introduce the notion of symmetric menu complexity of a mechanism, which counts the number of distinct options the buyer may purchase, up to permutations of the items. Our main result is that a mechanism of quasi-polynomial symmetric menu complexity suffices to guarantee a $(1-\varepsilon)$-approximation when the buyer… ▽ More We consider a revenue-maximizing seller with $n$ items facing a single buyer. We introduce the notion of symmetric menu complexity of a mechanism, which counts the number of distinct options the buyer may purchase, up to permutations of the items. Our main result is that a mechanism of quasi-polynomial symmetric menu complexity suffices to guarantee a $(1-\varepsilon)$-approximation when the buyer is unit-demand over independent items, even when the value distribution is unbounded, and that this mechanism can be found in quasi-polynomial time. Our key technical result is a polynomial time, (symmetric) menu-complexity-preserving black-box reduction from achieving a $(1-\varepsilon)$-approximation for unbounded valuations that are subadditive over independent items to achieving a $(1-O(\varepsilon))$-approximation when the values are bounded (and still subadditive over independent items). We further apply this reduction to deduce approximation schemes for a suite of valuation classes beyond our main result. Finally, we show that selling separately (which has exponential menu complexity) can be approximated up to a $(1-\varepsilon)$ factor with a menu of efficient-linear $(f(\varepsilon) \cdot n)$ symmetric menu complexity. △ Less

Submitted 19 November, 2019; v1 submitted 13 May, 2019; originally announced May 2019.

Comments: FOCS 2019

arXiv:1803.08235 [pdf, other]

doi 10.1140/epjc/s10052-018-6105-5

Bayesian analysis of bulk viscous matter dominated universe

Authors: Athira Sasidharan, N D Jerin Mohan, Moncy V John, Titus K Mathew

Abstract: In our previous works, we have analyzed the evolution of bulk viscous matter dominated universe with a more general form for bulk viscous coefficient, $ζ=ζ_{0}+ζ_{1}\frac{\dot{a}}{a}+ζ_{2}\frac{\ddot{a}}{\dot{a}}$ and also carried out the dynamical system analysis. We found that the model reasonably describes the evolution of the universe if the viscous coefficient is a constant. In the present wo… ▽ More In our previous works, we have analyzed the evolution of bulk viscous matter dominated universe with a more general form for bulk viscous coefficient, $ζ=ζ_{0}+ζ_{1}\frac{\dot{a}}{a}+ζ_{2}\frac{\ddot{a}}{\dot{a}}$ and also carried out the dynamical system analysis. We found that the model reasonably describes the evolution of the universe if the viscous coefficient is a constant. In the present work we are contrasting this model with the standard $Λ$CDM model of the universe using the Bayesian method. We have shown that, even though the viscous model gives a reasonable back ground evolution of the universe, the Bayes factor of the model indicates that, it is not so superior over the $Λ$CDM model, but have a slight advantage over it. △ Less

Submitted 20 April, 2018; v1 submitted 22 March, 2018; originally announced March 2018.

Comments: 15 pages, 9 figures

arXiv:1103.4712 [pdf]

doi 10.5121/sipij.2011.2111

Distributed Video Coding: Codec Architecture and Implementation

Authors: Vijay Kumar Kodavalla, Dr. P. G. Krishna Mohan

Abstract: Distributed Video Coding (DVC) is a new coding paradigm for video compression, based on Slepian- Wolf (lossless coding) and Wyner-Ziv (lossy coding) information theoretic results. DVC is useful for emerging applications such as wireless video cameras, wireless low-power surveillance networks and disposable video cameras for medical applications etc. The primary objective of DVC is low-complexity v… ▽ More Distributed Video Coding (DVC) is a new coding paradigm for video compression, based on Slepian- Wolf (lossless coding) and Wyner-Ziv (lossy coding) information theoretic results. DVC is useful for emerging applications such as wireless video cameras, wireless low-power surveillance networks and disposable video cameras for medical applications etc. The primary objective of DVC is low-complexity video encoding, where bulk of computation is shifted to the decoder, as opposed to low-complexity decoder in conventional video compression standards such as H.264 and MPEG etc. There are couple of early architectures and implementations of DVC from Stanford University[2][3] in 2002, Berkeley University PRISM (Power-efficient, Robust, hIgh-compression, Syndrome-based Multimedia coding)[4][5] in 2002 and European project DISCOVER (DIStributed COding for Video SERvices)[6] in 2007. Primarily there are two types of DVC techniques namely pixel domain and transform domain based. Transform domain design will have better rate-distortion (RD) performance as it exploits spatial correlation between neighbouring samples and compacts the block energy into as few transform coefficients as possible (aka energy compaction). In this paper, architecture, implementation details and "C" model results of our transform domain DVC are presented. △ Less

Submitted 24 March, 2011; originally announced March 2011.

Comments: Signal & Image Processing : An International Journal(SIPIJ) Vol.2, No.1, March 2011

Showing 1–34 of 34 results for author: Mohan, D