subscribe to arXiv mailings

Persuasion Games using Large Language Models

Authors: Ganesh Prasath Ramani, Shirish Karande, Santhosh V, Yash Bhatia

Abstract: Large Language Models (LLMs) have emerged as formidable instruments capable of comprehending and producing human-like text. This paper explores the potential of LLMs, to shape user perspectives and subsequently influence their decisions on particular tasks. This capability finds applications in diverse domains such as Investment, Credit cards and Insurance, wherein they assist users in selecting a… ▽ More Large Language Models (LLMs) have emerged as formidable instruments capable of comprehending and producing human-like text. This paper explores the potential of LLMs, to shape user perspectives and subsequently influence their decisions on particular tasks. This capability finds applications in diverse domains such as Investment, Credit cards and Insurance, wherein they assist users in selecting appropriate insurance policies, investment plans, Credit cards, Retail, as well as in Behavioral Change Support Systems (BCSS). We present a sophisticated multi-agent framework wherein a consortium of agents operate in collaborative manner. The primary agent engages directly with user agents through persuasive dialogue, while the auxiliary agents perform tasks such as information retrieval, response analysis, development of persuasion strategies, and validation of facts. Empirical evidence from our experiments demonstrates that this collaborative methodology significantly enhances the persuasive efficacy of the LLM. We continuously analyze the resistance of the user agent to persuasive efforts and counteract it by employing a combination of rule-based and LLM-based resistance-persuasion mapping techniques. We employ simulated personas and generate conversations in insurance, banking, and retail domains to evaluate the proficiency of large language models (LLMs) in recognizing, adjusting to, and influencing various personality types. Concurrently, we examine the resistance mechanisms employed by LLM simulated personas. Persuasion is quantified via measurable surveys before and after interaction, LLM-generated scores on conversation, and user decisions (purchase or non-purchase). △ Less

Submitted 1 September, 2024; v1 submitted 28 August, 2024; originally announced August 2024.

arXiv:2407.18541 [pdf, other]

Towards Improving NAM-to-Speech Synthesis Intelligibility using Self-Supervised Speech Models

Authors: Neil Shah, Shirish Karande, Vineet Gandhi

Abstract: We propose a novel approach to significantly improve the intelligibility in the Non-Audible Murmur (NAM)-to-speech conversion task, leveraging self-supervision and sequence-to-sequence (Seq2Seq) learning techniques. Unlike conventional methods that explicitly record ground-truth speech, our methodology relies on self-supervision and speech-to-speech synthesis to simulate ground-truth speech. Despi… ▽ More We propose a novel approach to significantly improve the intelligibility in the Non-Audible Murmur (NAM)-to-speech conversion task, leveraging self-supervision and sequence-to-sequence (Seq2Seq) learning techniques. Unlike conventional methods that explicitly record ground-truth speech, our methodology relies on self-supervision and speech-to-speech synthesis to simulate ground-truth speech. Despite utilizing simulated speech, our method surpasses the current state-of-the-art (SOTA) by 29.08% improvement in the Mel-Cepstral Distortion (MCD) metric. Additionally, we present error rates and demonstrate our model's proficiency to synthesize speech in novel voices of interest. Moreover, we present a methodology for augmenting the existing CSTR NAM TIMIT Plus corpus, setting a benchmark with a Word Error Rate (WER) of 42.57% to gauge the intelligibility of the synthesized speech. Speech samples can be found at https://nam2speech.github.io/NAM2Speech/ △ Less

Submitted 26 July, 2024; originally announced July 2024.

Comments: Accepted at Interspeech 2024

arXiv:2312.01398 [pdf, other]

Towards Mitigating Perceived Unfairness in Contracts from a Non-Legal Stakeholder's Perspective

Authors: Anmol Singhal, Preethu Rose Anish, Shirish Karande, Smita Ghaisas

Abstract: Commercial contracts are known to be a valuable source for deriving project-specific requirements. However, contract negotiations mainly occur among the legal counsel of the parties involved. The participation of non-legal stakeholders, including requirement analysts, engineers, and solution architects, whose primary responsibility lies in ensuring the seamless implementation of contractual terms,… ▽ More Commercial contracts are known to be a valuable source for deriving project-specific requirements. However, contract negotiations mainly occur among the legal counsel of the parties involved. The participation of non-legal stakeholders, including requirement analysts, engineers, and solution architects, whose primary responsibility lies in ensuring the seamless implementation of contractual terms, is often indirect and inadequate. Consequently, a significant number of sentences in contractual clauses, though legally accurate, can appear unfair from an implementation perspective to non-legal stakeholders. This perception poses a problem since requirements indicated in the clauses are obligatory and can involve punitive measures and penalties if not implemented as committed in the contract. Therefore, the identification of potentially unfair clauses in contracts becomes crucial. In this work, we conduct an empirical study to analyze the perspectives of different stakeholders regarding contractual fairness. We then investigate the ability of Pre-trained Language Models (PLMs) to identify unfairness in contractual sentences by comparing chain of thought prompting and semi-supervised fine-tuning approaches. Using BERT-based fine-tuning, we achieved an accuracy of 84% on a dataset consisting of proprietary contracts. It outperformed chain of thought prompting using Vicuna-13B by a margin of 9%. △ Less

Submitted 3 December, 2023; originally announced December 2023.

Comments: 9 pages, 2 figures, to be published in Natural Legal Language Processing Workshop at EMNLP 2023

arXiv:2311.13885 [pdf, other]

Can Physics Informed Neural Operators Self Improve?

Authors: Ritam Majumdar, Amey Varhade, Shirish Karande, Lovekesh Vig

Abstract: Self-training techniques have shown remarkable value across many deep learning models and tasks. However, such techniques remain largely unexplored when considered in the context of learning fast solvers for systems of partial differential equations (Eg: Neural Operators). In this work, we explore the use of self-training for Fourier Neural Operators (FNO). Neural Operators emerged as a data drive… ▽ More Self-training techniques have shown remarkable value across many deep learning models and tasks. However, such techniques remain largely unexplored when considered in the context of learning fast solvers for systems of partial differential equations (Eg: Neural Operators). In this work, we explore the use of self-training for Fourier Neural Operators (FNO). Neural Operators emerged as a data driven technique, however, data from experiments or traditional solvers is not always readily available. Physics Informed Neural Operators (PINO) overcome this constraint by utilizing a physics loss for the training, however the accuracy of PINO trained without data does not match the performance obtained by training with data. In this work we show that self-training can be used to close this gap in performance. We examine canonical examples, namely the 1D-Burgers and 2D-Darcy PDEs, to showcase the efficacy of self-training. Specifically, FNOs, when trained exclusively with physics loss through self-training, approach 1.07x for Burgers and 1.02x for Darcy, compared to FNOs trained with both data and physics loss. Furthermore, we discover that pseudo-labels can be used for self-training without necessarily training to convergence in each iteration. A consequence of this is that we are able to discover self-training schedules that improve upon the baseline performance of PINO in terms of accuracy as well as time. △ Less

Submitted 23 November, 2023; originally announced November 2023.

Comments: Paper accepted as a Spotlight talk at Symbiosis of Deep Learning and Differential Equations, Neural Information Processing Systems 2023

arXiv:2308.09293 [pdf, other]

How important are specialized transforms in Neural Operators?

Authors: Ritam Majumdar, Shirish Karande, Lovekesh Vig

Abstract: Simulating physical systems using Partial Differential Equations (PDEs) has become an indispensible part of modern industrial process optimization. Traditionally, numerical solvers have been used to solve the associated PDEs, however recently Transform-based Neural Operators such as the Fourier Neural Operator and Wavelet Neural Operator have received a lot of attention for their potential to prov… ▽ More Simulating physical systems using Partial Differential Equations (PDEs) has become an indispensible part of modern industrial process optimization. Traditionally, numerical solvers have been used to solve the associated PDEs, however recently Transform-based Neural Operators such as the Fourier Neural Operator and Wavelet Neural Operator have received a lot of attention for their potential to provide fast solutions for systems of PDEs. In this work, we investigate the importance of the transform layers to the reported success of transform based neural operators. In particular, we record the cost in terms of performance, if all the transform layers are replaced by learnable linear layers. Surprisingly, we observe that linear layers suffice to provide performance comparable to the best-known transform-based layers and seem to do so with a compute time advantage as well. We believe that this observation can have significant implications for future work on Neural Operators, and might point to other sources of efficiencies for these architectures. △ Less

Submitted 18 August, 2023; originally announced August 2023.

Comments: 8 pages, 3 figures, 4 tables

arXiv:2308.09290 [pdf, other]

HyperLoRA for PDEs

Authors: Ritam Majumdar, Vishal Jadhav, Anirudh Deodhar, Shirish Karande, Lovekesh Vig, Venkataramana Runkana

Abstract: Physics-informed neural networks (PINNs) have been widely used to develop neural surrogates for solutions of Partial Differential Equations. A drawback of PINNs is that they have to be retrained with every change in initial-boundary conditions and PDE coefficients. The Hypernetwork, a model-based meta learning technique, takes in a parameterized task embedding as input and predicts the weights of… ▽ More Physics-informed neural networks (PINNs) have been widely used to develop neural surrogates for solutions of Partial Differential Equations. A drawback of PINNs is that they have to be retrained with every change in initial-boundary conditions and PDE coefficients. The Hypernetwork, a model-based meta learning technique, takes in a parameterized task embedding as input and predicts the weights of PINN as output. Predicting weights of a neural network however, is a high-dimensional regression problem, and hypernetworks perform sub-optimally while predicting parameters for large base networks. To circumvent this issue, we use a low ranked adaptation (LoRA) formulation to decompose every layer of the base network into low-ranked tensors and use hypernetworks to predict the low-ranked tensors. Despite the reduced dimensionality of the resulting weight-regression problem, LoRA-based Hypernetworks violate the underlying physics of the given task. We demonstrate that the generalization capabilities of LoRA-based hypernetworks drastically improve when trained with an additional physics-informed loss component (HyperPINN) to satisfy the governing differential equations. We observe that LoRA-based HyperPINN training allows us to learn fast solutions for parameterized PDEs like Burger's equation and Navier Stokes: Kovasznay flow, while having an 8x reduction in prediction parameters on average without compromising on accuracy when compared to all other baselines. △ Less

Submitted 18 August, 2023; originally announced August 2023.

Comments: 8 pages, 4 figures, 3 Tables

arXiv:2304.03287 [pdf, other]

Synthesis of Mathematical programs from Natural Language Specifications

Authors: Ganesh Prasath, Shirish Karande

Abstract: Several decision problems that are encountered in various business domains can be modeled as mathematical programs, i.e. optimization problems. The process of conducting such modeling often requires the involvement of experts trained in operations research and advanced algorithms. Surprisingly, despite the significant advances in the methods for program and code synthesis, AutoML, learning to opti… ▽ More Several decision problems that are encountered in various business domains can be modeled as mathematical programs, i.e. optimization problems. The process of conducting such modeling often requires the involvement of experts trained in operations research and advanced algorithms. Surprisingly, despite the significant advances in the methods for program and code synthesis, AutoML, learning to optimize etc., there has been little or no attention paid to automating the task of synthesizing mathematical programs. We imagine a scenario where the specifications for modeling, i.e. the objective and constraints are expressed in an unstructured form in natural language (NL) and the mathematical program has to be synthesized from such an NL specification. In this work we evaluate the efficacy of employing CodeT5 with data augmentation and post-processing of beams. We utilize GPT-3 with back translation for generation of synthetic examples. Further we apply rules of linear programming to score beams and correct beams based on common error patterns. We observe that with these enhancements CodeT5 base gives an execution accuracy of 0.73 which is significantly better than zero-shot execution accuracy of 0.41 by ChatGPT and 0.36 by Codex. △ Less

Submitted 30 March, 2023; originally announced April 2023.

Comments: Accepted in ICLR 2023 DL4C stream

arXiv:2303.14194 [pdf, other]

DeepEpiSolver: Unravelling Inverse problems in Covid, HIV, Ebola and Disease Transmission

Authors: Ritam Majumdar, Shirish Karande, Lovekesh Vig

Abstract: The spread of many infectious diseases is modeled using variants of the SIR compartmental model, which is a coupled differential equation. The coefficients of the SIR model determine the spread trajectories of disease, on whose basis proactive measures can be taken. Hence, the coefficient estimates must be both fast and accurate. Shaier et al. in the paper "Disease Informed Neural Networks" used P… ▽ More The spread of many infectious diseases is modeled using variants of the SIR compartmental model, which is a coupled differential equation. The coefficients of the SIR model determine the spread trajectories of disease, on whose basis proactive measures can be taken. Hence, the coefficient estimates must be both fast and accurate. Shaier et al. in the paper "Disease Informed Neural Networks" used Physics Informed Neural Networks (PINNs) to estimate the parameters of the SIR model. There are two drawbacks to this approach. First, the training time for PINNs is high, with certain diseases taking close to 90 hrs to train. Second, PINNs don't generalize for a new SIDR trajectory, and learning its corresponding SIR parameters requires retraining the PINN from scratch. In this work, we aim to eliminate both of these drawbacks. We generate a dataset between the parameters of ODE and the spread trajectories by solving the forward problem for a large distribution of parameters using the LSODA algorithm. We then use a neural network to learn the mapping between spread trajectories and coefficients of SIDR in an offline manner. This allows us to learn the parameters of a new spread trajectory without having to retrain, enabling generalization at test time. We observe a speed-up of 3-4 orders of magnitude with accuracy comparable to that of PINNs for 11 highly infectious diseases. Further finetuning of neural network inferred ODE coefficients using PINN further leads to 2-3 orders improvement of estimated coefficients. △ Less

Submitted 24 March, 2023; originally announced March 2023.

Comments: Publication accepted at International Conference for Learning Representations 2023: First Workshop in Machine Learning and Global Health

arXiv:2303.07009 [pdf, other]

Symbolic Regression for PDEs using Pruned Differentiable Programs

Authors: Ritam Majumdar, Vishal Jadhav, Anirudh Deodhar, Shirish Karande, Lovekesh Vig, Venkataramana Runkana

Abstract: Physics-informed Neural Networks (PINNs) have been widely used to obtain accurate neural surrogates for a system of Partial Differential Equations (PDE). One of the major limitations of PINNs is that the neural solutions are challenging to interpret, and are often treated as black-box solvers. While Symbolic Regression (SR) has been studied extensively, very few works exist which generate analytic… ▽ More Physics-informed Neural Networks (PINNs) have been widely used to obtain accurate neural surrogates for a system of Partial Differential Equations (PDE). One of the major limitations of PINNs is that the neural solutions are challenging to interpret, and are often treated as black-box solvers. While Symbolic Regression (SR) has been studied extensively, very few works exist which generate analytical expressions to directly perform SR for a system of PDEs. In this work, we introduce an end-to-end framework for obtaining mathematical expressions for solutions of PDEs. We use a trained PINN to generate a dataset, upon which we perform SR. We use a Differentiable Program Architecture (DPA) defined using context-free grammar to describe the space of symbolic expressions. We improve the interpretability by pruning the DPA in a depth-first manner using the magnitude of weights as our heuristic. On average, we observe a 95.3% reduction in parameters of DPA while maintaining accuracy at par with PINNs. Furthermore, on an average, pruning improves the accuracy of DPA by 7.81% . We demonstrate our framework outperforms the existing state-of-the-art SR solvers on systems of complex PDEs like Navier-Stokes: Kovasznay flow and Taylor-Green Vortex flow. Furthermore, we produce analytical expressions for a complex industrial use-case of an Air-Preheater, without suffering from performance loss viz-a-viz PINNs. △ Less

Submitted 13 March, 2023; originally announced March 2023.

Comments: Publication accepted at International Conference for Learning Representations 2023: Physics for Machine Learning

arXiv:2303.02648 [pdf, other]

Comparative study of Transformer and LSTM Network with attention mechanism on Image Captioning

Authors: Pranav Dandwate, Chaitanya Shahane, Vandana Jagtap, Shridevi C. Karande

Abstract: In a globalized world at the present epoch of generative intelligence, most of the manual labour tasks are automated with increased efficiency. This can support businesses to save time and money. A crucial component of generative intelligence is the integration of vision and language. Consequently, image captioning become an intriguing area of research. There have been multiple attempts by the res… ▽ More In a globalized world at the present epoch of generative intelligence, most of the manual labour tasks are automated with increased efficiency. This can support businesses to save time and money. A crucial component of generative intelligence is the integration of vision and language. Consequently, image captioning become an intriguing area of research. There have been multiple attempts by the researchers to solve this problem with different deep learning architectures, although the accuracy has increased, but the results are still not up to standard. This study buckles down to the comparison of Transformer and LSTM with attention block model on MS-COCO dataset, which is a standard dataset for image captioning. For both the models we have used pretrained Inception-V3 CNN encoder for feature extraction of the images. The Bilingual Evaluation Understudy score (BLEU) is used to checked the accuracy of caption generated by both models. Along with the transformer and LSTM with attention block models,CLIP-diffusion model, M2-Transformer model and the X-Linear Attention model have been discussed with state of the art accuracy. △ Less

Submitted 5 March, 2023; originally announced March 2023.

Comments: 13 pages, 7 figures, 2 tables

arXiv:2212.10032 [pdf, other]

Real-time Health Monitoring of Heat Exchangers using Hypernetworks and PINNs

Authors: Ritam Majumdar, Vishal Jadhav, Anirudh Deodhar, Shirish Karande, Lovekesh Vig, Venkataramana Runkana

Abstract: We demonstrate a Physics-informed Neural Network (PINN) based model for real-time health monitoring of a heat exchanger, that plays a critical role in improving energy efficiency of thermal power plants. A hypernetwork based approach is used to enable the domain-decomposed PINN learn the thermal behavior of the heat exchanger in response to dynamic boundary conditions, eliminating the need to re-t… ▽ More We demonstrate a Physics-informed Neural Network (PINN) based model for real-time health monitoring of a heat exchanger, that plays a critical role in improving energy efficiency of thermal power plants. A hypernetwork based approach is used to enable the domain-decomposed PINN learn the thermal behavior of the heat exchanger in response to dynamic boundary conditions, eliminating the need to re-train. As a result, we achieve orders of magnitude reduction in inference time in comparison to existing PINNs, while maintaining the accuracy on par with the physics-based simulations. This makes the approach very attractive for predictive maintenance of the heat exchanger in digital twin environments. △ Less

Submitted 20 December, 2022; originally announced December 2022.

Comments: Neural Information Processing Systems 2022: The Machine Learning and the Physical Sciences workshop

arXiv:2207.06240 [pdf, ps, other]

Physics Informed Symbolic Networks

Authors: Ritam Majumdar, Vishal Jadhav, Anirudh Deodhar, Shirish Karande, Lovekesh Vig, Venkataramana Runkana

Abstract: We introduce Physics Informed Symbolic Networks (PISN) which utilize physics-informed loss to obtain a symbolic solution for a system of Partial Differential Equations (PDE). Given a context-free grammar to describe the language of symbolic expressions, we propose to use weighted sum as continuous approximation for selection of a production rule. We use this approximation to define multilayer symb… ▽ More We introduce Physics Informed Symbolic Networks (PISN) which utilize physics-informed loss to obtain a symbolic solution for a system of Partial Differential Equations (PDE). Given a context-free grammar to describe the language of symbolic expressions, we propose to use weighted sum as continuous approximation for selection of a production rule. We use this approximation to define multilayer symbolic networks. We consider Kovasznay flow (Navier-Stokes) and two-dimensional viscous Burger's equations to illustrate that PISN are able to provide a performance comparable to PINNs across various start-of-the-art advances: multiple outputs and governing equations, domain-decomposition, hypernetworks. Furthermore, we propose Physics-informed Neurosymbolic Networks (PINSN) which employ a multilayer perceptron (MLP) operator to model the residue of symbolic networks. PINSNs are observed to give 2-3 orders of performance gain over standard PINN. △ Less

Submitted 20 December, 2022; v1 submitted 11 July, 2022; originally announced July 2022.

Comments: Neural Information Processing Systems 2022: The Symbiosis of Deep Learning and Differential Equations Workshop

arXiv:2203.10085 [pdf, other]

I Know Therefore I Score: Label-Free Crafting of Scoring Functions using Constraints Based on Domain Expertise

Authors: Ragja Palakkadavath, Sarath Sivaprasad, Shirish Karande, Niranjan Pedanekar

Abstract: Several real-life applications require crafting concise, quantitative scoring functions (also called rating systems) from measured observations. For example, an effectiveness score needs to be created for advertising campaigns using a number of engagement metrics. Experts often need to create such scoring functions in the absence of labelled data, where the scores need to reflect business insights… ▽ More Several real-life applications require crafting concise, quantitative scoring functions (also called rating systems) from measured observations. For example, an effectiveness score needs to be created for advertising campaigns using a number of engagement metrics. Experts often need to create such scoring functions in the absence of labelled data, where the scores need to reflect business insights and rules as understood by the domain experts. Without a way to capture these inputs systematically, this becomes a time-consuming process involving trial and error. In this paper, we introduce a label-free practical approach to learn a scoring function from multi-dimensional numerical data. The approach incorporates insights and business rules from domain experts in the form of easily observable and specifiable constraints, which are used as weak supervision by a machine learning model. We convert such constraints into loss functions that are optimized simultaneously while learning the scoring function. We examine the efficacy of the approach using a synthetic dataset as well as four real-life datasets, and also compare how it performs vis-a-vis supervised learning models. △ Less

Submitted 18 March, 2022; originally announced March 2022.

arXiv:1703.07131 [pdf, other]

Knowledge distillation using unlabeled mismatched images

Authors: Mandar Kulkarni, Kalpesh Patil, Shirish Karande

Abstract: Current approaches for Knowledge Distillation (KD) either directly use training data or sample from the training data distribution. In this paper, we demonstrate effectiveness of 'mismatched' unlabeled stimulus to perform KD for image classification networks. For illustration, we consider scenarios where this is a complete absence of training data, or mismatched stimulus has to be used for augment… ▽ More Current approaches for Knowledge Distillation (KD) either directly use training data or sample from the training data distribution. In this paper, we demonstrate effectiveness of 'mismatched' unlabeled stimulus to perform KD for image classification networks. For illustration, we consider scenarios where this is a complete absence of training data, or mismatched stimulus has to be used for augmenting a small amount of training data. We demonstrate that stimulus complexity is a key factor for distillation's good performance. Our examples include use of various datasets for stimulating MNIST and CIFAR teachers. △ Less

Submitted 21 March, 2017; originally announced March 2017.

arXiv:1703.07115 [pdf, other]

Layer-wise training of deep networks using kernel similarity

Authors: Mandar Kulkarni, Shirish Karande

Abstract: Deep learning has shown promising results in many machine learning applications. The hierarchical feature representation built by deep networks enable compact and precise encoding of the data. A kernel analysis of the trained deep networks demonstrated that with deeper layers, more simple and more accurate data representations are obtained. In this paper, we propose an approach for layer-wise trai… ▽ More Deep learning has shown promising results in many machine learning applications. The hierarchical feature representation built by deep networks enable compact and precise encoding of the data. A kernel analysis of the trained deep networks demonstrated that with deeper layers, more simple and more accurate data representations are obtained. In this paper, we propose an approach for layer-wise training of a deep network for the supervised classification task. A transformation matrix of each layer is obtained by solving an optimization aimed at a better representation where a subsequent layer builds its representation on the top of the features produced by a previous layer. We compared the performance of our approach with a DNN trained using back-propagation which has same architecture as ours. Experimental results on the real image datasets demonstrate efficacy of our approach. We also performed kernel analysis of layer representations to validate the claim of better feature encoding. △ Less

Submitted 21 March, 2017; originally announced March 2017.

Journal ref: Deep Learning for Pattern Recognition (DLPR) workshop at ICPR 2016

arXiv:1609.05001 [pdf, other]

Stamp processing with examplar features

Authors: Yash Bhalgat, Mandar Kulkarni, Shirish Karande, Sachin Lodha

Abstract: Document digitization is becoming increasingly crucial. In this work, we propose a shape based approach for automatic stamp verification/detection in document images using an unsupervised feature learning. Given a small set of training images, our algorithm learns an appropriate shape representation using an unsupervised clustering. Experimental results demonstrate the effectiveness of our framewo… ▽ More Document digitization is becoming increasingly crucial. In this work, we propose a shape based approach for automatic stamp verification/detection in document images using an unsupervised feature learning. Given a small set of training images, our algorithm learns an appropriate shape representation using an unsupervised clustering. Experimental results demonstrate the effectiveness of our framework in challenging scenarios. △ Less

Submitted 16 September, 2016; originally announced September 2016.

arXiv:1609.02271 [pdf, other]

Ashwin: Plug-and-Play System for Machine-Human Image Annotation

Authors: Anand Sriraman, Mandar Kulkarni, Rahul Kumar, Kanika Kalra, Purushotam Radadia, Shirish Karande

Abstract: We present an end-to-end machine-human image annotation system where each component can be attached in a plug-and-play fashion. These components include Feature Extraction, Machine Classifier, Task Sampling and Crowd Consensus. We present an end-to-end machine-human image annotation system where each component can be attached in a plug-and-play fashion. These components include Feature Extraction, Machine Classifier, Task Sampling and Crowd Consensus. △ Less

Submitted 9 September, 2016; v1 submitted 8 September, 2016; originally announced September 2016.

Comments: HCOMP 2016 Demonstrations Track

arXiv:1609.02043 [pdf, ps, other]

Feasibility of Post-Editing Speech Transcriptions with a Mismatched Crowd

Authors: Purushotam Radadia, Shirish Karande

Abstract: Manual correction of speech transcription can involve a selection from plausible transcriptions. Recent work has shown the feasibility of employing a mismatched crowd for speech transcription. However, it is yet to be established whether a mismatched worker has sufficiently fine-granular speech perception to choose among the phonetically proximate options that are likely to be generated from the t… ▽ More Manual correction of speech transcription can involve a selection from plausible transcriptions. Recent work has shown the feasibility of employing a mismatched crowd for speech transcription. However, it is yet to be established whether a mismatched worker has sufficiently fine-granular speech perception to choose among the phonetically proximate options that are likely to be generated from the trellis of an ASRU. Hence, we consider five languages, Arabic, German, Hindi, Russian and Spanish. For each we generate synthetic, phonetically proximate, options which emulate post-editing scenarios of varying difficulty. We consistently observe non-trivial crowd ability to choose among fine-granular options. △ Less

Submitted 7 September, 2016; originally announced September 2016.

Comments: HCOMP 2016 Works-in-Progress

arXiv:0809.3600 [pdf, ps, other]

On the Capacity Improvement of Multicast Traffic with Network Coding

Authors: Zheng Wang, Shirish Karande, Hamid R. Sadjadpour, J. J. Garcia-Luna-Aceves

Abstract: In this paper, we study the contribution of network coding (NC) in improving the multicast capacity of random wireless ad hoc networks when nodes are endowed with multi-packet transmission (MPT) and multi-packet reception (MPR) capabilities. We show that a per session throughput capacity of $Θ(nT^{3}(n))$, where $n$ is the total number of nodes and T(n) is the communication range, can be achieve… ▽ More In this paper, we study the contribution of network coding (NC) in improving the multicast capacity of random wireless ad hoc networks when nodes are endowed with multi-packet transmission (MPT) and multi-packet reception (MPR) capabilities. We show that a per session throughput capacity of $Θ(nT^{3}(n))$, where $n$ is the total number of nodes and T(n) is the communication range, can be achieved as a tight bound when each session contains a constant number of sinks. Surprisingly, an identical order capacity can be achieved when nodes have only MPR and MPT capabilities. This result proves that NC does not contribute to the order capacity of multicast traffic in wireless ad hoc networks when MPR and MPT are used in the network. The result is in sharp contrast to the general belief (conjecture) that NC improves the order capacity of multicast. Furthermore, if the communication range is selected to guarantee the connectivity in the network, i.e., $T(n)\ge Θ(\sqrt{\log n/n})$, then the combination of MPR and MPT achieves a throughput capacity of $Θ(\frac{\log^{3/2} n}{\sqrt{n}})$ which provides an order capacity gain of $Θ(\log^2 n)$ compared to the point-to-point multicast capacity with the same number of destinations. △ Less

Submitted 21 September, 2008; originally announced September 2008.

Showing 1–19 of 19 results for author: Karande, S