-
SpeechCaps: Advancing Instruction-Based Universal Speech Models with Multi-Talker Speaking Style Captioning
Authors:
Chien-yu Huang,
Min-Han Shih,
Ke-Han Lu,
Chi-Yuan Hsiao,
Hung-yi Lee
Abstract:
Instruction-based speech processing is becoming popular. Studies show that training with multiple tasks boosts performance, but collecting diverse, large-scale tasks and datasets is expensive. Thus, it is highly desirable to design a fundamental task that benefits other downstream tasks. This paper introduces a multi-talker speaking style captioning task to enhance the understanding of speaker and…
▽ More
Instruction-based speech processing is becoming popular. Studies show that training with multiple tasks boosts performance, but collecting diverse, large-scale tasks and datasets is expensive. Thus, it is highly desirable to design a fundamental task that benefits other downstream tasks. This paper introduces a multi-talker speaking style captioning task to enhance the understanding of speaker and prosodic information. We used large language models to generate descriptions for multi-talker speech. Then, we trained our model with pre-training on this captioning task followed by instruction tuning. Evaluation on Dynamic-SUPERB shows our model outperforming the baseline pre-trained only on single-talker tasks, particularly in speaker and emotion recognition. Additionally, tests on a multi-talker QA task reveal that current models struggle with attributes such as gender, pitch, and speaking rate. The code and dataset are available at https://github.com/cyhuang-tw/speechcaps.
△ Less
Submitted 25 August, 2024;
originally announced August 2024.
-
Electrically-Driven Two-Dimensional Semiconductor Microcavity Laser
Authors:
Zheng-Zhe Chen,
Hsiang-Ting Lin,
Chiao-Yun Chang,
Adil Muhammad,
Po-Cheng Tsai,
Tsung Sheng Kao,
Chi Chen,
Shu-Wei Chang,
Shih-Yen Lin,
Min-Hsiung Shih
Abstract:
Two-dimensional (2-D) monolayer transition-metal dichalcogenides (TMDCs) are promising materials for realizing ultracompact, low-threshold semiconductor lasers. And the development of the electrical-driven TMDC devices is crucial for enhancing the integration potential of practical optoelectronic systems. However, at current stage, the electrically-driven 2-D TMDC laser has never been realized. He…
▽ More
Two-dimensional (2-D) monolayer transition-metal dichalcogenides (TMDCs) are promising materials for realizing ultracompact, low-threshold semiconductor lasers. And the development of the electrical-driven TMDC devices is crucial for enhancing the integration potential of practical optoelectronic systems. However, at current stage, the electrically-driven 2-D TMDC laser has never been realized. Herein, we have developed the first electrically-driven 2-D TMDC microcavity laser.
△ Less
Submitted 13 August, 2024;
originally announced August 2024.
-
The IBEX Knowledge-Base: Achieving more together with open science
Authors:
Andrea J. Radtke,
Ifeanyichukwu Anidi,
Leanne Arakkal,
Armando Arroyo-Mejias,
Rebecca T. Beuschel,
Katy Borner,
Colin J. Chu,
Beatrice Clark,
Menna R. Clatworthy,
Jake Colautti,
Joshua Croteau,
Saven Denha,
Rose Dever,
Walderez O. Dutra,
Sonja Fritzsche,
Spencer Fullam,
Michael Y. Gerner,
Anita Gola,
Kenneth J. Gollob,
Jonathan M. Hernandez,
Jyh Liang Hor,
Hiroshi Ichise,
Zhixin Jing,
Danny Jonigk,
Evelyn Kandov
, et al. (33 additional authors not shown)
Abstract:
Iterative Bleaching Extends multipleXity (IBEX) is a versatile method for highly multiplexed imaging of diverse tissues. Based on open science principles, we created the IBEX Knowledge-Base, a resource for reagents, protocols and more, to empower innovation.
Iterative Bleaching Extends multipleXity (IBEX) is a versatile method for highly multiplexed imaging of diverse tissues. Based on open science principles, we created the IBEX Knowledge-Base, a resource for reagents, protocols and more, to empower innovation.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
A Surrogate Endpoint Based Provisional Approval Causal Roadmap
Authors:
Peter B. Gilbert,
James Peng,
Larry Han,
Theis Lange,
Yun Lu,
Lei Nie,
Mei-Chiung Shih,
Salina P. Waddy,
Ken Wiley,
Margot Yann,
Zafar Zafari,
Debashis Ghosh,
Dean Follmann,
Michal Juraska,
Iván Díaz
Abstract:
For many rare diseases with no approved preventive interventions, promising interventions exist, yet it has been difficult to conduct a pivotal phase 3 trial that could provide direct evidence demonstrating a beneficial effect on the target disease outcome. When a promising putative surrogate endpoint(s) for the target outcome is available, surrogate-based provisional approval of an intervention m…
▽ More
For many rare diseases with no approved preventive interventions, promising interventions exist, yet it has been difficult to conduct a pivotal phase 3 trial that could provide direct evidence demonstrating a beneficial effect on the target disease outcome. When a promising putative surrogate endpoint(s) for the target outcome is available, surrogate-based provisional approval of an intervention may be pursued. We apply the Causal Roadmap rubric to define a surrogate endpoint based provisional approval causal roadmap, which combines observational study data that estimates the relationship between the putative surrogate and the target outcome, with a phase 3 surrogate endpoint study that collects the same data but is very under-powered to assess the treatment effect (TE) on the target outcome. The objective is conservative estimation/inference for the TE with an estimated lower uncertainty bound that allows (through two bias functions) for an imperfect surrogate and imperfect transport of the conditional target outcome risk in the untreated between the observational and phase 3 studies. Two estimators of TE (plug-in, nonparametric efficient one-step) with corresponding inference procedures are developed. Finite-sample performance of the plug-in estimator is evaluated in two simulation studies, with R code provided. The roadmap is illustrated with contemporary Group B Streptococcus vaccine development.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Modeling Ambient Scene Dynamics for Free-view Synthesis
Authors:
Meng-Li Shih,
Jia-Bin Huang,
Changil Kim,
Rajvi Shah,
Johannes Kopf,
Chen Gao
Abstract:
We introduce a novel method for dynamic free-view synthesis of an ambient scenes from a monocular capture bringing a immersive quality to the viewing experience. Our method builds upon the recent advancements in 3D Gaussian Splatting (3DGS) that can faithfully reconstruct complex static scenes. Previous attempts to extend 3DGS to represent dynamics have been confined to bounded scenes or require m…
▽ More
We introduce a novel method for dynamic free-view synthesis of an ambient scenes from a monocular capture bringing a immersive quality to the viewing experience. Our method builds upon the recent advancements in 3D Gaussian Splatting (3DGS) that can faithfully reconstruct complex static scenes. Previous attempts to extend 3DGS to represent dynamics have been confined to bounded scenes or require multi-camera captures, and often fail to generalize to unseen motions, limiting their practical application. Our approach overcomes these constraints by leveraging the periodicity of ambient motions to learn the motion trajectory model, coupled with careful regularization. We also propose important practical strategies to improve the visual quality of the baseline 3DGS static reconstructions and to improve memory efficiency critical for GPU-memory intensive learning. We demonstrate high-quality photorealistic novel view synthesis of several ambient natural scenes with intricate textures and fine structural elements.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models
Authors:
Meng-Li Shih,
Wei-Chiu Ma,
Lorenzo Boyice,
Aleksander Holynski,
Forrester Cole,
Brian L. Curless,
Janne Kontkanen
Abstract:
We propose ExtraNeRF, a novel method for extrapolating the range of views handled by a Neural Radiance Field (NeRF). Our main idea is to leverage NeRFs to model scene-specific, fine-grained details, while capitalizing on diffusion models to extrapolate beyond our observed data. A key ingredient is to track visibility to determine what portions of the scene have not been observed, and focus on reco…
▽ More
We propose ExtraNeRF, a novel method for extrapolating the range of views handled by a Neural Radiance Field (NeRF). Our main idea is to leverage NeRFs to model scene-specific, fine-grained details, while capitalizing on diffusion models to extrapolate beyond our observed data. A key ingredient is to track visibility to determine what portions of the scene have not been observed, and focus on reconstructing those regions consistently with diffusion models. Our primary contributions include a visibility-aware diffusion-based inpainting module that is fine-tuned on the input imagery, yielding an initial NeRF with moderate quality (often blurry) inpainted regions, followed by a second diffusion model trained on the input imagery to consistently enhance, notably sharpen, the inpainted imagery from the first pass. We demonstrate high-quality results, extrapolating beyond a small number of (typically six or fewer) input views, effectively outpainting the NeRF as well as inpainting newly disoccluded regions inside the original viewing volume. We compare with related work both quantitatively and qualitatively and show significant gains over prior art.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Sustained Robust Exciton Emission in Suspended Monolayer WSe_2 within the Low Carrier Density Regime for Quantum Emitter Applications
Authors:
Zheng-Zhe Chen,
Chiao-Yun Chang,
Ya-Ting Tsai,
Po-Cheng Tsai,
Shih-Yen Lin,
Min-Hsiung Shih
Abstract:
The development of semiconductor optoelectronic devices is moving toward low power consumption and miniaturization, especially for high-efficiency quantum emitters. However, most of these quantum sources work at low carrier density region, where the Shockley-Read-Hall recombination may dominant and seriously reduce the emission efficiency. In order to diminish the affection of carrier trapping and…
▽ More
The development of semiconductor optoelectronic devices is moving toward low power consumption and miniaturization, especially for high-efficiency quantum emitters. However, most of these quantum sources work at low carrier density region, where the Shockley-Read-Hall recombination may dominant and seriously reduce the emission efficiency. In order to diminish the affection of carrier trapping and sustain a strong photoluminescence emission under low power pumping condition, we investigated on the influence of Suspending to monolayered tungsten diselenide, novel two-dimensional quantum material. Not only the PL intensity, but also the fundamental photoluminescence quantum yield has exhibited a huge, order-scale enhancement through suspending, even surprisingly, we found the PLQY improvement revealed far significantly under small pumping power and came out an exponential increase tendency toward even lower carrier density region. With its strong excitonic effect, suspended WSe_2 offers a solution to reduce carrier trapping and participate in non-radiative processes. Moreover, in the low-power range where SRH recombination dominates, suspended WSe_2 exhibited remarkably higher percentage of excitonic radiation compared to contacted WSe_2. Herein, we quantitatively demonstrate the significance of suspended WSe_2 monolayer at low carrier density region, highlighting its potential for developing compact, low-power quantum emitters in the future.
△ Less
Submitted 27 February, 2024;
originally announced February 2024.
-
GSQA: An End-to-End Model for Generative Spoken Question Answering
Authors:
Min-Han Shih,
Ho-Lam Chung,
Yu-Chi Pai,
Ming-Hao Hsu,
Guan-Ting Lin,
Shang-Wen Li,
Hung-yi Lee
Abstract:
In recent advancements in spoken question answering (QA), end-to-end models have made significant strides. However, previous research has primarily focused on extractive span selection. While this extractive-based approach is effective when answers are present directly within the input, it falls short in addressing abstractive questions, where answers are not directly extracted but inferred from t…
▽ More
In recent advancements in spoken question answering (QA), end-to-end models have made significant strides. However, previous research has primarily focused on extractive span selection. While this extractive-based approach is effective when answers are present directly within the input, it falls short in addressing abstractive questions, where answers are not directly extracted but inferred from the given information. To bridge this gap, we introduce the first end-to-end Generative Spoken Question Answering (GSQA) model that empowers the system to engage in abstractive reasoning. The challenge in training our GSQA model lies in the absence of a spoken abstractive QA dataset. We propose using text models for initialization and leveraging the extractive QA dataset to transfer knowledge from the text generative model to the spoken generative model. Experimental results indicate that our model surpasses the previous extractive model by 3% on extractive QA datasets. Furthermore, the GSQA model has only been fine-tuned on the spoken extractive QA dataset. Despite not having seen any spoken abstractive QA data, it can still closely match the performance of the cascade model. In conclusion, our GSQA model shows the potential to generalize to a broad spectrum of questions, thus further expanding the spoken question answering capabilities of abstractive QA. Our code is available at https://voidful.github.io/GSQA
△ Less
Submitted 21 July, 2024; v1 submitted 15 December, 2023;
originally announced December 2023.
-
A Causal Roadmap for Generating High-Quality Real-World Evidence
Authors:
Lauren E Dang,
Susan Gruber,
Hana Lee,
Issa Dahabreh,
Elizabeth A Stuart,
Brian D Williamson,
Richard Wyss,
Iván Díaz,
Debashis Ghosh,
Emre Kıcıman,
Demissie Alemayehu,
Katherine L Hoffman,
Carla Y Vossen,
Raymond A Huml,
Henrik Ravn,
Kajsa Kvist,
Richard Pratley,
Mei-Chiung Shih,
Gene Pennello,
David Martin,
Salina P Waddy,
Charles E Barr,
Mouna Akacha,
John B Buse,
Mark van der Laan
, et al. (1 additional authors not shown)
Abstract:
Increasing emphasis on the use of real-world evidence (RWE) to support clinical policy and regulatory decision-making has led to a proliferation of guidance, advice, and frameworks from regulatory agencies, academia, professional societies, and industry. A broad spectrum of studies use real-world data (RWD) to produce RWE, ranging from randomized controlled trials with outcomes assessed using RWD…
▽ More
Increasing emphasis on the use of real-world evidence (RWE) to support clinical policy and regulatory decision-making has led to a proliferation of guidance, advice, and frameworks from regulatory agencies, academia, professional societies, and industry. A broad spectrum of studies use real-world data (RWD) to produce RWE, ranging from randomized controlled trials with outcomes assessed using RWD to fully observational studies. Yet many RWE study proposals lack sufficient detail to evaluate adequacy, and many analyses of RWD suffer from implausible assumptions, other methodological flaws, or inappropriate interpretations. The Causal Roadmap is an explicit, itemized, iterative process that guides investigators to pre-specify analytic study designs; it addresses a wide range of guidance within a single framework. By requiring transparent evaluation of causal assumptions and facilitating objective comparisons of design and analysis choices based on pre-specified criteria, the Roadmap can help investigators to evaluate the quality of evidence that a given study is likely to produce, specify a study to generate high-quality RWE, and communicate effectively with regulatory agencies and other stakeholders. This paper aims to disseminate and extend the Causal Roadmap framework for use by clinical and translational researchers, with companion papers demonstrating application of the Causal Roadmap for specific use cases.
△ Less
Submitted 11 May, 2023;
originally announced May 2023.
-
Falsification of Internal and External Validity in Observational Studies via Conditional Moment Restrictions
Authors:
Zeshan Hussain,
Ming-Chieh Shih,
Michael Oberst,
Ilker Demirel,
David Sontag
Abstract:
Randomized Controlled Trials (RCT)s are relied upon to assess new treatments, but suffer from limited power to guide personalized treatment decisions. On the other hand, observational (i.e., non-experimental) studies have large and diverse populations, but are prone to various biases (e.g. residual confounding). To safely leverage the strengths of observational studies, we focus on the problem of…
▽ More
Randomized Controlled Trials (RCT)s are relied upon to assess new treatments, but suffer from limited power to guide personalized treatment decisions. On the other hand, observational (i.e., non-experimental) studies have large and diverse populations, but are prone to various biases (e.g. residual confounding). To safely leverage the strengths of observational studies, we focus on the problem of falsification, whereby RCTs are used to validate causal effect estimates learned from observational data. In particular, we show that, given data from both an RCT and an observational study, assumptions on internal and external validity have an observable, testable implication in the form of a set of Conditional Moment Restrictions (CMRs). Further, we show that expressing these CMRs with respect to the causal effect, or "causal contrast", as opposed to individual counterfactual means, provides a more reliable falsification test. In addition to giving guarantees on the asymptotic properties of our test, we demonstrate superior power and type I error of our approach on semi-synthetic and real world datasets. Our approach is interpretable, allowing a practitioner to visualize which subgroups in the population lead to falsification of an observational study.
△ Less
Submitted 6 March, 2023; v1 submitted 30 January, 2023;
originally announced January 2023.
-
Systematic Analysis for Pretrained Language Model Priming for Parameter-Efficient Fine-tuning
Authors:
Shih-Cheng Huang,
Shih-Heng Wang,
Min-Han Shih,
Saurav Sahay,
Hung-yi Lee
Abstract:
Parameter-efficient (PE) methods (like Prompts or Adapters) for adapting pre-trained language models (PLM) to downstream tasks have been popular recently. However, hindrances still prevent these methods from reaching their full potential. For example, two significant challenges are few-shot adaptation and cross-task generalization. To tackle these issues, we propose a general PE priming framework…
▽ More
Parameter-efficient (PE) methods (like Prompts or Adapters) for adapting pre-trained language models (PLM) to downstream tasks have been popular recently. However, hindrances still prevent these methods from reaching their full potential. For example, two significant challenges are few-shot adaptation and cross-task generalization. To tackle these issues, we propose a general PE priming framework to enhance and explore the few-shot adaptation and generalization ability of PE methods. In this framework, PLMs are primed with PE methods for rapidly adapting to various target tasks. To evaluate the generalization ability of these PE methods, we conduct experiments on a few-shot cross-domain benchmark containing 160 diverse NLP tasks. Our experiment not only reveals the best priming strategy but also verifies that priming facilitates the adaptation to target tasks.
△ Less
Submitted 30 May, 2024; v1 submitted 2 December, 2022;
originally announced December 2022.
-
Falsification before Extrapolation in Causal Effect Estimation
Authors:
Zeshan Hussain,
Michael Oberst,
Ming-Chieh Shih,
David Sontag
Abstract:
Randomized Controlled Trials (RCTs) represent a gold standard when developing policy guidelines. However, RCTs are often narrow, and lack data on broader populations of interest. Causal effects in these populations are often estimated using observational datasets, which may suffer from unobserved confounding and selection bias. Given a set of observational estimates (e.g. from multiple studies), w…
▽ More
Randomized Controlled Trials (RCTs) represent a gold standard when developing policy guidelines. However, RCTs are often narrow, and lack data on broader populations of interest. Causal effects in these populations are often estimated using observational datasets, which may suffer from unobserved confounding and selection bias. Given a set of observational estimates (e.g. from multiple studies), we propose a meta-algorithm that attempts to reject observational estimates that are biased. We do so using validation effects, causal effects that can be inferred from both RCT and observational data. After rejecting estimators that do not pass this test, we generate conservative confidence intervals on the extrapolated causal effects for subgroups not observed in the RCT. Under the assumption that at least one observational estimator is asymptotically normal and consistent for both the validation and extrapolated effects, we provide guarantees on the coverage probability of the intervals output by our algorithm. To facilitate hypothesis testing in settings where causal effect transportation across datasets is necessary, we give conditions under which a doubly-robust estimator of group average treatment effects is asymptotically normal, even when flexible machine learning methods are used for estimation of nuisance parameters. We illustrate the properties of our approach on semi-synthetic and real world datasets, and show that it compares favorably to standard meta-analysis techniques.
△ Less
Submitted 6 March, 2023; v1 submitted 27 September, 2022;
originally announced September 2022.
-
Automated Fidelity Assessment for Strategy Training in Inpatient Rehabilitation using Natural Language Processing
Authors:
Hunter Osterhoudt,
Courtney E. Schneider,
Haneef A Mohammad,
Minmei Shih,
Alexandra E. Harper,
Leming Zhou,
Elizabeth R Skidmore,
Yanshan Wang
Abstract:
Strategy training is a multidisciplinary rehabilitation approach that teaches skills to reduce disability among those with cognitive impairments following a stroke. Strategy training has been shown in randomized, controlled clinical trials to be a more feasible and efficacious intervention for promoting independence than traditional rehabilitation approaches. A standardized fidelity assessment is…
▽ More
Strategy training is a multidisciplinary rehabilitation approach that teaches skills to reduce disability among those with cognitive impairments following a stroke. Strategy training has been shown in randomized, controlled clinical trials to be a more feasible and efficacious intervention for promoting independence than traditional rehabilitation approaches. A standardized fidelity assessment is used to measure adherence to treatment principles by examining guided and directed verbal cues in video recordings of rehabilitation sessions. Although the fidelity assessment for detecting guided and directed verbal cues is valid and feasible for single-site studies, it can become labor intensive, time consuming, and expensive in large, multi-site pragmatic trials. To address this challenge to widespread strategy training implementation, we leveraged natural language processing (NLP) techniques to automate the strategy training fidelity assessment, i.e., to automatically identify guided and directed verbal cues from video recordings of rehabilitation sessions. We developed a rule-based NLP algorithm, a long-short term memory (LSTM) model, and a bidirectional encoder representation from transformers (BERT) model for this task. The best performance was achieved by the BERT model with a 0.8075 F1-score. This BERT model was verified on an external validation dataset collected from a separate major regional health system and achieved an F1 score of 0.8259, which shows that the BERT model generalizes well. The findings from this study hold widespread promise in psychology and rehabilitation intervention research and practice.
△ Less
Submitted 24 January, 2023; v1 submitted 14 September, 2022;
originally announced September 2022.
-
3D Photography using Context-aware Layered Depth Inpainting
Authors:
Meng-Li Shih,
Shih-Yang Su,
Johannes Kopf,
Jia-Bin Huang
Abstract:
We propose a method for converting a single RGB-D input image into a 3D photo - a multi-layer representation for novel view synthesis that contains hallucinated color and depth structures in regions occluded in the original view. We use a Layered Depth Image with explicit pixel connectivity as underlying representation, and present a learning-based inpainting model that synthesizes new local color…
▽ More
We propose a method for converting a single RGB-D input image into a 3D photo - a multi-layer representation for novel view synthesis that contains hallucinated color and depth structures in regions occluded in the original view. We use a Layered Depth Image with explicit pixel connectivity as underlying representation, and present a learning-based inpainting model that synthesizes new local color-and-depth content into the occluded region in a spatial context-aware manner. The resulting 3D photos can be efficiently rendered with motion parallax using standard graphics engines. We validate the effectiveness of our method on a wide range of challenging everyday scenes and show fewer artifacts compared with the state of the arts.
△ Less
Submitted 10 June, 2020; v1 submitted 9 April, 2020;
originally announced April 2020.
-
P2FAAS: Toward Privacy-Preserving Fuzzing as a Service
Authors:
Fan Sang,
Daehee Jang,
Ming-Wei Shih,
Taesoo Kim
Abstract:
Global corporations (e.g., Google and Microsoft) have recently introduced a new model of cloud services, fuzzing-as-a-service (FaaS). Despite effectively alleviating the cost of fuzzing, the model comes with privacy concerns. For example, the end user has to trust both cloud and service providers who have access to the application to be fuzzed. Such concerns are due to the platform is under the co…
▽ More
Global corporations (e.g., Google and Microsoft) have recently introduced a new model of cloud services, fuzzing-as-a-service (FaaS). Despite effectively alleviating the cost of fuzzing, the model comes with privacy concerns. For example, the end user has to trust both cloud and service providers who have access to the application to be fuzzed. Such concerns are due to the platform is under the control of its provider and the application and the fuzzer are highly coupled. In this paper, we propose P2FaaS, a new ecosystem that preserves end user's privacy while providing FaaS in the cloud. The key idea of P2FaaS is to utilize Intel SGX for preventing cloud and service providers from learning information about the application. Our preliminary evaluation shows that P2FaaS imposes 45% runtime overhead to the fuzzing compared to the baseline. In addition, P2FaaS demonstrates that, with recently introduced hardware, Intel SGX Card, the fuzzing service can be scaled up to multiple servers without native SGX support.
△ Less
Submitted 24 September, 2019;
originally announced September 2019.
-
Random Sampling for Group-By Queries
Authors:
Trong Duc Nguyen,
Ming-Hung Shih,
Sai Sree Parvathaneni,
Bojian Xu,
Divesh Srivastava,
Srikanta Tirthapura
Abstract:
Random sampling has been widely used in approximate query processing on large databases, due to its potential to significantly reduce resource usage and response times, at the cost of a small approximation error. We consider random sampling for answering the ubiquitous class of group-by queries, which first group data according to one or more attributes, and then aggregate within each group after…
▽ More
Random sampling has been widely used in approximate query processing on large databases, due to its potential to significantly reduce resource usage and response times, at the cost of a small approximation error. We consider random sampling for answering the ubiquitous class of group-by queries, which first group data according to one or more attributes, and then aggregate within each group after filtering through a predicate. The challenge with group-by queries is that a sampling method cannot focus on optimizing the quality of a single answer (e.g. the mean of selected data), but must simultaneously optimize the quality of a set of answers (one per group).
We present CVOPT, a query- and data-driven sampling framework for a set of group-by queries. To evaluate the quality of a sample, CVOPT defines a metric based on the norm (e.g. $\ell_2$ or $\ell_\infty$) of the coefficients of variation (CVs) of different answers, and constructs a stratified sample that provably optimizes the metric. CVOPT can handle group-by queries on data where groups have vastly different statistical characteristics, such as frequencies, means, or variances. CVOPT jointly optimizes for multiple aggregations and multiple group-by clauses, and provides a way to prioritize specific groups or aggregates. It can be tuned to cases when partial information about a query workload is known, such as a data warehouse where queries are scheduled to run periodically.
Our experimental results show that CVOPT outperforms current state-of-the-art on sample quality and estimation accuracy for group-by queries. On a set of queries on two real-world data sets, CVOPT yields relative errors that are 5x smaller than competing approaches, under the same space budget.
△ Less
Submitted 12 September, 2019; v1 submitted 5 September, 2019;
originally announced September 2019.
-
Self-Supervised Learning of Depth and Camera Motion from 360° Videos
Authors:
Fu-En Wang,
Hou-Ning Hu,
Hsien-Tzu Cheng,
Juan-Ting Lin,
Shang-Ta Yang,
Meng-Li Shih,
Hung-Kuo Chu,
Min Sun
Abstract:
As 360° cameras become prevalent in many autonomous systems (e.g., self-driving cars and drones), efficient 360° perception becomes more and more important. We propose a novel self-supervised learning approach for predicting the omnidirectional depth and camera motion from a 360° video. In particular, starting from the SfMLearner, which is designed for cameras with normal field-of-view, we introdu…
▽ More
As 360° cameras become prevalent in many autonomous systems (e.g., self-driving cars and drones), efficient 360° perception becomes more and more important. We propose a novel self-supervised learning approach for predicting the omnidirectional depth and camera motion from a 360° video. In particular, starting from the SfMLearner, which is designed for cameras with normal field-of-view, we introduce three key features to process 360° images efficiently. Firstly, we convert each image from equirectangular projection to cubic projection in order to avoid image distortion. In each network layer, we use Cube Padding (CP), which pads intermediate features from adjacent faces, to avoid image boundaries. Secondly, we propose a novel "spherical" photometric consistency constraint on the whole viewing sphere. In this way, no pixel will be projected outside the image boundary which typically happens in images with normal field-of-view. Finally, rather than naively estimating six independent camera motions (i.e., naively applying SfM-Learner to each face on a cube), we propose a novel camera pose consistency loss to ensure the estimated camera motions reaching consensus. To train and evaluate our approach, we collect a new PanoSUNCG dataset containing a large amount of 360° videos with groundtruth depth and camera motion. Our approach achieves state-of-the-art depth prediction and camera motion estimation on PanoSUNCG with faster inference speed comparing to equirectangular. In real-world indoor videos, our approach can also achieve qualitatively reasonable depth prediction by acquiring model pre-trained on PanoSUNCG.
△ Less
Submitted 13 November, 2018;
originally announced November 2018.
-
SPX: Preserving End-to-End Security for Edge Computing
Authors:
Ketan Bhardwaj,
Ming-Wei Shih,
Ada Gavrilovska,
Taesoo Kim,
Chengyu Song
Abstract:
Beyond point solutions, the vision of edge computing is to enable web services to deploy their edge functions in a multi-tenant infrastructure present at the edge of mobile networks. However, edge functions can be rendered useless because of one critical issue: Web services are delivered over end-to-end encrypted connections, so edge functions cannot operate on encrypted traffic without compromisi…
▽ More
Beyond point solutions, the vision of edge computing is to enable web services to deploy their edge functions in a multi-tenant infrastructure present at the edge of mobile networks. However, edge functions can be rendered useless because of one critical issue: Web services are delivered over end-to-end encrypted connections, so edge functions cannot operate on encrypted traffic without compromising security or degrading performance. Any solution to this problem must interoperate with existing protocols like TLS, as well as with new emerging security protocols for client and IoT devices. The edge functions must remain invisible to client-side endpoints but may require explicit control from their service-side web services. Finally, a solution must operate within overhead margins which do not obviate the benefits of the edge.
To address this problem, this paper presents SPX - a solution for edge-ready and end-to-end secure protocol extensions, which can efficiently maintain end-to-edge-to-end ($E^3$) security semantics. Using our SPX prototype, we allow edge functions to operate on encrypted traffic, while ensuring that security semantics of secure protocols still hold. SPX uses Intel SGX to bind the communication channel with remote attestation and to provide a solution that not only defends against potential attacks but also results in low performance overheads, and neither mandates any changes on the end-user side nor breaks interoperability with existing protocols.
△ Less
Submitted 24 September, 2018;
originally announced September 2018.
-
Variance-Optimal Offline and Streaming Stratified Random Sampling
Authors:
Trong Duc Nguyen,
Ming-Hung Shih,
Divesh Srivastava,
Srikanta Tirthapura,
Bojian Xu
Abstract:
Stratified random sampling (SRS) is a fundamental sampling technique that provides accurate estimates for aggregate queries using a small size sample, and has been used widely for approximate query processing. A key question in SRS is how to partition a target sample size among different strata. While Neyman allocation provides a solution that minimizes the variance of an estimate using this sampl…
▽ More
Stratified random sampling (SRS) is a fundamental sampling technique that provides accurate estimates for aggregate queries using a small size sample, and has been used widely for approximate query processing. A key question in SRS is how to partition a target sample size among different strata. While Neyman allocation provides a solution that minimizes the variance of an estimate using this sample, it works under the assumption that each stratum is abundant, i.e., has a large number of data points to choose from. This assumption may not hold in general: one or more strata may be bounded, and may not contain a large number of data points, even though the total data size may be large.
We first present VOILA, an offline method for allocating sample sizes to strata in a variance-optimal manner, even for the case when one or more strata may be bounded. We next consider SRS on streaming data that are continuously arriving. We show a lower bound, that any streaming algorithm for SRS must have (in the worst case) a variance that is Ω(r) factor away from the optimal, where r is the number of strata. We present S-VOILA, a practical streaming algorithm for SRS that is locally variance-optimal in its allocation of sample sizes to different strata. Our result from experiments on real and synthetic data show that VOILA can have significantly (1.4 to 50.0 times) smaller variance than Neyman allocation. The streaming algorithm S-VOILA results in a variance that is typically close to VOILA, which was given the entire input beforehand.
△ Less
Submitted 20 February, 2018; v1 submitted 27 January, 2018;
originally announced January 2018.
-
Resonance in modulation instability from non-instantaneous nonlinearities
Authors:
Ray-Ching Hong,
Chun-Yan Lin,
You-Lin Chuang,
Chien-Ming Wu,
Yonan Su,
Jeng Yi Lee,
Chien-Chung Jeng,
Ming-Feng Shih,
Ray-Kuang Lee
Abstract:
To explore resonance phenomena in the nonlinear region, we show by experimental measurements and theoretical analyses that resonance happens in modulation instability (MI) from non-instantaneous nonlinearities in photorefractive crystals. With a temporally periodic modulation in the external bias voltage, corresponding to a modulation in the nonlinear strength, an enhancement in the visibility of…
▽ More
To explore resonance phenomena in the nonlinear region, we show by experimental measurements and theoretical analyses that resonance happens in modulation instability (MI) from non-instantaneous nonlinearities in photorefractive crystals. With a temporally periodic modulation in the external bias voltage, corresponding to a modulation in the nonlinear strength, an enhancement in the visibility of MI at resonant frequency is reported through spontaneous optical pattern formations. Modeled by such temporally periodic nonlinear driving force to the system, theoretical curves obtained from a nonlinear non-instantaneous Schrödinger equation give good agreement to experimental data. As MI is a universal signature of symmetry-breaking phenomena, our observation on the resonance in MI may provide a control on chaotic, solitary, and turbulence waves.
△ Less
Submitted 31 December, 2017;
originally announced January 2018.
-
Tactics of Adversarial Attack on Deep Reinforcement Learning Agents
Authors:
Yen-Chen Lin,
Zhang-Wei Hong,
Yuan-Hong Liao,
Meng-Li Shih,
Ming-Yu Liu,
Min Sun
Abstract:
We introduce two tactics to attack agents trained by deep reinforcement learning algorithms using adversarial examples, namely the strategically-timed attack and the enchanting attack. In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode. Limiting the attack activity to this subset helps pre…
▽ More
We introduce two tactics to attack agents trained by deep reinforcement learning algorithms using adversarial examples, namely the strategically-timed attack and the enchanting attack. In the strategically-timed attack, the adversary aims at minimizing the agent's reward by only attacking the agent at a small subset of time steps in an episode. Limiting the attack activity to this subset helps prevent detection of the attack by the agent. We propose a novel method to determine when an adversarial example should be crafted and applied. In the enchanting attack, the adversary aims at luring the agent to a designated target state. This is achieved by combining a generative model and a planning algorithm: while the generative model predicts the future states, the planning algorithm generates a preferred sequence of actions for luring the agent. A sequence of adversarial examples is then crafted to lure the agent to take the preferred sequence of actions. We apply the two tactics to the agents trained by the state-of-the-art deep reinforcement learning algorithm including DQN and A3C. In 5 Atari games, our strategically timed attack reduces as much reward as the uniform attack (i.e., attacking at every time step) does by attacking the agent 4 times less often. Our enchanting attack lures the agent toward designated target states with a more than 70% success rate. Videos are available at http://yenchenlin.me/adversarial_attack_RL/
△ Less
Submitted 12 November, 2019; v1 submitted 7 March, 2017;
originally announced March 2017.
-
Environment-insensitive and gate-controllable photocurrent enabled by bandgap engineering of MoS2 junctions
Authors:
Fu-Yu Shih,
Yueh-Chun Wu,
Yi-Siang Shih,
Ming-Chiuan Shih,
Po-Hsun Ho,
Chun-Wei Chen,
Yang-Fang Chen,
Ya-Ping Chiu,
Wei-Hua Wang
Abstract:
Two-dimensional (2D) materials are composed of atomically thin crystals with an enormous surface-to-volume ratio, and their physical properties can be easily subjected to the change of the chemical environment. Encapsulation with other layered materials, such as hexagonal boron nitride, is a common practice; however, this approach often requires inextricable fabrication processes. Alternatively, i…
▽ More
Two-dimensional (2D) materials are composed of atomically thin crystals with an enormous surface-to-volume ratio, and their physical properties can be easily subjected to the change of the chemical environment. Encapsulation with other layered materials, such as hexagonal boron nitride, is a common practice; however, this approach often requires inextricable fabrication processes. Alternatively, it is intriguing to explore methods to control transport properties in the circumstance of no encapsulated layer. This is very challenging because of the ubiquitous presence of adsorbents, which can lead to charged-impurity scattering sites, charge traps, and recombination centers. Here, we show that the short-circuit photocurrent originated from the built-in electric field at the MoS2 junction is surprisingly insensitive to the gaseous environment over the range from a vacuum of 1X10^(-6) Torr to ambient condition. The environmental insensitivity of the short-circuit photocurrent is attributed to the characteristic of the diffusion current that is associated with the gradient of carrier density. Conversely, the photocurrent with bias exhibits typical persistent photoconductivity and greatly depends on the gaseous environment. The observation of environment-insensitive short-circuit photocurrent demonstrates an alternative method to design device structure for 2D-material-based optoelectronic applications.
△ Less
Submitted 3 March, 2017;
originally announced March 2017.
-
Navigable videos for presenting scientific data on head-mounted displays
Authors:
Jacqueline Chu,
Leonardo Ferrer,
Min Shih,
Kwan-Liu Ma
Abstract:
Immersive, stereoscopic viewing enables scientists to better analyze the spatial structures of visualized physical phenomena. However, their findings cannot be properly presented in traditional media, which lack these core attributes. Creating a presentation tool that captures this environment poses unique challenges, namely related to poor viewing accessibility. Immersive scientific renderings of…
▽ More
Immersive, stereoscopic viewing enables scientists to better analyze the spatial structures of visualized physical phenomena. However, their findings cannot be properly presented in traditional media, which lack these core attributes. Creating a presentation tool that captures this environment poses unique challenges, namely related to poor viewing accessibility. Immersive scientific renderings often require high-end equipment, which can be impractical to obtain. We address these challenges with our authoring tool and navigational interface, which is designed for affordable head-mounted displays. With the authoring tool, scientists can show salient data features as connected 360° video paths, resulting in a "choose-your-own-adventure" experience. Our navigational interface features bidirectional video playback for added viewing control when users traverse the tailor-made content. We evaluate our system's benefits by authoring case studies on several data sets and conducting a usability study on the navigational interface's design. In summary, our approach provides scientists an immersive medium to visually present their research to the intended audience--spanning from students to colleagues--on affordable virtual reality headsets.
△ Less
Submitted 27 November, 2016;
originally announced November 2016.
-
Inferring Fine-grained Control Flow Inside SGX Enclaves with Branch Shadowing
Authors:
Sangho Lee,
Ming-Wei Shih,
Prasun Gera,
Taesoo Kim,
Hyesoon Kim,
Marcus Peinado
Abstract:
In this paper, we explore a new, yet critical, side-channel attack against Intel Software Guard Extension (SGX), called a branch shadowing attack, which can reveal fine-grained control flows (i.e., each branch) of an enclave program running on real SGX hardware. The root cause of this attack is that Intel SGX does not clear the branch history when switching from enclave mode to non-enclave mode, l…
▽ More
In this paper, we explore a new, yet critical, side-channel attack against Intel Software Guard Extension (SGX), called a branch shadowing attack, which can reveal fine-grained control flows (i.e., each branch) of an enclave program running on real SGX hardware. The root cause of this attack is that Intel SGX does not clear the branch history when switching from enclave mode to non-enclave mode, leaving the fine-grained traces to the outside world through a branch-prediction side channel. However, exploiting the channel is not so straightforward in practice because 1) measuring branch prediction/misprediction penalties based on timing is too inaccurate to distinguish fine-grained control-flow changes and 2) it requires sophisticated control over the enclave execution to force its execution to the interesting code blocks. To overcome these challenges, we developed two novel exploitation techniques: 1) Intel PT- and LBR-based history-inferring techniques and 2) APIC-based technique to control the execution of enclave programs in a fine-grained manner. As a result, we could demonstrate our attack by breaking recent security constructs, including ORAM schemes, Sanctum, SGX-Shield, and T-SGX. Not limiting our work to the attack itself, we thoroughly studied the feasibility of hardware-based solutions (e.g., branch history clearing) and also proposed a software-based countermeasure, called Zigzagger, to mitigate the branch shadowing attack in practice.
△ Less
Submitted 1 June, 2017; v1 submitted 21 November, 2016;
originally announced November 2016.
-
Controlling and maximizing effective thermal properties by manipulating transient behaviors during energy-system cycles
Authors:
Z. J. Gao,
T. M. Shih,
H. Merlitz,
P. J. Pagni,
Z. Chen
Abstract:
Transient processes generally constitute part of energy-system cycles. If skillfully manipulated, they actually are capable of assisting systems to behave beneficially to suit designers' needs. In the present study, behaviors related to both thermal conductivities ($κ$) and heat capacities ($c_{v}$) are analyzed. Along with solutions of the temperature and the flow velocity obtained by means of th…
▽ More
Transient processes generally constitute part of energy-system cycles. If skillfully manipulated, they actually are capable of assisting systems to behave beneficially to suit designers' needs. In the present study, behaviors related to both thermal conductivities ($κ$) and heat capacities ($c_{v}$) are analyzed. Along with solutions of the temperature and the flow velocity obtained by means of theories and simulations, three findings are reported herein: $(1)$ effective $κ$ and effective $c_{v}$ can be controlled to vary from their intrinsic material-property values to a few orders of magnitude larger; $(2)$ a parameter, tentatively named as "nonlinear thermal bias", is identified and can be used as a criterion in estimating energies transferred into the system during heating processes and effective operating ranges of system temperatures; $(3)$ When a body of water, such as the immense ocean, is subject to the boundary condition of cold bottom and hot top, it may be feasible to manipulate transient behaviors of a solid propeller-like system such that the system can be turned by a weak buoyancy force, induced by the top-to-bottom heat conduction through the propeller, provided that the density of the propeller is selected to be close to that of the water. Such a turning motion serves both purposes of performing the hydraulic work and increasing the effective thermal conductivity of the system.
△ Less
Submitted 20 October, 2014;
originally announced October 2014.
-
Entropy variation rate divided by temperature always decreases
Authors:
T. M. Shih,
Z. J. Gao,
H. Merlitz,
L. Rondoni,
P. J. Pagni,
Z. Chen
Abstract:
For an isolated assembly that comprises a system and its surrounding reservoirs, the total entropy ($S_{a}$) always monotonically increases as time elapses. This phenomenon is known as the second law of thermodynamics ($S_{a}\geq0$). Here we analytically prove that, unlike the entropy itself, the entropy variation rate ($B=dS_{a}/dt$) defies the monotonicity for multiple reservoirs ($n\geq2$). In…
▽ More
For an isolated assembly that comprises a system and its surrounding reservoirs, the total entropy ($S_{a}$) always monotonically increases as time elapses. This phenomenon is known as the second law of thermodynamics ($S_{a}\geq0$). Here we analytically prove that, unlike the entropy itself, the entropy variation rate ($B=dS_{a}/dt$) defies the monotonicity for multiple reservoirs ($n\geq2$). In other words, there always exist minima. For example, when a system is heated by two reservoirs from $T=300\,K$ initially to $T=400\,K$ at the final steady state, $B$ decreases steadily first. Then suddenly it turns around and starts to increases at $387\,K$ until it reaches its steady-state value, exhibiting peculiar dipping behaviors. In addition, the crux of our work is the proof that a newly-defined variable, $B/T$, always decreases. Our proof involves the Newton's law of cooling, in which the heat transfer coefficient is assumed to be constant. These theoretical macro-scale findings are validated by numerical experiments using the Crank-Nicholson method, and are illustrated with practical examples. They constitute an alternative to the traditional second-law statement, and may provide useful references for the future micro-scale entropy-related research.
△ Less
Submitted 20 October, 2014; v1 submitted 20 October, 2014;
originally announced October 2014.
-
Coherence controlled soliton interactions
Authors:
Ting-Sen Ku,
Ming-Feng Shih,
Andrey A. Sukhorukov,
Yuri S. Kivshar
Abstract:
We demonstrate theoretically and subsequently observe in experiment a novel type of soliton interaction when a pair of closely spaced spatial optical solitons as a whole is made partially incoherent. We explain how the character of the soliton interaction can be controlled by the total partial incoherence, and show a possibility to change the soliton interaction from attractive to repulsive, or…
▽ More
We demonstrate theoretically and subsequently observe in experiment a novel type of soliton interaction when a pair of closely spaced spatial optical solitons as a whole is made partially incoherent. We explain how the character of the soliton interaction can be controlled by the total partial incoherence, and show a possibility to change the soliton interaction from attractive to repulsive, or vice versa, near a certain threshold in the coherence parameter.
△ Less
Submitted 25 October, 2004;
originally announced October 2004.
-
Spatial coherence singularities and incoherent vortex solitons
Authors:
Kristian Motzek,
Yuri S. Kivshar,
Ming-Feng Shih,
Grover A. Swartzlander Jr
Abstract:
We study spatially localized optical vortices created by self-trapping of partially incoherent light with a phase dislocation in a biased photorefractive crystal. In a contrast to the decay of coherent self-trapped vortex beams due to the azimuthal instability, the incoherent vortices are stabilized when the spatial incoherence of light exceeds a certain threshold. We analyze the spatial coheren…
▽ More
We study spatially localized optical vortices created by self-trapping of partially incoherent light with a phase dislocation in a biased photorefractive crystal. In a contrast to the decay of coherent self-trapped vortex beams due to the azimuthal instability, the incoherent vortices are stabilized when the spatial incoherence of light exceeds a certain threshold. We analyze the spatial coherence properties of the incoherent optical vortices and reveal the existence of ring-like singularities in the spatial coherence function of a vortex field that can characterize the stable propagation of vortices through nonlinear media.
△ Less
Submitted 24 October, 2004;
originally announced October 2004.
-
Partially incoherent optical vortices in self-focusing nonlinear media
Authors:
Chien-Chung Jeng,
Ming-Feng Shih,
Kristian Motzek,
Yuri Kivshar
Abstract:
We observe stable propagation of spatially localized single- and double-charge optical vortices in a self-focusing nonlinear medium. The vortices are created by self-trapping of partially incoherent light carrying a phase dislocation, and they are stabilized when the spatial incoherence of light exceeds a certain threshold. We confirm the vortex stabilization effect by numerical simulations and…
▽ More
We observe stable propagation of spatially localized single- and double-charge optical vortices in a self-focusing nonlinear medium. The vortices are created by self-trapping of partially incoherent light carrying a phase dislocation, and they are stabilized when the spatial incoherence of light exceeds a certain threshold. We confirm the vortex stabilization effect by numerical simulations and also show that the similar mechanism of stabilization applies to higher-order vortices.
△ Less
Submitted 9 September, 2003;
originally announced September 2003.
-
Soliton transverse instabilities in anisotropic nonlocal self-focusing media
Authors:
Kristian Motzek,
Friedemann Kaiser,
Wen-Hen Chu,
Ming-Feng Shih,
Yuri Kivshar
Abstract:
We study, both theoretically and experimentally, the transverse modulational instability of spatial stripe solitons in anisotropic nonlocal photorefractive media. We demonstrate that the instability scenarios depend strongly on the stripe orientation, but the anisotropy-induced features are largely suppressed for spatial solitons created by self-trapping of partially incoherent light.
We study, both theoretically and experimentally, the transverse modulational instability of spatial stripe solitons in anisotropic nonlocal photorefractive media. We demonstrate that the instability scenarios depend strongly on the stripe orientation, but the anisotropy-induced features are largely suppressed for spatial solitons created by self-trapping of partially incoherent light.
△ Less
Submitted 13 August, 2003;
originally announced August 2003.
-
Induced Coherence and Stable Soliton Spiraling
Authors:
Alexander V. Buryak,
Yuri S. Kivshar,
Ming-feng Shih,
Mordechai Segev
Abstract:
We develop a theory of soliton spiraling in a bulk nonlinear medium and reveal a new physical mechanism: periodic power exchange via induced coherence, which can lead to stable spiraling and the formation of dynamical two-soliton states. Our theory not only explains earlier observations, but provides a number of predictions which are also verified experimentally. Finally, we show theoretically a…
▽ More
We develop a theory of soliton spiraling in a bulk nonlinear medium and reveal a new physical mechanism: periodic power exchange via induced coherence, which can lead to stable spiraling and the formation of dynamical two-soliton states. Our theory not only explains earlier observations, but provides a number of predictions which are also verified experimentally. Finally, we show theoretically and experimentally that soliton spiraling can be controled by the degree of mutual initial coherence.
△ Less
Submitted 14 December, 1998;
originally announced December 1998.