subscribe to arXiv mailings

An Offline Adaptation Framework for Constrained Multi-Objective Reinforcement Learning

Authors: Qian Lin, Zongkai Liu, Danying Mo, Chao Yu

Abstract: In recent years, significant progress has been made in multi-objective reinforcement learning (RL) research, which aims to balance multiple objectives by incorporating preferences for each objective. In most existing studies, specific preferences must be provided during deployment to indicate the desired policies explicitly. However, designing these preferences depends heavily on human prior knowl… ▽ More In recent years, significant progress has been made in multi-objective reinforcement learning (RL) research, which aims to balance multiple objectives by incorporating preferences for each objective. In most existing studies, specific preferences must be provided during deployment to indicate the desired policies explicitly. However, designing these preferences depends heavily on human prior knowledge, which is typically obtained through extensive observation of high-performing demonstrations with expected behaviors. In this work, we propose a simple yet effective offline adaptation framework for multi-objective RL problems without assuming handcrafted target preferences, but only given several demonstrations to implicitly indicate the preferences of expected policies. Additionally, we demonstrate that our framework can naturally be extended to meet constraints on safety-critical objectives by utilizing safe demonstrations, even when the safety thresholds are unknown. Empirical results on offline multi-objective and safe tasks demonstrate the capability of our framework to infer policies that align with real preferences while meeting the constraints implied by the provided demonstrations. △ Less

Submitted 15 September, 2024; originally announced September 2024.

arXiv:2408.13727 [pdf, other]

LogParser-LLM: Advancing Efficient Log Parsing with Large Language Models

Authors: Aoxiao Zhong, Dengyao Mo, Guiyang Liu, Jinbu Liu, Qingda Lu, Qi Zhou, Jiesheng Wu, Quanzheng Li, Qingsong Wen

Abstract: Logs are ubiquitous digital footprints, playing an indispensable role in system diagnostics, security analysis, and performance optimization. The extraction of actionable insights from logs is critically dependent on the log parsing process, which converts raw logs into structured formats for downstream analysis. Yet, the complexities of contemporary systems and the dynamic nature of logs pose sig… ▽ More Logs are ubiquitous digital footprints, playing an indispensable role in system diagnostics, security analysis, and performance optimization. The extraction of actionable insights from logs is critically dependent on the log parsing process, which converts raw logs into structured formats for downstream analysis. Yet, the complexities of contemporary systems and the dynamic nature of logs pose significant challenges to existing automatic parsing techniques. The emergence of Large Language Models (LLM) offers new horizons. With their expansive knowledge and contextual prowess, LLMs have been transformative across diverse applications. Building on this, we introduce LogParser-LLM, a novel log parser integrated with LLM capabilities. This union seamlessly blends semantic insights with statistical nuances, obviating the need for hyper-parameter tuning and labeled training data, while ensuring rapid adaptability through online parsing. Further deepening our exploration, we address the intricate challenge of parsing granularity, proposing a new metric and integrating human interactions to allow users to calibrate granularity to their specific needs. Our method's efficacy is empirically demonstrated through evaluations on the Loghub-2k and the large-scale LogPub benchmark. In evaluations on the LogPub benchmark, involving an average of 3.6 million logs per dataset across 14 datasets, our LogParser-LLM requires only 272.5 LLM invocations on average, achieving a 90.6% F1 score for grouping accuracy and an 81.1% for parsing accuracy. These results demonstrate the method's high efficiency and accuracy, outperforming current state-of-the-art log parsers, including pattern-based, neural network-based, and existing LLM-enhanced approaches. △ Less

Submitted 25 August, 2024; originally announced August 2024.

Comments: Accepted by ACM KDD 2024

arXiv:2408.02859 [pdf, other]

Multistain Pretraining for Slide Representation Learning in Pathology

Authors: Guillaume Jaume, Anurag Vaidya, Andrew Zhang, Andrew H. Song, Richard J. Chen, Sharifa Sahai, Dandan Mo, Emilio Madrigal, Long Phi Le, Faisal Mahmood

Abstract: Developing self-supervised learning (SSL) models that can learn universal and transferable representations of H&E gigapixel whole-slide images (WSIs) is becoming increasingly valuable in computational pathology. These models hold the potential to advance critical tasks such as few-shot classification, slide retrieval, and patient stratification. Existing approaches for slide representation learnin… ▽ More Developing self-supervised learning (SSL) models that can learn universal and transferable representations of H&E gigapixel whole-slide images (WSIs) is becoming increasingly valuable in computational pathology. These models hold the potential to advance critical tasks such as few-shot classification, slide retrieval, and patient stratification. Existing approaches for slide representation learning extend the principles of SSL from small images (e.g., 224 x 224 patches) to entire slides, usually by aligning two different augmentations (or views) of the slide. Yet the resulting representation remains constrained by the limited clinical and biological diversity of the views. Instead, we postulate that slides stained with multiple markers, such as immunohistochemistry, can be used as different views to form a rich task-agnostic training signal. To this end, we introduce Madeleine, a multimodal pretraining strategy for slide representation learning. Madeleine is trained with a dual global-local cross-stain alignment objective on large cohorts of breast cancer samples (N=4,211 WSIs across five stains) and kidney transplant samples (N=12,070 WSIs across four stains). We demonstrate the quality of slide representations learned by Madeleine on various downstream evaluations, ranging from morphological and molecular classification to prognostic prediction, comprising 21 tasks using 7,299 WSIs from multiple medical centers. Code is available at https://github.com/mahmoodlab/MADELEINE. △ Less

Submitted 5 August, 2024; originally announced August 2024.

Comments: ECCV'24

arXiv:2308.12577 [pdf, other]

doi 10.1016/j.knosys.2024.111563

REB: Reducing Biases in Representation for Industrial Anomaly Detection

Authors: Shuai Lyu, Dongmei Mo, Waikeung Wong

Abstract: Existing representation-based methods usually conduct industrial anomaly detection in two stages: obtain feature representations with a pre-trained model and perform distance measures for anomaly detection. Among them, K-nearest neighbor (KNN) retrieval-based anomaly detection methods show promising results. However, the features are not fully exploited as these methods ignore domain bias of pre-t… ▽ More Existing representation-based methods usually conduct industrial anomaly detection in two stages: obtain feature representations with a pre-trained model and perform distance measures for anomaly detection. Among them, K-nearest neighbor (KNN) retrieval-based anomaly detection methods show promising results. However, the features are not fully exploited as these methods ignore domain bias of pre-trained models and the difference of local density in feature space, which limits the detection performance. In this paper, we propose Reducing Biases (REB) in representation by considering the domain bias and building a self-supervised learning task for better domain adaption with a defect generation strategy (DefectMaker) that ensures a strong diversity in the synthetic defects. Additionally, we propose a local-density KNN (LDKNN) to reduce the local density bias in the feature space and obtain effective anomaly detection. The proposed REB method achieves a promising result of 99.5\% Im.AUROC on the widely used MVTec AD, with smaller backbone networks such as Vgg11 and Resnet18. The method also achieves an impressive 88.8\% Im.AUROC on the MVTec LOCO AD dataset and a remarkable 96.0\% on the BTAD dataset, outperforming other representation-based approaches. These results indicate the effectiveness and efficiency of REB for practical industrial applications. Code:https://github.com/ShuaiLYU/REB. △ Less

Submitted 17 May, 2024; v1 submitted 24 August, 2023; originally announced August 2023.

Comments: 14 pages, 7 figures, 7 tables

arXiv:2308.07013 [pdf, other]

Learning to Optimize LSM-trees: Towards A Reinforcement Learning based Key-Value Store for Dynamic Workloads

Authors: Dingheng Mo, Fanchao Chen, Siqiang Luo, Caihua Shan

Abstract: LSM-trees are widely adopted as the storage backend of key-value stores. However, optimizing the system performance under dynamic workloads has not been sufficiently studied or evaluated in previous work. To fill the gap, we present RusKey, a key-value store with the following new features: (1) RusKey is a first attempt to orchestrate LSM-tree structures online to enable robust performance under t… ▽ More LSM-trees are widely adopted as the storage backend of key-value stores. However, optimizing the system performance under dynamic workloads has not been sufficiently studied or evaluated in previous work. To fill the gap, we present RusKey, a key-value store with the following new features: (1) RusKey is a first attempt to orchestrate LSM-tree structures online to enable robust performance under the context of dynamic workloads; (2) RusKey is the first study to use Reinforcement Learning (RL) to guide LSM-tree transformations; (3) RusKey includes a new LSM-tree design, named FLSM-tree, for an efficient transition between different compaction policies -- the bottleneck of dynamic key-value stores. We justify the superiority of the new design with theoretical analysis; (4) RusKey requires no prior workload knowledge for system adjustment, in contrast to state-of-the-art techniques. Experiments show that RusKey exhibits strong performance robustness in diverse workloads, achieving up to 4x better end-to-end performance than the RocksDB system under various settings. △ Less

Submitted 17 September, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

Comments: 25 pages, 13 figures

arXiv:2207.09179 [pdf, other]

doi 10.14778/3551793.3551866

SCARA: Scalable Graph Neural Networks with Feature-Oriented Optimization

Authors: Ningyi Liao, Dingheng Mo, Siqiang Luo, Xiang Li, Pengcheng Yin

Abstract: Recent advances in data processing have stimulated the demand for learning graphs of very large scales. Graph Neural Networks (GNNs), being an emerging and powerful approach in solving graph learning tasks, are known to be difficult to scale up. Most scalable models apply node-based techniques in simplifying the expensive graph message-passing propagation procedure of GNN. However, we find such ac… ▽ More Recent advances in data processing have stimulated the demand for learning graphs of very large scales. Graph Neural Networks (GNNs), being an emerging and powerful approach in solving graph learning tasks, are known to be difficult to scale up. Most scalable models apply node-based techniques in simplifying the expensive graph message-passing propagation procedure of GNN. However, we find such acceleration insufficient when applied to million- or even billion-scale graphs. In this work, we propose SCARA, a scalable GNN with feature-oriented optimization for graph computation. SCARA efficiently computes graph embedding from node features, and further selects and reuses feature computation results to reduce overhead. Theoretical analysis indicates that our model achieves sub-linear time complexity with a guaranteed precision in propagation process as well as GNN training and inference. We conduct extensive experiments on various datasets to evaluate the efficacy and efficiency of SCARA. Performance comparison with baselines shows that SCARA can reach up to 100x graph propagation acceleration than current state-of-the-art methods with fast convergence and comparable accuracy. Most notably, it is efficient to process precomputation on the largest available billion-scale GNN dataset Papers100M (111M nodes, 1.6B edges) in 100 seconds. △ Less

Submitted 19 July, 2022; originally announced July 2022.

Journal ref: Proceedings of the VLDB Endowment 15 (2022) 3240-3248

arXiv:1811.12322 [pdf, other]

Binary Sequence Set Design for Interferer Rejection in Multi-Branch Modulation

Authors: Dian Mo, Marco F. Duarte

Abstract: Wideband communication is often expected to deal with a very wide spectrum, which in many environments of interest includes strong interferers. Thus receivers for the wideband communication systems often need to mitigate interferers to reduce the distortion caused by the amplifier nonlinearity and noise. Recently, a new architecture for communication receivers known as random modulation mixes a si… ▽ More Wideband communication is often expected to deal with a very wide spectrum, which in many environments of interest includes strong interferers. Thus receivers for the wideband communication systems often need to mitigate interferers to reduce the distortion caused by the amplifier nonlinearity and noise. Recently, a new architecture for communication receivers known as random modulation mixes a signal with different pseudorandom sequences using multiple branches of channels before sampling. While random modulation is used in these receivers to acquire the signal at low sampling rates, the modulation sequences used lack the ability to suppress interferers due to their flat spectra. In previous work, we introduced the design of a single spectrally shaped binary sequence that mitigates interferers to replace the pseudorandom sequence in a channel. However, the designed sequences cannot provide the stable recovery achieved by pseudorandom sequence approaches. In this paper, we extend our previous sequence design to guarantee stable recovery by designing a set of sequences to be orthogonal to each other. We show that it is difficult to find the necessary number of sequences featuring mutual orthogonality and introduce oversampling to the sequence set design to improve the recovery performance. We propose an algorithm for multi-branch sequence design as a binary optimization problem, which is solved using a semidefinite program relaxation and randomized projection. While it is common to model narrowband interferers as a subspace spanned by a subset of elements from the Fourier basis, we show that the Slepian basis provides an alternative and more suitable compact representation for signals with components contained in narrow spectrum bands. Numerical experiments using the proposed sequence sets show their advantages against pseudorandom sequences and our previous work. △ Less

Submitted 3 June, 2020; v1 submitted 29 November, 2018; originally announced November 2018.

Comments: 11 pages, 6 figures. To appear in IEEE Transactions on Signal Processing

arXiv:1811.05873 [pdf, other]

Design of Spectrally Shaped Binary Sequences via Randomized Convex Relaxation

Authors: Dian Mo, Marco F. Duarte

Abstract: Wideband communication receivers often deal with the problems of detecting weak signals from distant sources received together with strong nearby interferers. When the techniques of random modulation are used in communication system receivers, one can design a spectrally shaped sequence that mitigates interferer bands while preserving message bands. Common implementation constraints require sequen… ▽ More Wideband communication receivers often deal with the problems of detecting weak signals from distant sources received together with strong nearby interferers. When the techniques of random modulation are used in communication system receivers, one can design a spectrally shaped sequence that mitigates interferer bands while preserving message bands. Common implementation constraints require sequence quantization, which turns the design problem formulation to an integer optimization problem solved using a semidefinite program on a matrix that is restricted to have rank one. Common approximation schemes for this problem are not amenable due to the distortion to the spectrum caused by the required quantization. We propose a method that leverages a randomized projection and quantization of the solution of a semidefinite program, an approach that has been previously used for related integer programs. We provide a theoretical and numerical analysis on the feasibility and quality of the approximation provided by the proposed approach. Furthermore, numerical simulations show that our proposed approach returns the same sequence as an exhaustive search (when feasible), showcasing its accuracy and efficiency. Furthermore, our proposed method succeeds in finding suitable spectrally shaped sequences for cases where exhaustive search is not feasible, achieving better performance than existing alternatives. △ Less

Submitted 14 November, 2018; originally announced November 2018.

Comments: 27 Pages, 7 figures

arXiv:1412.6724 [pdf, other]

doi 10.1016/j.sigpro.2017.07.003

Performance of Compressive Parameter Estimation via K-Median Clustering

Authors: Dian Mo, Marco F. Duarte

Abstract: Compressive sensing (CS) has attracted significant attention in parameter estimation tasks, where parametric dictionaries (PDs) collect signal observations for a sampling of the parameter space and yield sparse representations for signals of interest when the sampling is dense. While this sampling also leads to high dictionary coherence, one can leverage structured sparsity models to prevent highl… ▽ More Compressive sensing (CS) has attracted significant attention in parameter estimation tasks, where parametric dictionaries (PDs) collect signal observations for a sampling of the parameter space and yield sparse representations for signals of interest when the sampling is dense. While this sampling also leads to high dictionary coherence, one can leverage structured sparsity models to prevent highly coherent dictionary elements from appearing simultaneously in the recovered signal. However, the resulting approaches depend heavily on the careful setting of the maximum allowable coherence; furthermore, their guarantees are not concerned with general parameter estimation performance. We propose the use of earth mover's distance (EMD), as applied to a pair of true and estimated PD coefficient vectors, to measure the parameter estimation error. We formally analyze the connection between the EMD and the parameter estimation error and show that the EMD provides a better-suited metric for parameter estimation performance than the Euclidean distance. Additionally, we analyze the previously described relationship between K-median clustering and EMD-optimal sparse approximation and leverage it to develop improved PD-based parameter estimation algorithms. Finally, our numerical experiments verify our theoretical results and show that the proposed compressive parameter estimation algorithms have improved performance over existing approaches. △ Less

Submitted 11 May, 2017; v1 submitted 21 December, 2014; originally announced December 2014.

Comments: 41 pages, 8 figures; revision includes additional discussions and experiment

Showing 1–9 of 9 results for author: Mo, D