subscribe to arXiv mailings

Scaling Vision Transformers to Gigapixel Images via Hierarchical Self-Supervised Learning

Authors: Richard J. Chen, Chengkuan Chen, Yicong Li, Tiffany Y. Chen, Andrew D. Trister, Rahul G. Krishnan, Faisal Mahmood

Abstract: Vision Transformers (ViTs) and their multi-scale and hierarchical variations have been successful at capturing image representations but their use has been generally studied for low-resolution images (e.g. - 256x256, 384384). For gigapixel whole-slide imaging (WSI) in computational pathology, WSIs can be as large as 150000x150000 pixels at 20X magnification and exhibit a hierarchical structure of… ▽ More Vision Transformers (ViTs) and their multi-scale and hierarchical variations have been successful at capturing image representations but their use has been generally studied for low-resolution images (e.g. - 256x256, 384384). For gigapixel whole-slide imaging (WSI) in computational pathology, WSIs can be as large as 150000x150000 pixels at 20X magnification and exhibit a hierarchical structure of visual tokens across varying resolutions: from 16x16 images capture spatial patterns among cells, to 4096x4096 images characterizing interactions within the tissue microenvironment. We introduce a new ViT architecture called the Hierarchical Image Pyramid Transformer (HIPT), which leverages the natural hierarchical structure inherent in WSIs using two levels of self-supervised learning to learn high-resolution image representations. HIPT is pretrained across 33 cancer types using 10,678 gigapixel WSIs, 408,218 4096x4096 images, and 104M 256x256 images. We benchmark HIPT representations on 9 slide-level tasks, and demonstrate that: 1) HIPT with hierarchical pretraining outperforms current state-of-the-art methods for cancer subtyping and survival prediction, 2) self-supervised ViTs are able to model important inductive biases about the hierarchical structure of phenotypes in the tumor microenvironment. △ Less

Submitted 6 June, 2022; originally announced June 2022.

Comments: Accepted to CVPR 2022 (Oral)

arXiv:1712.03120 [pdf, other]

Learning Disease vs Participant Signatures: a permutation test approach to detect identity confounding in machine learning diagnostic applications

Authors: Elias Chaibub Neto, Abhishek Pratap, Thanneer M Perumal, Meghasyam Tummalacherla, Brian M Bot, Andrew D Trister, Stephen H Friend, Lara Mangravite, Larsson Omberg

Abstract: Recently, Saeb et al (2017) showed that, in diagnostic machine learning applications, having data of each subject randomly assigned to both training and test sets (record-wise data split) can lead to massive underestimation of the cross-validation prediction error, due to the presence of "subject identity confounding" caused by the classifier's ability to identify subjects, instead of recognizing… ▽ More Recently, Saeb et al (2017) showed that, in diagnostic machine learning applications, having data of each subject randomly assigned to both training and test sets (record-wise data split) can lead to massive underestimation of the cross-validation prediction error, due to the presence of "subject identity confounding" caused by the classifier's ability to identify subjects, instead of recognizing disease. To solve this problem, the authors recommended the random assignment of the data of each subject to either the training or the test set (subject-wise data split). The adoption of subject-wise split has been criticized in Little et al (2017), on the basis that it can violate assumptions required by cross-validation to consistently estimate generalization error. In particular, adopting subject-wise splitting in heterogeneous data-sets might lead to model under-fitting and larger classification errors. Hence, Little et al argue that perhaps the overestimation of prediction errors with subject-wise cross-validation, rather than underestimation with record-wise cross-validation, is the reason for the discrepancies between prediction error estimates generated by the two splitting strategies. In order to shed light on this controversy, we focus on simpler classification performance metrics and develop permutation tests that can detect identity confounding. By focusing on permutation tests, we are able to evaluate the merits of record-wise and subject-wise data splits under more general statistical dependencies and distributional structures of the data, including situations where cross-validation breaks down. We illustrate the application of our tests using synthetic and real data from a Parkinson's disease study. △ Less

Submitted 6 July, 2018; v1 submitted 8 December, 2017; originally announced December 2017.

arXiv:1607.00091 [pdf, ps, other]

Reducing overfitting in challenge-based competitions

Authors: Elias Chaibub Neto, Bruce R Hoff, Chris Bare, Brian M Bot, Thomas Yu, Lara Magravite, Andrew D Trister, Thea Norman, Pablo Meyer, Julio Saez-Rodrigues, James C Costello, Justin Guinney, Gustavo Stolovitzky

Abstract: Over-fitting is a dreaded foe in challenge-based competitions. Because participants rely on public leaderboards to evaluate and refine their models, there is always the danger they might over-fit to the holdout data supporting the leaderboard. The recently published Ladder algorithm aims to address this problem by preventing the participants from exploiting willingly or inadvertently minor fluctua… ▽ More Over-fitting is a dreaded foe in challenge-based competitions. Because participants rely on public leaderboards to evaluate and refine their models, there is always the danger they might over-fit to the holdout data supporting the leaderboard. The recently published Ladder algorithm aims to address this problem by preventing the participants from exploiting willingly or inadvertently minor fluctuations in public leaderboard scores during model refinement. In this paper, we report a vulnerability of the Ladder that induces severe over-fitting of the leaderboard when the sample size is small. To circumvent this attack, we propose a variation of the Ladder that releases a bootstrapped estimate of the public leaderboard score instead of providing participants with a direct measure of performance. We also extend the scope of the Ladder to arbitrary performance metrics by relying on a more broadly applicable testing procedure based on the Bayesian bootstrap. Our method makes it possible to use a leaderboard, with the technical and social advantages that it provides, even in cases where data is scant. △ Less

Submitted 30 June, 2016; originally announced July 2016.

arXiv:1604.01055 [pdf, ps, other]

Towards personalized causal inference of medication response in mobile health: an instrumental variable approach for randomized trials with imperfect compliance

Authors: Elias Chaibub Neto, Ross L Prentice, Brian M Bot, Mike Kellen, Stephen H Friend, Andrew D Trister, Larsson Omberg, Lara Mangravite

Abstract: Mobile health studies can leverage longitudinal sensor data from smartphones to guide the application of personalized medical interventions. In this paper, we propose that adoption of an instrumental variable approach for randomized trials with imperfect compliance provides a natural framework for personalized causal inference of medication response in mobile health studies. Randomized treatment s… ▽ More Mobile health studies can leverage longitudinal sensor data from smartphones to guide the application of personalized medical interventions. In this paper, we propose that adoption of an instrumental variable approach for randomized trials with imperfect compliance provides a natural framework for personalized causal inference of medication response in mobile health studies. Randomized treatment suggestions can be easily delivered to the study participants via electronic messages popping up on the smart-phone screen. Under quite general assumptions we can identify the causal effect of the actual treatment on the response in the presence of unobserved confounders. We implement a personalized randomization test of the null hypothesis of no causal effect of treatment on response, and evaluate its performance in a large scale simulation study encompassing data generated from linear and non-linear time series models under several simulation conditions. In particular, we evaluate the empirical power of the proposed test under varying degrees of compliance between the suggested and actual treatment adopted by the participant. Our investigations provide encouraging results in terms of power and control of type I error rates. Finally, we compare the proposed instrumental variable approach to a simple intent-to-treat strategy, and develop randomization confidence intervals for the causal effects. △ Less

Submitted 31 July, 2017; v1 submitted 4 April, 2016; originally announced April 2016.

Comments: Main text, appendixes, and supplementary materials were re-organized

Showing 1–4 of 4 results for author: Trister, A D