-
Benchmarking AutoML Frameworks for Disease Prediction Using Medical Claims
Authors:
Roland Albert A. Romero,
Mariefel Nicole Y. Deypalan,
Suchit Mehrotra,
John Titus Jungao,
Natalie E. Sheils,
Elisabetta Manduchi,
Jason H. Moore
Abstract:
We ascertain and compare the performances of AutoML tools on large, highly imbalanced healthcare datasets.
We generated a large dataset using historical administrative claims including demographic information and flags for disease codes in four different time windows prior to 2019. We then trained three AutoML tools on this dataset to predict six different disease outcomes in 2019 and evaluated…
▽ More
We ascertain and compare the performances of AutoML tools on large, highly imbalanced healthcare datasets.
We generated a large dataset using historical administrative claims including demographic information and flags for disease codes in four different time windows prior to 2019. We then trained three AutoML tools on this dataset to predict six different disease outcomes in 2019 and evaluated model performances on several metrics.
The AutoML tools showed improvement from the baseline random forest model but did not differ significantly from each other. All models recorded low area under the precision-recall curve and failed to predict true positives while keeping the true negative rate high. Model performance was not directly related to prevalence. We provide a specific use-case to illustrate how to select a threshold that gives the best balance between true and false positive rates, as this is an important consideration in medical applications.
Healthcare datasets present several challenges for AutoML tools, including large sample size, high imbalance, and limitations in the available features types. Improvements in scalability, combinations of imbalance-learning resampling and ensemble approaches, and curated feature selection are possible next steps to achieve better performance.
Among the three explored, no AutoML tool consistently outperforms the rest in terms of predictive performance. The performances of the models in this study suggest that there may be room for improvement in handling medical claims data. Finally, selection of the optimal prediction threshold should be guided by the specific practical application.
△ Less
Submitted 22 July, 2021;
originally announced July 2021.
-
Modeling differential rates of aging using routine laboratory data; Implications for morbidity and health care expenditure
Authors:
Alix Jean Santos,
Xavier Eugenio Asuncion,
Camille Rivero-Co,
Maria Eloisa Ventura,
Reynaldo Geronia II,
Lauren Bangerter,
Natalie E. Sheils
Abstract:
Aging is a multidimensional process where phenotypes change at varying rates. Longitudinal studies of aging typically involve following a cohort of individuals over the course of several years. This design is hindered by cost, attrition, and subsequently small sample size. Alternative methodologies are therefore warranted. In this study, we used a variational autoencoder to estimate rates of aging…
▽ More
Aging is a multidimensional process where phenotypes change at varying rates. Longitudinal studies of aging typically involve following a cohort of individuals over the course of several years. This design is hindered by cost, attrition, and subsequently small sample size. Alternative methodologies are therefore warranted. In this study, we used a variational autoencoder to estimate rates of aging from cross-sectional data from routine laboratory tests of 1.4 million individuals collected from 2016 to 2019. By incorporating metrics that would ensure model's stability and distinctness of the dimensions, we uncovered four aging dimensions that represent the following bodily functions: 1) kidney, 2) thyroid, 3) white blood cells, and 4) liver and heart. We then examined the relationship between rates of aging on morbidity and health care expenditure. In general, faster agers along these dimensions are more likely to develop chronic diseases that are related to these bodily functions. They also had higher health care expenditures compared to the slower agers. K-means clustering of individuals based on rate of aging revealed that clusters with higher odds of developing morbidity had the highest cost across all types of health care services. Results suggest that cross-sectional laboratory data can be leveraged as an alternative methodology to understand age along the different dimensions. Moreover, rates of aging are differentially related to future costs, which can aid in the development of interventions to delay disease progression.
△ Less
Submitted 17 March, 2021;
originally announced March 2021.
-
Revivals and Fractalisation in the Linear Free Space Schrödinger Equation
Authors:
Peter J Olver,
Natalie E Sheils,
David A Smith
Abstract:
We consider the one-dimensional linear free space Schrödinger equation on a bounded interval subject to homogeneous linear boundary conditions. We prove that, in the case of pseudoperiodic boundary conditions, the solution of the initial-boundary value problem exhibits the phenomenon of revival at specific (`rational') times, meaning that it is a linear combination of a certain number of copies of…
▽ More
We consider the one-dimensional linear free space Schrödinger equation on a bounded interval subject to homogeneous linear boundary conditions. We prove that, in the case of pseudoperiodic boundary conditions, the solution of the initial-boundary value problem exhibits the phenomenon of revival at specific (`rational') times, meaning that it is a linear combination of a certain number of copies of the initial datum. Equivalently, the fundamental solution at these times is a finite linear combination of delta functions. At other (`irrational') times, for suitably rough initial data, e.g., a step or more general piecewise constant function, the solution exhibits a continuous but fractal-like profile. Further, we express the solution for general homogenous linear boundary conditions in terms of numerically computable eigenfunctions. Alternative solution formulas are derived using the Uniform Transform Method (UTM), that can prove useful in more general situations. We then investigate the effects of general linear boundary conditions, including Robin, and find novel `dissipative' revivals in the case of energy decreasing conditions.
△ Less
Submitted 20 December, 2018;
originally announced December 2018.
-
The time-dependent Schrödinger equation with piecewise constant potentials
Authors:
Natalie E Sheils,
Bernard Deconinck
Abstract:
The linear Schrödinger equation with piecewise constant potential in one spatial dimension is a well-studied textbook problem. It is one of only a few solvable models in quantum mechanics and shares many qualitative features with physically important models. In examples such as "particle in a box" and tunneling, attention is restricted to the time-independent Schrödinger equation. This paper combi…
▽ More
The linear Schrödinger equation with piecewise constant potential in one spatial dimension is a well-studied textbook problem. It is one of only a few solvable models in quantum mechanics and shares many qualitative features with physically important models. In examples such as "particle in a box" and tunneling, attention is restricted to the time-independent Schrödinger equation. This paper combines the Unified Transform Method and recent insights for interface problems to present fully explicit solutions for the time-dependent problem.
△ Less
Submitted 28 June, 2018; v1 submitted 7 September, 2017;
originally announced September 2017.
-
Multilayer diffusion in a composite medium with imperfect contact
Authors:
Natalie E. Sheils
Abstract:
The problem of heat conduction in one-dimensional piecewise homogeneous composite materials is examined by providing an explicit solution of the one-dimensional heat equation in each domain. The location of the interfaces is known, but neither temperature nor heat flux are prescribed there. We find a solution using the Unified Transform Method, due to Fokas and collaborators, applied to interface…
▽ More
The problem of heat conduction in one-dimensional piecewise homogeneous composite materials is examined by providing an explicit solution of the one-dimensional heat equation in each domain. The location of the interfaces is known, but neither temperature nor heat flux are prescribed there. We find a solution using the Unified Transform Method, due to Fokas and collaborators, applied to interface problems and compute solutions numerically.
△ Less
Submitted 21 December, 2016; v1 submitted 6 October, 2016;
originally announced October 2016.
-
The Linear KdV Equation with an Interface
Authors:
Bernard Deconinck,
Natalie E. Sheils,
David A. Smith
Abstract:
The interface problem for the linear Korteweg-de Vries (KdV) equation in one-dimensional piecewise homogeneous domains is examined by constructing an explicit solution in each domain. The location of the interface is known and a number of compatibility conditions at the boundary are imposed. We provide an explicit characterization of sufficient interface conditions for the construction of a soluti…
▽ More
The interface problem for the linear Korteweg-de Vries (KdV) equation in one-dimensional piecewise homogeneous domains is examined by constructing an explicit solution in each domain. The location of the interface is known and a number of compatibility conditions at the boundary are imposed. We provide an explicit characterization of sufficient interface conditions for the construction of a solution using Fokas's Unified Transform Method. The problem and the method considered here extend that of earlier papers to problems with more than two spatial derivatives.
△ Less
Submitted 19 September, 2016; v1 submitted 14 August, 2015;
originally announced August 2015.
-
Initial-to-Interface Maps for the Heat Equation on Composite Domains
Authors:
Natalie E. Sheils,
Bernard Deconinck
Abstract:
A map from the initial conditions to the values of the function and its first spatial derivative evaluated at the interface is constructed for the heat equation on finite and infinite domains with $n$ interfaces. The existence of this map allows changing the problem at hand from an interface problem to a boundary value problem which allows for an alternative to the approach of finding a closed-for…
▽ More
A map from the initial conditions to the values of the function and its first spatial derivative evaluated at the interface is constructed for the heat equation on finite and infinite domains with $n$ interfaces. The existence of this map allows changing the problem at hand from an interface problem to a boundary value problem which allows for an alternative to the approach of finding a closed-form solution to the interface problem.
△ Less
Submitted 7 April, 2016; v1 submitted 28 June, 2015;
originally announced June 2015.
-
Heat equation on a network using the Fokas method
Authors:
N. E. Sheils,
D. A. Smith
Abstract:
The problem of heat conduction on networks of multiply connected rods is solved by providing an explicit solution of the one-dimensional heat equation in each domain. The size and connectivity of the rods is known, but neither temperature nor heat flux are prescribed at the interface. Instead, the physical assumptions of continuity at the interfaces are the only conditions imposed. This work gener…
▽ More
The problem of heat conduction on networks of multiply connected rods is solved by providing an explicit solution of the one-dimensional heat equation in each domain. The size and connectivity of the rods is known, but neither temperature nor heat flux are prescribed at the interface. Instead, the physical assumptions of continuity at the interfaces are the only conditions imposed. This work generalizes that of Deconinck, Pelloni, and Sheils, 2014, for heat conduction on a series of one-dimensional rods connected end-to-end to the case of general configurations.
△ Less
Submitted 30 July, 2015; v1 submitted 17 March, 2015;
originally announced March 2015.
-
Interface Problems for Dispersive equations
Authors:
Natalie E Sheils,
Bernard Deconinck
Abstract:
The interface problem for the linear Schrödinger equation in one-dimensional piecewise homogeneous domains is examined by providing an explicit solution in each domain. The location of the interfaces is known and the continuity of the wave function and a jump in their derivative at the interface are the only conditions imposed. The problem of two semi-infinite domains and that of two finite-sized…
▽ More
The interface problem for the linear Schrödinger equation in one-dimensional piecewise homogeneous domains is examined by providing an explicit solution in each domain. The location of the interfaces is known and the continuity of the wave function and a jump in their derivative at the interface are the only conditions imposed. The problem of two semi-infinite domains and that of two finite-sized domains are examined in detail. The problem and the method considered here extend that of an earlier paper by Deconinck, Pelloni and Sheils (2014). The dispersive nature of the problem presents additional difficulties that are addressed here.
△ Less
Submitted 2 August, 2014; v1 submitted 13 May, 2014;
originally announced May 2014.