Augmented Curation of Unstructured Clinical Notes from a Massive EHR System Reveals Specific Phenotypic Signature of Impending COVID-19 Diagnosis
Authors:
FNU Shweta,
Karthik Murugadoss,
Samir Awasthi,
AJ Venkatakrishnan,
Arjun Puranik,
Martin Kang,
Brian W. Pickering,
John C. O'Horo,
Philippe R. Bauer,
Raymund R. Razonable,
Paschalis Vergidis,
Zelalem Temesgen,
Stacey Rizza,
Maryam Mahmood,
Walter R. Wilson,
Douglas Challener,
Praveen Anand,
Matt Liebers,
Zainab Doctor,
Eli Silvert,
Hugo Solomon,
Tyler Wagner,
Gregory J. Gores,
Amy W. Williams,
John Halamka
, et al. (2 additional authors not shown)
Abstract:
Understanding the temporal dynamics of COVID-19 patient phenotypes is necessary to derive fine-grained resolution of pathophysiology. Here we use state-of-the-art deep neural networks over an institution-wide machine intelligence platform for the augmented curation of 15.8 million clinical notes from 30,494 patients subjected to COVID-19 PCR diagnostic testing. By contrasting the Electronic Health…
▽ More
Understanding the temporal dynamics of COVID-19 patient phenotypes is necessary to derive fine-grained resolution of pathophysiology. Here we use state-of-the-art deep neural networks over an institution-wide machine intelligence platform for the augmented curation of 15.8 million clinical notes from 30,494 patients subjected to COVID-19 PCR diagnostic testing. By contrasting the Electronic Health Record (EHR)-derived clinical phenotypes of COVID-19-positive (COVIDpos, n=635) versus COVID-19-negative (COVIDneg, n=29,859) patients over each day of the week preceding the PCR testing date, we identify anosmia/dysgeusia (37.4-fold), myalgia/arthralgia (2.6-fold), diarrhea (2.2-fold), fever/chills (2.1-fold), respiratory difficulty (1.9-fold), and cough (1.8-fold) as significantly amplified in COVIDpos over COVIDneg patients. The specific combination of cough and diarrhea has a 3.2-fold amplification in COVIDpos patients during the week prior to PCR testing, and along with anosmia/dysgeusia, constitutes the earliest EHR-derived signature of COVID-19 (4-7 days prior to typical PCR testing date). This study introduces an Augmented Intelligence platform for the real-time synthesis of institutional knowledge captured in EHRs. The platform holds tremendous potential for scaling up curation throughput, with minimal need for retraining underlying neural networks, thus promising EHR-powered early diagnosis for a broad spectrum of diseases.
△ Less
Submitted 28 April, 2020; v1 submitted 17 April, 2020;
originally announced April 2020.