Computer Science > Computation and Language
[Submitted on 7 Aug 2023 (v1), last revised 11 Jan 2024 (this version, v3)]
Title:CORAL: Expert-Curated medical Oncology Reports to Advance Language Model Inference
View PDFAbstract:Both medical care and observational studies in oncology require a thorough understanding of a patient's disease progression and treatment history, often elaborately documented in clinical notes. Despite their vital role, no current oncology information representation and annotation schema fully encapsulates the diversity of information recorded within these notes. Although large language models (LLMs) have recently exhibited impressive performance on various medical natural language processing tasks, due to the current lack of comprehensively annotated oncology datasets, an extensive evaluation of LLMs in extracting and reasoning with the complex rhetoric in oncology notes remains understudied. We developed a detailed schema for annotating textual oncology information, encompassing patient characteristics, tumor characteristics, tests, treatments, and temporality. Using a corpus of 40 de-identified breast and pancreatic cancer progress notes at University of California, San Francisco, we applied this schema to assess the zero-shot abilities of three recent LLMs (GPT-4, GPT-3.5-turbo, and FLAN-UL2) to extract detailed oncological history from two narrative sections of clinical progress notes. Our team annotated 9028 entities, 9986 modifiers, and 5312 relationships. The GPT-4 model exhibited overall best performance, with an average BLEU score of 0.73, an average ROUGE score of 0.72, an exact-match F1-score of 0.51, and an average accuracy of 68% on complex tasks (expert manual evaluation on subset). Notably, it was proficient in tumor characteristic and medication extraction, and demonstrated superior performance in relational inference like adverse event detection. However, further improvements are needed before using it to reliably extract important facts from cancer progress notes needed for clinical research, complex population management, and documenting quality patient care.
Submission history
From: Madhumita Sushil [view email][v1] Mon, 7 Aug 2023 18:03:10 UTC (1,627 KB)
[v2] Wed, 11 Oct 2023 22:41:37 UTC (1,857 KB)
[v3] Thu, 11 Jan 2024 16:45:56 UTC (2,175 KB)
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
alphaXiv (What is alphaXiv?)
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Hugging Face (What is Huggingface?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.