/* Target unordered list (bullets) in the Rich Text Block */ .Blog-Rich-Text ul { list-style-type: disc; color: #0000FF; /* Change this color to your desired bullet color */ } /* Target ordered list (numbers) in the Rich Text Block */ .Blog-Rich-Text ol { color: #FF0000; /* Change this color to your desired number color */ list-style-type: decimal; /* Customize list type */ } /* Target the list items */ .Blog-Rich-Text ul li, .Blog-Rich-Text ol li { font-size: 18px; /* Customize the font size for list items */ line-height: 1.6; }
Identifying patient cohorts is fundamental to numerous healthcare tasks, including clinical trial recruitment and retrospective studies. Current cohort retrieval methods in healthcare organizations rely on automated queries of structured data combined with manual curation, which are time-consuming, labor-intensive, and often yield low-quality results. Recent advancements in large language models (LLMs) and information retrieval (IR) offer promising avenues to revolutionize these systems.
This paper introduces a new task called Automatic Cohort Retrieval (ACR), which extends clinical trial matching and cohort selection to large-scale, longitudinal data. The researchers evaluate the performance of LLMs and commercial, domain-specific neuro-symbolic approaches on this task.
Methods: The study introduces a benchmark task, a query dataset of 113 complex oncology queries, an EMR dataset of 1,436 patients with 115,865 medical records, and an evaluation framework. Three baseline approaches are evaluated: a retriever-only method using dense retrieval, a retrieve-then-read method combining dense retrieval with an LLM reader, and a neuro-symbolic approach using a commercial product called Hypercube.
Results: The neuro-symbolic approach (Hypercube) consistently outperformed the LLM-only baselines on F1-scores, with a 10.1% to 26.72% gap. The retrieve-then-read method showed significant quality gains over the retriever-only method but at a substantial increase in computational cost. All methods showed performance degradation as patient records grew longer, highlighting the challenge of longitudinal reasoning.
The study also introduced metrics for evaluating hallucination tendencies and set-theoretic consistency of ACR systems. The neuro-symbolic approach showed lower hallucination rates and better consistency compared to LLM-only methods.
Conclusions: While the neuro-symbolic approach currently leads in performance, LLMs exhibit potential in helping automate the retrieval of patient cohorts from extensive, longitudinal datasets. The study highlights the importance of developing accurate yet efficient ACR systems to advance clinical research applications. It also underscores the potential of integrating expert knowledge with LLMs in healthcare, a domain rich with explicit knowledge.
Future work should focus on refining and adapting LLM technologies to meet the specific needs of medical researchers, improving model interpretability, and controlling hallucinations. These advancements could support better clinical decision-making, patient outcomes, and the development of new treatments and interventions.
Download full paper