Research Paper

Share on

Revolutionizing Patient Cohort Identification with AI – Insights from Mendel’s ACR Benchmark

‍

Identifying patient cohorts is fundamental to numerous healthcare tasks, including clinical trial recruitment and retrospective studies. Current cohort retrieval methods in healthcare organizations rely on automated queries of structured data combined with manual curation, which are time-consuming, labor-intensive, and often yield low-quality results. Recent advancements in large language models (LLMs) and information retrieval (IR) offer promising avenues to revolutionize these systems.

This paper introduces a new task called Automatic Cohort Retrieval (ACR), which extends clinical trial matching and cohort selection to large-scale, longitudinal data. The researchers evaluate the performance of LLMs and commercial, domain-specific neuro-symbolic approaches on this task.

Methods: The study introduces a benchmark task, a query dataset of 113 complex oncology queries, an EMR dataset of 1,436 patients with 115,865 medical records, and an evaluation framework. Three baseline approaches are evaluated: a retriever-only method using dense retrieval, a retrieve-then-read method combining dense retrieval with an LLM reader, and a neuro-symbolic approach using a commercial product called Hypercube.

Results: The neuro-symbolic approach (Hypercube) consistently outperformed the LLM-only baselines on F1-scores, with a 10.1% to 26.72% gap. The retrieve-then-read method showed significant quality gains over the retriever-only method but at a substantial increase in computational cost. All methods showed performance degradation as patient records grew longer, highlighting the challenge of longitudinal reasoning.

The study also introduced metrics for evaluating hallucination tendencies and set-theoretic consistency of ACR systems. The neuro-symbolic approach showed lower hallucination rates and better consistency compared to LLM-only methods.

Conclusions: While the neuro-symbolic approach currently leads in performance, LLMs exhibit potential in helping automate the retrieval of patient cohorts from extensive, longitudinal datasets. The study highlights the importance of developing accurate yet efficient ACR systems to advance clinical research applications. It also underscores the potential of integrating expert knowledge with LLMs in healthcare, a domain rich with explicit knowledge.

Future work should focus on refining and adapting LLM technologies to meet the specific needs of medical researchers, improving model interpretability, and controlling hallucinations. These advancements could support better clinical decision-making, patient outcomes, and the development of new treatments and interventions.

Download full paper

‍

Exploring the Future of Healthcare AI: A Conversation with Kristin Maloney

The recent podcast featuring Kristin Maloney, hosted on Oncology Data Advisor, delves into Mendel AI's transformative role in healthcare. Kristin highlights how Mendel’s clinical AI solutions—such as Retina, Resolve, and Hypercube—are revolutionizing data-driven decision-making, empowering clinicians to extract critical insights from complex datasets quickly and accurately. Mendel AI's mission is clear: turning unstructured and structured healthcare data into actionable intelligence, bridging gaps in clinical care, and providing physicians with tools to deliver optimal patient outcomes.

Introducing Mendel's New Brand Focus: Supercharging Clinical Data Workflows in Healthcare

Mendel has evolved its brand to “Supercharge Your Clinical Data Workflows,” a shift that reflects our commitment to delivering AI solutions that genuinely enhance clinical data management. In healthcare, where talent shortages demand efficient and reliable tech, our Hypercube solution and neuro-symbolic AI bring unmatched cost-efficiency, speed, and accuracy to workflows. This shift emphasizes our focus on alleviating healthcare’s talent strain with tech that builds trust—eliminating errors and reducing the risk of hallucinations. Discover how Mendel’s transformative approach can optimize your workflows with validated solutions trusted by leaders in the industry.

Revolutionizing Patient Cohort Identification with AI – Insights from Mendel’s ACR Benchmark

Introducing ACR: A New Benchmark for Patient Cohort Retrieval This study introduces Automatic Cohort Retrieval (ACR), a novel task for efficiently identifying patient groups from large-scale medical data. Comparing AI-powered approaches, including large language models and neuro-symbolic systems, the research reveals promising advancements in automating cohort selection for clinical trials and studies. The findings highlight the potential of AI to revolutionize healthcare data analysis, while emphasizing the need for continued improvements in accuracy, efficiency, and reliability.

Introduction to Hypercube’s Ontology and Reasoning Engine

Large Language Models (LLMs) hold the potential to transform healthcare by generating clinical insights and supporting decision-making. However, LLMs face challenges such as hallucinations, lack of explainability, and limited reasoning capabilities, which restrict their effectiveness in clinical settings. Mendel's Hypercube platform addresses these limitations by integrating LLMs with structured clinical ontologies, enhancing both inference and decision-making. Unlike standard ontologies focused mainly on documentation, Mendel’s generative ontology prioritizes scalable reasoning through reductionism and emergentism, enabling more accurate clinical reasoning and streamlined data integration.

Mendel Unveils Groundbreaking Neuro-Symbolic AI System Outperforming GPT-4 for Automatic Cohort Retreival in New Study

“Our latest research at Mendel marks a significant milestone in the field of AI in general, and healthcare in particular,” said Wael Salloum, Cofounder and Chief Science Officer at Mendel. “We are the leader in clinical reasoning by coupling LLMs with our hypergraph reasoning, enhancing both the effectiveness and efficiency of patient cohort retrieval.

Improving Clinical Trial Participant Prescreening With Artificial Intelligence (AI): A Comparison of the Results of AI Assisted vs Standard Methods in 3 Oncology Trials

Delays in clinical trial enrollment and difficulties enrolling representative samples continue to vex sponsors, sites, and patient populations. Here we investigated use of an artificial intelligence-powered technology, Mendel.ai, as a means of overcoming bottlenecks and potential biases associated with standard patient prescreening processes in an oncology setting.

Coupling Symbolic Reasoning with Language Modeling for Efficient Longitudinal Understanding of Unstructured Electronic Medical Records

The application of Artificial Intelligence (AI) in healthcare has been revolutionary, especially with the recent advancements in transformer-based Large Language Models (LLMs). However, the task of understanding unstructured electronic medical records remains a challenge given the nature of the records (e.g., disorganization, inconsistency, and redundancy) and the inability of LLMs to derive reasoning paradigms that allow for comprehensive understanding of medical variables. In this work, we examine the power of coupling symbolic reasoning with language modeling toward improved understanding of unstructured clinical texts. We show that such a combination improves the extraction of several medical variables from unstructured records. In addition, we show that the state-of-the-art commercially-free LLMs enjoy retrieval capabilities comparable to those provided by their commercial counterparts. Finally, we elaborate on the need for LLM steering through the application of symbolic reasoning as the exclusive use of LLMs results in the lowest performance.

How to Approach De-Identification

Organizations that use patient data for internal or external research need to take steps to prevent the exposure of PHI to those who are not authorized to view it. They do this by redacting specific categories of identifiers from every patient document. Once the identifiers are masked, the risk profile of these datasets is significantly reduced. But how do you ensure that redaction engines are working to the highest accuracy?

Clinical Data Abstraction

Clinical Record OCR

PHI De-identification

Clinical Search Engine

Clinical Trial Matching

Clinical Data Assets

Revolutionizing Patient Cohort Identification with AI – Insights from Mendel’s ACR Benchmark

The Feed

Enhancing Oncology Clinical Trial Prescreening at UPenn with Mendel AI

Enhancing Oncology Clinical Trial Prescreening at UPenn with Mendel AI

Exploring the Future of Healthcare AI: A Conversation with Kristin Maloney

Exploring the Future of Healthcare AI: A Conversation with Kristin Maloney

Introducing Mendel's New Brand Focus: Supercharging Clinical Data Workflows in Healthcare

Introducing Mendel's New Brand Focus: Supercharging Clinical Data Workflows in Healthcare

Faithfulness Hallucination Detection in Healthcare AI: Ensuring Reliable Medical Summaries

Faithfulness Hallucination Detection in Healthcare AI: Ensuring Reliable Medical Summaries

Revolutionizing Patient Cohort Identification with AI – Insights from Mendel’s ACR Benchmark

Revolutionizing Patient Cohort Identification with AI – Insights from Mendel’s ACR Benchmark

Introduction to Hypercube’s Ontology and Reasoning Engine

Introduction to Hypercube’s Ontology and Reasoning Engine

Mendel Unveils Groundbreaking Neuro-Symbolic AI System Outperforming GPT-4 for Automatic Cohort Retreival in New Study

Mendel Unveils Groundbreaking Neuro-Symbolic AI System Outperforming GPT-4 for Automatic Cohort Retreival in New Study

Improving Clinical Trial Participant Prescreening With Artificial Intelligence (AI): A Comparison of the Results of AI Assisted vs Standard Methods in 3 Oncology Trials

Improving Clinical Trial Participant Prescreening With Artificial Intelligence (AI): A Comparison of the Results of AI Assisted vs Standard Methods in 3 Oncology Trials

Coupling Symbolic Reasoning with Language Modeling for Efficient Longitudinal Understanding of Unstructured Electronic Medical Records

Coupling Symbolic Reasoning with Language Modeling for Efficient Longitudinal Understanding of Unstructured Electronic Medical Records

How a diagnostic company was able to build a clinico-genomic database in a week

How a diagnostic company was able to build a clinico-genomic database in a week

How One Organization Changed The Way Patients are Identified for Clinical Trials with AI

How One Organization Changed The Way Patients are Identified for Clinical Trials with AI

How to Approach De-Identification

How to Approach De-Identification

Back to Top

Headquarters

Hypercube Copilots

Industry

Privacy & Legal

Company