Leveraging large language models to automate the identification of healthcare access barriers for Veterans

Abstract: Objective: To develop and evaluate an automated system for identifying healthcare barriers focusing on transportation issues in veterans’ clinical notes using large language models (LLMs) and to assess the impact of different prompting strategies on classification performance and explanation consistency. Methods: We developed a hybrid system combining pattern matching for templated notes with LLM analysis for free-text notes. Using 2000 manually annotated clinical notes, we compared four prompting strategies (dual-role short, dual-role long, analysis-first, analysis-only) across Mistral-7B and Llama-3.1 models. We evaluated classification performance using standard metrics and assessed explanation consistency through embedding similarity analysis. Results: The analysis-first strategy achieved superior performance, with Mistral-7B reaching an F1 score of 0.914, outperforming traditional machine learning approaches (GBM: 0.786, BERT: 0.811). LLMs demonstrated higher explanation consistency within models (mean cosine similarity 0.887–0.908) compared to cross-model similarities (0.767–0.872). Pattern matching successfully handled 6.7% of templated notes deterministically. Mistral-7B showed greater internal consistency but higher abstention rates compared to Llama-3.1. Conclusion: Requiring LLMs to analyze evidence before classification improves both accuracy and explanation consistency for identifying transportation barriers in clinical notes. This approach enables automated barrier detection at scale while providing clinically relevant explanations, supporting both population-level healthcare planning and individual patient care decisions.

Read the full article
Report a problem with this article

Related articles