Skip to content

eurekahealth/EurekaMD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EurekaMD

Table of Contents

Overview

EurekaMD is an advanced medical prompting framework designed to enhance the accuracy and reasoning of LLMs in medical tasks. Building upon the MedPrompt framework, EurekaMD uses an LLM Judge to provide better selection of the strongest chain of thought reasoning, ultimately leading to a 93.3% performance score on the USMLE.

This repository provides the scripts and workflows to replicate EurekaMD's training and evaluation process.

Installation

Follow the steps below to configure your environment:

  1. Install Python 3.10 or above:

  2. Deploy gpt-4o via the Azure OpenAI Service: The scripts in this repo use the Azure OpenAI Service. Follow these instructions to deploy gpt-4o through the Azure OpenAI Service.

  3. Set Up Environment Variables: After configuring the Azure OpenAI Service, set the following environment variables:

    export AZURE_OPENAI_API_KEY="your-azure-openai-api-key"
    export AZURE_OPENAI_ENDPOINT_URL="your-azure-openai-endpoint-url"
  4. Install Dependencies: Use pip to install all required dependencies:

    pip install -r requirements.txt

Running Scripts

Determine Similar Training Questions

This script finds similar questions in the training set for each question in the test set. The script uses OpenAI's text-embedding-3-large embedding model to calculate similarity.

To Run

python src/determine-similar-training-questions.py

Generate Candidate Reasoning Paths

This script generates multiple reasoning paths for each question in the training set. The output file will contain multiple reasoning paths for each question in the training set.

To Run

python src/generate-candidate-reasoning-paths.py

Select Best Reasoning Paths

This script uses an LLM Judge to select the best reasoning path for each question in the training set.

To Run

python src/select-best-reasoning-paths.py

Calculate USMLE Accuracy

The final script evaluates the accuracy of EurekaMD on the USMLE test set, using the reasoning paths selected by the previous script as the paths to use for the few-shot learning. The questions evaluated are those in the MedQA 4-options dataset.

To Run

python src/calculate-usmle-accuracy.py

About

Updates to MedPrompt, scoring a new SOTA for the USMLE (MedQA)

Resources

Stars

Watchers

Forks

Languages