MA-LLM Pipeline — README

This repository contains the MA-LLM screening pipeline, a tool for automated screening of PubMed articles using Large Language Models (LLMs). It supports two modes:

Screening Selection Comparison: compare LLM-based selections with a manual gold standard.
Comparison-free Screening (Freeform): run PubMed searches and screen articles without a comparison set.

This README explains how to run the project from source, notes on the provided .exe (if any), required Python packages, common troubleshooting steps, and recommended small fixes and naming conventions. It is mainly intended for use when the provided .exe file does not work on your system (for example, on macOS or unsupported platforms).

Recommended usage of the pipeline

We have created an .exe of the code including the necessary packages and the .html file, thus it is recommended to download the latest release from Github and use the executable. If you are running on Linux or MacOS and you are not willing to download software to execute Windows executables you can clone the Github Repo und follow the instructions below. The functionality does not differ between the executable and the python code it is just meant to enhance usability.

Project structure (important files/folders)

MALLM_Pipeline/MALLM.py — main Flask app and processing logic (entrypoint)
MALLM_Pipeline/templates/MALLM.html — front-end UI used by the Flask app
MALLM_Pipeline/ExampleFiles/ — example PMIDs, prompts and gold-standard files (use these to test input formats)
requirements.txt — Python package list
Readme.md

Quick start (from source / ZIP)

Clone or extract the ZIP and open a terminal in the repository root.
Create and activate a Python virtual environment (recommended):

python3 -m venv .venv
source .venv/bin/activate    # zsh/bash

Install dependencies:

pip install -r requirements.txt

Start the Flask UI (the web UI will open automatically):

python "MALLM_Pipeline/MALLM.py"

Required Python packages

The project expects a set of Python packages. Ensure requirements.txt includes at least the following (add or pin versions as needed):
pandas
biopython
flask
openai (if using OpenAI provider)
anthropic (if using Anthropic provider)
google-generative-ai (if using Google provider)
ollama (if using Ollama)
openpyxl

How the web UI submits work

The front-end sends a POST to either /run_comparison (goldstandard mode) or /run_freeform (freeform mode).
The server starts processing in a background thread and the front-end polls /status for progress.

Common problems and troubleshooting

HTTP 500 "Submission failed: HTTP error! status: 500"
- Meaning: the server raised an unhandled exception while processing the submission. This is a server-side error, not a front-end problem.
- What to do: open your browser DevTools → Network, find the POST to /run_comparison or /run_freeform, and inspect the Response body. The server now returns JSON with message and traceback fields to help debugging.
- Likely causes in this codebase:
  - Required form fields or uploaded files were missing (the server expects initial_file, goldstandard_file, and prompts_file for comparison mode).
  - prompts_file is not an Excel file or has unexpected columns; pd.read_excel will raise on invalid input.
  - AI provider initialization failed (missing provider library, unsupported provider name, or authentication error).
  - If using Ollama (Local), no API key should be required; the UI currently asks for an API key for all providers — leave the API key blank for Ollama or update the UI/server to skip the key requirement for Ollama.
Unclear Screening Mode error message
- If you get an error about not choosing a screening mode, the UI will show a message. Make sure Screening Mode is set to either Screening Selection Comparison (goldstandard) or Comparison-free Screening (freeform) before submitting.
Problems reading prompts.xlsx
- The pipeline expects columns named like TitlePrompt, AbstractPrompt, screen_titles, and screen_abstracts in the prompts Excel file. If your file uses different column names, rename them or adapt the code.

Frontend & server behavior notes

The UI requires initial_file and goldstandard_file (text files with PMIDs) and prompts_file (Excel) when you select the Comparison mode. Make sure files are uploaded in the form.
For Freeform mode you need to provide a PubMed search query, screening prompt, and max articles.
For the Ollama provider the server constructs a local client and should not require an API key. If the UI still shows the API key field for Ollama, you can safely leave it empty and start the server.

Name		Name	Last commit message	Last commit date
Latest commit History 285 Commits
MALLM_Pipeline		MALLM_Pipeline
Paper		Paper
.gitignore		.gitignore
MA-LLM graphic tutorial.pdf		MA-LLM graphic tutorial.pdf
Readme.md		Readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MA-LLM Pipeline — README

Table of contents

Recommended usage of the pipeline

Project structure (important files/folders)

Quick start (from source / ZIP)

Required Python packages

How the web UI submits work

Common problems and troubleshooting

Frontend & server behavior notes

License & citation

About

Uh oh!

Releases 2

Packages

Contributors 8

Uh oh!

Languages

StudentNOS/MA-LLM

Folders and files

Latest commit

History

Repository files navigation

MA-LLM Pipeline — README

Table of contents

Recommended usage of the pipeline

Project structure (important files/folders)

Quick start (from source / ZIP)

Required Python packages

How the web UI submits work

Common problems and troubleshooting

Frontend & server behavior notes

License & citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 8

Uh oh!

Languages

Packages