AutoGKB

Goals:

Fetch annotated articles from variantAnnotations stored in PharmGKB API
Create a general benchmark for an extraction system that can output a score for an extraction system Given: Article, Ground Truth Variants (Manually extracted and recorded in var_drug_ann.tsv:) Input: Extracted Variants Output: Score
System for extracting drug related variants annotations from an article. Associations in which the variant affects a drug dose, response, metabolism, etc.
Continously fetch new pharmacogenomic articles

Description

This repository contains Python scripts for running and building a Pharmacogenomic Agentic system to annotate and label genetic variants based on their phenotypical associations from journal articles.

Dependencies

We manage a few repos externally:

PubMed Downloader: This repo is used to download all the markdown files from the PMIDs represented in var_drug_ann.tsv
Huggingface/AutoGKB: This converts the annotations and article text into a dataset format for benchmarking

Progress Tracker

Category	Task	Status
Initial Download	Download the zip of variants from pharmgkb	✅
	Get a PMID list from the variants tsv (column PMID)	✅
	Convert the PMID to PMCID	✅
	Update to use non-official pmid to pmcid (aaron's method)
	Fetch the content from the PMCID	✅
Benchmark	Create pairings of annotations to articles	✅
	Create a niave score of number of matches
	Create group wise score
	Look into advanced scoring based on distance from truth per term
Workflows	Integrate Aaron's current approach	✅
	Document on individual annotation meanings
	Delegate annotation groupings to team members
New Article Fetching	Replicate PharGKB current workflow

System Overview

Downloading the data

pixi run gdown —-id 1qtQWvi0x_k5_JofgrfsgkWzlIdb6isr9
unzip autogkb-data.zip

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
.github/workflows		.github/workflows
.reuse		.reuse
LICENSES		LICENSES
assets		assets
config		config
data		data
docs		docs
notebooks		notebooks
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.yamllint		.yamllint
README.MD		README.MD
main.py		main.py
pixi.lock		pixi.lock
pixi.toml		pixi.toml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AutoGKB

Description

Dependencies

Progress Tracker

System Overview

Downloading the data

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

Languages

DaneshjouLab/AutoGKB

Folders and files

Latest commit

History

Repository files navigation

AutoGKB

Description

Dependencies

Progress Tracker

System Overview

Downloading the data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Languages

Packages