Add python script to repo #1

pmb59 · 2025-01-29T11:07:24Z

Adds a python script to dynamically fetch and parse experiment metadata from the EBI Gene Expression Atlas (GXA) API, outputting structured data in valid YAML format.

The script retrieves experiment accessions, extracts key details such as organism, experiment type, and assay groups, and ensures properly formatted YAML for downstream processing.

GitHub action included for linting yaml and python code.

…mmas & add yaml

pythonic-query.py

anilthanki · 2025-02-03T11:02:02Z

consider renaming pythonic-query.py to something self explanatory for example fetch_gxa_metadata.py or gxa_metadata_extractor.py or something similar

anilthanki · 2025-02-03T11:50:09Z

consider renaming pythonic-query.py to something self explanatory for example fetch_gxa_metadata.py or gxa_metadata_extractor.py or something similar

With recent comments from Christina about table data format.. converting to dict will make more sense as it can be converted to various format we need, like YAML, JSON, TSV/CSV, XML etc

Co-authored-by: Anil Thanki <[email protected]>

… files, and edit readme

pmb59 · 2025-02-10T16:26:16Z

consider renaming pythonic-query.py to something self explanatory for example fetch_gxa_metadata.py or gxa_metadata_extractor.py or something similar

With recent comments from Christina about table data format.. converting to dict will make more sense as it can be converted to various format we need, like YAML, JSON, TSV/CSV, XML etc

comment addressed here: b8cace9

anilthanki · 2025-02-10T16:53:51Z

fetch_gxa_metadata.py

+parser.add_argument("yaml_filename", help="Output Yaml filename")
+parser.add_argument("tsv_filename", help="Output Tsv filename")


Are these both yaml_filename and yaml_filename required or either is fine..

if aim is to generate both file per run, consider taking just one optional argument like file_name with some default value and just change extension so user need to provide only one argument or none.

if aim is to generate either then script needs some logic at the EOF

anilthanki · 2025-02-10T16:55:31Z

code outside defined functions is a bit scattered, may be consider using a main function to make it more readable

anilthanki · 2025-02-10T17:00:12Z

code outside defined functions is a bit scattered, may be consider using a main function to make it more readable

I had a second look and its not scattered but divided into two blocks top and bottom .. so this comment can be ignored... but still using a main function might benefit it

Update README.md

ea4bf32

pmb59 self-assigned this Jan 29, 2025

pmb59 and others added 17 commits January 29, 2025 11:47

add pythonic-query.py

31c60f7

add GitHub action

eb4899e

Create gxa-studies.yaml

f90d708

add --- to gxa-studies.yaml

bb074c0

add --- to validate-code.yml

d3f9a50

attempt to fix truthy value warnings in CI

218506b

Update validate-code.yml

5355484

--max-line-length to 150

6d6bddd

fix python linting

338a8e3

fix python linting

a3a5a41

final python linting fix

332e758

Remove None, empty strings, whitespace-only values, and accidental co…

93cea41

…mmas & add yaml

set --max-line-length 200 to yamlllint

e898979

revert

4d6d7f9

Create .yamllint

fe99ee8

add functions to pythonic-query.py

858c705

add /try except to json load

e57a383

pmb59 requested review from anilthanki and irisdianauy January 31, 2025 19:39

anilthanki reviewed Feb 3, 2025

View reviewed changes