Skip to content

Duplicate IDs in the demographics cause difficult-to-interpret error messages #87

@0rC0

Description

@0rC0

Hello everyone,

Thank you for releasing this project as open source.

I found a minor bug in io_meld.py for version 2.2.2, which is also present in the main branch. If the demographics contain duplicate entries for an ID, the pipeline returns the error message reported below, which is not easy to interpret.

As a user, I believe the pipeline should either drop duplicates or return a clearer error message.

A similar issue occurs in meld_cohort.py when it searches for the scanner. I will submit my fix for this as a pull request.

Best,
A.D.

Traceback (most recent call last):
  File "/home/orco/data/FCD/code/meld_graph/scripts/new_patient_pipeline/new_pt_pipeline.py", line 134, in <module>
    result = run_script_segmentation(
  File "/home/orco/data/FCD/code/meld_graph/scripts/new_patient_pipeline/run_script_segmentation.py", line 378, in run_script_segmentation
    result = run_subject_segmentation(subject_id,  harmo_code = harmo_code, use_fastsurfer = use_fastsurfer, verbose=verbose)
  File "/home/orco/data/FCD/code/meld_graph/scripts/new_patient_pipeline/run_script_segmentation.py", line 350, in run_subject_segmentation
    result = extract_features(subject_id, fs_folder=fs_folder, output_dir=output_dir, verbose=verbose)
  File "/home/orco/data/FCD/code/meld_graph/scripts/new_patient_pipeline/run_script_segmentation.py", line 224, in extract_features
    result = create_training_data_hdf5(subject_id, fs_folder, output_dir  )
  File "/home/orco/data/FCD/code/meld_graph/scripts/data_preparation/extract_features/create_training_data_hdf5.py", line 23, in create_training_data_hdf5
    failed = save_subject(subject,
  File "/home/orco/data/FCD/code/meld_graph/scripts/data_preparation/extract_features/io_meld.py", line 128, in save_subject
    if scanner in ("15T" , "1.5T" , "15t" , "1.5t" ):
  File "/home/orco/miniforge3/envs/meld_graph/lib/python3.9/site-packages/pandas/core/generic.py", line 1535, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions