Duplicate IDs in the demographics cause difficult-to-interpret error messages

Hello everyone,

Thank you for releasing this project as open source.

I found a minor bug in io_meld.py for version 2.2.2, which is also present in the main branch. If the demographics contain duplicate entries for an ID, the pipeline returns the error message reported below, which is not easy to interpret.

As a user, I believe the pipeline should either drop duplicates or return a clearer error message.

A similar issue occurs in meld_cohort.py when it searches for the scanner. I will submit my fix for this as a pull request.

Best,
A.D. 

```
Traceback (most recent call last):
  File "/home/orco/data/FCD/code/meld_graph/scripts/new_patient_pipeline/new_pt_pipeline.py", line 134, in <module>
    result = run_script_segmentation(
  File "/home/orco/data/FCD/code/meld_graph/scripts/new_patient_pipeline/run_script_segmentation.py", line 378, in run_script_segmentation
    result = run_subject_segmentation(subject_id,  harmo_code = harmo_code, use_fastsurfer = use_fastsurfer, verbose=verbose)
  File "/home/orco/data/FCD/code/meld_graph/scripts/new_patient_pipeline/run_script_segmentation.py", line 350, in run_subject_segmentation
    result = extract_features(subject_id, fs_folder=fs_folder, output_dir=output_dir, verbose=verbose)
  File "/home/orco/data/FCD/code/meld_graph/scripts/new_patient_pipeline/run_script_segmentation.py", line 224, in extract_features
    result = create_training_data_hdf5(subject_id, fs_folder, output_dir  )
  File "/home/orco/data/FCD/code/meld_graph/scripts/data_preparation/extract_features/create_training_data_hdf5.py", line 23, in create_training_data_hdf5
    failed = save_subject(subject,
  File "/home/orco/data/FCD/code/meld_graph/scripts/data_preparation/extract_features/io_meld.py", line 128, in save_subject
    if scanner in ("15T" , "1.5T" , "15t" , "1.5t" ):
  File "/home/orco/miniforge3/envs/meld_graph/lib/python3.9/site-packages/pandas/core/generic.py", line 1535, in __nonzero__
    raise ValueError(
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Duplicate IDs in the demographics cause difficult-to-interpret error messages #87

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Duplicate IDs in the demographics cause difficult-to-interpret error messages #87

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions