-
Notifications
You must be signed in to change notification settings - Fork 2
Draft of raw_to_ready function #97
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
stellaprins
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! I merged Alessandro's PR that adds a column with custom mask paths to the input csv file, so that should probably be taken into account here.
Thanks for adding the TODOs, they're helpful! If denoising isn’t going to be part of this PR, we should create a separate issue for it.
| image = correct_image_brightness(image, spacing=vox_sizes_mm) | ||
|
|
||
| mask_config = config.mask | ||
| mask = create_mask( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the step to be skipped in case a custo mask is provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but we have to determine exactly how...
Relevant discussions in brainglobe/atlas-forge#1
|
This looks nice @K-Meech, I did a first-pass on this draft and left some initial comments that came to mind. |
|
Thanks for the comments all! Here's a rough list of the main changes to be made before this comes out of draft:
|
My vote is pro a new csv file. I think it's better for tracing back what happened to the data, and it's a natural place to write something to file because we are also writing the "raw" images and the QC plots to file at the same stage. |
…der into km/raw_to_ready
|
I've added tests covering most scenarios I could think off. The only thing I just realised that is not covered is testing a scenario in which the automatic mask generation is overruled. The I will mark this PR ready for review and create an issue for adding this "mask test" later. |
K-Meech
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the tests @stellaprins ! I've added some general comments below - but Alessandro + others should also take a look (as I was involved in writing this 😅 )
One important point is that there are some output files currently not covered by tests e.g. the QC plots, and the two .txt files with brain/mask paths.
| masks ready for template creation. | ||
|
|
||
| This assumes source_to_raw has already been run to downsample images, | ||
| re-orient them to ASR and save them to the derivatives directory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| re-orient them to ASR and save them to the derivatives directory. | |
| re-orient them to ASR and save them to the raw directory. |
| @@ -0,0 +1,253 @@ | |||
| from pathlib import Path | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some suggestions for other tests:
- It would be good to test that an input csv file with a
usecolumn correctly excludes subjects withuse=False - As you already mentioned, it would be good to test that provided masks are re-used correctly (rather than creating a new one)
- Maybe something around checking that changing the config does change the result? E.g. could try two configs with different values for
pad_pixels+ verify the output image/mask size is as expected
| """Test that raw_to_ready creates expected derivatives directory.""" | ||
| csv_path, config_path = create_raw_test_data | ||
| raw_to_ready(csv_path, config_path) | ||
| assert (create_raw_test_data[0].parents[0] / "derivatives").exists() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to test for the presence of more files? E.g.:
- the
all_processed_brain_paths.txt+ check it has the correct paths inside - the
all_processed_mask_paths.txt+ check it has the correct paths inside - QC plots for each subject
- image / mask files for each subject
I know the image / mask files are referenced in other tests, but I don't think the QC plots and .txt files are touched on anywhere yet?
| assert (create_raw_test_data[0].parents[0] / "derivatives").exists() | ||
|
|
||
|
|
||
| def test_create_subject_dir(tmp_path: Path) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably personal preference, but I'd favour testing more of this at the top level by calling raw_to_ready directly? raw_to_ready is the function users will actually be calling + the one we want to be sure of correct behaviour - while underscore functions like _create_subject_dir are more implementation details. Testing more at the top level, leaves us free to refactor how it works underneath freely (i.e. renaming underscore functions / creating more / splitting them up...) without having to update the tests in any way.
Not sure what @alessandrofelder and others think? (as a lot of this comes down to personal preference!)
| tmp_path: Path, | ||
| test_stacks: dict[str, NDArray[np.float64]], | ||
| ) -> None: | ||
| """Test lr-flipped asym images differ from non-flipped ones.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure you need to do this with both the symmetric and asymmetric input images?
The important point is to test that they are lr-flipped which is only possible to test with the asymmetric input. i.e. I'm not sure the symmetric input is adding anything here - it just proves that symmetric images are left-right symmetric, rather than something about the behaviour of the function.
| ) | ||
|
|
||
| image = load_any(image_path) | ||
| flipped_image = load_any(flipped_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As well as testing that the flipped / non-flipped images aren't identical. You could also do something like:
np.testing.assert_equal(flipped_image, np.flip(image, axis=2))
to check that they are really flipped copies of the same image.
|
|
||
| def create_test_csv(path: Path, test_data: list[dict[str, Any]]) -> Path: | ||
| """Creates "raw_data" CSV file and returns it's path.""" | ||
| for data in test_data: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to remove image from test_data before you make the csv file. At the moment, the numpy array is being included in the csv file.
| "image": make_stack(), | ||
| "mask": make_stack(mask=True), | ||
| "image_asym": make_stack(offset=5), | ||
| "mask_asym": make_stack(mask=True, offset=5), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure you need the symmetric image / mask here? Maybe just always use the asymmetric one so any flips / re-orientations etc can be tested?
| for data in test_data: | ||
| image_path = path / data["subject_id"] / f"{data['subject_id']}.nii.gz" | ||
| image_path.parent.mkdir() | ||
| save_as_asr_nii(data["image"], data["voxel_size"], image_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how precise we want to be here, but this function needs the voxel sizes in mm in xyz order (rather than in microns, as it is in the csv files). So these values * 0.001 - it's only really relevant if we add any tests checking output voxel sizes and similar.
Description
What is this PR
Why is this PR needed?
Currently, atlas example scripts repeat much of the same processing. This PR aims to abstract the main processing steps into a re-useable function:
raw_to_readyWhat does this PR do?
Bundles pre-processing steps into a re-useable function - see brainglobe/atlas-forge#10 (comment) for an overview of the various steps agreed for the two main processing steps:
source_to_rawandraw_to_ready.Some steps are still marked as todo in the code:
n4 bias field correction is waiting on Add alternative to ants.n4_bias_field_correction #89Feedback on any part of this is very welcome! Particularly:
raw_to_readyagnostic of specific template atlas-forge#14 about the steps / potential output fromsource_to_rawmask_filepathis provided in the input csv. There's an open issue about this though: Handling masks created from processed nifti images atlas-forge#16References
brainglobe/atlas-forge#10 and brainglobe/atlas-forge#14
How has this PR been tested?
I tested this locally with two example nifiti files arranged in two subject dirs:
source_data.csvwas in the standard input csv formatThe json config file was:
Is this a breaking change?
No
Does this PR require an update to the documentation?
TODO - once brought out of draft
Checklist: