-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
57 use tifffiles #58
base: main
Are you sure you want to change the base?
57 use tifffiles #58
Conversation
Remaining:
I'll get to these tonight. |
I can't test all the datasets myself (since I don't know the proper setups for most of them), but we should check if the results are still correct after the changes above. I've tested MADOS, and will test Sen1Floods11. Can you guys help out with the ones that you have access to? |
It seems that tifffile loads images as (H, W, C), whereas rasterio loads them as (C, H, W). I rearranged the dimensions for the SpaceNet 7 dataset. I tested this and it works. BTW I also had to pip install |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for SpaceNet 7.
The imagecodecs thing is interesting, it is in the dependencies list of tifffiles. Just conda installing tifffiles worked for me. Thanks for spotting the dimension order mismatch, I'll have to modify the code in a few places. I missed it with mados since it only uses 1-channel images :( |
5886809
to
865e2ea
Compare
Are we sure that changing to tifffile does not bring some algorithmic changes? For example, I doubt that rasterio resampling, torch.interpolate and cv2.resize are equivalent (even if we use nearest neighbors in every case). For this reason, I suggest we keep the rasterio dependency so we can ensure that we use the datasets as they are released. |
I would be very concerned if there was any difference in a nearest neighbor (or even bilinear) interpolate between these libraries. These are standard algorithms with well defined behaviors. If there is any difference apart from floating point noise I would file an issue on their github :D (I wouldn't even be surprised if they were calling the same libraries under the hood.) |
So I ran some tests, and there are indeed differences in rounding patterns (eg. which pixel gets chosen if resizing to 0.5). I see two upsides to removing rasterio/any other tiff reading module:
On the other hand, if we decide to keep all dataloaders with their original packages, we'll have to keep doing this for all datasets we add in the future, no matter how many different dependencies we accumulate. There's no guarantee this is even possible in the same venv. |
New update, I did a bit of RTFM. CV2 has a nearest interpolate mode |
So about the channel order in tiffiles: reading the files in Sen1Floods11 returns files in (C, H, W). So I'll have to go over each dataset and see which format they save their data in. Luckily, the info is available in TiffFile.series[0].axes as a string. |
I wanted to check on the progress of this PR. Are all the checks complete and is it ready to be merged? Please let me know if there's anything else needed, and I’d be happy to assist. |
I'm now thinking it might be more complication than it's worth to remove rasterio from the dependencies, so I put it on the backburner for a bit. If we still want to do this, we'll either need to manually check the dimension order for each remaining dataset, or we need to write a function that takes the dimension order from tiffiles and converts it to (C, H, W). Einops could probably handle it quite easy. Then we'll need to check each dataset and see if it works. |
No description provided.