-
Notifications
You must be signed in to change notification settings - Fork 16
Description
When the BMI-enabled version of Snow-17 is initialized with many catchments (e.g. more than 44) there is a unit number conflict when opening an input file.
Current behavior
The program crashes.
Expected behavior
The subject of this post.
Open forum
The issue is that files remain open during a Snow-17 run, which limits the number of catchments that can be processed in a given run. Background for this issue is from an email from Andy Wood:
"My original code setup for those models (pre-BMI) looped through each zone (eg U, L) and opened/closed the files before moving to the next zone, since as you say, they don't interact. When I refactored it to use BMI, I changed it to open all the files in the initialize step and close them all in the finalize step, and I didn't think about the upper limits. I/we could easily change the numbering scheme for the files to enable it to keep many 1000s of files open, which most machines would support, and which would enable a reasonable large (but not infinite) standalone run case. And when run in nextgen, snow17 should be getting forcings from the framework and not opening its files. The current numbering scheme envisions running a basin at a time, and the basin might have some number of elevation zones but probably never more than 20 (in RFC world the max is about 3).
An alternative might be tricky – given the way the update() function works. It would be inefficient to have that function re-open all the files and close them just to read just the forcing for a single timestep. If all the forcings are in one netcdf file, instead of individual csvs, then that could simplify the problem. Is the reason that noah-om doesn't have this issue because it doesn't try to run sub-catchment level zones? (or basin sub-catchments)? I think all the noah-om dev. work ran standalone over single catchments (ie one forcing file per catchment)."
Possible alternatives:
- No code changes other than an error catch if the number of catchments exceeds the current number that would produce a unit number conflict when opening a file
- Reformulate the code to allow for many more open csv files
- Put all forcings into a single netcdf file
- Close each file after reading data for the current time interval
Alternative discussion:
- If Snow-17 is executed using a single catchment at a time, then the only change we need to make is to check for multiple catchments and either error out or warn of a unit number conflict. This method also scales well because we avoid the problem of running out of open file handles or exceeding memory constraints if an arbitrary number of catchments are processed in a single run.
- If it is the case that we could possibly have 1000s of files open at a single time then this option is worth discussing. One consideration is that it seems problematic design-wise and the number of possible file handles may be operating system dependent. It also does not scale.
- Putting all forcings into a single netcdf file avoids the problem of many open files but complicates formatting the forcings input. If the file remains open then, it seems to me, that the data from catchments would need to be interleaved, which would make for a complicated input file format. If, instead, direct access is used then we’d need to come up with an algorithm of tracking record numbers. In either case, this solution would not scale, memory-wise.
- Similar to (3) this solution would require direct access of the input file and tracking record numbers. Also, as you mention, it would be computationally expensive. In this case we may need to track multiple record numbers (although they’d likely be the same :] ) and it would not scale for an arbitrarily large number of catchments.
I don’t know the answer to your question about whether noah-om was tested using single catchments but that does seem to be a critical point. If it is advisable to run Snow-17 a single catchment at a time in the noah-om context then the file I/O issues become less consequential.