What's the best method for running a single site multi-instance case? #974
Replies: 4 comments 4 replies
-
| I've never gone big enough to run into this @JessicaNeedham but maybe also try posting on the ctsm forum to see if anyone else there has? | 
Beta Was this translation helpful? Give feedback.
-
| @JessicaNeedham are you running with --multi-driver in your call to ./create_newcase? I see in my ensemble runs that I do have multiple user_nl_datm files. I'm not remembering how this all works! I thought I was running all land instances with one datm instance. I may have been editing the env_mach_pes.xml file. Maybe @ekluzek or @billsacks can comment on how to do this properly? Sorry for not being very helpful. | 
Beta Was this translation helpful? Give feedback.
-
| @JessicaNeedham so you are running with 780 members and just running ./preview_namelist takes 4 hours? How much time is just preview_namelist? And what machine is this on? The build is slower on some machines than others, but it shouldn't vary based on the number of instances. The namelist build is going to be slower with more instances though. And we've designed the system around the single instance case where the time it takes is acceptable. You might need to send the preview_namelist to the compute nodes. Technically it would be possible to make each instance of the namelist be done in parallel, so it could be sped up that way. But, I don't see that we would have resources to do that, so you'd have to figure it out mostly on your own. If you had time to do that, we could give you some consulting on how to do it though. For your case do you want to have one datm that is identical for all instances? @pollybuotte is right about bringing up multi-driver, but it also depends on what you need to do here. And this isn't something I use that often so I have to look it up. | 
Beta Was this translation helpful? Give feedback.
-
| @ekluzek I think it was both the ‘create namelist for component datm’ step and preview_namelist that were taking hours to run. It looked as if preview_namelist was repeating the creation of datm files in the case run directory based on the timestamps of the files. I am not familiar with cime and building cases so I’m not sure if that is supposed to be happening. I am running on compy - PNNL’s machine. All instances in the case should have the same datm - only the parameter file is supposed to be different between them. I tried with multi-driver but it didn’t seem to make much difference, at least not to the namelist build time. | 
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I'm trying to set up a large ensemble of runs at a single site using the multi-instance functionality. The problem is that namelist building is taking over four hours in the case.build step and again in case.submit.
Reading this old issue ESCOMP/CTSM#515 it looks like one can avoid at least the second namelist building step by running preview_namelist before/after(?) case_build, and then running case.submit with the --skip-preview-namelist flag.
Is there a way to have a multi-instance case where ensemble members share data streams? I was reading the clone_case documentation here but it looks like that would create a separate case for each instance?
This is more of a cime issue but it might be helpful for FATES users to have some guidelines documented here.
Beta Was this translation helpful? Give feedback.
All reactions